We're updating the issue view to help you get more done. 

Automatic translation of TypeTokenFilterFactory to Elasticsearch turns blacklists to whitelists

Description

TypeTokenFilter in Lucene acts as a blacklist by default, removing some tokens according to their type. It can act as a whitelist if the "useWhitelist" parameter is set to true.

For Elasticsearch analyzers, we translate TypeTokenFilterFactory to Elasticsearch's "keep_types" filter type, and we forbid the use of "useWhitelist". This effectively inverts the meaning of this filter...

What we should do instead is mandate the use of "useWhitelist", and throw an exception when it is either missing or set to something else than "true".

Note: there doesn't seem to be an equivalent to the blacklist mode of TypeTokenFilterFactory in Elasticsearch.

Environment

None

Status

Assignee

Yoann Rodière

Reporter

Yoann Rodière

Components

Fix versions

Affects versions

5.6.1.Final
5.7.0.Final

Priority

Major