Automatic translation of TypeTokenFilterFactory to Elasticsearch turns blacklists to whitelists

Description

TypeTokenFilter in Lucene acts as a blacklist by default, removing some tokens according to their type. It can act as a whitelist if the "useWhitelist" parameter is set to true.

For Elasticsearch analyzers, we translate TypeTokenFilterFactory to Elasticsearch's "keep_types" filter type, and we forbid the use of "useWhitelist". This effectively inverts the meaning of this filter...

What we should do instead is mandate the use of "useWhitelist", and throw an exception when it is either missing or set to something else than "true".

Note: there doesn't seem to be an equivalent to the blacklist mode of TypeTokenFilterFactory in Elasticsearch.

Environment

None

Assignee

Yoann Rodière

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Affects versions

Priority

Major
Configure