Test (and document) alternatives to analyzer discriminators in Search 6

Description

Reasons to remove analyzer discriminators:

  • We can't implement them in the Elasticsearch backend

  • They are bad practice anyway: storing the result of applying different analyzers in the same field is a great way to get unpredictable search results.

There is, however, an alternative. Let's take the example of a language-based analyzer discriminator.
Instead of using the same field for every language, and just changing the analyzer based on the language of the document, we could change the field based on the language: for English we put the data in "myField_en", for French we put it in "myField_fr", etc. As long as the list of supported languages is known in advance (and honestly, why wouldn't it?), we can easily write a type bridge with this behavior.

We could even write a type bridge with a custom annotation and custom marker annotations to re-use the same bridge on multiple entities. Example of use:

Let's not expose this as an API right now, but at least let's test this solution. If it works, let's document it in the migration guide (HSEARCH-3283). If it doesn't... We might have to consider adding some sort of support for analyzer discriminators in the Lucene backend, perhaps as an extension.

Environment

None

Assignee

Yoann Rodière

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Priority

Major
Configure