AnalyzerDiscriminator is called for parent entity fields

Description

From my understanding, an analyzer discriminator would be defined on classes/fields that should not use the standard analyzer. However, DocumentBuilderIndexedEntity.allowAnalyzerDiscriminatorOverride() applies the discriminator to all unprocessed fields in the document, which might also be fields which the analyzer should not be applied to.

The discriminator itself does not know whether the field which is passed belongs to the entity which the discriminator is applied to or not.

Consider the following setup (irrelevant parts omitted):

The bridge creates custom fields per language, i.e. we'd get fields like "texts.name_<language>" and "texts.description_<language>".

When the document is created/updated, the fields from Article (articleNumber and internalText) are added to the document first. Then the first text is processed and during that operation the analyzer discriminator is called.

At that moment, if the text was a German text, the document would contain the following fields:

The discriminator would determine the analyzer to be, let's say analyzer_de. Because the fields from Article have not been processed yet, the fieldToAnalyzerMap would now look like this:

As you can see, fields like internalText would now not use the default analyzer but analyzer_de, which seems not to be intended.

Possible approaches to a solution:

  1. only apply the discriminator on the fields which have been added by the current entity (possibly even passing it to buildDocumentFields() in order to use it for embedded entities as well)

  2. pass some information to the discriminator whether the field was added by the current entity or not, enabling the discriminator to decide itself

I'd prefer solution no. 1, since it would easily allow to pass the analyzer down to embedded entities and use it there, unless another discriminator would override it.

Currently, I can think of a few workarounds:

  1. define a discriminator on each entity

  2. set the default entity on the root entity (or each embedded entity as well? - I use the programmatic API so I'm not sure here, also the programmatic API doesn't seem to support that directly)

  3. define a discriminator that determines the analyzer based on the field suffix

Activity

Show:

Yoann Rodière May 11, 2020 at 7:43 AM

No longer relevant: @AnalyzerDiscriminator has been removed in Hibernate Search 6.
See for how to implement similar functionality in Hibernate Search 6.

Out of Date

Details

Assignee

Reporter

Components

Affects versions

Priority

Created September 11, 2013 at 12:51 PM
Updated May 11, 2020 at 7:44 AM
Resolved May 11, 2020 at 7:44 AM