Lucene's unified highlighter should produce multiple highlighted snippets for multi-valued fields

Description

Lucene’s unified highlighter will concatenate all the snippets into one string by default. For example if we have a multivalued field with:

Applying a unified highlighter to it will result in:

While using other highlighter types, or if using the Elasticsearch backend, the result will be:

To get Lucene’s unified highlighter to behave in the same way, we’d need to have our custom implementation of both PassageFormatter and FieldHighlighter

Alternatively, we might end up simply exposing the ellipsis parameter of the DefaultPassageFormatter to the users through Lucene-specific unified highlighter DSL.

Activity

Show:

Yoann Rodière March 31, 2023 at 1:15 PM

+1 for aligning the behavior on Elasticsearch. Requalifying as bug, since the current behavior is not the one we want and will need to change in a backwards-incompatible way.

To get Lucene’s unified highlighter to behave in the same way, we’d need to have our custom implementation of both PassageFormatter and FieldHighlighter

Fixed

Details

Assignee

Reporter

Components

Sprint

Fix versions

Priority

Created March 28, 2023 at 10:58 AM
Updated June 2, 2023 at 1:33 PM
Resolved May 31, 2023 at 8:01 AM