Range fields and predicates for these fields

Description

When indexing a range in a document, searching for all documents whose range contains a given value does not perform very well: we have to run a query such as rangeStart <= myValue && myValue <= rangeEnd. Either clause potentially returns a large subset of documents, so computing the intersection can be very resource intensive and lead to poor performance.

See this example (in Infinispan)

Lucene has dedicated field types and queries that presumably perform better, since they are specialized:

  • org.apache.lucene.document.IntRange

  • org.apache.lucene.document.LongRange

  • org.apache.lucene.document.DoubleRange

  • etc.

We should expose range field types, with the corresponding predicates (contains a given value/range, intersects a given range, etc.).

On the mapper side, we could use the Range<T> util type we already have in Hibernate Search.

I think we only need to support one dimension, even if Lucene supports up to 4 dimensions.

Elasticsearch also exposes this feature: https://www.elastic.co/guide/en/elasticsearch/reference/current/range.html

Activity

Details

Assignee

Reporter

Components

Priority

Created March 23, 2021 at 10:53 AM
Updated September 9, 2024 at 12:59 PM