Predicates with reduced scope (narrowed target type) in predicate DSL

Description

Use case: I create a query on indexes A and B, but I want some predicates to match only for index A. This can be for one of two reasons:

  1. The business requirements state that this predicate is optional for index B. This can happen for example if a targeted field does not exist in B, and thus the predicate makes no sense in B. See for example https://stackoverflow.com/questions/53414076/hibernate-search-query-for-multiple-entities/53414429#53414429

  2. The predicates would not work on type B, because of field type conflicts: for example an "sku" field which is numeric in A, but text in B.

Solution 1 : a type() predicate

*This solution only addresses the first use case.* It won't allow to target fields with conflicting types.

The type predicate would allow to filter by mapped type. It would only accept an indexed type, and would be translated into a predicate filtering by index name.

API-wise we would have to allow any Object to be passed to represent the type, unless we define the type() predicate in a mapper-specific extension, in which case we could require a Class<?> for the POJO mapper.

Syntax (maybe):

Implementation: we would have to add a dedicated, internal field to differentiate between indexes.

For Elasticsearch type query will not work, because 1. we use the same type name for all indexes and 2. the concept of type is going to be removed in future Elasticsearch versions (ES8 or ES9 IIRC).

Still for Elasticsearch, we might take advantage of the internal _entity_type field, but that field may not always be present (see ).

Solution 2 : a re-scoped composite predicate

The predicate would allow to reduce the scope, so that:

  1. The predicate only matches on documents of the selected types.

  2. AND the predicate gives access to a predicate factory that works on the selected types only, thus being more permissive (see use case #2).

Syntax:

Implementation:
For low-level predicates, same as solution 1, but if we want to support the second use case (predicates that may not make sense for some indexes), we need Elasticsearch not to throw exceptions when we send data that cannot be interpreted for the non-relevant indexes. For example, if the indexes in the sub-scope define field "myField" with a string type, and indexes outside of this sub-scope define this same field with a numeric type, Elasticsearch must not raise an error when we ask for all documents where "myField" matches "someStringThatDefinitelyIsNotANumber".

Activity

Details

Assignee

Reporter

Components

Priority

Created November 21, 2018 at 3:13 PM
Updated May 7, 2024 at 11:15 AM