We're updating the issue view to help you get more done. 

Allow to add predicates with reduced scope in predicate DSL

Description

Use case: I create a query on indexes A and B, but I want some predicates to match only for index A. This can be for one of two reasons:

  1. The business requirements state that this predicate is optional for index B. This can happen for example if a targeted field does not exist in B, and thus the predicate makes no sense in B. See for example https://stackoverflow.com/questions/53414076/hibernate-search-query-for-multiple-entities/53414429#53414429

  2. The predicates would not work on type B, because of field type conflicts: for example an "sku" field which is numeric in A, but text in B.

Solution 1 : a type() predicate

*This solution only addresses the first use case.* It won't allow to target fields with conflicting types.

The type predicate would allow to filter by mapped type. It would only accept an indexed type, and would be translated into a predicate filtering by index name.

API-wise we would have to allow any Object to be passed to represent the type, unless we define the type() predicate in a mapper-specific extension, in which case we could require a Class<?> for the POJO mapper.

Implementation: we would have to add a dedicated, internal field to differentiate between indexes. The Elasticsearch type query will not work, because 1. we use the same type name for all indexes and 2. the concept of type is going to be removed in future Elasticsearch versions (ES8 or ES9 IIRC).

Solution 2 : a scope() composite predicate

The scope() predicate would allow to reduce the scope of predicates, so that:

  1. The predicates are only taken into account for a specific subset of indexes. So results that do not match these predicates may be returned for other indexes.

  2. AND any predicate built in the "sub-scope" will only be required to be valid for indexes in that sub-scope.

Syntax:

1 2 3 4 5 6 7 // f2 performs check in the sub-scope only, and is thus more permissive (see use case #2) .scope().on( "index1", "index2" ).predicate( f2 -> f2.match().onField( ... ).matching( ... ) ) // Index names would work, but allowing to pass mapper-specific types would probably be better: .scope().on( MyType.class, MyOtherType.class ).predicate( f2 -> f2.match().onField( ... ).matching( ... ) ) // Or maybe: SearchScope scope = <retrieve a scope from the mapper APIs> .scope().on( scope ).predicate( f2 -> f2.match().onField( ... ).matching( ... ) )

Implementation:
We will need some code to handle
For low-level predicates, same as solution 1, but if we want to support the second use cases (predicates that may not make sense for some indexes), we need Elasticsearch not to throw exceptions when we send data that cannot be interpreted for the non-relevant indexes. For example, if the indexes in the sub-scope define field "myField" with a string type, and indexes outside of this sub-scope define this same field with a numeric type, Elasticsearch must not raise an error when we ask for all documents where "myField" matches "someStringThatDefinitelyIsNotANumber".

Environment

None

Status

Assignee

Unassigned

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Fix versions

Priority

Major