Composite projections

Description

Follow-up on HSEARCH‌-3222 .

The idea would be to provide a type-safe solution to building objects that encompass multiple projections, for example a "Color" object that requires a projection on the "red", "green" and "blue" fields.

We would offer the ability to build such a projection:

  • Though the projection DSL: {‌{target.projection().composite()….}‌}

  • When calling .asProjections() with a multiple parameters in {‌{HibernateOrmSearchQueryResultDefinitionContext}‌} plus a BiFunction/TriFunction/etc. acting as a “hit transformer”: this would allow to create a type-safe query, returning results with a known type even when using projections.

In the future (not in this ticket), this will be particularly useful when combined with multi-valued nested object fields, allowing to get a List<Color> from a projection for example.

Draft of the APIs: https://github.com/yrodiere/hibernate-search/tree/projections-syntax-attempt2 , https://github.com/yrodiere/hibernate-search/commit/5f9f56f587d3e44344b773bdefeb70c8fedf1be1

We decided to give up on the fluid API for composite projections, for two reasons:

  1. We decided to remove the fluid APIs for predicates and sorts, so let's be consistent

  2. Having one interface per number of parameters is complex to implement (or at least very redundant), and it also makes implementing extensions much more complex, because we'll have to implement each interface in each backend...

Environment

None

Activity

Show:
Yoann Rodière
November 12, 2018, 8:24 AM
Edited

Some thoughts I had this weekend over implementing this. I hope it helps. Let's discuss that when you start?

Hit collector cannot work for composite projections, we need to change them. The main problem being that hit collectors assume a "flat" structure for hits, but composite projections will obviously allow tree-like structures that don't fit well in this model.

So, we need something else.

I think one thing that could work would be to move the responsibility of "collecting" hit elements out of the engine, and leave it to the backend (which knows what each projection is supposed to do).

The idea would be that:

  1. the aggregator just offers a "add(T)" method (T being the type of the reference, loaded object, projection, ...) instead of a "nextCollector" method.

  2. the projections do not add their result to a projection, but return it.

  3. the hit-scoped hit collector disappears in favor of a query-scoped "hit mapper". The "hit mapper" will not collect the hits, but instead it will allow to:

    • convert a reference to the type expected by the mapper (e.g. DocumentReference => PojoReference) using its Object convertReference(DocumentReference) method.

    • inform the mapper that an object will have to be loaded through its Object planLoading(DocumentReference) method, which returns a key for later retrieval of the loaded object.

    • trigger loading of objects through its LoadResult load() method that returns another SPI, which exposes a Object getLoaded(Object key) method.

This should allow to work with (potentially nested) composite projections more easily.

However, the hit collector is needed for several reasons, so we'll have to remove it carefully.

Here are the three main changes I think will be necessary in order to even begin to implement composite projections. Unfortunately we'll probably have to implement them all in the same commit, as they are somewhat intertwined.

First, the HitCollector is needed because it abstracts over the number of elements per hit: some projections only return one element per hit, some others return multiple elements per hit (a list).
We can work around that by removing asProjections (plural) from SPIs and using composite projections instead wherever it was needed in the ORM/JavaBean mappers (namely in org.hibernate.search.mapper.javabean.search.dsl.query.impl.JavaBeanQueryResultDefinitionContextImpl#asProjections and in org.hibernate.search.mapper.orm.search.impl.FullTextQueryResultDefinitionContextImpl#asProjections).

Second, the HitCollector is needed because we have multiple different methods of result collection: asReference, asObject, ... and thus multiple types of collectors. We can work around that by having multiple types of HitMapper just as we had multiple types of HitCollector: ReferenceHitCollector will become ReferenceHitMapper and will only expose convertReference(), LoadingHitCollector will become LoadingHitMapper and will only expose planLoading() and load(), and ProjectionHitCollector will become ProjectionHitMapper and will extend both ReferenceHitMapper and LoadingHitMapper. I would recommend implementing the various "mappers" in the engine for now, probably instantiating them in org.hibernate.search.engine.common.impl.MappedIndexSearchTargetImpl#queryAsProjection et al.

Third, the HitCollector is needed to, well, collect hits. Without it, we need to change the implementation of projections to return the hits differently. Thus we'll have to make the projections return their result from their extract method. However, it's not that simple, as generating the results of a full-text query is a multiple-step process: first we extract everything from the query, then we load whatever we need from the database, then we insert what was loaded from the database into the extracted results.
I think we'll have to make projection expose two methods:

  • Object extract(ProjectionHitMapper, <backene specific parameters>) extracts data from the hits, plans loading and returns a temporary extracted result.

  • T transform(Object extractedData, LoadingResult loadingResult), gets passed the extracted data returned by extract, and turns the temporary result into the type of the projection. Most of the time it just casts and returns the extracted data, but it can also consider the extracted data as a loading key and use the LoadingResult to return a loaded object. It can also (and that'll be useful for composite projections) apply a function to the extracted data.

The two methods will be called in two passes: first call extract() for every single hit, then call LoadingHitMapper#load(), then call transform() for every single hit.

Miscellaneous notes:

1. Initially implement composite projections without the transforming function, it can be added later
2. We may want to take this opportunity to simplify the architecture by removing the asObject method from SPIs and considering mappers should simply use projections for that. But we can't really do it for asReference, because that would force mappers to provide an ObjectLoader even when using asReference and not needing any object loading.
3. Important: make sure that we can add support for more parameters (type-safe transforming function) without breaking APIs. One way to do that would be to give a different name to the DSL method that create a composite projection from a vararg of projections, e.g. compositeList() vs composite(), or composite() vs. compositeTransformed(), or compositeList() vs. compositeTransformed(). Another way would be to use the same name, but not allow a function in that same method.

Assignee

Guillaume Smet

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Priority

Major
Configure