Fully asynchronous automatic indexing

Description

Important: See "Approach 2: entity change events" in HSEARCH-3281: Reintroduce support for clustered applicationsClosed .

As highlighted but the use case here:

Our "async" worker is delegating to a background thread the write-to-index aspect, but it still has to gather all data in the application thread synchronously.

This is not the first time such a thing is asked, and we might be able to batch some operations in the background work? For example we have plans to have the MassIndexer hinted on how to optimally load associations for a given type, the same hints could be applied here.

It could be in in all effects implemented by simply feeding the queue of an always available MassIndexer.

Note that one important aspect of this would be persistence of entity change events: if the server crashes after the transaction is committed, we need to be aware of the entities that still need reindexing.

Activity

Yoann RodièreMay 31, 2021 at 8:29 AM

This will effectively be solved in HSEARCH-3280. Closing as duplicate (even if this ticket is the original one, https://hibernate.atlassian.net/browse/HSEARCH-3280#icft=HSEARCH-3280 already has more detailed sub-tasks).

Yoann RodièreJune 4, 2019 at 6:15 AM

There is previous work here: https://github.com/hibernate/hibernate-search/pull/979

We cannot use it directly, but we should have a look before we start working on an implementation using Kafka to store the events (so that we can eventually use Debezium).

Duplicate

Details

Assignee

Reporter

Components

Priority

Created September 29, 2016 at 9:45 AM
Updated May 31, 2021 at 8:30 AM
Resolved May 31, 2021 at 8:29 AM

Flag notifications