Simplify and improve ordering and parallelism of Elasticsearch indexing

Description

One simplification we could apply in particular is to only ever execute indexing works (Index/Delete) in bulks, even if there's only one. That shouldn't affect performance too much, and that would definitely make the code simpler.

When that's done, many of the improvements implemented in the Lucene backend as part of could be applied to the Elasticsearch backend as well:

Queue works instead of worksets (simplifies configuration, e.g. will be easier to implement)
Use a single thread pool for the whole backend (share resources across indexes)
Do not batch works that don't benefit from batching, e.g. non-bulkable works such as purge, search queries, ... In the case of Elasticsearch, that would mean submitting them to the REST client immediately when they are submitted to the orchestrator.
Maybe, use multiple queues per orchestrator in order to execute multiple works for the same index in parallel
Maybe, move to a common, global orchestrator for indexing
More?

Linked issues

follows up on

HSEARCH-3822

Restore support for concurrent Lucene work execution - at least during mass indexing

Activity

Show:

Fixed

Details
Assignee
Yoann Rodière
Reporter
Yoann Rodière
Components
Sprint
None
Fix versions
6.0.0.Beta6
Priority
Major

Created March 26, 2020 at 8:53 AM

Updated March 31, 2020 at 11:52 AM

Resolved March 30, 2020 at 11:15 AM

Simplify and improve ordering and parallelism of Elasticsearch indexing

Description

Linked issues

follows up on

Activity

DetailsAssigneeYoann RodièreYoann RodièreReporterYoann RodièreYoann RodièreComponentsSprintNone+1Fix versions6.0.0.Beta6PriorityMajor

Details

Assignee

Reporter

Components

Sprint

Fix versions

Priority

Details
Assignee
Yoann Rodière
Reporter
Yoann Rodière
Components
Sprint
None
Fix versions
6.0.0.Beta6
Priority
Major