Simplify and improve ordering and parallelism of Elasticsearch indexing

Description

One simplification we could apply in particular is to only ever execute indexing works (Index/Delete) in bulks, even if there's only one. That shouldn't affect performance too much, and that would definitely make the code simpler.

When that's done, many of the improvements implemented in the Lucene backend as part of could be applied to the Elasticsearch backend as well:

  • Queue works instead of worksets (simplifies configuration, e.g. will be easier to implement)

  • Use a single thread pool for the whole backend (share resources across indexes)

  • Do not batch works that don't benefit from batching, e.g. non-bulkable works such as purge, search queries, ... In the case of Elasticsearch, that would mean submitting them to the REST client immediately when they are submitted to the orchestrator.

  • Maybe, use multiple queues per orchestrator in order to execute multiple works for the same index in parallel

  • Maybe, move to a common, global orchestrator for indexing

  • More?

Environment

None

Assignee

Yoann Rodière

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Priority

Major
Configure