Lucene terms aggregations (discrete facets) may return wrong results for any sort other than the default one

Description

We rely internally on Lucene's built-in faceting capabilities, i.e. the Facets class. But this class only allows us to retrieve the top N facets by document count.
In order to retrieve facets in any other order, we would have to retrieve the top Integer.MAX_VALUE facets by document count, which is clearly unacceptable.
So currently, we just retrieve the top N facets by document count, and sort them by whatever sorts the user requested. Which makes little sense, and is not consistent with Elasticsearch's behavior.

Maybe we should implement our own collectors instead of relying on lucene-facets. That's what Elasticsearch did, and it would allow us to solve other problems ( in particular).

Environment

None

Assignee

Unassigned

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Fix versions

Affects versions

Priority

Major
Configure