We're updating the issue view to help you get more done. 

Add support for Elasticsearch 5

Description

Adding a ticket, since I'm working on it. I will update the required changes below as I find new ones.

Potential blockers:

  • Support for specifying analyzers in elasticsearch.yml has been removed: https://www.elastic.co/guide/en/elasticsearch/reference/5.x/analysis-custom-analyzer.html: we have to use the Rest API to declare analyzers (see HSEARCH-2219)

  • Analyzer definitions are now index-scoped so you can't declare global analyzers and have to declare the analyzers for each index (more or less each Hibernate root entity); this is highly inconvenient. This makes solving all the more important: expecting users to declare analyzers themselves on the Elasticsearch server is now a no-no (see comments on this ticket).

External work required:

Changes that would require to drop support for 2.0 (or to introduce dialects):

  • The string datatype disappeared and has been replaced by text and keyword. What we need is probably text, except for non-anlyzed fields that must be keyword s (as text fields have to be analyzed).

  • null_value is no longer supported on the text datatype: we currently use it for the indexNullAs feature

  • sorting on text fields now requires enabling data loading in the mapping

  • DeleteByQuery is a core feature again, with its own API. The plugin has been removed.

  • The default scripting language is now Painless, which is very similar to Groovy (only script parameters must be prefixed with params.)

  • For projections, the "fields" keyword when querying is now "stored_fields" and using "_source" in there is disallowed. Source filtering must be used to access the _source. e.g. ?_source_include=foo

  • arcDistanceInKm has been renamed to arcDistance and now returns meters: https://www.elastic.co/guide/en/elasticsearch/reference/5.0/breaking_50_scripting.html#_geopoint_scripts

Changes that will probably also work with ES 2.x (see HSEARCH-2437):

  • "filtered" queries are no longer supported and must be replaced by "bool" queries with a "must" and a "filter"

  • the "queryString" keyword for query string queries does not work anymore, we must use "query_string" (I wonder why we didn't in the first place)

  • The syntax we used with ES 2 for search scripts ({{"script_fields:"{"_distance":{"params": {...}, "script": "..."}} }}) seems off with the documentation and doesn't work in ES 5.

  • the size parameter in bucket aggregation queries (used for facetting) used to accept a 0 value, meaning "Integer.MAX_VALUE". It was a deprecated feature and it's not possible anymore. See https://www.elastic.co/guide/en/elasticsearch/reference/2.4/search-aggregations-bucket-terms-aggregation.html#_size

  • affects Elasticsearch 5.0 too (not only 2.4.1).

See my branch where I'm poking around to see what needs to be done: https://github.com/yrodiere/hibernate-search/tree/HSEARCH-2434

Environment

None

Status

Assignee

Yoann Rodière

Reporter

Yoann Rodière

Components

Fix versions

Affects versions

5.6.0.Beta3

Priority

Major