Define analyzers via the REST API

Description

Defining analyzer or any index.* properties in elasticsearch.yml is deprecated and will not work in Elasticsearch 5 (next version after 2.3).
We should move away from using this approach and instead incorporate the Analyzer definition during the create index phase.
Here is the API for it https://www.elastic.co/guide/en/elasticsearch/reference/2.3/indices-update-settings.html#update-settings-analysis

This will work for all the analyzer def based on reasonable default implementations of Lucene / Elasticsearch. Each tokenizer and filter and char set can be given a name.
One can also pass a fully qualified class name instead of the short name (to be verified)

What about custom implementations of Tokenizer / Filter. The natural way in Elasticsearch is to write and deploy a plugin which contains a small implementation enlisting the tokenizers or filter by name and the actual implementations in a Jar. The main gotcha is that implementation classes must implement Elasticsearch interfaces.

How far should we help users deploy their custom analyzer implementations :

  • build the plugin distro?

  • check the presence of the named analyzers or components (which ES API)?

  • change Analyzerdef to adopt a string based name solution like Elasticsearch?

Activity

Show:

Yoann RodièreNovember 14, 2016 at 9:43 AM

Continuing the discussion from (which is a duplicate): this issue will be addressed in 5.6.0-CR1 only if we have enough time. It's a low-priority issue for now.

Fixed

Details

Assignee

Reporter

Components

Sprint

Priority

Created April 12, 2016 at 6:19 PM
Updated December 23, 2016 at 11:44 AM
Resolved December 19, 2016 at 8:11 PM