Define analyzers via the REST API

Description

Defining analyzer or any index.* properties in elasticsearch.yml is deprecated and will not work in Elasticsearch 5 (next version after 2.3).
We should move away from using this approach and instead incorporate the Analyzer definition during the create index phase.
Here is the API for it https://www.elastic.co/guide/en/elasticsearch/reference/2.3/indices-update-settings.html#update-settings-analysis

This will work for all the analyzer def based on reasonable default implementations of Lucene / Elasticsearch. Each tokenizer and filter and char set can be given a name.
One can also pass a fully qualified class name instead of the short name (to be verified)

What about custom implementations of Tokenizer / Filter. The natural way in Elasticsearch is to write and deploy a plugin which contains a small implementation enlisting the tokenizers or filter by name and the actual implementations in a Jar. The main gotcha is that implementation classes must implement Elasticsearch interfaces.

How far should we help users deploy their custom analyzer implementations :

  • build the plugin distro?

  • check the presence of the named analyzers or components (which ES API)?

  • change Analyzerdef to adopt a string based name solution like Elasticsearch?

Environment

None

Activity

Show:
Yoann Rodière
November 14, 2016, 9:43 AM

Continuing the discussion from (which is a duplicate): this issue will be addressed in 5.6.0-CR1 only if we have enough time. It's a low-priority issue for now.

Assignee

Yoann Rodière

Reporter

Emmanuel Bernard

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Affects versions

Priority

Major
Configure