Serialization doesn't handle some builtin Lucene attributes

Description

First, some interfaces are not supported:

  • org.apache.lucene.search.BoostAttribute

  • org.apache.lucene.search.FuzzyTermsEnum.LevenshteinAutomataAttribute

  • org.apache.lucene.search.MaxNonCompetitiveBoostAttribute

  • org.apache.lucene.analysis.NumericTokenStream.NumericTermAttribute

  • org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute

  • org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute

  • org.apache.lucene.analysis.tokenattributes.BytesTermAttribute

Second, attributes implementing multiple interfaces are not supported, because the serializer assumes only one interface is implemented. This is the case of org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl for instance. We should make it possible for one attribute in Java to generate multiple attributes in the message.

And finally, polymorphism is being ignored. org.apache.lucene.collation.tokenattributes.CollatedTermAttributeImpl for instance extends org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl without implementing additional interfaces, but simply overrides a method. Simply solving the first two issue should solve the problem, though, provided every Attribute method is simply a getter.

Environment

None

Activity

Show:
Sanne Grinovero
October 6, 2016, 12:55 PM

Because of how the backends are currently designed, we never need to serialize Lucene attributes.

When a Lucene Document is serialized, is has not been processed by the Analysis chain yet.

Assignee

Unassigned

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Affects versions

Priority

Major
Configure