Serialization doesn't handle some builtin Lucene attributes

Description

First, some interfaces are not supported:

  • org.apache.lucene.search.BoostAttribute

  • org.apache.lucene.search.FuzzyTermsEnum.LevenshteinAutomataAttribute

  • org.apache.lucene.search.MaxNonCompetitiveBoostAttribute

  • org.apache.lucene.analysis.NumericTokenStream.NumericTermAttribute

  • org.apache.lucene.analysis.tokenattributes.PositionLengthAttribute

  • org.apache.lucene.analysis.tokenattributes.TermToBytesRefAttribute

  • org.apache.lucene.analysis.tokenattributes.BytesTermAttribute

Second, attributes implementing multiple interfaces are not supported, because the serializer assumes only one interface is implemented. This is the case of org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl for instance. We should make it possible for one attribute in Java to generate multiple attributes in the message.

And finally, polymorphism is being ignored. org.apache.lucene.collation.tokenattributes.CollatedTermAttributeImpl for instance extends org.apache.lucene.analysis.tokenattributes.CharTermAttributeImpl without implementing additional interfaces, but simply overrides a method. Simply solving the first two issue should solve the problem, though, provided every Attribute method is simply a getter.

Activity

Show:

Sanne Grinovero October 6, 2016 at 12:55 PM

Because of how the backends are currently designed, we never need to serialize Lucene attributes.

When a Lucene Document is serialized, is has not been processed by the Analysis chain yet.

Won't Fix

Details

Assignee

Reporter

Components

Affects versions

Priority

Created October 6, 2016 at 10:47 AM
Updated October 6, 2016 at 12:56 PM
Resolved October 6, 2016 at 12:55 PM

Flag notifications