Wrong analyzers used in IndexWriter

Description

when an IndexWriter is first opened during a transaction commit it is assigned the analyzer connected to the first entity written,
if during the same transaction other entities are saved to the same index it will reuse the first one ( instead of the entity specified one).
I have a testcase showing the problem ready for commit but need your opinion about how to solve it.

I think the problem is that we register a ScopedAnalyzer for each DocumentBuilder, but there should be one per DirectoryProvider?
In this case we should check at startup that no entities sharing an index define conflicting Analyzer rules.

Another solution would be to let complete flexibility during analyzer definition, but reopen the IndexWriter when the entityType is different
from the last one indexed.

Activity

Show:

Sanne Grinovero September 14, 2008 at 12:25 AM

actually, there is a nice "addDocument(Document doc, Analyzer analyzer) "
sorry I didn't see it before.

Emmanuel Bernard September 13, 2008 at 11:41 PM

I prefer the latter
same file names between various entity tipes is quite common so conflicts will happen.

Alternatively we can use a wrapper analyzer that can switch from one underlying analyzer to another depending on some inputs from the code.
Right before adding a document, switch to the right analyzer. It should work.

Fixed

Details

Assignee

Reporter

Components

Fix versions

Affects versions

Priority

Created September 13, 2008 at 6:13 PM
Updated December 10, 2008 at 3:18 PM
Resolved October 10, 2008 at 12:02 PM