Better tests for IndexReader passed to Filter to be consistent with latest writes


Complex issue; I don't have time to boil it down to a simple test case, but will attempt to explain clearly:

Recent Index changes are not visible through the IndexReader passed to a Lucene Filter set on a FullTextQuery.


  1. I create a new Foo, id: 1, and persist it through entityManager.persist()

  2. I examine indexes with Luke, they are updated. Foo #1 is present.

  3. I perform a simple lucene search using Hibernate Search, Foo #1 is fetched.

Now, I run another query, this time using a query Filter that reads from the IndexReader passed to the getDocIdSet(IndexReader reader) method like so:

I would expect this to return 1, since I just persisted a Foo with ID 1. However, it returns 0.

If however, I check out an IndexReader instance from the searchFactory, and perform the same command like so:

Now the reader successfully returns 1, for the entity I had recently persisted.

Currently I work around this issue by manually checking out an IndexReader from the searchFactory, passing it to my Filter, and checking it in after the query runs. But, this is pretty clunky.

Shouldn't the Filter be getting the same current IndexReader?




Sanne Grinovero
December 1, 2012, 12:43 AM

I'm finally back from several trips and will inspect this tomorrow: looks quite bad.

Sanne Grinovero
January 1, 2013, 4:38 PM

Hi Clark,
I developed a functional test to verify this, could you please have a look at it:

Did you notice that the Filter instance is invoked multiple times? The filter needs to be applied on each sub-reader: you will get a reader instance to process for each segment in the index; considering that after each change you make to the index the set of segments changes, adding a new element like in your test means you'll be processing at least two segments.

Note that when you invoke

you're not operating on sub-readers but on a recursive IndexReader which includes all current segments.

Why: filtering needs to be applied on a per-segment basis to make caching more effective: each cached DocIdSet is kept around for the validity of each segment, so you won't invalidate all processing for each minimal change on the index.

Sanne Grinovero
January 1, 2013, 4:40 PM

Changed priority from critical as I think it's not really a bug - might need some clarifications on the docs?

Clark Duplichien
January 5, 2013, 12:17 AM

Thanks for the investigation and explanation, Sanne.
I was attempting to use a Lucene Filter to limit searched records to a subset of records whose IDs are or are not present in another field of the same index (but different record(s)). After your explanation, I follow the javadoc on better, and can understand why this use case wouldn't work out in a filter: the sub-reader filtering the current index segment would only be able to read the values for the referenced field within the current index segment.
Given this, I don't think there's any clarification to be made in the hibernate-search docs, either.

Sanne Grinovero
January 5, 2013, 12:27 AM

Hi Clark, thanks for confirming it's not a bug: feeling better

You're right, for your use case you would need a top-level IndexReader, so the approach to check one out yourself as you described in the JIRA description is good.



Sanne Grinovero


Clark Duplichien



Suitable for new contributors


Pull Request


Feedback Requested



Fix versions