AndDocIdSet makeDocIdSetOnAgreedBits() returns wrong values
Description
Attachments
Activity

Hardy Ferentschik November 2, 2010 at 6:38 PM
The problem is that the DocIdSetIterator
returned by SortedVIntList
behaves differently when advance(int target)
compared to the iterators returned by DocIdBitSet
and OpenBitSet
. Lets take the example from test test. Assume the following doc id are in the set 0, 5, 6 and 10. We get the DocIdSetIterator
from the DocIdBitSet
and we call next()
until we point to the third element (iterator.docID() == 6
. Now we call iterator.advance(6)
.
The algorithm in AndDocIdSet
assumes that the advance call will return 6, basically not moving to another element. This is also how DocIdBitSet
and OpenBitSet
behave. DocIdSetIterator
, however, returns 10.
The question is who is right. The DocIdSetIterator.advance
javadoc says:
Advances to the first beyond the current whose document number is greater than or equal to target
It also shows some pseudo code:
Reading this documentation I would think SortedVIntList
behaves correctly, but I think OpenBitSet
is the more commonly used. I am surprised that no one has noticed this before.
On our AndDocIdSet
side we can actually cater for this problem by comparing iterator.docID() == targetPosition
before we call advance
. If they match we don't have to call advance at all, because the iterator is already at the right position.
Need to follow up with the Lucene guys as well.

Sanne Grinovero October 27, 2010 at 12:00 PM
thanks a lot for spotting this and providing a testcase, very useful.
Details
Assignee
UnassignedUnassignedReporter
Christian MaderChristian MaderComponents
Fix versions
Affects versions
Priority
Major
Details
Details
Assignee
Reporter

Depending on the DocIdSets list, AndDocIdSet fails to compute correct values in makeDocIdSetOnAgreedBits(). Please see the attached test cases, test_middle() fails and in my opinion it shouldn't.