Facet collection not threadsafe
Description
Activity

Hardy Ferentschik May 15, 2014 at 9:32 AM
Hardy, thanks. That really helps.
No problem. Actually glad to hear that this setup helped. I created it a while back with the intention to create a performance baseline for upcoming work on improving faceting. I am planning to extend it once I start the actual implementation.
I've extended it quite a bit to more mimic how we run queries, and to run multi-threaded.
Cool. If you think that you come up with something which is useful worth sharing, feel free to create a pull request
Good news seems to be that things are okay.
Good news for us indeed. Good luck with the bug hunt.

adamb May 14, 2014 at 8:04 PM
Hardy, thanks. That really helps. I've extended it quite a bit to more mimic how we run queries, and to run multi-threaded. Good news seems to be that things are okay. Bad news is that this might be in our code:

Hardy Ferentschik May 14, 2014 at 8:42 AM
I actually have a test harness - https://github.com/hferentschik/hsearch-faceting-perf - I just never have seen this before. The harness is geared towards comparing Lucene faceting against Search faceting, but still it is executing faceting multi threaded. I've never had an issue. Since I am planning to continue with some faceting related issues, I'll extend the tests and try to see whether I can trigger a concurrency issue.
Emmanuel Bernard May 14, 2014 at 6:06 AM
Hello , I know some have used byteman rules to implement concurrency unit tests. do you think that would be an appropriate tool here?

adamb May 13, 2014 at 7:57 PM
Hardy,
Starting from your last point and moving backward, I guess that's the point – all that has to happen is for the FacetManager or the search to have to handle two requests at the same time.
I can look at extracting a test case, but I"ll be honest, I'm not exactly sure how to best do that for you. Sounds like I need to write a small application using Hibernate and Hibernate Search that loads thousands of objects into a table and then sets up two threads to perform different faceted searches simultaneously. I can definitely do that ... but I wonder if there might be some test fixtures in place that might be more conducive? Not trying to be argumentative, just trying to be proactive in using everyone's time here.
thanks,
adam
Details
Details
Assignee

Reporter

We have noticed that under heavy load, our Hibernate Search generated facets no longer pertain to the Lucene query that generated them. Eg. if a user loads:
http://core.tdar.org/search/results
they should see at least 364,000 documents. But, when you have a script like the following running at the same time, the user will often get a result closer to 230 documents:
From what I can surmise from how the code works – and that the facet collection is done in a separate method from the actual query, that it may be possible that the lucene workers or threads are not being retained between the time of building the query, processing it, and collecting the facets.
Under normal load, it is nearly impossible to elicit this situation.
I'm having a hard time figuring out the best way to provide a "Test case" for this, beyond providing the script above and description of how to reproduce. If there's a test case in HibSearch that might be conducive, please point me in that direction.
Code running the query and generating the facets is available here:
[processing search]
https://bitbucket.org/tdar/tdar.src/src/tip/src/main/java/org/tdar/core/service/SearchService.java#cl-329
[facet collection]
https://bitbucket.org/tdar/tdar.src/src/tip/src/main/java/org/tdar/core/service/SearchService.java#cl-369