Issues
- Add match all terms option when matching in the DSL APIHSEARCH-917
- Spell check functionalityHSEARCH-269
- Support runtime polymorphism on associations (instead of defining the indexed properties based on the returned type)HSEARCH-438
- Consider "Dismax" feature setHSEARCH-665
- Drill-down faceting query should not affect other facet countsHSEARCH-713
- Make it possible to configure an alternative IndexDeletionPolicyHSEARCH-816
- Support DISTINCT select on single field projectionHSEARCH-868
- Infinispan backendHSEARCH-882
- Allow for the creation of a index backupHSEARCH-1006
- MassIndexer with an update mechanismHSEARCH-1032
- @IndexedEmbedded.includePaths should allow to pick paths that were excluded from the embedded entityHSEARCH-1112
- Feed Lucene Fields with pre-interned field names as metadata constantsHSEARCH-1127
- Use Jandex instead of hibernate-commons-annotations for annotation discoveryHSEARCH-1213
- Describe ways to avoid cascading entity initializationHSEARCH-1235
- Query time joinHSEARCH-1237
- Offer ability to define template based directories for shardingHSEARCH-1295
- Minimize file descriptor usage by coupling the IndexReader lifecycle to Merger activityHSEARCH-1349
- Geo-distance aggregationHSEARCH-1359
- Composite bridgesHSEARCH-1397
- Investigate usage of container managed ExecutorService instancesHSEARCH-1439
- Implement updateable numeric fieldsHSEARCH-1473
- Indexing and querying of primitive typesHSEARCH-1493
- API to declare free-form entity propertiesHSEARCH-1526
- Additional features to consider around MoreLikeThisHSEARCH-1537
- Support favorPerFieldSimilarity in MoreLikeThisHSEARCH-1538
- Make MoreLikeThis .toEntity(Object) type-safe with genericsHSEARCH-1539
- Support toContent(Reader) and toContent(String) in MoreLikeThisHSEARCH-1541
- Add additional stopwords list to a MoreLikeThis queryHSEARCH-1542
- Consider exposing the min/max term freq, document freq, word length, token parsed, query terms for MoreLikeThisHSEARCH-1543
- Support numbers with exact match in MoreLikeThisHSEARCH-1544
- Switch the Merge policy during MassindexingHSEARCH-1654
- Provide supported configuration scenarios for Hibernate Search on OpenshiftHSEARCH-1776
- Grouping the search result by index field (field collapsing)HSEARCH-1802
- Fetch only fields needed for indexing during index buildHSEARCH-1813
- Explicitly validate the version of Hibernate ORMHSEARCH-1816
- Provide abstraction for lucene-suggest build and lookup methodHSEARCH-1823
- Automate setup of Eclipse formatting rules on project importHSEARCH-1838
- Expose indexing queue length and other backend performance metricsHSEARCH-1892
- Log all properties and their values at startup timeHSEARCH-1898
- Query based approach for reindexing resolutionHSEARCH-1937
- Support for on-demand/lazy initialization of backend resourcesHSEARCH-2339
- Re-run all tests from the ORM integration modules using a JTA configurationHSEARCH-2344
- Support for parallel searchersHSEARCH-2345
- Allow customization of date formatHSEARCH-2354
- Indexed resolution (truncation) for date/time typesHSEARCH-2378
- @*Field annotations on type arguments (e.g. List<@GenericField String>)HSEARCH-2444
- Improve massindexer with Elasticsearch by disabling some refresh and replicationHSEARCH-2455
- Make types in org.hibernate.search.elasticsearch.schema.impl.model immutableHSEARCH-2517
- Improve Elasticsearch mapping deployment in concurrent environmentsHSEARCH-2553
- Use the SearchAfter feature for paginated queries on ElasticsearchHSEARCH-2601
Add match all terms option when matching in the DSL API
Description
is duplicated by
is followed up by
Details
Assignee
UnassignedUnassignedReporter
Guillaume SmetGuillaume SmetComponents
Fix versions
Priority
Critical
Details
Details
Assignee
Reporter
Components
Fix versions
Priority
Activity
Yoann RodièreJuly 1, 2020 at 11:02 AM
I guess the use case is already addressed by SimpleQueryString for the most part, but there are still details that are not...
In particular I know the simple query string can apply an AND between two clauses (separated by a space), but I'm not sure what happens when a clause is tokenized into two separate terms (e.g. wi-fi
). It's quite possible that we end up with an OR between the two terms, and then we're back to the same old problem.
Elasticsearch has quite advanced settings to define how to behave when trying to match multiple terms (what this ticket is about) or multiple fields (), and I think it would be worth having a look.
There are also minor differences between the match predicate and the simple query string predicate that could make the match predicate preferable in some cases: no support for DSL converters, no fuzzy option that can be set by the developer, ... All very minor, but they exist.
Anyway, this is definitely low-priority, at least for Search 6.0.0.Final. I might end up postponing to 6.1 if we don't have enough time.
Guillaume SmetJune 30, 2020 at 5:00 PM
For me this one can probably be closed. The simple query string work was a direct follow up of that one.
Sanne GrinoveroApril 16, 2012 at 12:21 PM
Hi Guillaume,
thanks! We can split tasks if you happen to find the time to contribute your draft, as for my mind the most valuable contribution is the tests and the requirements expressed in Java. Otherwise looking forward to you finish your other project.
Guillaume SmetApril 16, 2012 at 11:44 AM
Hi Sanne,
The Hibernate Search part is easy - and I already have it somewhere - but the Lucene one isn't that easy, depending on your analyzers.
I had to put it on hold for the time being as I have to concentrate my efforts on another project. I'll be back on it as soon as I've finished my other project (should be end of june) if noone beats me to it.
–
Guillaume
Hi,
A bit of context:
We are early adopters of Hibernate Search and we have very few problems with it (except the @IndexEmbedded problem we helped to fix in 3.4.1, no problem so far).
When the DSL API was introduced, I tried it and I found the problem I describe below. I decided to use the QueryParser API (and the MultiFieldQueryParser API) as a workaround. The fact is that:
we use Hibernate Search in every application we have, now;
the DSL API is really nice and, as we introduced QueryDSL in our application, we now use a lot of DSL like API and I would like to be able to use Hibernate Search API too;
I thought it was a deliberate choice but, recently I found an example so weird, I can't think it's the wanted behaviour.
So this problem isn't new and it exists since the first version of the DSL API.
Now, the description of the problem:
we use the following analyzer to index a field in our entity:
the content of the field is something like XXXX-AAAA-HAGYU-19910
if you search for an exact match "XXXX-AAAA-HAGYU-19910" with the QueryParser, you have a few results: namely the results which have all the different parts (XXXX, AAAA, HAGYU and 19910) in any order. That's the behaviour I expect considering my analyzer.
if you search using the DSL API, you have ALL the results containing at least ONE token so A LOT of results in our case.
My expectation is that the DSL API should work as the Lucene parser works and it should return the same results.
The problem is that in ConnectedMultiFieldsTermQueryBuilder, we don't use the QueryParser to build the Lucene query but a getAllTermsFromText() method which uses the analyzer to get all the terms and from that we build a OR query.
So when I search for XXXX-AAAA-HAGYU-19910, the DSL API searches for "XXXX" OR "AAAA" OR "HAGYU" OR "19910".
I really think it's a mistake and that we should use the *QueryParser API to build the Lucene Query and have the correct behaviour.
If needed, I can provide any further information and/or a test case. I just want to be sure you consider it a bug before working further on this. Otherwise I'll stick to using the *QueryParser API.
Thanks for your feedback.