Stopword analysis may result in empty exception being thrown in keyword query

Description

When you enter a text "the" on a field that uses stopword analysis, Hibernate Search will throw an exception. This is undesirable there doesn't seem to be an easy way to prevent this kind of input to be given to a keyword query since the result depends on the analysis of the input text.

In stead I believe a term like this should just be ignored and a log-warning should be given.

{{{
org.hibernate.search.errors.EmptyQueryException: HSEARCH000146: The query string 'the' applied on field 'tag_kw_pt' has no meaningfull tokens to be matched. Validate the query input against the Analyzer applied on this field.
at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:111)
at org.hibernate.search.query.dsl.impl.ConnectedMultiFieldsTermQueryBuilder.createQuery(ConnectedMultiFieldsTermQueryBuilder.java:81)

}}}

Activity

Show:

Fabio Massimo Ercoli February 9, 2021 at 5:52 PM

we already cover the case and we already have a test for it

Yoann Rodière February 8, 2021 at 9:27 AM

I believe this bug no longer exists in Hibernate Search 6. Could you check and close this ticket if it’s been fixed? Thanks.

Yoann Rodière March 3, 2017 at 9:03 AM

An easy way out is to simply catch the exception and handle it whatever way suits you. Ignore it if that's acceptable.

As for changing Hibernate Search to handle this case, I'm a bit torn. On the one hand, yes, throwing an exception seems a bit harsh. But on the other hand, what in the world can an empty query mean? Some developers may want to not return any result in this case, forcing users to enter meaningful queries. Different use cases may require a different behavior, and simply logging a warning doesn't give users the ability to choose.

I guess we could add an option to allow developers to choose the behavior when there are no meaningful terms in a query. This would make even more sense if we introduce the fuild API described in the comments of HSEARCH-2498.

Jan-Willem Gmelig Meyling March 2, 2017 at 11:14 PM
Edited

I'm facing the same issue, I am also wondering how to validate queries for emptiness after running the analysers. Have you found a workaround? How can one apply analyzers to a query to figure out it is in fact empty prior to executing the query?

Diego Salvi December 19, 2013 at 4:40 PM
Edited

I've got the same issue while running kewords query against fields configured with stop words.

Can I suggest to use a strategy similar to ConnectedMultiFieldsPhraseQueryBuilder.createQuery(FieldContext)?

ConnectedMultiFieldsPhraseQueryBuilder.createQuery(FieldContext):[144,148]

Implementing a similar code into ConnectedMultiFieldsTermQueryBuilder.createQuery(FieldContext,ConversionContext) will return an empty query for an "empty after stopwords removal" keyword query.

From:

ConnectedMultiFieldsTermQueryBuilder.createQuery(FieldContext,ConversionContext):[110:112]

To:

ConnectedMultiFieldsTermQueryBuilder.createQuery(FieldContext,ConversionContext):[110:112]

Out of Date

Details

Assignee

Reporter

Components

Sprint

Affects versions

Priority

Created April 16, 2013 at 3:06 PM
Updated February 9, 2021 at 5:52 PM
Resolved February 9, 2021 at 5:52 PM