Uploaded image for project: 'Hibernate Search'
  1. HSEARCH-2642

Automatic translation of TypeTokenFilterFactory to Elasticsearch turns blacklists to whitelists

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.6.1.Final, 5.7.0.Final
    • Fix Version/s: 5.8.0.Beta2
    • Component/s: backend-elasticsearch
    • Labels:
      None

      Description

      TypeTokenFilter in Lucene acts as a blacklist by default, removing some tokens according to their type. It can act as a whitelist if the "useWhitelist" parameter is set to true.

      For Elasticsearch analyzers, we translate TypeTokenFilterFactory to Elasticsearch's "keep_types" filter type, and we forbid the use of "useWhitelist". This effectively inverts the meaning of this filter...

      What we should do instead is mandate the use of "useWhitelist", and throw an exception when it is either missing or set to something else than "true".

      Note: there doesn't seem to be an equivalent to the blacklist mode of TypeTokenFilterFactory in Elasticsearch.

        Attachments

          Activity

            People

            • Assignee:
              yrodiere Yoann Rodière
              Reporter:
              yrodiere Yoann Rodière
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: