Explicit support for indexing free-form (dynamic) entities

Description

It's possible today to index free-form objects but it is a well kept secret, and looks more like a hackish workaround and lacks some explicit API help.

Today (i.e. with version 4.3.x) this is the possible solution:

  • Indexing
    For indexing, you need a "placeholder object" which has a rather
    smart custom @ClassBridge.
    Analyzers could also be custom, as Numeric types, etc..
    ModeShape uses this approach, and we added some tests in Search Engine as example as Hibernate Search users occasionally also asked for more dynamic properties.

  • Queries
    If you have a purely dynamic schema - likely as you're building a framework on top of Hibernate Search - you would use sich placeholders exclusively.
    Since you have a single placeholder object, you'll only be able to
    target this type and will return lists of this type. This means that
    user types are built on top of this type as an additional layer:
    probably a special field to represent which protostream schema we're
    referring to, the HQL query transformer will then either add a
    restriction or enable a filter.
    There is room for improvement in the lower level details, for example
    by exposing some control over the usage of custom field
    org.hibernate.search.ProjectionConstants.OBJECT_CLASS .
    I guess also the results Loader could take advantage of this, but
    doesn't seem an urgently needed patch either.

  • Query DSL + metadata API
    These APIs don't provide any useful helper for purely dynamic input. Needs to be explored.

  • Tika and filesystem-stored documents
    To be kept in mind as interesting input examples

Environment

None

Status

Assignee

Unassigned

Reporter

Sanne Grinovero

Labels

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Priority

Major