Explicit support for indexing free-form (dynamic) entities
Description
is duplicated by
required for
Activity
Yoann Rodière June 9, 2017 at 8:37 AM
Closing as duplicate of HSEARCH-1401, since that other ticket is a bit more documented.
Hardy Ferentschik September 12, 2013 at 1:33 PM
Can we add some examples to this issue? How do these "dynamic entities" look like? How are the fields configured and how do queries look like?
Emmanuel Bernard July 31, 2013 at 4:34 PMEdited
The goal is is to offer:
an abstraction on top of object navigation (via reflection) to support alternate data structure
open the API to reference entities besides their Class (query DSL, programmatic mapping are two examples)
Sanne Grinovero July 31, 2013 at 3:53 PMEdited
This is an example of how it can be achieved today:
https://github.com/hibernate/hibernate-search/blob/277449eb02a367d76e32f7fd92ef9c57fa6a1f0c/engine/src/test/java/org/hibernate/search/test/bridge/PropertiesExampleBridgeTest.java
Note that the indexing and query example looks like particularly clumsy as the test is made in engine, so without the ORM syntactic sugar (not the one from Infinispan Query).
Of course it could be made smarter, for example the DynamicIndexedValueHolder could use multiple Properties defined: some for simple text, some for numbers, some other option to carry a specific boosting option as a value.
I don't know how we would like this to evolve exactly in 5.0 and beyond, that needs to be discussed. The point is that CapeDwarf uses it in this more flexible way, as does Infinispan via remote queries (even from clients in different languages like C#, Ruby or Phyton), as does ModeShape. We need to inspect their use cases, but some are yet to be defined.
Hardy Ferentschik July 31, 2013 at 3:41 PM
So how does this "rather smart" class bridge look like? And how to queries look like?
It's possible today to index free-form objects but it is a well kept secret, and looks more like a hackish workaround and lacks some explicit API help.
Today (i.e. with version 4.3.x) this is the possible solution:
Indexing
For indexing, you need a "placeholder object" which has a rather
smart custom @ClassBridge.
Analyzers could also be custom, as Numeric types, etc..
ModeShape uses this approach, and we added some tests in Search Engine as example as Hibernate Search users occasionally also asked for more dynamic properties.
Queries
If you have a purely dynamic schema - likely as you're building a framework on top of Hibernate Search - you would use sich placeholders exclusively.
Since you have a single placeholder object, you'll only be able to
target this type and will return lists of this type. This means that
user types are built on top of this type as an additional layer:
probably a special field to represent which protostream schema we're
referring to, the HQL query transformer will then either add a
restriction or enable a filter.
There is room for improvement in the lower level details, for example
by exposing some control over the usage of custom field
org.hibernate.search.ProjectionConstants.OBJECT_CLASS .
I guess also the results Loader could take advantage of this, but
doesn't seem an urgently needed patch either.
Query DSL + metadata API
These APIs don't provide any useful helper for purely dynamic input. Needs to be explored.
Tika and filesystem-stored documents
To be kept in mind as interesting input examples