It's possible today to index free-form objects but it is a well kept secret, and looks more like a hackish workaround and lacks some explicit API help.
Today (i.e. with version 4.3.x) this is the possible solution:
For indexing, you need a "placeholder object" which has a rather
smart custom @ClassBridge.
Analyzers could also be custom, as Numeric types, etc..
ModeShape uses this approach, and we added some tests in Search Engine as example as Hibernate Search users occasionally also asked for more dynamic properties.
If you have a purely dynamic schema - likely as you're building a framework on top of Hibernate Search - you would use sich placeholders exclusively.
Since you have a single placeholder object, you'll only be able to
target this type and will return lists of this type. This means that
user types are built on top of this type as an additional layer:
probably a special field to represent which protostream schema we're
referring to, the HQL query transformer will then either add a
restriction or enable a filter.
There is room for improvement in the lower level details, for example
by exposing some control over the usage of custom field
I guess also the results Loader could take advantage of this, but
doesn't seem an urgently needed patch either.
Query DSL + metadata API
These APIs don't provide any useful helper for purely dynamic input. Needs to be explored.
Tika and filesystem-stored documents
To be kept in mind as interesting input examples