Multiple entity types sharing the index shouldn't necessarily prevent single-term deletions
Description
Activity
Emmanuel BernardJanuary 19, 2015 at 3:53 PM
To clarify, we need to do the following.
In the ORM case, detect that several entities share the same index but either:
do not use the same identifier field name
are from the same ORM mapped hierarchy
Sanne GrinoveroJanuary 9, 2015 at 2:23 PM
The above PR is a related improvement but it's not the same thing, I'll create a new JIRA for that.
Sanne GrinoveroJanuary 7, 2015 at 8:21 PMEdited
For 1. yes that would work but only on ORM, not on Infinispan.
It turns out it actually works all the time for Infinispan; at least I think, it took me a while to think of all possible cases but I think I concur with Gustavo; see comments on ISPN-5103: Infinispan uses also a Transformer
interface which includes a marker for the key type in the key term; so you can figure out which reverse Transformer
needs to be applied, and from there it's Id->Entry and being a Cache a Map there is no ambiguity.
There are limitations on different cache instances sharing a same index, but then again that's not something we encourage as the Search engine is per-cache.
"incompatible encodings" in 2
I was thinking that at least for some known field bridge implementations that we ship, we could know for example that the output of a DateBridge using an encoding like "01-01-2011" would never have ambiguity with a second type sharing the same index which uses a Long as identifier. We could figure out such a matrix.. not sure if it's worth it so I'd focus on case 1. which seems possibly quite common in practice.
As a follow up of discussions on https://issues.jboss.org/browse/ISPN-5103 : performance analysis in Infinispan highlighted that there is a significant performance benefit in enabling single-term delete rather than a query-based delete.
We already were aware of this and Hibernate Search will apply the better performing single-term delete operations (and update operations) when safe, but the definition of "when safe" is very conservative.
Currently the optimisation is only enabled when both of these are true:
org.hibernate.search.indexes.impl.PropertiesParseHelper.isIndexMetadataComplete(Properties, WorkerBuildContext) returns true (the default but can be configured otherwise)
there is a single type in the index (the IndexManager owns only one entity type)
In reality even if the IndexManager were to contain multiple types, this would be safe as long as we can guarantee that no type has an ambiguity on the keyword used for the ID encoding.
Some cases in which this is actually possible to figure out:
there are multiple types but they are mapped in an inheritance mode on ORM, which would force them to have a unique ID across different instance types
all types sharing the index have field bridges on the primary ID which would necessarily result in incompatible encodings
The second case might be complex to hit in practice (or just complex to evaluate), but I think the first case is probably the most common reason for one to possibly have multiple types sharing the same index, so that would be a great enhancement.