We're updating the issue view to help you get more done. 

Automatic reindexing on asymmetric association updates

Description

Let's imagine a model with two entity types A and B, pointing to each other through an association: A.b and B.a.
A is an indexed entity, and its indexed document includes some parts of B through the association A.b (e.g. there's a field b.c in A's indexed document).

When the user updates the association between A and B, we expect him to update both sides of the association: A.b and B.a. If the user were to only update B.a, we would know that the entity previously pointed to by B.a needs to be reindexed, but we wouldn't know which entity this was, since B.a was updated (set to null or to another entity)...

There are two problems with this limitations:

1. ORM allows to only update one side of the association (the owning side), which means users are likely to only update one side from time to time. When they do, Hibernate Search will silently "forget" to reindex. Best case, users will spot the problem, will assume Search is buggy and painstakingly update their application to update both sides of the association; worst case they will not spot the problem and they will ship a buggy application.
2. This effectively precludes us from allowing unidirectional associations, where A.b doesn't exist and is replaced with a query: if we did that, we wouldn't be able to reindex when the association is updated on the B side.

In Search 5, we used to partially support such asymmetric association updates: if B was completely deleted and the association B.a was left untouched, we walked through that association and reindexed A. However this only worked when the association is updated implicitly though deletion of B, and the association is eager.

I think we should be support support asymmetric association updates. We would only have to store in the work plan, for each entity and each relevant association from that entity, the list of entities removed from associations:

  • When a *ToOne association is set to null, or simply changed, we'd use the entity update event to find the previous value of that association.

  • When a *ToMany association is updated or replaced, we'd use the collection update or replace event to find the entities removed from the association.

  • When an entity is deleted, we'd use the before delete event to load every relevant association.

This might be a bit heavy resource wise, but we would only need to do that for associations from an "indexedEmbedded" or otherwise contained type to its containing type. There may be way to optimize this.

The big question is, of course, would extracting this information be easy, or even possible at all?

Environment

None

Status

Assignee

Unassigned

Reporter

Yoann Rodière

Labels

None

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Fix versions

Priority

Major