Add means to disable automatic indexing temporarily (per session)

Description

Automatic indexing is fine for many purposes but in some cases it should be possible to temporarily disable it, e.g. when importing data.

Consider our case:

We have a somewhat complex index which uses indexEmbedded quite a lot, mostly for some text entities as well as quite static entities that need to contribute to the index.

Example:

As you can see, articles have a list of texts as well as a status which itself has a list of texts (descriptions).

During imports, we only load the articles and texts while the status is only referenced by its id. Even if we would load it, we'd not need the texts for the import.

Thus, at the end of the import, the session would contain all articles and article texts that are needed. When the articles are indexed, the indexer will also need the status and status texts and load them.

So far, so good, but here's the problem:

In some cases, we need to run the import in one transaction. For performance reasons we flush and clear the session from time to time, since once imported, the articles and the texts are not need anymore.

At the end of the transaction, however, the indexer tries to load the status and status texts that are referenced by the articles, which due to the flush and clear operations results in lazy initialization exceptions (the articles that are to be indexed are not attached to the session anymore).

In this case it would be better to disable automatic indexing during the import and manually rebuild the index afterwards.

I know that there was a similar request for Hibernate Search 3.x (HSEARCH-387) which was rejected due to lack of clean ways to accomplish this.

However, with Hibernate 4, there might be some way.

A few thoughts on how that could work:

  • Add some property to the session that could be set via EntityManager.setProperty(...)
    a. Use that property to disable the entity listener or indexer or
    b. pass those properties to the EntityIndexListener and let it decide whether to add/update the entity or skip indexing (this might be more flexible but bear less performance)

Currently, there might be a few workarounds:

  • Preload all entities needed for the indexing operation. This might be tedious depending on the complexity and might require much more memory.

  • Use a transient property and examine that in the EntityIndexListener

  • Break up imports into several transactions, but that might not always be an option.

Environment

JBoss 7.1
Hibernate 4.0

Assignee

Yoann Rodière

Reporter

Thomas Göttlich

Labels

Suitable for new contributors

None

Pull Request

None

Feedback Requested

None

Components

Fix versions

Affects versions

Priority

Major
Configure