Improve performance of MassIndexer through Eager fetching

Description

The MassIndexer fetches all of a certain entity from the database. It then proceeds to index it and the tree of IndexedEmbedded objects is traversed. In doing so, a lot of additional queries get executed. These queries are not really necessary, since we know beforehand that the data is needed for the indexing operation. So it would make sense to eagerly fetch all associations that are marked as indexEmbedded. This way MassIndexer would significantly speed up.

Activity

Yoann RodièreMarch 8, 2022 at 11:34 AM

Note this would mostly only make sense for single-valued associations, since eager fetching of multi-valued associations can, in itself, introduce performance issues (combinatory explosion in the resultset).

Sanne GrinoveroNovember 4, 2014 at 4:24 PM

Correct the Massindexer would need some new configuration options.
A fetch profile would be useful for the second phase, but an easy improvement I was having in mind is to have the first phase - which currently just loads the stream of IDs - to use a custom named query, so people could use a "join fetch" in their query if they want to.

That would only need you to declare the namedquery, and to specify its name on the MassIndexer - with the small API complexity that it can run for multiple types, so you need to specify for which type it should be using the specific named query.

Marc SchipperheynNovember 4, 2014 at 3:52 PM

I'm not sure I'd be happy with that level of finetuning. Would add a lot of tuning and annotations in many areas

Adrian MeredithNovember 4, 2014 at 3:36 PM

To use a fetch profile it has to be activated right? The massindexer doesn't know it exists so wont use it. Ideally we would need a new api call in the massindex builder
e.g.
.usingFetchProfile("search")

Marc SchipperheynMay 3, 2010 at 7:26 PM
Edited

After reviewing fetch profiles I was initially very enthusiastic. It seems to be the answer to some of the infuriating issues with n+1.
However, on review it doesn't work or at least not as I would expect. Running MassIndexer on an entity

@FetchProfiles({ @FetchProfile(name = "search", fetchOverrides = { @FetchProfile.FetchOverride(entity = MyClass.class, association = "foreign", mode = FetchMode.JOIN) }) }) @Entity @Table public class MyClass implements IOffer, Serializable {

a method

@OneToOne(optional = false, cascade = CascadeType.ALL, fetch = FetchType.LAZY) @JoinColumn(name = "FK_ForeignID", nullable = false, updatable = false) @IndexedEmbedded public Foreign getForeign() { return foreign; }

I see for each MyClass a separate query being executed for getForeign. This does not happen when I map getForeign as a FetchType.EAGER association.

This doesn't seem to just happen with search. I also see this behaviour on other types of hibernate core queries. Don't really understand either why there is no FetchMode for inner joins.

Anyway, this doesn't really apply to Hibernate Search other than that it doesn't look like FetchProfile works for it.

Details

Assignee

Reporter

Labels

Components

Fix versions

Priority

Created May 2, 2010 at 6:00 PM
Updated January 26, 2024 at 3:52 PM

Flag notifications