Fetch only fields needed for indexing during index build
Description
follows up on
Activity

Yoann RodièreApril 20, 2023 at 1:36 PM
Not at the moment, no. We do update issues when we work on them.

masrawiApril 20, 2023 at 7:48 AM
is there any update on this issue?

Yoann RodièreFebruary 1, 2022 at 3:50 PMEdited
That being said, with bytecode enhancement enabled, I believe it’s possible to mark even basic fields (e.g. a string) as lazy, in which case the fetch graph may allow us to get rid of basic fields that we don’t need in the loading query.
I just checked: that does not work. Fetch graphs do not affect the loading of basic attributes or formulas, except if they are lazy and part of a non-default LazyGroup
(in which case they wouldn’t be loaded in the current implementation either, so it doesn’t matter).
So, if we were to automatically apply entity graphs, it would only affect associations.

Yoann RodièreFebruary 1, 2022 at 3:48 PM
we use very heavily the formula annotation to add subquery to the entity for example and we don’t need them for reindexing
Right. That definitely won’t be addressed by fetch graphs (though that’s unfortunate). I’ll reopen your ticket HSEARCH-4471; we’ll need a different solution for you.

masrawiFebruary 1, 2022 at 1:56 PMEdited
we use very heavily the formula annotation to add subquery to the entity for example and we don’t need them for reindexing
Details
Assignee
UnassignedUnassignedReporter
Magnus HovénMagnus HovénComponents
Fix versions
Priority
Major
Details
Details
Assignee
Reporter

In cases when you need to fetch collections eagerly within application code but none of these collections are indexed, it would save a lot of indexing time if only columns and collections needed for indexing were fetched.
In my case, indexing 530 000 entities consisting of 40 columns and 7 collections, it takes around 1 hour to index when collections are eagerly fetched, but only 3 minutes if collections are not fetched.
If it was possible to only fetch needed columns and collections the build index performance would improve a lot in these specific cases.