JSR-352: Split job parameter fetchSize into 2 parameters

Description

In the current mass indexer, there're 2 parameters for different sizes:

  • objectLoadingBatchSize: Sets the batch size used to load the root entities.

  • idFetchSize: Specifies the fetch size to be used when loading primary keys if objects to be indexed.

In JSR-352, we use a single value (fetchSize) for both idFetchSize and objectLoadingBatchSize. See:

  • org.hibernate.search.jsr352.massindexing.impl.steps.lucene.PartitionMapper.buildScrollableResults(StatelessSession, Session, Class<?>, Set<Criterion>)

  • org.hibernate.search.jsr352.massindexing.impl.steps.lucene.EntityReader.buildScrollUsingCriteria(StatelessSession, PartitionBound, Object, JobContextData)

We may want to split them in the future.

See also this comment: https://github.com/yrodiere/hibernate-search/pull/21#issuecomment-309377142

Activity

Yoann RodièreJuly 17, 2017 at 8:44 AM

Hmm ok. But it's a shame to introduce the "partition" keyword while this parameter could be used in a context where partitions are irrelevant (e.g. with an HQL "scope"). What about idLoadingFetchSize/objectLoadingFetchSize then?

Mincong HuangJuly 15, 2017 at 8:54 AM
Edited

Agree with "idFetchSize". As for "entityFetchSize", I don't like it because "entity" has a special meaning in our JSR-325 implementation, it means entity level. Here's the list of all the levels:

  • Job level: the highest level. It means a job-execution, which consists several entity types to index

  • Entity level: the 2nd level. It means the resources dedicated to index the given entity type: an entity-level progress monitor, and a set of partitions for indexing the entities of this entity type

  • Partition level: the 3rd level. A partition indexes a subset of entities of a given entity type.

  • Chunk level: the lowest level. Items processed between checkpoints are referred to as a "chunk". (§8.2.1 Chunk, Spec 1.0)

I think "partitionFetchSize" is more preferable here.

Yoann RodièreJuly 13, 2017 at 8:17 AM

Might be a good idea. For naming, what about "idFectSize" and "entityFetchSize"?

Fixed

Details

Assignee

Reporter

Components

Fix versions

Priority

Created June 17, 2017 at 7:56 AM
Updated December 3, 2024 at 11:53 AM
Resolved July 25, 2017 at 3:48 PM