JSR-352: Each partition executes on the whole data set

Description

The current test org.hibernate.search.test.integration.jsr352.massindexing.RestartIT provides wrong information about the index progress. This integration test aims to index 5,000 entities within 2 job executions: one normal start and one restart, where the interruption point of the first execution is the 2500th entity read and such interruption is controlled by Byteman script JobInterruptor.btm. After the start of the 2nd job execution, the index progress does not stop:

1 2 3 4 5 6 7 8 9 10 11 12 13 11:38:29,710 INFO [org.hibernate.search.jsr352.massindexing.impl.steps.lucene.ProgressAggregator] (Batch Thread - 1) HSEARCH500010: org.hibernate.search.test.integration.jsr352.massindexing.test.common.Message: 3000/5000 works processed (60.00%). 11:38:29,710 INFO [org.hibernate.search.jsr352.massindexing.impl.steps.lucene.ProgressAggregator] (Batch Thread - 1) HSEARCH500010: org.hibernate.search.test.integration.jsr352.massindexing.test.common.Message: 5800/5000 works processed (116.00%). ... 11:38:43,258 INFO [org.hibernate.search.jsr352.massindexing.impl.steps.lucene.ProgressAggregator] (Batch Thread - 4) HSEARCH500010: org.hibernate.search.test.integration.jsr352.massindexing.test.common.Message: 105001/5000 works processed (2100.02%).

Actually I'm not sure if the index progress monitor was wrong, this job has actually indexed 105,001 entities, or both. When fixing this issue, we need to:

  • Investigate the origin of the issue & fix it

  • Check if other JSR-352 tests having restart task are facing the same problem.

  • Provide more precise assertions for this test and any other related test.

  • Check if this ticket is duplicate with JSR-352: Implement checkpoints so as to recover correctly from failures.

Environment

None

Status

Assignee

Yoann Rodière

Reporter

Mincong Huang

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Priority

Critical