Uploaded image for project: 'Hibernate Search'
  1. HSEARCH-2761

Elasticsearch purges may fail due to concurrent (or non-refreshed) writes

    Details

      Description

      Since the beginning of the week (approximately) we're having transient failures in IndexingActionInterceptorTest. The logs contain the following error:

      11:01:50,429 (main) DEBUG request:104 - ES client issued a ResponseException - not necessarily a problem
      org.elasticsearch.client.ResponseException: POST http://localhost:9200/org.hibernate.search.test.interceptor.blog/org.hibernate.search.test.interceptor.Article,org.hibernate.search.test
      .interceptor.TotalArticle/_delete_by_query/: HTTP/1.1 409 Conflict
      {"took":3,"timed_out":false,"total":2,"deleted":1,"batches":1,"version_conflicts":1,"noops":0,"retries":{"bulk":0,"search":0},"throttled_millis":0,"requests_per_second":-1.0,"throttled_
      until_millis":0,"failures":[{"index":"org.hibernate.search.test.interceptor.blog","type":"org.hibernate.search.test.interceptor.TotalArticle","id":"3","cause":{"type":"version_conflict_
      engine_exception","reason":"[org.hibernate.search.test.interceptor.TotalArticle][3]: version conflict, current version [3] is different than the one provided [2]","index_uuid":"-oMMjrTs
      RHOY64GKiID4Fg","shard":"0","index":"org.hibernate.search.test.interceptor.blog"},"status":409}]}
              at org.elasticsearch.client.RestClient$1.completed(RestClient.java:354)
              at org.elasticsearch.client.RestClient$1.completed(RestClient.java:343)
              at org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:119)
              at org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:177)
              at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:436)
              at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:326)
              at org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
              at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
              at org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
              at org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
              at org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
              at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
              at org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
              at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
              at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
              at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:588)
              at java.lang.Thread.run(Thread.java:745)
      11:01:50,430 (main) ERROR LogErrorHandler:69 - HSEARCH000058: Exception occurred org.hibernate.search.exception.SearchException: HSEARCH400007: Elasticsearch request failed.
       Request:
      ========
      Method: POST
      Path: /org.hibernate.search.test.interceptor.blog/org.hibernate.search.test.interceptor.Article,org.hibernate.search.test.interceptor.TotalArticle/_delete_by_query/
      Data:
      {
        "query": {
          "match_all": {}
        }
      }
      
      Response:
      =========
      Status: 409 Conflict
      Error message: null
      Cluster name: null
      Cluster status: null
      
      
      Primary Failure:
              Entity org.hibernate.search.test.interceptor.Article  Id null  Work Type  org.hibernate.search.backend.PurgeAllLuceneWork
      Subsequent failures:
              Entity org.hibernate.search.test.interceptor.Article  Id null  Work Type  org.hibernate.search.backend.PurgeAllLuceneWork
              Entity org.hibernate.search.test.interceptor.Blog  Id null  Work Type  org.hibernate.search.backend.PurgeAllLuceneWork
      
      org.hibernate.search.exception.SearchException: HSEARCH400007: Elasticsearch request failed.
       Request:
      ========
      Method: POST
      Path: /org.hibernate.search.test.interceptor.blog/org.hibernate.search.test.interceptor.Article,org.hibernate.search.test.interceptor.TotalArticle/_delete_by_query/
      Data:
      {
        "query": {
          "match_all": {}
        }
      }
      Response:
      =========
      Status: 409 Conflict
      Error message: null
      Cluster name: null
      Cluster status: null
      
      
              at org.hibernate.search.elasticsearch.work.impl.DefaultElasticsearchRequestSuccessAssessor.checkSuccess(DefaultElasticsearchRequestSuccessAssessor.java:98)
              at org.hibernate.search.elasticsearch.work.impl.SimpleElasticsearchWork.execute(SimpleElasticsearchWork.java:64)
              at org.hibernate.search.elasticsearch.processor.impl.ElasticsearchWorkProcessor.executeUnsafe(ElasticsearchWorkProcessor.java:195)
              at org.hibernate.search.elasticsearch.processor.impl.ElasticsearchWorkProcessor.executeSafe(ElasticsearchWorkProcessor.java:159)
              at org.hibernate.search.elasticsearch.processor.impl.ElasticsearchWorkProcessor.executeSyncSafe(ElasticsearchWorkProcessor.java:105)
              at org.hibernate.search.elasticsearch.impl.ElasticsearchIndexManager.performOperations(ElasticsearchIndexManager.java:442)
              at org.hibernate.search.backend.impl.LocalBackendQueueProcessor.applyWork(LocalBackendQueueProcessor.java:58)
              at org.hibernate.search.backend.impl.WorkQueuePerIndexSplitter.commitOperations(WorkQueuePerIndexSplitter.java:66)
              at org.hibernate.search.backend.impl.TransactionalOperationDispatcher.dispatch(TransactionalOperationDispatcher.java:71)
              at org.hibernate.search.backend.impl.BatchedQueueingProcessor.performWorks(BatchedQueueingProcessor.java:72)
              at org.hibernate.search.backend.impl.PostTransactionWorkQueueSynchronization.afterCompletion(PostTransactionWorkQueueSynchronization.java:76)
              at org.hibernate.search.backend.impl.EventSourceTransactionContext$DelegateToSynchronizationOnAfterTx.doAfterTransactionCompletion(EventSourceTransactionContext.java:188)
              at org.hibernate.engine.spi.ActionQueue$AfterTransactionCompletionProcessQueue.afterTransactionCompletion(ActionQueue.java:964)
              at org.hibernate.engine.spi.ActionQueue.afterTransactionCompletion(ActionQueue.java:498)
              at org.hibernate.internal.SessionImpl.afterTransactionCompletion(SessionImpl.java:2432)
              at org.hibernate.engine.jdbc.internal.JdbcCoordinatorImpl.afterTransactionCompletion(JdbcCoordinatorImpl.java:473)
              at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.afterCompletionCallback(JdbcResourceLocalTransactionCoordinatorImpl.java:166)
              at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl.access$200(JdbcResourceLocalTransactionCoordinatorImpl.java:38)
              at org.hibernate.resource.transaction.backend.jdbc.internal.JdbcResourceLocalTransactionCoordinatorImpl$TransactionDriverControlImpl.commit(JdbcResourceLocalTransactionCoordinatorImpl.java:222)
              at org.hibernate.engine.transaction.internal.TransactionImpl.commit(TransactionImpl.java:68)
              at org.hibernate.search.test.interceptor.IndexingActionInterceptorTest.testInterceptorWithMassIndexer(IndexingActionInterceptorTest.java:141)
              at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
              at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
              at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
              at java.lang.reflect.Method.invoke(Method.java:498)
              at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
              at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
              at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
              at org.hibernate.testing.junit4.ExtendedFrameworkMethod.invokeExplosively(ExtendedFrameworkMethod.java:45)
              at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
              at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
              at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
              at org.hibernate.testing.junit4.FailureExpectedHandler.evaluate(FailureExpectedHandler.java:41)
              at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
              at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
              at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
              at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
              at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
              at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
              at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
              at org.hibernate.testing.junit4.BeforeClassCallbackHandler.evaluate(BeforeClassCallbackHandler.java:26)
              at org.hibernate.testing.junit4.AfterClassCallbackHandler.evaluate(AfterClassCallbackHandler.java:25)
              at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
              at org.junit.runners.Suite.runChild(Suite.java:128)
              at org.junit.runners.Suite.runChild(Suite.java:27)
              at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
              at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
              at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
              at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
              at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
              at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
              at org.junit.runner.JUnitCore.run(JUnitCore.java:137)
              at org.junit.runner.JUnitCore.run(JUnitCore.java:115)
              at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.createRequestAndRun(JUnitCoreWrapper.java:108)
              at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.executeEager(JUnitCoreWrapper.java:78)
              at org.apache.maven.surefire.junitcore.JUnitCoreWrapper.execute(JUnitCoreWrapper.java:54)
              at org.apache.maven.surefire.junitcore.JUnitCoreProvider.invoke(JUnitCoreProvider.java:144)
              at org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:203)
              at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:155)
              at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:103)
      

      See for instance http://ci.hibernate.org/job/hibernate-search-master/988/

      I suspect it's a pre-existing bug, but we only see it now because of https://github.com/hibernate/hibernate-search/pull/1432, which had the effect of not executing subsequent works in a changeset when one work fails (while previously we considered each work independent from the others). (Note that this was still the right choice, since it aligned the behavior on embedded Lucene).

      We should investigate why the delete by query fails in the first place.

        Attachments

          Issue links

            Activity

              People

              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: