Reproduce transaction timeouts during mass indexing

Description

If the mass indexer producer thread runs longer than the configured JTA transaction timeout, a timeout exception is raised. As a basis for investigating work-arounds for that, a reproducer shall be created.

Environment

None

Activity

Show:
Gunnar Morling
January 29, 2015, 9:30 PM

: I can reproduce the timeout exception now (still a bit ad-hoc-ish, but it could be automated), but something made me curious which I'd like to discuss with you.

I'm running with one producer and one consumer (performing several "takes" from the queue), and I meant to set it up in a way that the timeout exception only would occur on the producer side. But actually the transactions on both sides time out: on the producer side (as intended) and on the consumer side.

The latter is caused by the fact that the consumer (IdentifierConsumerDocumentProducer) runs all its takes in one single transaction. I would have it expected to use one transaction per take. Is this actually intended? Depending on the data size, that may be a very long running transaction (btw. we also never do a session clearance), so using one TX per reasonably-sized object batch (say 100 or 1000 items) seems more reasonable IMO. That'd be one step towards avoiding the timeout exceptions (as users can control this through the batch size), so we'd only have to deal with the exception on the producer side. WDYT?

Either way, with the current set-up I cannot enforce a timeout exception only on the consumer side, it will always be producer and consumer side.

Sanne Grinovero
February 5, 2015, 10:41 PM
Edited

You're right, the transaction boundary in the consumer isn't correct either. I suspect that's because of HSEARCH-640, at that time we introduced OptionallyWrapInJTATransaction to actually connect to JTA.

Some background: the original MassIndexer as I was using it, wasn't affected by any transaction timeouts. It was working fine in JBoss 4 (which is what I was using at the time).
It turns out I didn't have any timeouts problems because I didn't "link" the transaction to JTA properly.
Then in JBoss 6 the container would detect that you were doing something suspicious, and its DataSource would prevent you to open a JDBC connection.. so at that point we "fixed" it by connecting to JTA. I guess that's when we started seeing people complaining about timeouts.. but indeed, as you say the integration with JTA is using an unreasonable scope.

The Session is cleared periodically though, the code doing it is just a bit hidden

Would be great if you could automate such a test.

Gunnar Morling
June 4, 2015, 6:41 AM

As a heads-up: I've got it basically working (a timeout is raised on the producer side, whereas the TX on the consumer side all succeed, which is possible now as of ). Still struggling with Arquillian atm. and having it launch the container with the correct config. PR will follow.

Assignee

Gunnar Morling

Reporter

Gunnar Morling

Labels

None

Suitable for new contributors

None

Feedback Requested

None

Components

Fix versions

Priority

Major
Configure