We're updating the issue view to help you get more done. 

Enabling caching with SNAPSHOT transactional isolation considered harmful

Description

This may just be something that would be worthwhile to point out in the documentation; or there may be a fix (possibility below).

When hibernate places items into the L2 cache, it uses a timestamp ('now') in the query cache to use to check whether the results are still valid.

Unfortunately, if one is using "SNAPSHOT" isolation level, this can lead to the cache becoming poisoned with incorrect results.

Consider 3 transactions (T1, T2,T3), and the following sequence of events, with nothing in table TBL:

t=0 : Transaction T1 begins (SNAPSHOT isolation)
t=1 : Transaction T1 does some unrelated query
t=2 : Transaction T2 begins
t=3 : Transaction T2 inserts a row into TBL (with name='fred')
t=4 : Transaction T2 completes
t=5 : Transaction T1 does some query on TBL (e.g where name='fred').

  • this returns no rows from the database (this is correct).

  • this is L2 cached, with a timestamp of '5'

t=6 : Transaction T1 completes.

...

t=7 : Transaction T3 starts
t=8 : Transaction T3 does some query on TBL (say, where name='fred').

  • this finds a hit in the query cache, timestamp = 5

  • the table TBL has a timestamp of 3

  • 3 < 5, therefore cache deemed valid. no rows returned

t=9 : Transaction T3 ends

The cache is now poisoned @t=6, since any re-run of the query will compare the entity cache for updates to TBL (and discover a timestamp of '3') and compare it to the query timestamp ('5'), and determine the cache is still valid - and will return no results. I.E: if you ran this sequence with the L2 cache disabled, you would return different results.

Side note: In some, but not all, databases, the instruction at t=1 is important for the database to 'start' the snapshot.

The 'no result' in t=5 is because the snapshot was started at t=1. If the database had been in SERIALIZABLE isolation, it would have returned some results (and so this would not be an issue).

One potential fix might be that where Hibernate populates the cache timestamp, if it were to use a timestamp (if in snapshot mode) from the beginning of the transaction, rather than the 'current' time, then this would have entered a result in the cache that would be correctly invalidated on the next read, which is probably correct.

This could probably be shoehorned into the behaviour of Region.nextTimestamp() - but the cache region would have to know about the transaction isolation; perhaps some kind decorator/wrapper might work, as arguably this is an interaction between the cache and the database, rather than a direct hibernate concern.

Environment

Tested on 3.6.10. Marked on 4.1.7 too as the API looks the same.

Status

Assignee

Unassigned

Reporter

NigelN

Fix versions

None

backPortable

None

Suitable for new contributors

None

Requires Release Note

None

Pull Request

None

backportDecision

None

Components

Affects versions

4.1.7
3.6.10

Priority

Major