Consider a stronger hashing function for cache keys

Description

The CacheKeyImplementation and BasicCacheKeyImplementation classes uses the underlying object’s hashCode, which is typically an accumulation of fields by using Java’s standard fast but weak hashing function of 31 * result + element.hashCode(). This hash is perfectly fine in most circumstances because data structures like hash tables will use a secondary hashing function or other techniques to avoid collision or clustering issues.

Unfortunately this is not the case in Ehcache 3.x, where all versions can degrade to O(n) eviction. This was observed in a Hibernate production workload in a 2018 conference talk. The author’s solution was to switch to Infinispan.

The details are described in this analysis, but in short Ehcache’s sampled eviction relies on the hash table having a uniform distribution of the entries. This is not the case in ConcurrentHashMap which relies instead on O(lg n) treebins, so clustering is not problematic. Unfortunately that leads to the eviction’s random start index performing long scans when trying to find entries. The Ehcache developers have decided to not fix this problem, leaving it up integrations to safeguard their usage.

To mitigate this or similar by other providers, you can strengthen the hashCode in your cache keys. That solution was adopted by Spring Cache which incorporates the Murmur3 finalizer into its calculation. This is very inexpensive and resolves the performance problem, reducing the runtime of an IBM SQL production workload from 18.25 minutes down to 40 seconds.

Activity

Show:

Ben Manes March 8, 2025 at 3:55 AM

Details

Assignee

Reporter

Labels

Original estimate

Time tracking

No time logged

Components

Priority

Created March 3, 2025 at 7:50 AM
Updated March 8, 2025 at 3:55 AM