I'm using HibernateValidator on my current project. In some cases I need to save very complex entities (trees of nodes associated with other set of entities, etc). Saving an entity like this takes me a considerable amount of time if validation is turned on. Trying to solve this issue, I've been doing some profiling lately.
The bottleneck is at ClassValidator.getClassValidator(). I get this method executed around 20000 times. When validation is on, it takes like 10 seconds, without validation it takes like 1.5 seconds. So what's going on?
getClassValidator() searches in a cache for any possible ClassValidator created at the constructor. The constructor of a class examines all its relationships with related classes and saved them in a cache. The point is that I'm getting a lot of misses for entities of type Collection (PersistentCollection, Collection.$Unmodifiable, Set, etc). Since getClassValidator() doesn't find these entities in the cache (which makes sense), a new ClassValidator is created, which is not cheap in execution time terms. So, considering that the number of calls is considerable, that's the reason for the bottleneck.
I checked there's comment in the code suggesting adding a second cache for saving new ClassValidators when a miss happens. My first approach was to code this extra-cache, and things improved enormously (no differences between validating with and without validation).
But, there's still something I don't fully understand. In the method: protected InvalidValue getInvalidValues(T bean, Set<Object> circularityState), there's a loop that examines the class of an entity and do the actual validation. The body of this loop is coded as:
The point is that for entities of type Collection the validation is being done twice. Once on the first branch (Validate for collections) and another time on the "else" branch (Validate for anything else is not an Array).
Imagine an entity PersistenCollection<Person>. The first branch validates all the people in the Collection and the else branch creates a ClassValidator of type PersistenCollection and executes its validators (that doesn't make much sense to me). Most of the misses I got on the "else" branch are for entities of type Collection, I got some others for entities of type ValueObject I think, those ones are OK. So, why this checkings are not coded in exclusive form, something like:
Before sending a patch for adding a second cache to getClassValidator() method, I'd like to know if most of this could get fixed by validating in exclusive form at getInvalidValues(). In any case, the second cache patch is also nice but I guess that could be subject of a another thread.
Hibernate 3.0, PostgreSQL
Actually, the Hibernate Validator I described this error for is 3.0.0.ga. I checked against the current repository code (trunk/hibernate-validator-legacy) and with regard to the error described it also applies to the latest 3.X version (I think 3.1.0.ga)
Changing the fix version, since it is actually reported against 3.x.
FYI, the 3.x versions of Hibernate are not under active development. It is highly recommended to upgrade to the latest 4.x version which is based on the Bean Validation standard.