Envers can't start when some audited field has accented letters


This bug prevents some valid Java identifiers from being used as persisted fields names, and only when Envers is used. I marked it as a CRITICAL bug, because it may prevent someone from upgrading, or force them to rename lots of fields, specially for non-English users of Hibernate.

I am using Hibernate 5.2.0. This problem did NOT exist on 4.3.11.Final.

When I turn Envers on:

Then Hibernate bootstrap fails when the following line builds the metadata:

An example problem is an audited class (source code in UTF-8) containing a boolean field called "seÉfinal", since the character "É", an accented version of character "E", results in a parsing error: "Invalid byte 2 of 2-byte UTF-8 sequence".

Many character will fail, for example: áéíóúãõñàèçÁÉÃÇ etc.

Most commonly this is due to feeding ISO-8859-x (like Latin-1) but the XML parser thinking it is getting UTF-8 (or vice-versa). For example, certain sequences of Latin-1 characters (two consecutive characters with accents or umlauts) form something that is invalid as UTF-8, and specifically such that based on first byte, second byte has unexpected high-order bits.

Maybe this is as simple as changing the used default character by something like String.getBytes() to String.getBytes("utf-8"). Please note that forms like String.getBytes() should be avoided, since they use the platform's default charset, which may result in code that works on some platforms and fails on others (they would even pass tests in some platforms that are, by chance, configured as expected by the code).

The following is an offending XML Document, containing name="seÉFinal", created internally, in memory, by Envers:

And this is the complete stacktrace:

I would also suggest that the bootstraping of Envers should issue better error messages. In this case it could have warned something along the lines of "Envers bootstraping failed when processing Users.class#seÉfinal. Caused by: javax.xml.stream.XMLStreamException: ParseError at row,col:67,50 Message: Invalid byte 2 of 2-byte UTF-8 sequence...".




Chris Cranford


Marcelo Glasberg

Fix versions



Suitable for new contributors

Yes, likely

Requires Release Note


Pull Request





Affects versions