This bug prevents some valid Java identifiers from being used as persisted fields names, and only when Envers is used. I marked it as a CRITICAL bug, because it may prevent someone from upgrading, or force them to rename lots of fields, specially for non-English users of Hibernate.
I am using Hibernate 5.2.0. This problem did NOT exist on 4.3.11.Final.
When I turn Envers on:
Then Hibernate bootstrap fails when the following line builds the metadata:
An example problem is an audited class (source code in UTF-8) containing a boolean field called "seÉfinal", since the character "É", an accented version of character "E", results in a parsing error: "Invalid byte 2 of 2-byte UTF-8 sequence".
Many character will fail, for example: áéíóúãõñàèçÁÉÃÇ etc.
Most commonly this is due to feeding ISO-8859-x (like Latin-1) but the XML parser thinking it is getting UTF-8 (or vice-versa). For example, certain sequences of Latin-1 characters (two consecutive characters with accents or umlauts) form something that is invalid as UTF-8, and specifically such that based on first byte, second byte has unexpected high-order bits.
Maybe this is as simple as changing the used default character by something like String.getBytes() to String.getBytes("utf-8"). Please note that forms like String.getBytes() should be avoided, since they use the platform's default charset, which may result in code that works on some platforms and fails on others (they would even pass tests in some platforms that are, by chance, configured as expected by the code).
The following is an offending XML Document, containing name="seÉFinal", created internally, in memory, by Envers:
And this is the complete stacktrace:
I would also suggest that the bootstraping of Envers should issue better error messages. In this case it could have warned something along the lines of "Envers bootstraping failed when processing Users.class#seÉfinal. Caused by: javax.xml.stream.XMLStreamException: ParseError at row,col:67,50 Message: Invalid byte 2 of 2-byte UTF-8 sequence...".
Please provide a runnable test case. I was unable to reproduce this with the described steps on windows nor linux.
In any software using Hibernate with Envers, simply create a persisted field with this name: "aáçãéèíõÃÕÇñÑ", and then try to start it.
If you can't reproduce it, a test case will do no good. You probably have your platform's default charset setup such as it won't fail. See what I wrote above: "...the platform's default charset, ... may result in code that works on some platforms and fails on others..."
So you most likely need to change the default charset of your JVM to see the bug. See here:
Some encodings you may try: windows-1252, UTF-16, ISO-8859-1
Thanks , I'll give those a try.