Inefficient implementation of JarVisitorFactory.getBytesFromInputStream

Description

Implementation of JarVisitorFactory.getBytesFromInputStream is not efficient. Data loaded from the stream is copied over and over again in memory as the data is fetched.

Because of this problem creation of local EntityManagerFactory is slow for us in hibernate 3.4.0 (~40 seconds). We profiled start-up and found this method being bottleneck (>90% of time spent). Although we have not tested creation of EntityManagerFactory with the latest version of hibernate, this method is there in hibernate 4.1.8 and should be updated.

Attached is alternative implementation of method and performance test.
In our test we loaded 2491475 bytes:
Current implementation: 219 ms
Suggested implementation: 5 ms

Environment

Windows x64, RedHat Linux x64, HIbernate 3.4.0 GA

Activity

Show:
Andriy Kharchuk
December 6, 2012, 9:33 PM

ByteArrayOutputStream class internally still does redundant copying by calling Arrays.copyOf() in write(byte b[], int off, int len).

Default internal buffer size in ByteArrayOutputStream is just 32 bytes. Unless you make initial buffer size large everything will be similar to existing implementation. However, if you set buffer size to some larger number, say to 128k, it might be comparable to suggested implementation. This has to be tested.

Brett Meyer
December 6, 2012, 10:22 PM

Ah, good point. For now, I've pushed your version of the method.

Andriy Kharchuk
December 6, 2012, 10:25 PM

what about zero size streams? was it fixed?

Brett Meyer
December 6, 2012, 10:36 PM

I tested it with a 0 size file w/o issue.

Andriy Kharchuk
December 6, 2012, 10:41 PM

cool, thank you

Assignee

Brett Meyer

Reporter

Andriy Kharchuk

Fix versions

Labels

backPortable

None

Suitable for new contributors

Yes, likely

Requires Release Note

None

Pull Request

None

backportDecision

None

Components

Affects versions

Priority

Major
Configure