I've been optimising the implementation of TwoPhaseLoad from a perspective of performance.
Some simple operations could be moved into a narrower scope of some branches, possibly skipping unneccessary work in certain conditions.
Also checks for logging levels could be joined, and include within their scope the computation for some parameters which were only being used for the logging messages.
Most notably the PRE_LOAD and POST_LOAD Event listeners were being retrieved from the ServiceRegistry within each invocation of each method of TwoPhaseLoad; however these methods are called within the scope of hot loops: once for each managed object. When loading non trivial amounts of entities the cost of the service registry access gets significant.