Native query EntityManager.createNativeQuery(…, EntityType.class) returns different result when run via EntityManager.createNativeQuery(…)

Description

I have the following use case when using Spring Data JPA repositories with native queries:

@Repository public interface TestRepository extends JpaRepository<TestEntity, UUID> { String QUERY = """ SELECT client_id, date_trunc(:unitOfTime, hour AT TIME ZONE 'UTC' - (INTERVAL '1 hour') * :offset) AS hour, SUM(decimal_column) AS decimal_column, SUM(counter) AS counter FROM test_service.hourly_test WHERE client_id = :clientId AND hour >= :from AND hour < :until GROUP BY client_id, 2 ORDER BY 2 ASC; """; @Query(value = QUERY, nativeQuery = true) Stream<TestEntity> exampleQuery( @Param("clientId") UUID clientId, @Param("from") Instant from, @Param("until") Instant until, @Param("offset") int offset, @Param("unitOfTime") String unitOfTime); }

As far as I understand it, this essentially calls entityManager.createNativeQuery(QUERY, TestEntity.class).setParameter(...).getResultList().

The following test code showcases this query yielding wrong results using the entityManager directly:

private Query withParams(Query query) { return query .setParameter("clientId", clientId) .setParameter("from", from) .setParameter("until", until) .setParameter("offset", offset) .setParameter("unitOfTime", uot.name()); } @Test void testEntityResultTypeFails() { // Same native query using the TestEntity.class as the result class var entityManagerQuery = withParams(entityManager.createNativeQuery(TestRepository.QUERY, TestEntity.class)); List<TestEntity> typedEmResult = entityManagerQuery.getResultList(); TestEntity typedRow = typedEmResult.get(0); Assertions.assertThat(typedRow.getClientId()).isEqualTo(clientId); Assertions.assertThat(typedRow.getHour()).isEqualTo(expectedTruncatedDayBoundary); Assertions.assertThat(typedRow.getDecimalColumn()).isEqualTo(expectedDecimalColumn); Assertions.assertThat(typedRow.getCounter()).isEqualTo(expectedCounterColumn); }

Yields:

org.opentest4j.AssertionFailedError: expected: 2.000 but was: 1

I have included a sample project which showcases the following test cases:

  1. Using em.createNativeQuery(queryString) (using Object[] as a result class by default)→ works

  2. Using em.createNativeQuery(queryString, TestEntity.class) → fails

  3. Using Spring Data JPA proxies → fails (probably due to internally being equivalent to the second test, I assume)

This does seem like a bug of some sort for me due to the fact that running the same query on the console (via e.g. psql in the container) yields different results to what Hibernate is returning.

Some additional context:

I initially posted this on the Spring Data JPA tracker, where it was determined to be caused by Hibernate behaviour (specifically, entity caching).

More context is available over on GitHub. See also this previous comment for instructions on reproducing the issue using psql. Note that the sample project I uploaded to this Jira issue is slightly cleaned-up in comparison with the GH upload, but is semantically equivalent.

Attachments

3
  • 20 Mar 2025, 05:28 PM
  • 20 Mar 2025, 05:28 PM
  • 20 Mar 2025, 04:17 PM

Activity

Show:

FP March 21, 2025 at 7:47 AM

Understood. We do actually use @ConstructorResult in several places. For some reporting queries like the one in this issue, we thought we could save ourselves a few (seemingly unneeded) DTOs, but it turns out decision didn’t sit well with our understanding of the cache.

Again, thanks for taking the time to explain this.

Gavin King March 20, 2025 at 6:54 PM

Though, I should note that for report-type queries it’s more usually to use @ConstructorResultinstead of returning managed entities.

Gavin King March 20, 2025 at 6:13 PM

Hibernate does not know how to execute queries against the cache. The cache is used for lookup by id.

And yes, it’s normal to use a separate class for projecting or aggregating queries.

FP March 20, 2025 at 6:02 PM
Edited

with the same id as one of the ids being returned by the query,

Ah. I suppose this is the point that I was missing. The resulting hour value from my date_trunc function essentially is that same as one of the hour of the entities I previously inserted.

I guess my confusion stems from the fact that I would’ve have expected this first-level cache to be used before the query is run on the database, not after, seeing how only after the query runs does the hour and thus the entire cache key match. As I understand, this does potentially save some cpu cycles on deserializing other, in that case definitely-don’t-need-to-be-deserialized fields (which would in this case be already present in the cache).

To sum up, does this mean that either:

I should always use different objects than my “normal” entities for any grouping operations, even if these have the same fields

or:

use a stateless session, just like we do in Hibernate Data Repositories.

would be the correct way to go about these kinds of queries?

 

Thank you for your patience in explaining this.

Gavin King March 20, 2025 at 5:47 PM

The test indeed passes in that case.

Well then in that case it’s pretty clear what’s going on, right?

You already have an instance in the session, with the same id as one of the ids being returned by the query, and so the cached instance is being returned, as is absolutely 100% normal and correct for a stateful session.

Rejected

Details

Assignee

Reporter

Labels

Affects versions

Priority

Created March 20, 2025 at 4:24 PM
Updated March 21, 2025 at 7:47 AM
Resolved March 20, 2025 at 6:55 PM