Uploaded image for project: 'Hibernate ORM'
  1. HHH-8281

Paginated query produces non-deterministic result set ordering

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Cannot Reproduce
    • Affects Version/s: 3.6.9
    • Fix Version/s: None
    • Component/s: hibernate-core
    • Labels:
      None
    • Environment:
      3.6.9.Final, SQL Server 2008
    • Bug Testcase Reminder (view):

      Bug reports should generally be accompanied by a test case!

    • Last commented by a user?:
      true
    • Sprint:

      Description

      The query generated by Hibernate when using setMaxResults/setStartResults does not always return an order result set. As a simple example, assume I have a class Customer and I issue the following:

      Query query = session.createQuery("select c from Customer c order by c.customerId");
      query.setMaxResults(10);
      query.setFirstResult(5);
      

      The SQL Server dialect will generate a query that looks like:

      WITH query AS (
        select ROW_NUMBER() OVER (order by customer0_.cust_id) as __hibernate_row_nr__, customer0_.id as id0_, customer0_.cust_id as cust2_0_ 
        from homer.droberge.dev_test_customer customer0_ 
      )
      SELECT * FROM query WHERE __hibernate_row_nr__ BETWEEN ? AND ?
      

      Notice there is no order by clause on the outer query. This seems incorrect because there is no guarantee the query will be ordered. We actually ran into an issue in one of our production applications where the result set is coming back in random order.

      We did some investigative work on this and here is what we found:

      • The generated Hibernate query works when a serial execution plan is used by the query optimizer. In fact, appending an order by clause to the outer query results in an execution plan that is exactly the same as the plan that is generated without the order by clause.
      • It appears to fail when SQL Server query optimizer selects a parallel query execution plan. In this case, work is distributed over a number of threads. MSDN indicates that relative order is maintained among the threads, but there is no guarantee the final output will be ordered unless there is an order by predicate. Appending an order by clause will guarantee the final output is ordered correctly.

      Some helpful links that relate to parallel queries:

      It is somewhat difficult to force SQL Server to choose a parallel execution plan. I did find this article http://sqlblog.com/blogs/paul_white/archive/2011/12/23/forcing-a-parallel-query-execution-plan.aspx which does go into detail about forcing SQL to choose a parallel execution plan. If you force the parallel execution plan you can easily reproduce the non-deterministic ordering

        Attachments

          Activity

            People

            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: