Use sub-query in ORM when distinct and order_by columns do not match

102 views
Skip to first unread message

Miroslav Shubernetskiy

unread,
Jan 24, 2015, 10:20:53 PM1/24/15
to django-d...@googlegroups.com
Hi everyone,

Just created my first django new feature ticket and wanted to get some community opinion on the topic - https://code.djangoproject.com/ticket/24218#ticket.

For convenience below is the ticket text:

This ticket is to propose a slight change in ORM - use subqueries when querying a model where.distinct() and .order_by() (or .extra(order_by=())) leftmost columns do not match. For example:

Model.objects.all().distinct('foo').order_by('bar')

The above generates the following SQL:

SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
FROM "app_model"
ORDER BY "app_model"."bar" ASC;

I am not sure about all backends however the above syntax is not allowed in PostgreSQL which produces the following error:

ERROR:  SELECT DISTINCT ON expressions must match initial ORDER BY expressions

Here are PostgreSQL docs explaining why that is not allowed:

DISTINCT ON ( expression [, ...] ) keeps only the first row of each set of rows where the given expressions evaluate to equal. [...] Note that the "first row" of each set is unpredictable unless ORDER BY is used to ensure that the desired row appears first. [...] The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s).

This ticket proposes to use subqueries in such situations which would use SQL:

SELECT *
FROM (
  SELECT DISTINCT ON ("app_model"."foo") <lots of columns here>
  FROM "app_model"
) result
ORDER BY "app_model"."bar" ASC;

The above is perfectly valid SQL and produces expected results (please note that ORDER_BY is in the outer query to guarantee that distinct results are correctly sorted).

I created a simple patch by overwriting few things in SQLCompiler.as_sql() which seems to work pretty well. The patch only creates subquery when the above dilema is encountered which should not have any negative side-effects on existing queries (since such queries were not allowed by SQL). The patch also works on the .count()queries since Django then strips any ordering hence the subquery is never created.

Carl Meyer

unread,
Jan 26, 2015, 12:14:47 PM1/26/15
to django-d...@googlegroups.com
Hi Miroslav,

On 01/24/2015 08:20 PM, Miroslav Shubernetskiy wrote:
> Hi everyone,
>
> Just created my first django new feature ticket and wanted to get some
> community opinion on the topic
> - https://code.djangoproject.com/ticket/24218#ticket.

Welcome, and thanks for the suggestion + patch!

In the future, it's not generally necessary to post to the list about a
new ticket, unless a core dev says on the ticket that the issue requires
a mailing-list discussion.

I'll leave my comments regarding the specific issue on the ticket.

Carl

signature.asc
Reply all
Reply to author
Forward
0 new messages