[Django] #20600: Subqueries should retain ORDER BY when using DISTINCT ON

37 views
Skip to first unread message

Django

unread,
Jun 14, 2013, 11:46:11 AM6/14/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
----------------------------------------------+--------------------
Reporter: brianglass | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: 1.5
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
When using subqueries in conjunction with DISTINCT ON, ordering should be
retained.

Case in point: if I have a table of chapters in a book with weightings for
popularity and I wish to obtain the top weighted chapter per book I might
do something like:
{{{
#!python
top_chapters = Chapter.objects.filter(topics__in=['css',
'javascript']).distinct('book_id').order_by('book_id', '-weight', 'id)
}}}
Now I want the chapters to be ordered by weight, so I use top_chapters as
a sub-query:
{{{
#!python
chapters = Chapter.objects.filter(id__in=top_chapters).order_by('weight')
}}}
However, the ORDER BY statement is stripped from the subquery so we are
not guaranteed to get the top chapters. If the subquery is sliced as in
the following, the ordering is retained:
{{{
#!python
count = top_chapters.count()
chapters =
Chapter.objects.filter(id__in=top_chapters[:count]).order_by('weight')
}}}

This issue seems related to ticket #12328. In that ticket, order by was
intentionally retained if the subquery is sliced.

--
Ticket URL: <https://code.djangoproject.com/ticket/20600>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jun 14, 2013, 1:24:22 PM6/14/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------

Reporter: brianglass | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by brianglass):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

Note that the workaround solution I posted above does product SQL with the
ORDER BY retained, but it also contains too many columns for the
{{{id__in}}} to work correctly and I get back:

{{{
*** DatabaseError: subquery has too many columns
LINE 1: ..." FROM "heron_section" WHERE "heron_section"."id" IN (SELECT...
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:1>

Django

unread,
Jun 14, 2013, 4:39:09 PM6/14/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------

Reporter: brianglass | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 1 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* easy: 0 => 1
* stage: Unreviewed => Accepted


Comment:

The workaround might actually work if you add a .values() call in there.

But yes, this seems like a bug. I wonder if adding an ORDER BY into the
subquery always is the correct approach. But maybe that is costly on some
databases...

The easy solution is to just add ORDER BY if the query has distinct on
fields set.

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:2>

Django

unread,
Jun 29, 2013, 5:38:50 AM6/29/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned

Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 1 |
-------------------------------------+-------------------------------------
Changes (by koddsson):

* status: new => assigned
* owner: nobody => anonymous


--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:3>

Django

unread,
Jul 3, 2013, 12:03:53 PM7/3/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 1 |
-------------------------------------+-------------------------------------

Comment (by koddsson@…):

I made unit tests! https://github.com/django/django/pull/1325

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:4>

Django

unread,
Aug 14, 2013, 7:50:17 AM8/14/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 1

Needs tests: 0 | UI/UX: 0
Easy pickings: 1 |
-------------------------------------+-------------------------------------
Changes (by mjtamlyn):

* needs_better_patch: 0 => 1


Comment:

(closed) PR has tests, but lacks fixing implementation.

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:5>

Django

unread,
Sep 18, 2013, 4:13:19 PM9/18/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by timo):

* easy: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:6>

Django

unread,
Oct 10, 2013, 1:50:18 PM10/10/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 1
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------

Comment (by chekunkov):

I've got this issue as well. Posted workaround didn't work for me (even
with .values()). The only solution I've found is to evaluate query for ids
and then use result in "in" statement:

{{{
top_chapters_ids = Chapter.objects.filter(


topics__in=['css', 'javascript']
).distinct(
'book_id'
).order_by(
'book_id',
'-weight',

'id'
).values_list('id', flat=True)

# evaluate query
top_chapters_ids = list(top_chapters_ids)

chapters = Chapter.objects.filter(
id__in=top_chapters_ids
).order_by('weight')
}}}

It is working but resource-consuming solution. And possibly with large
number of ids it wouldn't work for SQLite databases, because of their
limitations.

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:7>

Django

unread,
Oct 10, 2013, 5:30:03 PM10/10/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: assigned
Component: Database layer | Version: 1.5
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0

Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* needs_better_patch: 1 => 0
* has_patch: 0 => 1


Comment:

A patch fixing this is available from
https://github.com/django/django/pull/1722 - I have commented all changes
so that it should be easy to verify that the patch is actually doing the
right thing. Anybody care to double-check the pull request?

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:8>

Django

unread,
Nov 7, 2013, 7:38:37 AM11/7/13
to django-...@googlegroups.com
#20600: Subqueries should retain ORDER BY when using DISTINCT ON
-------------------------------------+-------------------------------------
Reporter: brianglass | Owner: anonymous
Type: Bug | Status: closed

Component: Database layer | Version: 1.5
(models, ORM) | Resolution: fixed

Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 1 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by Anssi Kääriäinen <akaariai@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"b1b04df06522b092a9b4768d2d61c52956e00eca"]:
{{{
#!CommitTicketReference repository=""
revision="b1b04df06522b092a9b4768d2d61c52956e00eca"
Fixed #20600 -- ordered distinct(*fields) in subqueries
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:9>

Reply all
Reply to author
Forward
0 new messages