Case in point: if I have a table of chapters in a book with weightings for
popularity and I wish to obtain the top weighted chapter per book I might
do something like:
{{{
#!python
top_chapters = Chapter.objects.filter(topics__in=['css',
'javascript']).distinct('book_id').order_by('book_id', '-weight', 'id)
}}}
Now I want the chapters to be ordered by weight, so I use top_chapters as
a sub-query:
{{{
#!python
chapters = Chapter.objects.filter(id__in=top_chapters).order_by('weight')
}}}
However, the ORDER BY statement is stripped from the subquery so we are
not guaranteed to get the top chapters. If the subquery is sliced as in
the following, the ordering is retained:
{{{
#!python
count = top_chapters.count()
chapters =
Chapter.objects.filter(id__in=top_chapters[:count]).order_by('weight')
}}}
This issue seems related to ticket #12328. In that ticket, order by was
intentionally retained if the subquery is sliced.
--
Ticket URL: <https://code.djangoproject.com/ticket/20600>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0
Comment:
Note that the workaround solution I posted above does product SQL with the
ORDER BY retained, but it also contains too many columns for the
{{{id__in}}} to work correctly and I get back:
{{{
*** DatabaseError: subquery has too many columns
LINE 1: ..." FROM "heron_section" WHERE "heron_section"."id" IN (SELECT...
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:1>
* easy: 0 => 1
* stage: Unreviewed => Accepted
Comment:
The workaround might actually work if you add a .values() call in there.
But yes, this seems like a bug. I wonder if adding an ORDER BY into the
subquery always is the correct approach. But maybe that is costly on some
databases...
The easy solution is to just add ORDER BY if the query has distinct on
fields set.
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:2>
* status: new => assigned
* owner: nobody => anonymous
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:3>
Comment (by koddsson@…):
I made unit tests! https://github.com/django/django/pull/1325
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:4>
* needs_better_patch: 0 => 1
Comment:
(closed) PR has tests, but lacks fixing implementation.
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:5>
* easy: 1 => 0
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:6>
Comment (by chekunkov):
I've got this issue as well. Posted workaround didn't work for me (even
with .values()). The only solution I've found is to evaluate query for ids
and then use result in "in" statement:
{{{
top_chapters_ids = Chapter.objects.filter(
topics__in=['css', 'javascript']
).distinct(
'book_id'
).order_by(
'book_id',
'-weight',
'id'
).values_list('id', flat=True)
# evaluate query
top_chapters_ids = list(top_chapters_ids)
chapters = Chapter.objects.filter(
id__in=top_chapters_ids
).order_by('weight')
}}}
It is working but resource-consuming solution. And possibly with large
number of ids it wouldn't work for SQLite databases, because of their
limitations.
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:7>
* needs_better_patch: 1 => 0
* has_patch: 0 => 1
Comment:
A patch fixing this is available from
https://github.com/django/django/pull/1722 - I have commented all changes
so that it should be easy to verify that the patch is actually doing the
right thing. Anybody care to double-check the pull request?
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:8>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"b1b04df06522b092a9b4768d2d61c52956e00eca"]:
{{{
#!CommitTicketReference repository=""
revision="b1b04df06522b092a9b4768d2d61c52956e00eca"
Fixed #20600 -- ordered distinct(*fields) in subqueries
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/20600#comment:9>