[Django] #32658: Subquery ignores query ordering

19 views
Skip to first unread message

Django

unread,
Apr 15, 2021, 8:26:41 AM4/15/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint | Owner: nobody
Balina |
Type: | Status: new
Uncategorized |
Component: Database | Version: 3.2
layer (models, ORM) | Keywords: Subquery,
Severity: Normal | annotations, ordering
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
I've started migrating a huge project from Django==2.2 to ==3.2, and many
tests started to fail, because of the problem below:

When using a query with ordering in an annotated Subquery, the query's
ordering is ignored (or removed).

In a bare new Django project, I have successfully reproduced the issue:

{{{
from django.db import models


class MyModel(models.Model):
parent = models.ForeignKey(to='self', on_delete=models.CASCADE,
null=True)
name = models.TextField()
}}}


{{{
MyModel.objects.create(name='parent')
MyModel.objects.create(name='b', parent=parent)
MyModel.objects.create(name='a', parent=parent)
}}}


{{{
children_query =
MyModel.objects.filter(parent=OuterRef('pk')).annotate(child_names=Func(F('name'),
function='group_concat')).values('child_names').order_by('name')
parent_query =
MyModel.objects.annotate(child_names=Subquery(children_query))
parent_query.query.__str__()
}}}
Outputs:

{{{
SELECT "app_mymodel"."id", "app_mymodel"."parent_id",
"app_mymodel"."name", (SELECT group_concat(U0."name") AS "child_names"
FROM "app_mymodel" U0 WHERE U0."parent_id" = "app_mymodel"."id") AS
"child_names" FROM "app_mymodel"
}}}

The only thing I do, is pip install Django==2.2, run the same thing in a
shell and:

Outputs:

{{{
SELECT "app_mymodel"."id", "app_mymodel"."parent_id",
"app_mymodel"."name", (SELECT group_concat(U0."name") AS "child_names"
FROM "app_mymodel" U0 WHERE U0."parent_id" = ("app_mymodel"."id") ORDER BY
U0."name" ASC) AS "child_names" FROM "app_mymodel"
}}}
Notice the ORDER BY U0."name" ASC is missing in Django 3.2, though I have
explicitly set the ordering on the query. This happens both on SQLite and
PostgreSQL backends.

--
Ticket URL: <https://code.djangoproject.com/ticket/32658>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 15, 2021, 8:37:31 AM4/15/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Bálint Balina):

After digging deeper, I have found that this is the duplicate of
https://code.djangoproject.com/ticket/31687 , which was closed without any
real explanation about why this working behaviour changed. I am aware of
the solution provided, but it would require quite some code changes, which
is fine but seems unreasonable. At very least I would like to know why it
is necessary to remove the ordering from the subquery.

--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:1>

Django

unread,
Apr 15, 2021, 11:02:49 AM4/15/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: closed

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)
* status: new => closed
* resolution: => invalid


Comment:

IIRC the ordering is cleared for optimization purposes when `LIMIT`,
`DISTINCT` and aggregation are not used.

I think that ordering wouldn't be disabled if your function was properly
marked as being an aggregate (as it's one)

{{{#!python
children_query = MyModel.objects.filter(
parent=OuterRef('pk'),
).annotate(
child_names=Aggregate('name', function='group_concat'),
).values('child_names').order_by('name')
}}}

If you use `Func` instead of `Aggregate` then the ORM has to way to know
your opaque expression is performing aggregation and that ordering might
be relevant.

If using `Aggregate` instead of `Func` still results in cleared ordering
then please re-open this ticket as it shouldn't be the case otherwise this
a clearly a misuse of the ORM and likely only work in your case because
MySQL is very lax about `GROUP BY` requirements.

--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:2>

Django

unread,
Apr 16, 2021, 10:28:24 AM4/16/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: new

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Bálint Balina):

* status: closed => new
* resolution: invalid =>


Comment:

Okay, that sounds sane, but the solution you provided does not work:


{{{


children_query = MyModel.objects.filter(
parent=OuterRef('pk'),
).annotate(
child_names=Aggregate('name', function='group_concat'),
).values('child_names').order_by('name')

queryset = MyModel.objects.annotate(
other_names=Subquery(children_query)
).only('id')

print(str(queryset.query))

}}}

{{{
SELECT "app_mymodel"."id", (SELECT ARRAY_AGG(U0."name" ) AS "child_names"
FROM "app_mymodel" U0 WHERE U0."parent_id" = "app_mymodel"."id" GROUP BY
U0."id") AS "other_names" FROM "app_mymodel"
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:3>

Django

unread,
Apr 17, 2021, 8:00:24 PM4/17/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

I this case I think we should either make the logic in charge of clearing
ordering consider `contains_aggregate` as well.

An alternative which might more correct would be to move the ordering
clearing logic to `In(Lookup).process_rhs` after all
[https://github.com/django/django/blob/aa4acc164d1247c0de515c959f7b09648b57dc42/django/db/models/lookups.py#L399
that's what we do for clearing the select clause]. The devils lurks in the
details there though as the solution needs to handle `Subquery`,
`sql.Query`, and `QuerySet` (all resolves to `sql.Query` though) right-
hand-sides and the ordering can only be cleared
[https://github.com/django/django/blob/678c8dfee458cda77fce0d1c127f1939dc134584/django/db/models/sql/query.py#L1029-L1034
under some particular conditions].

Maybe adding a `sql.Query.as_haystack` (please suggest a better name)
method that takes care of all this logic and duck-typing against it in
`In(Lookup).process_rhs` would provide a decoupled enough API?

--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:4>

Django

unread,
Apr 18, 2021, 4:41:23 AM4/18/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Hannes Ljungberg):

* cc: Hannes Ljungberg (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:5>

Django

unread,
Apr 19, 2021, 7:08:30 AM4/19/21
to django-...@googlegroups.com
#32658: Subquery ignores query ordering
-------------------------------------+-------------------------------------
Reporter: Bálint Balina | Owner: nobody
Type: Uncategorized | Status: closed

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid

Keywords: Subquery, | Triage Stage:
annotations, ordering | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* resolution: => invalid


Comment:

`ORDER BY` clause should be included in the `GROUP_CONCAT()` not in the
main query, e.g. `GROUP_CONCAT(U0."name" ORDER BY U0."name" ASC) AS
"child_names")`. You can check
[https://github.com/django/django/blob/413c15ef2e3d3958fb641a023bc1e2d15fb2b228/django/contrib/postgres/aggregates/mixins.py#L4-L48
OrderableAggMixin] to see how it's implemented for `ArrayAgg`,
`StringAgg`, and `JSONBAgg` on PostgreSQL.

--
Ticket URL: <https://code.djangoproject.com/ticket/32658#comment:6>

Reply all
Reply to author
Forward
0 new messages