[Django] #26390: order_by('?') unexpectedly breaking queryset aggregation

71 views
Skip to first unread message

Django

unread,
Mar 21, 2016, 3:38:04 PM3/21/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
----------------------------------------------+--------------------
Reporter: uranusjr | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: master
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Steps to reproduce:


{{{
class Thing(models.Model):
pass

class Related(models.Model):
models.ForeignKey(Thing)
}}}

With data

{{{
t = Thing.objects.create()
rs = [Related.objects.create(thing=t) for _ in range(2)]
}}}

The following query works as expected. The aggregation with Count produces
a GROUP BY clause on related.id.

{{{
>>>
Thing.objects.annotate(rc=Count('related')).order_by('rc').values('id',
'rc')
<QuerySet [{'id': 1, 'rc': 2}]>
}}}

This also works as expected (at least to me). Although there is an
aggregation, ordering by related means that the grouping will be broken
down.

{{{
>>>
Thing.objects.annotate(rc=Count('related')).order_by('related').values('id',
'rc')
<QuerySet [{'id': 1, 'rc': 1}, {'id': 1, 'rc': 1}]>
}}}

But the following seems wrong to me.

{{{
>>> Thing.objects.annotate(rc=Count('related')).order_by('?').values('id',
'rc')
<QuerySet [{'id': 1, 'rc': 1}, {'id': 1, 'rc': 1}]>
}}}

The random function call has nothing to do with the aggregation, and I see
no reason it should break it. Dumping the query seems that indeed the
random call breaks the group by call: (I simpilfied the table names a
little)

{{{
>>>
print(Thing.objects.annotate(rc=Count('related')).order_by('?').values('id',
'rc').query)
SELECT "thing"."id", COUNT("related"."id") AS "rc" FROM "thing" LEFT OUTER
JOIN "related" ON ("thing"."id" = "related"."thing_id") GROUP BY
"thing"."id", RANDOM() ORDER BY RANDOM() ASC
}}}

I dug into the SQL compiler, and it seems to me the problem is inside
`django.db.models.sql.compiler.get_group_by`, where the compiler combines
all non-aggregate, non-ref order_by expressions into group_by. I patched
it like this

{{{
for expr, (sql, params, is_ref) in order_by:
if expr.contains_aggregate:
continue
if is_ref:
continue
expressions.extend([
exp for exp in expr.get_source_expressions()
if not isinstance(exp, Random)
])
}}}

and things seem to work correctly. No failed tests against SQLite3 with
default settings.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 21, 2016, 3:43:41 PM3/21/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
------------------------------------------+----------------------------

Reporter: uranusjr | Owner: nobody
Type: Bug | Status: new
Component: Database layer (models, ORM) | Version: master
Severity: Normal | Resolution:

Keywords: | Triage Stage: Unreviewed
Has patch: 0 | Easy pickings: 0
UI/UX: 0 |
------------------------------------------+----------------------------
Changes (by uranusjr):

* Attachment "order-by-random-no-group.patch" added.

Patch to SQLCompiler.get_group_by that excluds Random expressions

Django

unread,
Mar 21, 2016, 4:44:34 PM3/21/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------

Reporter: uranusjr | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* needs_better_patch: => 0
* needs_docs: => 0
* needs_tests: => 0
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:1>

Django

unread,
Mar 21, 2016, 5:08:44 PM3/21/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------

Reporter: uranusjr | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by charettes):

* needs_better_patch: 0 => 1
* has_patch: 0 => 1
* needs_tests: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:2>

Django

unread,
Mar 22, 2016, 4:55:18 AM3/22/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by uranusjr):

* owner: nobody => uranusjr
* needs_better_patch: 1 => 0
* status: new => assigned
* needs_tests: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:3>

Django

unread,
Mar 22, 2016, 4:55:47 AM3/22/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by uranusjr):

[https://github.com/django/django/pull/6322 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:4>

Django

unread,
Mar 22, 2016, 5:25:48 AM3/22/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by akaariai):

I wonder what would happen if we skipped all expressions that have no cols
as source expressions (plus, we need to include any raw sql).

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:5>

Django

unread,
Mar 30, 2016, 4:11:38 AM3/30/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by uranusjr):

> I wonder what would happen if we skipped all expressions that have no
cols as source expressions (plus, we need to include any raw sql).

This seems like a very good idea, and I can’t think of a scenario where
this will break things. I’ve updated the PR.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:6>

Django

unread,
Apr 1, 2016, 9:37:40 AM4/1/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1


Comment:

The new test isn't passing on MySQL/PostgreSQL.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:7>

Django

unread,
Apr 1, 2016, 12:19:05 PM4/1/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by uranusjr):

* needs_better_patch: 1 => 0


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:8>

Django

unread,
Apr 9, 2016, 1:07:07 PM4/9/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: uranusjr | Owner: uranusjr
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by timgraham):

* needs_better_patch: 0 => 1


Comment:

Some test additions are still needed.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:9>

Django

unread,
Nov 4, 2016, 10:21:57 AM11/4/16
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: Tzu-ping Chung | Owner: Tzu-ping
| Chung

Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthias Kestenholz):

It's documented that ordering will be included in the grouping clause so I
wouldn't say that this behavior is unexpected. It seems to me that trying
to remove some (but not all) columns from the group by clause according to
new rules is less clear than the simple rule that is in place now.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:10>

Django

unread,
Oct 10, 2018, 1:11:42 AM10/10/18
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: Tzu-ping Chung | Owner: Tzu-ping
| Chung
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Zach):

* cc: Zach (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:11>

Django

unread,
Oct 10, 2018, 11:42:05 AM10/10/18
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: Tzu-ping Chung | Owner: Tzu-ping
| Chung
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Zach):

If you need to filter on an annotated value while still using
`.order_by('?')`, this could work:

{{{#!python
Thing.objects.filter(pk__in=Thing.objects.annotate(rc=Count('related')).filter(rc__gte=2)).order_by('?')
}}}

This avoids the `GROUP BY RANDOM() ORDER BY RANDOM() ASC` issue while
still allowing `.annotate()` and `.order_by('?')` to be used together.

--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:12>

Django

unread,
Oct 21, 2020, 6:16:12 AM10/21/20
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: Tzu-ping Chung | Owner: Étienne
| Beaulé

Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:15>

Django

unread,
Oct 21, 2020, 2:58:04 PM10/21/20
to django-...@googlegroups.com
#26390: order_by('?') unexpectedly breaking queryset aggregation
-------------------------------------+-------------------------------------
Reporter: Tzu-ping Chung | Owner: Étienne
| Beaulé
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* needs_better_patch: 1 => 0
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/26390#comment:16>

Reply all
Reply to author
Forward
0 new messages