[Django] #30771: Filtering on query result overrides GROUP BY of internal query

22 views
Skip to first unread message

Django

unread,
Sep 11, 2019, 9:13:21 AM9/11/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: nobody
Type: Bug | Status: new
Component: Database | Version: 2.2
layer (models, ORM) | Keywords: annotate,
Severity: Normal | annotation, filter, groupby,
Triage Stage: | aggregate, aggregation
Unreviewed | Has patch: 0
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
{{{#!python
from django.contrib.auth import models

a =
models.User.objects.filter(email__isnull=True).values('email').annotate(m=Max('id')).values('m')
print(a.query) # good
# SELECT MAX("auth_user"."id") AS "m" FROM "auth_user" WHERE
"auth_user"."email" IS NULL GROUP BY "auth_user"."email"
print(a[:1].query) # good
# SELECT MAX("auth_user"."id") AS "m" FROM "auth_user" WHERE
"auth_user"."email" IS NULL GROUP BY "auth_user"."email" LIMIT 1
b = models.User.objects.filter(id=a[:1])
print(b.query) # GROUP BY U0."id" should be GROUP BY U0."email"
# SELECT ... FROM "auth_user" WHERE "auth_user"."id" = (SELECT U0."id"
FROM "auth_user" U0 WHERE U0."email" IS NULL GROUP BY U0."id" LIMIT 1)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30771>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 11, 2019, 9:14:30 AM9/11/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage:
annotation, filter, groupby, | Unreviewed
aggregate, aggregation |

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Aur Saraf):

Workaround:

{{{
from django.contrib.auth import models

a =
models.User.objects.filter(email__isnull=True).values('email').aggregate(Max('id'))['id_max']
b = models.User.objects.filter(id=a)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:1>

Django

unread,
Sep 12, 2019, 9:38:06 PM9/12/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned

Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage:
annotation, filter, groupby, | Unreviewed
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by James Timmins):

* owner: nobody => James Timmins
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:2>

Django

unread,
Sep 13, 2019, 12:23:27 AM9/13/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: 2.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage:
annotation, filter, groupby, | Unreviewed
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Thanks for tackling that one James!

If I can provide you some guidance I'd suggest you have a look at
`lookups.Exact.process_rhs`

https://github.com/django/django/blob/ea25bdc2b94466bb1563000bf81628dea4d80612/django/db/models/lookups.py#L265-L267

We probably don't want to perform the `clear_select_clause` and
`add_fields(['pk'])` when the query is already only selecting a single
field.

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:3>

Django

unread,
Sep 13, 2019, 3:35:52 AM9/13/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* version: 2.2 => master
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:4>

Django

unread,
Sep 19, 2019, 3:17:09 AM9/19/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by James Timmins):

Thanks so much for the help Simon! This is a great jumping-off point.

There's something that I'm unclear about, which perhaps you can shed some
light on. While I was able to replicate the bug with 2.2, when I try to
create a test on Master to validate the bug, the group-by behavior seems
to have changed.

Here's the test that I created:

{{{
def test_exact_selected_field_rhs_subquery(self):
author_1 = Author.objects.create(name='one')
author_2 = Author.objects.create(name='two')
max_ids =
Author.objects.filter(alias__isnull=True).values('alias').annotate(m=Max('id')).values('m')
authors = Author.objects.filter(id=max_ids[:1])

self.assertFalse(str(max_ids.query)) # This was just to force the
test-runner to output the query.

self.assertEqual(authors[0], author_2)
}}}

And here's the resulting query:

{{{
SELECT MAX("lookup_author"."id") AS "m" FROM "lookup_author" WHERE
"lookup_author"."alias" IS NULL GROUP BY "lookup_author"."alias",
"lookup_author"."name"
}}}

It no longer appears to be grouping by the 'alias' field listed in the
initial `.values()` preceeding the `.annotate()`. I looked at the docs and
release notes to see if there was a behavior change, but didn't see
anything listed.

Do you know if I'm just misunderstanding what's happening here? Or does
this seem like a possible regression?

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:5>

Django

unread,
Sep 19, 2019, 12:18:15 PM9/19/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

It's possible that a regression was introduced in between.

Could you try bisecting the commit that changed the behavior

https://docs.djangoproject.com/en/dev/internals/contributing/triaging-
tickets/#bisecting-a-regression

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:6>

Django

unread,
Sep 19, 2019, 8:06:07 PM9/19/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by James Timmins):

Mmm actually disregard that. The second value in the `GROUP BY` is due to
the `ordering` value in the Author class's Meta class.

{{{
class Author(models.Model):
name = models.CharField(max_length=100)
alias = models.CharField(max_length=50, null=True, blank=True)

class Meta:
ordering = ('name',)
}}}


Regarding the bug in question in this ticket, what should the desired
behavior be if the inner query is returning multiple fields? With the fix,
which allows the inner query to define a field to return/group by, if
there are multiple fields used then it will throw a
`sqlite3.OperationalError: row value misused`.

Is this the desired behavior or should it avoid this problem by defaulting
back to `pk` if more than one field is selected?

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:7>

Django

unread,
Sep 19, 2019, 8:09:15 PM9/19/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

I think that we should only default to `pk` if no fields are selected. The
ORM has preliminary support for multi-column lookups and other interface
dealing with subqueries doesn't prevent passing queries with multiple
fields so I'd stick to the current `__in` lookup behavior.

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:8>

Django

unread,
Sep 20, 2019, 1:35:18 AM9/20/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: assigned
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by James Timmins):

* has_patch: 0 => 1


Comment:

The pull request can be found at the following link:
https://github.com/django/django/pull/11797

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:9>

Django

unread,
Sep 20, 2019, 5:15:40 AM9/20/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"0719edcd5fed56157ffb3323a8f634aa5e8f9a80" 0719edc]:
{{{
#!CommitTicketReference repository=""
revision="0719edcd5fed56157ffb3323a8f634aa5e8f9a80"
Fixed #30771 -- Fixed exact lookup against queries with selected columns.

Use pre-existing select fields (and thereby GROUP BY fields) from
subquery if they were specified, instead of always defaulting to pk.

Thanks Aur Saraf for the report and Simon Charette for guidance.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:11>

Django

unread,
Sep 20, 2019, 5:15:40 AM9/20/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"3697ddbf7526e85483067db5a2fde93f661da360" 3697ddb]:
{{{
#!CommitTicketReference repository=""
revision="3697ddbf7526e85483067db5a2fde93f661da360"
[3.0.x] Fixed #30771 -- Fixed exact lookup against queries with selected
columns.

Use pre-existing select fields (and thereby GROUP BY fields) from
subquery if they were specified, instead of always defaulting to pk.

Thanks Aur Saraf for the report and Simon Charette for guidance.

Backport of 0719edcd5fed56157ffb3323a8f634aa5e8f9a80 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:10>

Django

unread,
Sep 20, 2019, 4:02:31 PM9/20/19
to django-...@googlegroups.com
#30771: Filtering on query result overrides GROUP BY of internal query
-------------------------------------+-------------------------------------
Reporter: Aur Saraf | Owner: James
| Timmins
Type: Bug | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: annotate, | Triage Stage: Accepted
annotation, filter, groupby, |
aggregate, aggregation |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by GitHub <noreply@…>):

In [changeset:"1611906094f8583371167eb6184c0f6762e13c2e" 1611906]:
{{{
#!CommitTicketReference repository=""
revision="1611906094f8583371167eb6184c0f6762e13c2e"
[3.0.x] Refs #30771 -- Fixed RemovedInDjango31Warning in
test_exact_query_rhs_with_selected_columns.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/30771#comment:12>

Reply all
Reply to author
Forward
0 new messages