A breaking test is shown in:
https://github.com/igorpejic/django/pull/1/files
From investigation this is related to:
https://code.djangoproject.com/ticket/31094
and it seems the regression was introduced in:
https://github.com/django/django/commit/fb3f034f1c63160c0ff13c609acd01c18be12f80
It seems that the combination of the "double OuterRef" with the "Case"
logic is causing the problem.
Query looks like:
{{{
books_with_same_name_as_country = Book.objects.filter(
id__in=Subquery(
Book.objects.filter(
name=OuterRef(OuterRef('country__name')),
).values('id')
)
).values('id')[:1]
books_breakdown = Publisher.objects.annotate(total_books=Case(
When(
num_awards__gte=2,
then=Subquery(books_with_same_name_as_country,
IntegerField())
),
When(
num_awards__lt=0,
then=Count('country__publishers')
),
))
}}}
Stack trace:
{{{
======================================================================
ERROR: test_group_by_subquery_annotation_with_conditional
(aggregation.tests.AggregateTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in
_execute
return self.cursor.execute(sql, params)
psycopg2.errors.GroupingError: subquery uses ungrouped column
"aggregation_country.name" from outer query
LINE 1: ..."id" FROM "aggregation_book" U0 WHERE U0."name" = "aggregati...
^
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/igor/django-repo/django/test/testcases.py", line 1229, in
skip_wrapper
return test_func(*args, **kwargs)
File "/home/igor/django-repo/tests/aggregation/tests.py", line 1299, in
test_group_by_subquery_annotation_with_conditional
self.assertEqual(books_breakdown.get(id=self.p1.id).total_books, 1)
File "/home/igor/django-repo/django/db/models/query.py", line 411, in
get
num = len(clone)
File "/home/igor/django-repo/django/db/models/query.py", line 258, in
__len__
self._fetch_all()
File "/home/igor/django-repo/django/db/models/query.py", line 1261, in
_fetch_all
self._result_cache = list(self._iterable_class(self))
File "/home/igor/django-repo/django/db/models/query.py", line 57, in
__iter__
results = compiler.execute_sql(chunked_fetch=self.chunked_fetch,
chunk_size=self.chunk_size)
File "/home/igor/django-repo/django/db/models/sql/compiler.py", line
1154, in execute_sql
cursor.execute(sql, params)
File "/home/igor/django-repo/django/db/backends/utils.py", line 68, in
execute
return self._execute_with_wrappers(sql, params, many=False,
executor=self._execute)
File "/home/igor/django-repo/django/db/backends/utils.py", line 77, in
_execute_with_wrappers
return executor(sql, params, many, context)
File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in
_execute
return self.cursor.execute(sql, params)
File "/home/igor/django-repo/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/home/igor/django-repo/django/db/backends/utils.py", line 86, in
_execute
return self.cursor.execute(sql, params)
django.db.utils.ProgrammingError: subquery uses ungrouped column
"aggregation_country.name" from outer query
LINE 1: ..."id" FROM "aggregation_book" U0 WHERE U0."name" = "aggregati...
^
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32478>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* component: Uncategorized => Database layer (models, ORM)
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:1>
* cc: Igor Pejic (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:2>
* type: Uncategorized => Bug
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:3>
* owner: nobody => Simon Charette
* status: new => assigned
* stage: Unreviewed => Accepted
Comment:
That looks like another case of #31094 but generalized for expressions
that mix aggregates and subqueries.
`Case.get_group_by_cols()` defers to `BaseExpression.get_group_by_cols()`
when it has `cases` and the latter immediately returns `self`
[https://github.com/django/django/blob/19ce1d493ae7cbc1e704d338077b1f5f5e5769c9/django/db/models/expressions.py#L349-L355
when it contains aggregates].
Igor, could you include the SQL that uses to be generated and what the new
one looks like?
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:4>
Old description:
New description:
Query looks like:
Stack trace:
Full query 2.2:
{{{
SELECT "aggregation_regress_publisher"."id",
"aggregation_regress_publisher"."name",
"aggregation_regress_publisher"."num_awards",
"aggregation_regress_publisher"."country_id",
CASE
WHEN "aggregation_regress_publisher"."num_awards" >= 2 THEN
(SELECT V0."id"
FROM "aggregation_regress_book" V0
WHERE V0."id" IN
(SELECT U0."id"
FROM "aggregation_regress_book" U0
WHERE U0."name" =
("aggregation_regress_country"."name")
ORDER BY U0."name" ASC)
ORDER BY V0."name" ASC
LIMIT 1)
WHEN "aggregation_regress_publisher"."num_awards" < 0 THEN
COUNT(T3."id")
ELSE NULL
END AS "total_books"
FROM "aggregation_regress_publisher"
INNER JOIN "aggregation_regress_country" ON
("aggregation_regress_publisher"."country_id" =
"aggregation_regress_country"."id")
LEFT OUTER JOIN "aggregation_regress_publisher" T3 ON
("aggregation_regress_country"."id" = T3."country_id")
GROUP BY "aggregation_regress_publisher"."id",
(SELECT V0."id"
FROM "aggregation_regress_book" V0
WHERE V0."id" IN
(SELECT U0."id"
FROM "aggregation_regress_book" U0
WHERE U0."name" = ("aggregation_regress_country"."name")
ORDER BY U0."name" ASC)
ORDER BY V0."name" ASC
LIMIT 1)
}}}
Full query 3.0:
{{{
SELECT "aggregation_publisher"."id",
"aggregation_publisher"."name",
"aggregation_publisher"."num_awards",
"aggregation_publisher"."duration",
"aggregation_publisher"."country_id",
CASE
WHEN "aggregation_publisher"."num_awards" >= 2 THEN
(SELECT V0."id"
FROM "aggregation_book" V0
WHERE V0."id" IN
(SELECT U0."id"
FROM "aggregation_book" U0
WHERE U0."name" = "aggregation_country"."name")
LIMIT 1)
WHEN "aggregation_publisher"."num_awards" < 0 THEN
COUNT(T3."id")
ELSE NULL
END AS "total_books"
FROM "aggregation_publisher"
INNER JOIN "aggregation_country" ON ("aggregation_publisher"."country_id"
= "aggregation_country"."id")
LEFT OUTER JOIN "aggregation_publisher" T3 ON ("aggregation_country"."id"
= T3."country_id")
GROUP BY "aggregation_publisher"."id"
}}}
--
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:5>
Comment (by Igor Pejic):
Thanks. I edited the main description with the query in 3.0 and 2.2.
The main difference is in the last argument of the group by.
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:6>
Comment (by Mariusz Felisiak):
I confirmed that it's a regression in
fb3f034f1c63160c0ff13c609acd01c18be12f80.
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:7>
* has_patch: 0 => 1
Comment:
Igor, could you confirm your issue is resolved by
https://github.com/django/django/pull/14039.
The underlying problem was that nested external column references were not
recursively collected so the problem detailed in #31094 could manifest
itself when mixing aggregation with a nested subqueries because
`Query.get_external_cols()` would not account for possibly nested
subqueries with external column references.
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:8>
Comment (by Igor Pejic):
I can confirm the fix works for our repository when using it with 3.0.12
Thank you for this fix and all other work Simon.
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:9>
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:10>
Comment (by Mariusz Felisiak):
Thanks Igor.
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:11>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"277eea8fcced7f04f3800617f189beb349a3212e" 277eea8f]:
{{{
#!CommitTicketReference repository=""
revision="277eea8fcced7f04f3800617f189beb349a3212e"
Fixed #32478 -- Included nested columns referenced by subqueries in GROUP
BY on aggregations.
Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.
Refs #31094, #31150.
Thanks Igor Pejic for the report.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:12>
Comment (by Mariusz Felisiak <felisiak.mariusz@…>):
In [changeset:"7a6ca01f4eb990772080e2978f19d83484184b04" 7a6ca01]:
{{{
#!CommitTicketReference repository=""
revision="7a6ca01f4eb990772080e2978f19d83484184b04"
[3.2.x] Fixed #32478 -- Included nested columns referenced by subqueries
in GROUP BY on aggregations.
Regression in fb3f034f1c63160c0ff13c609acd01c18be12f80.
Refs #31094, #31150.
Thanks Igor Pejic for the report.
Backport of 277eea8fcced7f04f3800617f189beb349a3212e from master
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32478#comment:13>