Re: [Django] #34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on" on MySQL/MariaDB.

24 views
Skip to first unread message

Django

unread,
Nov 20, 2023, 4:05:41 AM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by David Sanders):

Failing test in case that helps:

{{{
--- a/tests/aggregation/tests.py
+++ b/tests/aggregation/tests.py
@@ -2135,6 +2135,39 @@ class AggregateTestCase(TestCase):
)
self.assertEqual(list(author_qs), [337])

+ def test_annotate_raw_expression(self):
+ qs = (
+ Book.objects.values("publisher")
+ .annotate(min_price=Min("price"))
+ .annotate(max_price=RawSQL("MAX(price)", params=[]))
+ .values("publisher__name", "min_price", "max_price")
+ )
+ self.assertEqual(
+ list(qs),
+ [
+ {
+ "max_price": Decimal("30.00"),
+ "min_price": Decimal("29.69"),
+ "publisher__name": "Apress",
+ },
+ {
+ "max_price": Decimal("23.09"),
+ "min_price": Decimal("23.09"),
+ "publisher__name": "Sams",
+ },
+ {
+ "max_price": Decimal("82.80"),
+ "min_price": Decimal("29.69"),
+ "publisher__name": "Prentice Hall",
+ },
+ {
+ "max_price": Decimal("75.00"),
+ "min_price": Decimal("75.00"),
+ "publisher__name": "Morgan Kaufmann",
+ },
+ ],
+ )
+

class AggregateAnnotationPruningTests(TestCase):
@classmethod
}}}

This one's interesting because we can't dig into RawSQL to determine
whether it's valid in a GROUP BY or not.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:2>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Nov 20, 2023, 9:18:29 AM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned

Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* owner: nobody => Simon Charette
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:3>

Django

unread,
Nov 20, 2023, 10:23:03 AM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

See #26602 ''Provide a way to manage grouping with RawSQL'' which was
closed because of a justified use case.

I would say that the same can be said about this ticket as the reported
problem can be fixed in multiple ways with the provided ORM interfaces.

The most obvious and non-invasive one is to use `Min(RawSQL(...))` instead

{{{#!python
annotate(
best_date=Min(
RawSQL(
'IFNULL(productions_place.press_date,
IF(productions_place.end_date!="", productions_place.end_date,
productions_place.start_date))',
(),
),
)
)
}}}

a second one is to use expressions entirely

{{{#!python
annotate(best_date=Min(Coalesce("press_date", Case(When(end_date="",
then=F("start_date")), default=F("end_date"))))
}}}

a third one, assuming the user wants to stick to `IF` and `IFNULL`

{{{#!python
from django.db.models import Func

class If(Func):
function = "IF"

class IfNull(Func):
function = "IFNULL"


annotate(best_date=Min(
IfNull("press_date", If(end_date="", "start_date", "end_date"))
))
}}}

-----

> This one's interesting because we can't dig into RawSQL to determine
whether it's valid in a GROUP BY or not.

That's right David, it's the crux of the issue.

The reason why 041551d716b69ee7c81199eee86a2d10a72e15ab broke the reported
use that is that prior to this change the ORM supported a non-standard
feature of MySQL [https://dev.mysql.com/doc/refman/8.0/en/sql-mode.html
disabled in recent versions] that allowed grouping solely by the primary
key of the first entry in `FROM`. It's important to note that using
`RawSQL` to aggregate was only working on MySQL due to this feature and
never worked on any of the other backends that follow the functional
dependency detection in `GROUP BY` clauses as specified by the `SQL:1999`
standard.

The nail in the coffin of this feature was went it was discovered that it
had a peculiarity when dealing with subqueries #31331 that would have
required a significant amount of work to get working.

I could see us go three ways about dealing with this issue

1. Revert the changes made in 041551d716b69ee7c81199eee86a2d10a72e15ab
while making `allows_group_by_pk` based on the absence of
`ONLY_FULL_GROUP_BY`. Note that this won't resolve the aggregation over
the annotation of a dependant subquery but will restore the usage of
`RawSQL` for aggregation on MySQL only when `ONLY_FULL_GROUP_BY` is
disabled.
2. 1 + adjustments to the `allows_group_by_pk` to special case dependant
subquery annotations
3. Adjust the 4.2 existing release notes about this change to let them
know that this version of Django removed support for doing `RawSQL`
aggregations on MySQL and that they should use proper expressions instead
going forward.

Due to lack of demonstration that some aggregates or window function
cannot be expressed using the advanced ORM primitives, that
non-`ONLY_FULL_GROUP_BY` model is a non standard MySQL feature that is not
enabled by default since 8.0, and that this change happen to standardize
the usage of `RawSQL` for aggregations on all backends I'd be inclined to
go with 3.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:4>

Django

unread,
Nov 20, 2023, 12:39:39 PM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Natalia Bidart):

Replying to [comment:4 Simon Charette]:


> I could see us go three ways about dealing with this issue
>
> 1. Revert the changes made in 041551d716b69ee7c81199eee86a2d10a72e15ab
while making `allows_group_by_pk` based on the absence of
`ONLY_FULL_GROUP_BY`. Note that this won't resolve the aggregation over
the annotation of a dependant subquery but will restore the usage of
`RawSQL` for aggregation on MySQL only when `ONLY_FULL_GROUP_BY` is
disabled.
> 2. 1 + adjustments to the `allows_group_by_pk` to special case dependant
subquery annotations

> 3. Adjust the 4.2 existing release notes about this change to better
communicate that this version of Django removed support for doing `RawSQL`


aggregations on MySQL and that they should use proper expressions instead
going forward.

Given your rationale and considering that there are multiple workarounds
as you proposed, I'm also in favor of option 3.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:5>

Django

unread,
Nov 20, 2023, 1:18:37 PM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthew Somerville):

Given the same query does already use `Min()`, as quoted, I am at a loss
as to why I wasn't using it in the other part! Thanks for the
investigation; an addition to the 4.2 release notes explaining the change
further (I did read those, but only saw the reference to changes in
"third-party database backends") would be welcome, thank you.

With the change to your first option¹, my code passes on Django 4.2 fine
with `ONLY_FULL_GROUP_BY` turned off, thank you; turning that option on I
get a lot of 1055 errors, even with a query that's only e.g.
`Play.objects.annotate(Count('authors'))`, without any RawSQL, I get
`(1055, "'theatricalia.plays_play.title' isn't in GROUP BY")`, but assume
that's my issue somehow.

¹ If you're interested, regarding your second/third code change options,
press_date is a DateField but start_date/end_date are
ApproximateDateFields from my https://pypi.org/project/django-date-
extensions/ so that becomes a bit more complex.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:6>

Django

unread,
Nov 20, 2023, 3:33:24 PM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> I get a lot of 1055 errors, even with a query that's only e.g.

`Play.objects.annotate(Count('authors'))`, without any `RawSQL`, I get


`(1055, "'theatricalia.plays_play.title' isn't in GROUP BY")`, but assume
that's my issue somehow.

That's interesting. If it's happening for models of the form

{{{#!python
class Author(models.Model):
pass

class Play(models.Model):
title = models.CharField()
authors = models.ManyToManyField(Author)
}}}

I would expect `Play.objects.annotate(Count('authors'))` to generate

{{{#!sql
SELECT play.id, play.name, COUNT(author.id)
FROM play
LEFT JOIN play_authors ON (play_authors.play_id = play.id)
LEFT JOIN author ON (play_authors.author_id = author.id)
GROUP BY play.id
}}}

And by [https://dev.mysql.com/doc/refman/8.0/en/group-by-handling.html
MySQL docs]

> MySQL implements detection of functional dependence. If the
`ONLY_FULL_GROUP_BY` SQL mode is enabled (which it is by default), MySQL
rejects queries for which the select list, HAVING condition, or ORDER BY
list refer to nonaggregated columns that are neither named in the `GROUP
BY` clause '''nor are functionally dependent on them'''.

So in this case `play.name` ''is'' functionally dependant on `play.id` (as
it's the primary key of `play`) so if you're using a version of MySQL
supported on Django 4.2 we'd definitely like to learn more about it as
it's unexpected.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:7>

Django

unread,
Nov 20, 2023, 7:12:56 PM11/20/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by David Sanders):

Replying to [comment:4 Simon Charette]:

> 1. Revert the changes made in 041551d716b69ee7c81199eee86a2d10a72e15ab
while making `allows_group_by_pk` based on the absence of
`ONLY_FULL_GROUP_BY`. Note that this won't resolve the aggregation over
the annotation of a dependant subquery but will restore the usage of
`RawSQL` for aggregation on MySQL only when `ONLY_FULL_GROUP_BY` is
disabled.
> 2. 1 + adjustments to the `allows_group_by_pk` to special case dependant
subquery annotations
> 3. Adjust the 4.2 existing release notes about this change to better
communicate that this version of Django removed support for doing `RawSQL`
aggregations on MySQL and that they should use proper expressions instead
going forward.

We could also have a user-definable attribute `RawSQL.contains_aggregates`
though I think that's making things too complex.

Option 3 sounds good 👍

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:8>

Django

unread,
Nov 21, 2023, 2:05:34 AM11/21/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Agreed, let's document it.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:9>

Django

unread,
Nov 22, 2023, 1:14:02 PM11/22/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthew Somerville):

> So in this case play.name is functionally dependant on play.id (as it's

the primary key of play) so if you're using a version of MySQL supported


on Django 4.2 we'd definitely like to learn more about it as it's
unexpected.

So it turns out the database is MariaDB (11.1.2), not MySQL, and MariaDB
does not appear to include/have the functional dependency requirement that
this is based on. I don't know if you'd like me to raise that as a
separate ticket, if Django is supposed to support both entirely equally,
but yes, it looks like the code will not work at all with MariaDB with
ONLY_FULL_GROUP_BY turned on.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:10>

Django

unread,
Nov 22, 2023, 3:18:09 PM11/22/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

> it looks like the code will not work at all with MariaDB with
ONLY_FULL_GROUP_BY turned on.

That's good to know thanks for investigating further that's appreciated! I
think it's worth having a separate ticket for it yes. The solution will
likely be to turn off the `allows_group_by_selected_pks` feature on
MariaDB when `ONLY_FULL_GROUP_BY` mode is turned on. Note that the
`allows_group_by_selected_pks` feature is different from the
`allows_group_by_pk` feature removed in
041551d716b69ee7c81199eee86a2d10a72e15ab.

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:11>

Django

unread,
Nov 23, 2023, 4:57:42 AM11/23/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Simon
| Charette
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Matthew Somerville):

Have opened #34992. Thanks :)

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:12>

Django

unread,
Nov 27, 2023, 5:52:46 AM11/27/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Mariusz
| Felisiak

Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* owner: Simon Charette => Mariusz Felisiak
* has_patch: 0 => 1


Comment:

[https://github.com/django/django/pull/17532 PR]

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:13>

Django

unread,
Nov 27, 2023, 6:51:35 AM11/27/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Mariusz
| Felisiak
Type: Bug | Status: assigned
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution:
Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:14>

Django

unread,
Nov 27, 2023, 10:43:45 AM11/27/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Mariusz
| Felisiak
Type: Bug | Status: closed

Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"a652f0759651dd7103ed04336ef85dc410f680c1" a652f07]:
{{{
#!CommitTicketReference repository=""
revision="a652f0759651dd7103ed04336ef85dc410f680c1"
Fixed #34978, Refs #31331 -- Added backward incompatibility note about raw
aggregations on MySQL.

Thanks Matthew Somerville for the report.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:15>

Django

unread,
Nov 27, 2023, 10:45:12 AM11/27/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Mariusz
| Felisiak
Type: Bug | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Natalia <124304+nessita@…>):

In [changeset:"cdb14cc18bbfc7c32a6139137fe0151875e1c92e" cdb14cc1]:
{{{
#!CommitTicketReference repository=""
revision="cdb14cc18bbfc7c32a6139137fe0151875e1c92e"
[4.2.x] Fixed #34978, Refs #31331 -- Added backward incompatibility note


about raw aggregations on MySQL.

Thanks Matthew Somerville for the report.

Backport of a652f0759651dd7103ed04336ef85dc410f680c1 from main
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:16>

Django

unread,
Nov 27, 2023, 10:45:50 AM11/27/23
to django-...@googlegroups.com
#34978: Annotating through an aggregate with RawSQL() raises 1056 "Can't group on"
on MySQL/MariaDB.
-------------------------------------+-------------------------------------
Reporter: Matthew Somerville | Owner: Mariusz
| Felisiak
Type: Bug | Status: closed
Component: Database layer | Version: 4.2
(models, ORM) |
Severity: Release blocker | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Natalia <124304+nessita@…>):

In [changeset:"cbd1e913ef9fd9ef3a59a8607b2c9ccc34a64db3" cbd1e91]:
{{{
#!CommitTicketReference repository=""
revision="cbd1e913ef9fd9ef3a59a8607b2c9ccc34a64db3"
[5.0.x] Fixed #34978, Refs #31331 -- Added backward incompatibility note


about raw aggregations on MySQL.

Thanks Matthew Somerville for the report.

Backport of a652f0759651dd7103ed04336ef85dc410f680c1 from main
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34978#comment:17>

Reply all
Reply to author
Forward
0 new messages