[Django] #34262: Queryset grouped by annotation with aggregates on another annotated expression crashes on MySQL with sql_mode=only_full_group_by.

96 views
Skip to first unread message

Django

unread,
Jan 16, 2023, 1:32:18 PM1/16/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz | Owner: nobody
Felisiak |
Type: Bug | Status: new
Component: Database | Version: 4.1
layer (models, ORM) | Keywords: mysql
Severity: Normal | only_full_group_by
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Queryset grouped by annotation with aggregates on another annotated
expression crashed on MySQL with `sql_mode=only_full_group_by`, e.g.
{{{#!python
def test_group_by_nested_expression_with_params(self):
books_qs = (
Book.objects.annotate(greatest_pages=Greatest("pages",
Value(600)))
.values(
"greatest_pages",
)
.annotate(
min_pages=Min("pages"),
least=Least("min_pages", "greatest_pages"),
)
.values_list("least", flat=True)
)
self.assertCountEqual(books_qs, [300, 946, 1132])
}}}
crashes with:
{{{
django.db.utils.OperationalError: (1055, "Expression #1 of SELECT list is
not in GROUP BY clause and contains nonaggregated column
'test_django_2.aggregation_book.pages' which is not functionally dependent
on columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by")
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/34262>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Jan 17, 2023, 12:00:24 AM1/17/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage:
only_full_group_by | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* cc: Simon Charette (added)


--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:1>

Django

unread,
Jan 17, 2023, 2:27:51 AM1/17/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Carlton Gibson):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:2>

Django

unread,
Jan 24, 2023, 4:56:08 PM1/24/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by David Wobrock):

* cc: David Wobrock (added)


Comment:

Hey there,

Took a look at what is happening and why MySQL is failing with
`ONLY_FULL_GROUP_BY`.

In short and simplified, this statement works:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages) FROM aggregation_book GROUP
BY GREATEST(pages, 600) ORDER BY NULL;
+----------------------+------------+
| GREATEST(pages, 600) | MIN(pages) |
+----------------------+------------+
| 600 | 300 |
| 1132 | 1132 |
| 946 | 946 |
+----------------------+------------+
3 rows in set (0,01 sec)
}}}

And when you try to add a third expression, that uses the two first:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages), LEAST(MIN(pages),
GREATEST(pages, 600)) AS least FROM aggregation_book GROUP BY
GREATEST(pages, 600) ORDER BY NULL;
ERROR 1055 (42000): Expression #3 of SELECT list is not in GROUP BY clause
and contains nonaggregated column
'test_django_tests.aggregation_book2.pages' which is not functionally


dependent on columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by
}}}

MySQL is not happy, even though it seems rather straightforward that this
query should work.

The resources I could find on the topic are here:
- [https://bugs.mysql.com/bug.php?id=90792 A ticket from the MySQL bug
tracker]
- [https://stackoverflow.com/q/49447622 A related StackOverflow thread]
- [https://dev.mysql.com/blog-archive/when-only_full_group_by-wont-see-
the-query-is-deterministic/ A MySQL blog post]

And the blog post explains in more depth why it's not working.

-------

That leaves us with a choice to make for Django's behavior I reckon :)
Some options:

== 1. Add an `ANY_VALUE` around the problematic expression

That solves the issue here for instance:
{{{
mysql> SELECT GREATEST(pages, 600), MIN(pages),
ANY_VALUE(LEAST(MIN(pages), GREATEST(pages, 600))) AS least FROM
aggregation_book2 GROUP BY GREATEST(pages, 600) ORDER BY NULL;
+----------------------+------------+-------+
| GREATEST(pages, 600) | MIN(pages) | least |
+----------------------+------------+-------+
| 600 | 300 | 300 |
| 1132 | 1132 | 1132 |
| 946 | 946 | 946 |
+----------------------+------------+-------+
3 rows in set (0,00 sec)
}}}
However, I fear that detecting when to wrap the expression with an
`ANY_VALUE` is a rabbit hole we don't want to go down, as we might end up
implementing what the MySQL team didn't want to implement.


== 2. Raise awareness

We could, firstly, document the potential issue, and secondly raise a
warning when such an error occurs when executing a query a Django.
That way, users are at least aware that's not entirely their or Django's
fault.


== 3. Generally change query generation

Another type of workaround suggested by the [https://dev.mysql.com/blog-
archive/when-only_full_group_by-wont-see-the-query-is-deterministic/ MySQL
blog post] is to use a subquery/derived table:
{{{
mysql> SELECT greatest_pages,
MIN(pages),
LEAST(MIN(pages), greatest_pages) AS least
FROM (SELECT GREATEST(pages, 600) greatest_pages,
pages
FROM aggregation_book2) AS t
GROUP BY greatest_pages
ORDER BY NULL;
+----------------+------------+-------+
| greatest_pages | MIN(pages) | least |
+----------------+------------+-------+
| 600 | 300 | 300 |
| 1132 | 1132 | 1132 |
| 946 | 946 | 946 |
+----------------+------------+-------+
3 rows in set (0,00 sec)
}}}
So that we always try to group on a column, and not an expression.
Even though, it might be worse in terms of performances, depending the DB
implementation I guess.

This change would then affect all databases I reckon, which is a much
larger change, and therefore riskier.


== 4. Any other option! :D

I hope all of this makes sense, happy to read any thoughts on this :)
See ya!

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:3>

Django

unread,
Jan 24, 2023, 5:15:14 PM1/24/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

Thanks for the analysis David, option 1. and 2. were also the conclusion
of [https://github.com/django/django/pull/16460#discussion_r1071497499 my
limited investigation on the subject] so it's great to have cross peer
validation on the subject.

I think there might be a way to implement 3. by reusing
[https://github.com/django/django/blob/d3c93cdc597e0efc2815111c04dd5a427432ed37/django/db/models/sql/compiler.py#L669-L680
some of the logic used to implement masking of columns when filtering
against window functions] which requires two level of subquery wrapping.

A different of approaching 3. is to think that any form of ''masking'' of
annotations/aliases used for grouping purposes would result in a subquery
pushdown. So to reuse your example, instead of performing a subquery
pushdown to compute expressions used with aggregate queries we'd do it the
other way around to have MySQL group against top level `SELECT`
expressions which it's good at inferring dependencies from

{{{#!sql
SELECT LEAST(min_pages, greatest_pages) AS `least` FROM (
SELECT
GREATEST(`aggregation_book`.`pages`, 600) greatest_pages,
MIN(`aggregation_book`.`pages`) min_pages
FROM `aggregation_book`
GROUP BY 1 ORDER BY NULL
) masked
}}}

This should reduce the area of impact of this change to aggregating
queries that group against a value that isn't explicitly selected and
would also happen to solve the last remaining known issue with using
server-side parameters binding for Postgres (#34255).

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:4>

Django

unread,
Jul 14, 2023, 11:02:23 AM7/14/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Amir Karimi):

Replying to [comment:4 Simon Charette]:

I'm curios to know what happened with this issue. Any updates?

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:5>

Django

unread,
Jul 14, 2023, 12:06:28 PM7/14/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:5 Amir Karimi]:


> I'm curios to know what happened with this issue. Any updates?

Feel-free to work on this issue. Please don't leave comments like `any
updates?` they don't help and cause unnecessary noise to the history.

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:6>

Django

unread,
Jul 24, 2023, 2:59:03 AM7/24/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Jonny
| Park
Type: Bug | Status: assigned

Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jonny Park):

* owner: nobody => Jonny Park
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:7>

Django

unread,
Aug 12, 2023, 5:11:50 AM8/12/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: Jonny
| Park
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Jonny Park):

I think @David Wobrock's query is easier to implement and covers more
cases.

For example, if we have a followinfg queryset:

{{{


books_qs = (
Book.objects.annotate(greatest_pages=Greatest("pages",
Value(600)))
.values(
"greatest_pages",
)
.annotate(
min_pages=Min("pages"),
least=Least("min_pages", "greatest_pages"),
)

)
}}}

Creating the following query that @David Wobrock presented seems like more
sense to me and covers many other cases.

{{{


SELECT greatest_pages,
MIN(pages),
LEAST(MIN(pages), greatest_pages) AS least
FROM (SELECT GREATEST(pages, 600) greatest_pages,
pages
FROM aggregation_book2) AS t
GROUP BY greatest_pages
ORDER BY NULL;
}}}

If we were to take @Simon Charette's query, it could be like this:

{{{
SELECT geatest_pages, min_pages, LEAST(min_pages, greatest_pages) AS


`least` FROM (
SELECT
GREATEST(`aggregation_book`.`pages`, 600) greatest_pages,
MIN(`aggregation_book`.`pages`) min_pages
FROM `aggregation_book`
GROUP BY 1 ORDER BY NULL
) masked
}}}

I think the position of "MIN(`aggregation_book`.`pages`) min_pages" look
award for me.
with `.values_list("least", flat=True)` clause present, there was a
obvious reason for "MIN(`aggregation_book`.`pages`) min_pages" to be
pushed down because it is a dependency for `least`, but without
`.values_list("least", flat=True)` it loses it's reason to be pushed down.
I am a bit suspicious that choosing which additional item to be pushed
down by looking at `values_list` worth it's effort considering frequency
of this use case is thought to be small.

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:8>

Django

unread,
Nov 28, 2023, 2:46:30 AM11/28/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new

Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Jonny Park):

* owner: Jonny Park => (none)
* status: assigned => new


--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:9>

Django

unread,
Nov 28, 2023, 11:36:17 AM11/28/23
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

FWIW this relates to #34992 where we had to disable
`allows_group_by_selected_pks` on MariaDB entirely as it doesn't implement
any form of functional dependence resolition.

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:10>

Django

unread,
Feb 28, 2024, 8:22:32 PM2/28/24
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

[https://modern-sql.com/caniuse/any_value A recent article on any_value
and functional dependency] if it can be of help to anyone working on this
issue.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:11>

Django

unread,
Feb 7, 2025, 12:41:06 AM2/7/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

Is there an expression in Django for ANY_VALUE()? I did a quick search for
"anyvalue” and “any.*value” but it did not come up with results. Would it
be useful to support such an expression?

My thought is, option 2 seems to be the lowest effort and risk to
implement (the other options can potentially be added in later if there is
a good solution). It can raise an error and suggest that the user apply
such an expression to the offending column. This way, django is not making
an arbitrary decision on wrapping the column with ANY_VALUE(). I haven’t
dived into what it would take to support such an expression, so this may
be naive. Any thoughts here?
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:12>

Django

unread,
Feb 7, 2025, 2:42:36 AM2/7/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: (none)
Type: Bug | Status: new
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

There's none but you can
[https://docs.djangoproject.com/en/5.1/ref/models/expressions/#creating-
your-own-aggregate-functions easily write your own] and circumvent the
problems MySQL exhibits

{{{#!python
from django.db.models.aggregate import Aggregate

class AnyValue(Aggregate):
function = "ANY_VALUE"
}}}

Given MySQL, Postgres 16+, and Oracle 19c+ [https://modern-
sql.com/caniuse/any_value support it] it might be worth considering adding
it to core and documenting that it must be used under some circumstances
on MySQL?
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:13>

Django

unread,
Feb 7, 2025, 8:26:01 PM2/7/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ontowhee):

* owner: (none) => ontowhee
* status: new => assigned

Comment:

Should a new ticket be created for adding support for ANY_VALUE()? For
MySQL, since ANY_VALUE() is not an aggregate function, would it be
implemented as an Aggregate or Func or a different expression type?

Just thinking out loud here. If option 2 is the path forward,
- Would the warning be raised before the query is evaluated? That means
the query needs to detect if there are nonaggregated expressions. I'll
need to dig more to understand how that might work.
- Or, raise the warning after the query is evaluated by catching the
OperationalError in the
[https://github.com/django/django/blob/main/django/db/backends/mysql/base.py#L80
CursorWrapper]?

{{{
codes_for_warnings = (
1055, # Expression not in GROUP BY and contains nonaggregated
colum
)
...

def execute(self, query, args=None):
try:
# args is None means no string interpolation
return self.cursor.execute(query, args)
except Database.OperationalError as e:
# Map some error codes to IntegrityError, since they seem to
be
# misclassified and Django would prefer the more logical
place.
if e.args[0] in self.codes_for_integrityerror:
raise IntegrityError(*tuple(e.args))
else if e.args[0] in self.codes_for_warnings:
warnings.warn(
"%s "
"Consider wrapping the nonaggregated expression using
AnyValue." %(e),
RuntimeWarning,
)
return
raise
}}}

I'm going to dig around and see if the warning can be raised before the
query is evaluated. I might start looking through the resolve_expression()
functions, since there seem to be patterns of raising errors there.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:14>

Django

unread,
Feb 9, 2025, 10:06:04 AM2/9/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

> Should a new ticket be created for adding support for ANY_VALUE()?

I think this that this ticket can be re-purposed for the introduction of
`AnyValue(Aggregate)` given that the reported problem here is that MySQL
query planer isn't smart enough to transitively infer the presence of
members in the `GROUP BY` clause.

> For MySQL, since ANY_VALUE() is not an aggregate function, would it be
implemented as an Aggregate or Func or a different expression type?

It's effectively not an aggregate function per-se on MySQL but more of a
sentinel to tell the query planner to ignore the `only_full_group_by`
check only for a single expression but in the Django sense it must be
defined as an `Aggregate` subclass otherwise it will be included in the
`GROUP BY` clause. In other words, it should be implemented as an
`Aggregate` subclass even if MySQL doesn't follow the SQL spec by clearly
defining what kind of function `ANY_VALUE` is.

> Just thinking out loud here. If option 2 is the path forward,

I think that raising awareness is the way to go here but I don't think
that the proposed implementation of capturing errors and emitting warning
is the way to go here. First because it's brittle given we don't know the
extent of this problem and the MySQL implementation could change and
secondly because it's not an approach we've taken with this class of
problems.

IMO this problem is very similar to how we let inappropriate casting
errors bubble through and expect users to use `CAST` where necessary from
the error messages over trying to detect such errors and point them
directly at `django.db.models.functions.Cast`. In other words I think that
it's better to have the user go through the following chain of thoughts

- Encounter type / `only_full_group_by` error
- Do their research on the subject and learn about `CAST` / `ANY_VALUE`
- Search for Django `CAST` / `ANY_VALUE` and endup on the `Cast` /
`AnyValue` docs

than committing to building a bullet proof solution that hides away these
details from users. For all we know MySQL could finally implement
functional dependency detection properly and all of our efforts (and bugs
trying to get it right but failing) would be wasted efforts.

For these reasons I believe that ''raising awareness'' is effectively the
way to go but I think the way to do so is

1. Introduce an `AnyValue(Aggregate)`
2. Make sure the documentation mentions that its usage might be necessary
on MySQL when mixing aggregate and non-aggregate functions when
`sql_mode=only_full_group_by` in a `.. note`
3. Link to `AnyValue` documentation from the
[https://docs.djangoproject.com/en/5.1/topics/db/aggregation/#aggregating-
annotations aggregating annotations] section of the docs
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:15>

Django

unread,
Feb 12, 2025, 9:08:04 PM2/12/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

Thank you so much for the detailed clarifications! It helps me understand
the context. The chain of thoughts for the user that you outlined makes a
lot of sense, and it is what I would prefer in my own workflow, too.


----


I have WIP in this
[https://github.com/django/django/compare/main...ontowhee:django:ticket-34262
branch]. I’ve added AnyValue, along with some tests.


One thing I'm not sure about is, in MySQL, the expression
`ANY_VALUE(LEAST(MIN(pages), GREATEST(pages, 600)))` is valid, but
creating the expression in django with `AnyValue(Least("min_pages",
"greatest_page"))` leads to the following error during the resolving
phase,

{{{
Cannot compute AnyValue(’Least(F(min_pages), F(greatest_pages))’):
‘Least(F(min_pages), F(greatest_pages))’ is an aggregate
}}}

For Postgresql and Oracle, it makes sense to raise this error, because
ANY_VALUE is considered an aggregate function, and nested aggregates are
not allowed. However, for MySQL, would it make sense to not raise the
error? Would that mean changing how
[https://github.com/django/django/blob/main/django/db/models/aggregates.py#L81
resolve_expression] is checking the `contains_aggregate` attribute? The
resolving phase is backend agnostic. Trying to catch it in the compilation
phase in `as_mysql()` would be too late, because the error would have
already been raise. I don't have a clear path forward at this point, but
will dig around and think about this more. Any ideas and suggestions are
welcomed, of course!


----


I'm also a bit stuck on understanding how to write the SQL statement for
`ANY_VALUE(…) FILTER(…) OVER(…)`. I was running statements directly in
MySQL to better understand the different combinations of the FILTER, OVER,
GROUP BY clauses, trying add support according to the chart in the [https
://modern-sql.com/caniuse/any_value article], but I'm getting syntax
errors.

Here is a simple example I was running,

{{{
CREATE TABLE person(id int, first_name varchar(255), dob date);
INSERT INTO person(id, first_name, dob)
VALUES(1, 'alice', '2025-01-01'),
(2, 'alice', '2024-02-02'),
(3, 'alice', '2025-03-03'),
(4, 'bob', '2025-04-04'),
(5, 'charlie', '2025-05-05');
}}}

These statement worked in MySQL.

{{{
SELECT ANY_VALUE(id) FROM person;
SELECT ANY_VALUE(id) FROM person GROUP BY first_name;
}}}

These statements did not work. They returned a syntax error. I expected
the last one, `ANY_VALUE(..) FILTER(...) OVER(...)`, to be valid,
according to the chart, but was surprised to see the syntax error. All the
statements work in Postgresql.

{{{
SELECT ANY_VALUE(id) OVER() FROM person;
SELECT ANY_VALUE(id) FILTER(WHERE id > 1) FROM person;
SELECT ANY_VALUE(id) FILTER(WHERE id > 1) OVER() FROM person;
SELECT ANY_VALUE(id) FILTER(WHERE id > 1) OVER(partition by first_name)
FROM person;
}}}


MySQL does not support [https://modern-sql.com/feature/filter FILTER
clause], and django will emulate using
[https://github.com/django/django/blob/main/django/db/models/aggregates.py#L142
CASE] instead. I wonder if the chart in the article is accurate, or if the
FILTER clause in ANY_VALUE is a special case.

If anyone can see what I'm doing wrong here, and give an example of a
valid statement with `ANY_VALUE(..) FILTER(...) OVER(...)`, that would be
helpful!
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:16>

Django

unread,
Feb 19, 2025, 2:56:56 PM2/19/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by ontowhee):

[https://github.com/django/django/pull/19196 Draft PR]

I opened a draft PR. This table shows what is implemented. It strays a bit
from the [https://modern-sql.com/caniuse/any_value chart]:

{{{
+-----------------------------+---------------+------------+------------+
| | MySQL | Oracle | PostgreSQL |
+-----------------------------+---------------+------------+------------+
| ANY_VALUE() | Yes | Yes | Yes |
| ANY_VALUE() FILTER() | No, uses CASE | No | Yes |
| ANY_VALUE() OVER() | No | Yes | Yes |
| ANY_VALUE() FILTER() OVER() | No, uses CASE | No | Yes |
+-----------------------------+---------------+------------+------------+
}}}

- The over clause has not been implemented for AnyValue(). The tests are
using the Window function and partition_by parameter to achieve this.
- **Question:** Is this a good approach? Or should AnyValue()
implement over clause?
- The filter clause is leveraging the existing filter implementation for
expressions. For MySQL, this means it creates CASE() instead of FILTER().
MySQL does not support [https://modern-sql.com/feature/filter filter
clause].
- The chart in the article shows a lightning bolt MySQL. I may be
misunderstanding what that means, but I have not been able to directly run
a “ANY_VALUE() FILTER() OVER()” expression. I tried to keep the scope of
this ticket simple by not addressing this one for now.
- For the specific case described in this ticket, instead of creating the
django expression `AnyValue(Least("min_pages", "greatest_page"))` , the
tests are creating `Least("min_pages", AnyValue("greatest_page"))`. It
wraps AnyValue() on "greatest_page" instead of the entire expression. This
avoids the error that is raised `Cannot compute ... is an aggregate` error
in `resolve_expression`.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:17>

Django

unread,
Mar 15, 2025, 4:43:40 PM3/15/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* has_patch: 0 => 1
* needs_better_patch: 0 => 1

Comment:

I left some comments on the PR but this is looking promising. I provided a
way to make it work for SQLite as well.

> Question: Is this a good approach? Or should AnyValue() implement over
clause?

Given SQLite doesn't support it and the `FirstValue` window-compatible
function I think it might be better to set `window_compatible = False` on
it for now and revisit in the future if truly needed.

> The filter clause is leveraging the existing filter implementation for
expressions. For MySQL, this means it creates CASE() instead of FILTER().

That's expected just like with any other aggregates. Is this causing
issues with `NULL` values generated by members that should be filtered
out?

> The chart in the article shows a lightning bolt MySQL. I may be
misunderstanding what that means, but I have not been able to directly run
a “ANY_VALUE() FILTER() OVER()” expression. I tried to keep the scope of
this ticket simple by not addressing this one for now.

The issue goes away if we don't initially support its usage in window
functions which I think is okay for now.

> This avoids the error that is raised Cannot compute ... is an aggregate
error in resolve_expression, and it seems to make sense now that all the
columns have an "aggregate" function operating on them.

That makes sense to me as well.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:18>

Django

unread,
Mar 18, 2025, 12:43:34 PM3/18/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ontowhee):

* needs_better_patch: 1 => 0

Comment:

I've pushed changes to the PR that reflect this table, and updated the
table to reflect what is supported. Does this look good?

{{{
+-----------------------------+--------+---------------+------------+------------+
| | SQLite | MySQL | Oracle |
PostgreSQL |
+-----------------------------+--------+---------------+------------+------------+
| ANY_VALUE() | Yes | Yes | Yes | Yes
|
| ANY_VALUE() FILTER() | Yes | No [*] | No | Yes
|
| ANY_VALUE() OVER() | No | No | No | No
|
| ANY_VALUE() FILTER() OVER() | No | No | No | No
|
+-----------------------------+--------+---------------+------------+------------+

[*] MySQL does not seem to support this, but Django will use CASE if the
filter clause is present.
}}}


>> The filter clause is leveraging the existing filter implementation for
expressions. For MySQL, this means it creates CASE() instead of FILTER().
>
> That's expected just like with any other aggregates. Is this causing
issues with `NULL` values generated by members that should be filtered
out?

I think I initially wrote the comment to clarify to myself how Django is
handling FILTER, even though I was observing that running ANY_VALUE()
FILTER() directly in the db would raise a syntax error.

With CASE, it does return NULL values. I don't know if that is an issue,
because ANY_VALUE can return NULL for MySQL. For example:

{{{
CREATE TABLE person(id int, first_name varchar(255), dob date);

INSERT INTO person(id, first_name, dob)
VALUES(1, null, '2025-01-01'),
(2, 'alice', '2024-01-01'),
(3, 'alice', '2025-02-02'),
(4, 'bob', '2025-04-04'),
(5, 'charlie', '2025-05-05');

SELECT ANY_VALUE(first_name), EXTRACT(YEAR FROM dob) FROM person GROUP BY
EXTRACT(YEAR FROM dob);
}}}

For me, this returns `(NULL, 2025), ('alice', 2024)`.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:19>

Django

unread,
Mar 18, 2025, 5:23:59 PM3/18/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* needs_better_patch: 0 => 1

Comment:

> With CASE, it does return NULL values. I don't know if that is an issue,
because ANY_VALUE can return NULL for MySQL.

I think that's fine given an explicit `COALESCE` could be used if needs
be. I suspect `AnyValue` will end up mostly being useful to assist MySQL
functional dependency resolver and rarely with `filter` anyway.

Left some comments about the mention of support for SQLite in the release
notes as well as supported Tim's request for slight tweaks of docs in
MySQL specific parts. It also seems like `ANY_VALUE` can be used on MySQL
even when `ONLY_FULL_GROUP_BY` is disabled.
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:20>

Django

unread,
Mar 20, 2025, 12:00:21 PM3/20/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ontowhee):

* needs_better_patch: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:21>

Django

unread,
May 15, 2025, 4:04:38 AM5/15/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 1
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* needs_better_patch: 0 => 1

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:22>

Django

unread,
May 16, 2025, 10:23:09 AM5/16/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted
only_full_group_by |
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by ontowhee):

* needs_better_patch: 1 => 0

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:23>

Django

unread,
May 19, 2025, 4:47:02 AM5/19/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: assigned
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Ready for
only_full_group_by | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce):

* stage: Accepted => Ready for checkin

--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:24>

Django

unread,
May 20, 2025, 4:01:53 AM5/20/25
to django-...@googlegroups.com
#34262: Queryset grouped by annotation with aggregates on another annotated
expression crashes on MySQL with sql_mode=only_full_group_by.
-------------------------------------+-------------------------------------
Reporter: Mariusz Felisiak | Owner: ontowhee
Type: Bug | Status: closed
Component: Database layer | Version: 4.1
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: mysql | Triage Stage: Ready for
only_full_group_by | checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sarah Boyce <42296566+sarahboyce@…>):

* resolution: => fixed
* status: assigned => closed

Comment:

In [changeset:"ddb85294159185c5bd5cae34c9ef735ff8409bfe" ddb8529]:
{{{#!CommitTicketReference repository=""
revision="ddb85294159185c5bd5cae34c9ef735ff8409bfe"
Fixed #34262 -- Added support for AnyValue for SQLite, MySQL, Oracle, and
Postgresql 16+.

Thanks Simon Charette for the guidance and review. Thanks Tim Schilling
for the
documentation review. Thanks David Wobrock for investigation and solution
proposals.
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/34262#comment:25>
Reply all
Reply to author
Forward
0 new messages