[Django] #34771: order_by on annotated field that's not present in values/values_list causes SQL syntax error

26 views
Skip to first unread message

Django

unread,
Aug 11, 2023, 10:55:42 AM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: tinyx | Owner: nobody
Type: Bug | Status: new
Component: Database | Version: 3.2
layer (models, ORM) | Keywords: SQL syntax order_by
Severity: Normal | annotate
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
Although this is an extremely rare case, it does seem like to be something
Django could've captured. Basically, when there's an annotated field,
there's a slight difference on how the `ORDER BY` SQL is constructed based
on whether the field is present or not in the `SELECT` statement, or in
Django's world, whether the fields are present in either `values` or
`values_lsit`. Here's an example:

This would work fine:

{{{
>>> User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id',
'random_stuff').order_by('random_stuff')
SELECT `auth_user`.`id`,
0 AS `random_stuff`
FROM `auth_user`
ORDER BY `random_stuff` ASC
LIMIT 21

Execution time: 0.000783s [Database: default]
<QuerySet [{'id': 1, 'random_stuff': False}, {'id': 2, 'random_stuff':
False}, '...(remaining elements truncated)...']>
>>>
}}}

But this would break:

{{{
>>> User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id').order_by('random_stuff')
None

Execution time: 0.000340s [Database: default]
Traceback (most recent call last):
File "<input>", line 1, in <module>
User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id').order_by('random_stuff')
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/query.py", line 256, in __repr__
data = list(self[:REPR_OUTPUT_SIZE + 1])
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/query.py", line 280, in __iter__
self._fetch_all()
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/query.py", line 1324, in _fetch_all
self._result_cache = list(self._iterable_class(self))
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/query.py", line 109, in __iter__
for row in compiler.results_iter(chunked_fetch=self.chunked_fetch,
chunk_size=self.chunk_size):
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/sql/compiler.py", line 1130, in results_iter
results = self.execute_sql(MULTI, chunked_fetch=chunked_fetch,
chunk_size=chunk_size)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/models/sql/compiler.py", line 1175, in execute_sql
cursor.execute(sql, params)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django_extensions/management/debug_cursor.py", line 50, in
execute
return utils.CursorWrapper.execute(self, sql, params)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False,
executor=self._execute)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django_mysql/apps.py", line 75, in rewrite_hook
return execute(sql, params, many, context)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/backends/utils.py", line 79, in _execute
with self.db.wrap_database_errors:
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/django/db/backends/mysql/base.py", line 73, in execute
return self.cursor.execute(query, args)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/MySQLdb/cursors.py", line 206, in execute
res = self._query(query)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/MySQLdb/cursors.py", line 319, in _query
db.query(q)
File "/Users/tinyx/.pyenv/versions/portal/lib/python3.10/site-
packages/MySQLdb/connections.py", line 254, in query
_mysql.connection.query(self, query)
django.db.utils.ProgrammingError: (1064, "You have an error in your SQL
syntax; check the manual that corresponds to your MySQL server version for
the right syntax to use near 'bool) ASC LIMIT 21' at line 1")
>>>
}}}

If you look at the SQL query, it breaks because it didn't have an alias to
reference to in the `ORDER BY` statement, therefore it grabs the entire
annotation expression and throw it in there:

{{{
>>> print(User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id').order_by('random_stuff').query)
SELECT `auth_user`.`id` FROM `auth_user` ORDER BY CAST(False AS bool) ASC
>>>
}}}

This is under MySQL 8.0.33 by the way. Not sure if it's just syntax not
supported by MySQL.

Since Django doesn't seem to require an annotated field to be present in
`values` or `values_list` to be used in `order_by`, my humble opinion is
that it should be slightly smarter for this case by implicitly adding the
field into the `SELECT` statement.

Thanks for taking a look, and feel free to let me know if you need more
information.

--
Ticket URL: <https://code.djangoproject.com/ticket/34771>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Aug 11, 2023, 10:57:10 AM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: tinyx | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: SQL syntax order_by | Triage Stage:
annotate | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Description changed by tinyx:

Old description:

New description:

This would work fine:

But this would break:

field into the `SELECT` statement, and use its alias in the `ORDER BY`.

Thanks for taking a look, and feel free to let me know if you need more
information.

--

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:1>

Django

unread,
Aug 11, 2023, 10:59:16 AM8/11/23
to django-...@googlegroups.com

Old description:

> field into the `SELECT` statement, and use its alias in the `ORDER BY`.
>
> Thanks for taking a look, and feel free to let me know if you need more
> information.

New description:

Although this is an extremely rare case, it does seem like to be something
Django could've captured. Basically, when there's an annotated field,
there's a slight difference on how the `ORDER BY` SQL is constructed based
on whether the field is present or not in the `SELECT` statement, or in

Django's world, whether the field is present in either `values` or
`values_list`. Here's an example:

This would work fine:

But this would break:

--

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:2>

Django

unread,
Aug 11, 2023, 12:53:21 PM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: Yitao Xiong | Owner: nobody

Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage:

| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* cc: Simon Charette (added)
* keywords: SQL syntax order_by annotate => mysql


Comment:

Hello tinyx, thank you for your ticket.

My first thought when reading the description is that "it makes no sense"
to order a query using a field that was not included in `values` list,
though I can't find docs that explicitly confirm my suspicion.

I tested this in Postgresql (15.3) and obtained a successful query run:
{{{

>>> print(User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id').order_by('random_stuff').query)

SELECT "auth_user"."id" FROM "auth_user" ORDER BY (False)::boolean ASC


>>> User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).values('id').order_by('random_stuff')

<QuerySet [{'id': 2}, {'id': 3}, {'id': 1}, {'id': 4}, {'id': 5}, {'id':
6}, {'id': 7}]>
>>>
}}}

Nevertheless, I would like to understand your use case a little bit more.
I can imagine you defined `random_stuff` purposely silly to simplify the
example, but would you have a more concrete/real use case that you could
share with us?

Also, is there any chance that you can test this same example in Django
4.2 (or even better, `main`)? I don't have a handy mysql to test myself at
this time.

Lastly, I'm cc'ing Simon who may have more specific thoughts on the
matter.

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:3>

Django

unread,
Aug 11, 2023, 1:45:57 PM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: Yitao Xiong | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Yitao Xiong):

Thanks for your reply Natalia!

I tried this with `main` and it behaved the same:

{{{
In [5]: User.objects.annotate(random_stuff=Value(False,


output_field=BooleanField())).values('id').order_by('random_stuff')

....


ProgrammingError: (1064, "You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for the right syntax
to use near 'bool) ASC LIMIT 21' at line 1")

In [6]: print(User.objects.annotate(random_stuff=Value(False,


output_field=BooleanField())).values('id').order_by('random_stuff').query)

SELECT `auth_user`.`id` FROM `auth_user` ORDER BY CAST(False AS bool) ASC


In [7]: from django import VERSION

In [8]: VERSION
Out[8]: (5, 0, 0, 'alpha', 0)

In [9]:
}}}

For our specific scenario that caused this error, it wasn't really an
intentional skipping on the annotated fields, it's more like this (it also
wasn't the User model at all, using it to avoid exposing our business
logics, apologize if that added confusions):

{{{
# Some generic queryset pre-assembling with annotations
base_queryset =
User.objects.annotate(random_field...).order_by('random_field')

...
# Later in a specific path, needing some extra information from the
base_queryset
additional_lookup = base_queryset.filter(...).values('id', 'email')
}}}

So it's more like the field was annotated first, and developer just didn't
care about it when later reusing the queryset. Using my example, this
query with the `annotate` and `values` changed orders will produce the
same error:

{{{
User.objects.annotate(random_stuff=Value(False,
output_field=BooleanField())).order_by('random_stuff').values('id')
...

ProgrammingError: (1064, "You have an error in your SQL syntax; check the
manual that corresponds to your MySQL server version for the right syntax
to use near 'bool) ASC LIMIT 21' at line 1")
}}}

To real-life use cases, I can see some cases when you want the rows
returned in a specific order, but doesn't care about the value it's been
ordered by.

A user-login log table can be an example - you want the last 10 logged in
users, but you don't really care about when they logged in. If we specific
are looking for order with annotations, maybe an API usage log table would
be a good example? You wanna see the 10 users who used the API last so you
annotate with their max access time and order by it, without needing the
actual values.

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:4>

Django

unread,
Aug 11, 2023, 1:49:21 PM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: Yitao Xiong | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Yitao Xiong):

Oh, an interesting discovery I just found when doing more experiments.

This doesn't seem to be an issue that applies to all fields. For instance,
I just found out that the same pattern worked for `IntegerField`:

{{{
In [20]: User.objects.annotate(random_stuff=Value(1,
output_field=IntegerField())).order_by('random_stuff').values('id')
Out[20]: <QuerySet [{'id': 1}, {'id': 2}, '...(remaining elements
truncated)...']>
}}}

So I'm guess it's something specific with MySQL, given your example also
showed the boolean query worked in Postgres.

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:5>

Django

unread,
Aug 11, 2023, 2:23:36 PM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: Yitao Xiong | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Natalia Bidart):

Accepting since as [https://www.w3schools.com/sql/func_mysql_cast.asp per
docs], it seems that the only valid `datatypes` for a `CAST` call are:

{{{
DATE Converts value to DATE. Format: "YYYY-MM-DD"
DATETIME Converts value to DATETIME. Format: "YYYY-MM-DD HH:MM:SS"
DECIMAL Converts value to DECIMAL. Use the optional M and D
parameters to specify the maximum number of digits (M) and the number of
digits following the decimal point (D).
TIME Converts value to TIME. Format: "HH:MM:SS"
CHAR Converts value to CHAR (a fixed length string)
NCHAR Converts value to NCHAR (like CHAR, but produces a string with the
national character set)
SIGNED Converts value to SIGNED (a signed 64-bit integer)
UNSIGNED Converts value to UNSIGNED (an unsigned 64-bit integer)
BINARY Converts value to BINARY (a binary string)
}}}

Will write a regression test and bisect to confirm whether this would need
a backport or not.

--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:6>

Django

unread,
Aug 11, 2023, 2:23:51 PM8/11/23
to django-...@googlegroups.com
#34771: order_by on annotated field that's not present in values/values_list causes
SQL syntax error
-------------------------------------+-------------------------------------
Reporter: Yitao Xiong | Owner: nobody
Type: Bug | Status: new
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: mysql | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Natalia Bidart):

* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/34771#comment:7>

Reply all
Reply to author
Forward
0 new messages