[Django] #32691: Django 3.2 boolean field SQL output is causing a performance regression in mysql.

137 views
Skip to first unread message

Django

unread,
Apr 28, 2021, 4:37:21 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
mysql.
-------------------------------------+-------------------------------------
Reporter: Todor | Owner: nobody
Velichkov |
Type: | Status: new
Uncategorized |
Component: Database | Version: 3.2
layer (models, ORM) | Keywords: bool, booleanfield,
Severity: Normal | orm, sql, mysql
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
When filtering on boolean field in Django 2.2 the SQL output would look
like this:
{{{
WHERE `mytable`.`myfield` = 1
}}}

in Django 3.2 the SQL changed to:
{{{
WHERE `mytable`.`myfield`
}}}

The problem with the 2nd approach is that in mysql8 this field is not used
in indexes anymore, causing a great performance degradation.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Apr 28, 2021, 5:14:25 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
mysql.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: new => closed
* type: Uncategorized => Cleanup/optimization
* resolution: => invalid


Comment:

Thanks for this ticket, however both clauses are correct in MySQL. The way
how the ORM builds filters is an implementation detail not something
Django should guarantee (see #25367).

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:1>

Django

unread,
Apr 28, 2021, 5:14:39 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.

-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:2>

Django

unread,
Apr 28, 2021, 5:44:59 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Todor Velichkov):

Hi Mariusz,
I don't quite agree, both clauses are syntactically correct yes, but they
are not the same.

MySQL does not have a boolean type of column, so Django is using
[https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/integer-types.html
tinyint]
which stores value between `-128..127` or `0..255`

When you tell `mysql` you are looking for {{{`mytable`.`myfield`}}} this
basically translates to all values between `-128..127` except `0` which is
a range query equivalent to `myfield__in=[-128..-1]+[1..127]`

even `myfield__exact` lookup is acting the same way.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:3>

Django

unread,
Apr 28, 2021, 6:05:14 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

> MySQL does not have a boolean type of column, so Django is using

​tinyint

Django is using `BOOL` which is an alias for `tinyint(1)`. As far as I'm
aware, storing values except `0` and `1` in `BOOL` columns has never been
supported by Django.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:4>

Django

unread,
Apr 28, 2021, 7:25:58 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Todor Velichkov):

Django is not storing values except `0` and `1` in `BOOL`, but MySQL
`tinyint(1)` [https://www.mysqltutorial.org/mysql-boolean/ support it].

So my question is, why is Django asking MySQL to search for values other
than `0` and `1` in a `BOOL` column. Doesn't it make sense to only search
for `0` and `1` when Django is only storing `0` and `1`?

My second question is: Let's say Django is right about asking MySQL to
search for values other than `0` and `1` in a `BOOL` column, then
shouldn't it at least `myboofield__exact` act in a different way and ask
MySQL to look for a specific value ? Django is literally leaving us w/o an
option here and using a boolean field as index/ inside composite index is
impossible.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:5>

Django

unread,
Apr 28, 2021, 7:42:25 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* cc: Adam Johnson (added)


Comment:

> Django is literally leaving us w/o an option here and using a boolean
field as index/ inside composite index is impossible.

I'm not a MySQL expert, can you provide any link to the MySQL docs which
will confirm that it's not possible. It's surprising that you cannot
create an index with a `BOOL` (`TINYINT`) column.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:6>

Django

unread,
Apr 28, 2021, 9:06:23 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Todor Velichkov):

The issue is not that the index will not be created, the index will be
created, but it will not be used against expressions like:
{{{
WHERE `mytable`.`myfield`
}}}

because such an expression evaluates to `True` for a range of values (in
the case of signed `tinyint(1)` the range is `-128..127` except `0`) which
turns the query into a `range` query which are
[https://dev.mysql.com/doc/refman/8.0/en/range-optimization.html hard to
optimize]. MySQL needs a constant value like:


{{{
WHERE `mytable`.`myfield` = 1
}}}

in order for the index to be used.

An example: If we have a row with a value of `myfield = 2` (MySQL allows
this because the column is `tinyint(1)`) then:
- The first expression will include this row in the results because `2`
evaluates to `True`
- The second expression will not include this row because `2 != 1`
Django only stores `0` and `1` why its expression is including rows which
Django itself do not store.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:7>

Django

unread,
Apr 28, 2021, 9:50:32 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Simon Charette):

If we want to move forward with this change the only possible resolution I
see is to add a `has_native_boolean_type` database feature and set it to
false for MySQL. Another more intrusive alternative would be to have
`BooleanField.get_db_prep_value` return an `int` instead of a `bool` on
MySQL so the lookup machinery is made aware of this quirk instead of
having it magically happen at the database connector layer (`mysqlclient`,
`pymysql`).

In the mean time this can be worked around by doing
`.filter(boolean_field=1)`. Entirely reverting this change would possibly
cause a breakage in all function index on boolean expressions created on
other backends.

In all cases I'd argue that creating an index on a boolean field is a bad
practice particularly when it's backed by a b-tree implementation and your
backend doesn't support partial indices to allow excluding parts of the
covered table as your underlying tree will get heavily unbalanced on way
or the other.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:8>

Django

unread,
Apr 28, 2021, 10:07:44 AM4/28/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Todor Velichkov):

In the mean time this can be worked around by doing
`.filter(boolean_field=1)`. Entirely reverting this change would possibly
cause a breakage in all function index on boolean expressions created on
other backends.

I tried this before opening the ticket, but it didn't work, however it
looks like `.filter(boolean_field=models.Value(1))` is working!!
For now this should be sufficient for us to complete our migration to
`Django 3.2`! Thanks for the time spent on this!

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:9>

Django

unread,
Sep 29, 2021, 6:11:29 AM9/29/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Roman Miroshnychenko):

Why not add `as_mysql()` method to `Exact` lookup that simply calls
parent's `as_sql()` method instead of doing this boolean field magic as in
`Exact.as_sql()`? We had the same issue as in the original post and this
solution worked for us.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:10>

Django

unread,
Sep 29, 2021, 3:29:30 PM9/29/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


Comment:

Adding a dedicated `Exact.as_mysql` method would also work, feels like a
feature flag might be more appropriate though.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:11>

Django

unread,
Oct 12, 2021, 11:35:22 AM10/12/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Dan Strokirk):

* cc: Dan Strokirk (added)


Comment:

We've also seen this change cause a production outage when we've tried
upgrading to Django 3, so either a notice in the docs or some other way to
to avoid it changing back again in the future would be great.

Thanks Todor for your tip about `models.Value(1)`!

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:12>

Django

unread,
Oct 24, 2021, 2:05:29 PM10/24/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Roman Miroshnychenko):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:13>

Django

unread,
Oct 24, 2021, 2:06:16 PM10/24/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: closed
Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: bool, booleanfield, | Triage Stage:
orm, sql, mysql | Unreviewed
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Roman Miroshnychenko):

I have submitted a PR, hopefully it will be accepted.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:14>

Django

unread,
Oct 26, 2021, 7:17:30 AM10/26/21
to django-...@googlegroups.com
#32691: Django 3.2 boolean field SQL output is causing a performance regression in
MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: nobody
Type: | Status: new

Cleanup/optimization |
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: bool, booleanfield, | Triage Stage: Accepted
orm, sql, mysql |
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* status: closed => new
* resolution: invalid =>
* stage: Unreviewed => Accepted


--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:15>

Django

unread,
Oct 26, 2021, 7:19:53 AM10/26/21
to django-...@googlegroups.com
#32691: Performance regression in Exact lookup on BooleanField on MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: Roman
Type: | Miroshnychenko
Cleanup/optimization | Status: assigned

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution:
Keywords: bool, booleanfield, | Triage Stage: Ready for
orm, sql, mysql | checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak):

* owner: nobody => Roman Miroshnychenko
* status: new => assigned
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:16>

Django

unread,
Nov 3, 2021, 2:40:58 AM11/3/21
to django-...@googlegroups.com
#32691: Performance regression in Exact lookup on BooleanField on MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: Roman
Type: | Miroshnychenko
Cleanup/optimization | Status: closed

Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: bool, booleanfield, | Triage Stage: Ready for
orm, sql, mysql | checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"407fe95cb116599adeb4b9ed01df5673aa5cb1db" 407fe95c]:
{{{
#!CommitTicketReference repository=""
revision="407fe95cb116599adeb4b9ed01df5673aa5cb1db"
Fixed #32691 -- Made Exact lookup on BooleanFields compare directly to a
boolean value on MySQL.

Performance regression in 37e6c5b79bd0529a3c85b8c478e4002fd33a2a1d.

Thanks Todor Velichkov for the report.

Co-authored-by: Mariusz Felisiak <felisiak...@gmail.com>
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:17>

Django

unread,
Jan 21, 2022, 3:06:24 AM1/21/22
to django-...@googlegroups.com
#32691: Performance regression in Exact lookup on BooleanField on MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: Roman
Type: | Miroshnychenko
Cleanup/optimization | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: bool, booleanfield, | Triage Stage: Ready for
orm, sql, mysql | checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by anze3db):

Any chance that we could cheery-pick the fix into Django 3.2? The issue
nearly caused a production outage for us when we upgraded to Django 3.2
and it would be great to make sure nobody else gets burned by this.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:18>

Django

unread,
Jan 21, 2022, 3:11:37 AM1/21/22
to django-...@googlegroups.com
#32691: Performance regression in Exact lookup on BooleanField on MySQL.
-------------------------------------+-------------------------------------
Reporter: Todor Velichkov | Owner: Roman
Type: | Miroshnychenko
Cleanup/optimization | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: bool, booleanfield, | Triage Stage: Ready for
orm, sql, mysql | checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak):

Replying to [comment:18 Anže Pečar]:


> Any chance that we could cheery-pick the fix into Django 3.2? The issue
nearly caused a production outage for us when we upgraded to Django 3.2
and it would be great to make sure nobody else gets burned by this.

Unfortunately not, this is a regression in Django 3.1 that is no longer
supported. Per our backporting policy this means it doesn't qualify for a
backport to 3.2.x or 4.0.x anymore.
See [https://docs.djangoproject.com/en/stable/internals/release-process/
Django’s release process] for more details.

--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:19>

Reply all
Reply to author
Forward
0 new messages