in Django 3.2 the SQL changed to:
{{{
WHERE `mytable`.`myfield`
}}}
The problem with the 2nd approach is that in mysql8 this field is not used
in indexes anymore, causing a great performance degradation.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* status: new => closed
* type: Uncategorized => Cleanup/optimization
* resolution: => invalid
Comment:
Thanks for this ticket, however both clauses are correct in MySQL. The way
how the ORM builds filters is an implementation detail not something
Django should guarantee (see #25367).
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:1>
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:2>
Comment (by Todor Velichkov):
Hi Mariusz,
I don't quite agree, both clauses are syntactically correct yes, but they
are not the same.
MySQL does not have a boolean type of column, so Django is using
[https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/integer-types.html
tinyint]
which stores value between `-128..127` or `0..255`
When you tell `mysql` you are looking for {{{`mytable`.`myfield`}}} this
basically translates to all values between `-128..127` except `0` which is
a range query equivalent to `myfield__in=[-128..-1]+[1..127]`
even `myfield__exact` lookup is acting the same way.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:3>
Comment (by Mariusz Felisiak):
> MySQL does not have a boolean type of column, so Django is using
tinyint
Django is using `BOOL` which is an alias for `tinyint(1)`. As far as I'm
aware, storing values except `0` and `1` in `BOOL` columns has never been
supported by Django.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:4>
Comment (by Todor Velichkov):
Django is not storing values except `0` and `1` in `BOOL`, but MySQL
`tinyint(1)` [https://www.mysqltutorial.org/mysql-boolean/ support it].
So my question is, why is Django asking MySQL to search for values other
than `0` and `1` in a `BOOL` column. Doesn't it make sense to only search
for `0` and `1` when Django is only storing `0` and `1`?
My second question is: Let's say Django is right about asking MySQL to
search for values other than `0` and `1` in a `BOOL` column, then
shouldn't it at least `myboofield__exact` act in a different way and ask
MySQL to look for a specific value ? Django is literally leaving us w/o an
option here and using a boolean field as index/ inside composite index is
impossible.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:5>
* cc: Adam Johnson (added)
Comment:
> Django is literally leaving us w/o an option here and using a boolean
field as index/ inside composite index is impossible.
I'm not a MySQL expert, can you provide any link to the MySQL docs which
will confirm that it's not possible. It's surprising that you cannot
create an index with a `BOOL` (`TINYINT`) column.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:6>
Comment (by Todor Velichkov):
The issue is not that the index will not be created, the index will be
created, but it will not be used against expressions like:
{{{
WHERE `mytable`.`myfield`
}}}
because such an expression evaluates to `True` for a range of values (in
the case of signed `tinyint(1)` the range is `-128..127` except `0`) which
turns the query into a `range` query which are
[https://dev.mysql.com/doc/refman/8.0/en/range-optimization.html hard to
optimize]. MySQL needs a constant value like:
{{{
WHERE `mytable`.`myfield` = 1
}}}
in order for the index to be used.
An example: If we have a row with a value of `myfield = 2` (MySQL allows
this because the column is `tinyint(1)`) then:
- The first expression will include this row in the results because `2`
evaluates to `True`
- The second expression will not include this row because `2 != 1`
Django only stores `0` and `1` why its expression is including rows which
Django itself do not store.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:7>
Comment (by Simon Charette):
If we want to move forward with this change the only possible resolution I
see is to add a `has_native_boolean_type` database feature and set it to
false for MySQL. Another more intrusive alternative would be to have
`BooleanField.get_db_prep_value` return an `int` instead of a `bool` on
MySQL so the lookup machinery is made aware of this quirk instead of
having it magically happen at the database connector layer (`mysqlclient`,
`pymysql`).
In the mean time this can be worked around by doing
`.filter(boolean_field=1)`. Entirely reverting this change would possibly
cause a breakage in all function index on boolean expressions created on
other backends.
In all cases I'd argue that creating an index on a boolean field is a bad
practice particularly when it's backed by a b-tree implementation and your
backend doesn't support partial indices to allow excluding parts of the
covered table as your underlying tree will get heavily unbalanced on way
or the other.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:8>
Comment (by Todor Velichkov):
In the mean time this can be worked around by doing
`.filter(boolean_field=1)`. Entirely reverting this change would possibly
cause a breakage in all function index on boolean expressions created on
other backends.
I tried this before opening the ticket, but it didn't work, however it
looks like `.filter(boolean_field=models.Value(1))` is working!!
For now this should be sufficient for us to complete our migration to
`Django 3.2`! Thanks for the time spent on this!
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:9>
Comment (by Roman Miroshnychenko):
Why not add `as_mysql()` method to `Exact` lookup that simply calls
parent's `as_sql()` method instead of doing this boolean field magic as in
`Exact.as_sql()`? We had the same issue as in the original post and this
solution worked for us.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:10>
* cc: Simon Charette (added)
Comment:
Adding a dedicated `Exact.as_mysql` method would also work, feels like a
feature flag might be more appropriate though.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:11>
* cc: Dan Strokirk (added)
Comment:
We've also seen this change cause a production outage when we've tried
upgrading to Django 3, so either a notice in the docs or some other way to
to avoid it changing back again in the future would be great.
Thanks Todor for your tip about `models.Value(1)`!
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:12>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:13>
Comment (by Roman Miroshnychenko):
I have submitted a PR, hopefully it will be accepted.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:14>
* status: closed => new
* resolution: invalid =>
* stage: Unreviewed => Accepted
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:15>
* owner: nobody => Roman Miroshnychenko
* status: new => assigned
* stage: Accepted => Ready for checkin
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:16>
* status: assigned => closed
* resolution: => fixed
Comment:
In [changeset:"407fe95cb116599adeb4b9ed01df5673aa5cb1db" 407fe95c]:
{{{
#!CommitTicketReference repository=""
revision="407fe95cb116599adeb4b9ed01df5673aa5cb1db"
Fixed #32691 -- Made Exact lookup on BooleanFields compare directly to a
boolean value on MySQL.
Performance regression in 37e6c5b79bd0529a3c85b8c478e4002fd33a2a1d.
Thanks Todor Velichkov for the report.
Co-authored-by: Mariusz Felisiak <felisiak...@gmail.com>
}}}
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:17>
Comment (by anze3db):
Any chance that we could cheery-pick the fix into Django 3.2? The issue
nearly caused a production outage for us when we upgraded to Django 3.2
and it would be great to make sure nobody else gets burned by this.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:18>
Comment (by Mariusz Felisiak):
Replying to [comment:18 Anže Pečar]:
> Any chance that we could cheery-pick the fix into Django 3.2? The issue
nearly caused a production outage for us when we upgraded to Django 3.2
and it would be great to make sure nobody else gets burned by this.
Unfortunately not, this is a regression in Django 3.1 that is no longer
supported. Per our backporting policy this means it doesn't qualify for a
backport to 3.2.x or 4.0.x anymore.
See [https://docs.djangoproject.com/en/stable/internals/release-process/
Django’s release process] for more details.
--
Ticket URL: <https://code.djangoproject.com/ticket/32691#comment:19>