[Django] #31275: Optimize MariaDB/MySQL sql_flush

80 views
Skip to first unread message

Django

unread,
Feb 15, 2020, 11:43:04 AM2/15/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam | Owner: nobody
(Chainz) Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database | Version: master
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
This was suggested as an extra feature for Django-MySQL by @jonatron in
https://github.com/adamchainz/django-mysql/pull/611/files , but it would
make more sense to make it in Django core.

Currently MySQL's sql_flush emits TRUNCATE TABLE statements. These are
relatively slow on small tables since they force recreation of the table
on disk. The alternative, `DELETE FROM`, is faster. I can see a diference
even for a single empty table locally (MariaDB 10.4):

{{{
chainz@localhost [21]> truncate t2;
Query OK, 0 rows affected (0.005 sec)

chainz@localhost [22]> delete from t2;
Query OK, 0 rows affected (0.000 sec)
}}}

`TransactionTestCase` uses sql_flush, via the flush management command, to
reset the database. Most test suites don't use very large tables, so
using `DELETE FROM` should normally be faster in this case. Optimizing it
MySQL to use `DELETE FROM` for flushing, at least for tables up to a
certain size (1000 rows or similar heuristic) could save lot of time when
using `TransactionTestCase`.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Feb 16, 2020, 11:17:49 AM2/16/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: nobody

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Claude Paroz):

* stage: Unreviewed => Accepted


Comment:

I confirm that `TransactionTestCase` is horrrrribly slow on MySQL/MariaDB!
So a big +1 from me.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:1>

Django

unread,
Feb 18, 2020, 1:40:28 PM2/18/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: jonatron
Johnson |
Type: | Status: assigned

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jonatron):

* owner: nobody => jonatron
* status: new => assigned


--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:2>

Django

unread,
Feb 21, 2020, 9:01:26 AM2/21/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: (none)

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jonatron):

* owner: jonatron => (none)
* status: assigned => new


Comment:

If I recall correctly, TRUNCATE will reset AUTO_INCREMENT but DELETE FROM
won't.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:3>

Django

unread,
Feb 21, 2020, 9:56:06 AM2/21/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: (none)

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam (Chainz) Johnson):

If that's the case we'd want to `ALTER TABLE tablename AUTO_INCREMENT = 1`
for those tables with auto-increment keys.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:4>

Django

unread,
Mar 26, 2020, 12:58:32 PM3/26/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: (none)

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Masashi SHIBATA):

Hi! Can I work on this ticket?

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:5>

Django

unread,
Mar 26, 2020, 1:17:48 PM3/26/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: (none)

Johnson |
Type: | Status: new
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Claude Paroz):

No need to ask when there is no owner. Please do!

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:6>

Django

unread,
Mar 27, 2020, 4:18:29 AM3/27/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Masashi SHIBATA):

* owner: (none) => Masashi SHIBATA


* status: new => assigned


Comment:

Thank you!

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:7>

Django

unread,
Mar 27, 2020, 5:57:06 AM3/27/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Masashi SHIBATA):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:8>

Django

unread,
Mar 27, 2020, 5:57:30 AM3/27/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Masashi SHIBATA):

I created a pull request.
https://github.com/django/django/pull/12634

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:9>

Django

unread,
Mar 28, 2020, 11:50:30 AM3/28/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)


Comment:

> If that's the case we'd want to `ALTER TABLE tablename AUTO_INCREMENT =
1` for those tables with auto-increment keys.

I decided to guide the patch submitter towards avoiding these queries and
suggesting the use of `TransactionTestCase.reset_sequences` instead while
documenting the behaviour changed in Django 3.1 and that this feature
should be explicitly enabled to achieve this previously implicit reset.

It made more sense to me since the goal of the ticket is to make this
operation faster and these sequence alteration queries took longer than
plain `TRUNCATE` calls from my local testing which completely defeats the
purpose of this ticket. The fact we have an explicit feature to achieve
this behaviour only makes it slightly backward incompatible for users
relying on this undefined behaviour.

Based on the fact `ALTER TABLE tablename AUTO_INCREMENT = 1` is slower
than `TRUNCATE tablename` I guess MySQL's `sql_flush` could be optimized
further to only issue the latter when asked to flush both a table and its
sequence. Given `TransactionTestCase`
[https://github.com/django/django/blob/b9336b78cf2a9a29f4934041c9e221bc68daec80/django/test/testcases.py#L1037-L1040
calls the flush command] with `reset_sequences=False` that would allow us
to favour `DELETE` over `TRUNCATE` when no sequences are meant to be
flushed and grasp the benefit in the test suite without relying on the
1000 rows heuristics which is a bit arbitrary and require a schema
introspection query which isn't cheap either.

I'll add that the benefits for Django's test suite and any suite that
[https://docs.djangoproject.com/en/3.0/topics/testing/advanced/#django.test.TransactionTestCase.available_apps
heavily uses] `TransactionTestCase.available_apps` will likely be minimal
as this ms difference between `DELETE` and `TRUNCATE` is only apparent
when the number flushed tables after every test is large and the
`available_apps` feature greatly reduces it this number in large Django
project. In summary projects that will benefits from this patch would
benefit way more by defining `available_apps` on their tests.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:10>

Django

unread,
Mar 29, 2020, 4:15:17 AM3/29/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Adam (Chainz) Johnson):

> I decided to guide the patch submitter towards avoiding these queries


and suggesting the use of `TransactionTestCase.reset_sequences` instead
while documenting the behaviour changed in Django 3.1 and that this
feature should be explicitly enabled to achieve this previously implicit
reset.

Very sensible. I forget about this option!

> In summary projects that will benefits from this patch would benefit way
more by defining `available_apps` on their tests.

Yes indeed, although it can be tedious to define and keep correct.

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:11>

Django

unread,
Apr 2, 2020, 7:22:24 AM4/2/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: assigned
Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:12>

Django

unread,
Apr 2, 2020, 7:34:05 AM4/2/20
to django-...@googlegroups.com
#31275: Optimize MariaDB/MySQL sql_flush
-------------------------------------+-------------------------------------
Reporter: Adam (Chainz) | Owner: Masashi
Johnson | SHIBATA
Type: | Status: closed

Cleanup/optimization |
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"89032876f427a77ab4de26493190280377567d1c" 89032876]:
{{{
#!CommitTicketReference repository=""
revision="89032876f427a77ab4de26493190280377567d1c"
Fixed #31275 -- Optimized sql_flush() without resetting sequences on
MySQL.

Co-Authored-By: Simon Charette <char...@users.noreply.github.com>
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/31275#comment:13>

Reply all
Reply to author
Forward
0 new messages