The `sql_flush` database operation, used by the 'flush' management command
and, thus through `TransactionTestCase`, creates one `TRUNCATE TABLE`
statement for every table - regardless of whether they are empty already.
This means unnecessary database operations are run, which will be most
noticeable in test runs.
Instead of flushing *every* table, we can flush only the non-empty ones.
A minimal, hopefully cross-database way of quickly finding out which
tables in a list are non-empty is to make existence queries on them
combined with union:
{{{
chainz@localhost [14]> (select 't' as has_rows from t limit 1) union
(select 't2' as has_rows from t2 limit 1);
+----------+
| has_rows |
+----------+
| t2 |
+----------+
1 row in set (0.001 sec)
}}}
We could use logic like this to batch the tables into such union queries
in `django.core.management.sql.sql_flush`. We could then reduce the list
of tables we wish to flush down to the non-empty ones for the `sql_flush`
operation. Perhaps it would be best to put this behind a flag that is set
only in tests, for backwards compatibility.
--
Ticket URL: <https://code.djangoproject.com/ticket/31276>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* stage: Unreviewed => Accepted
Comment:
+1. What sort of backwards compatibility issues would you imagine here?
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:1>
Comment (by Adam (Chainz) Johnson):
Not sure actually, just being cautious by suggesting the flag in case
someone else can come up with a case. As far as I can tell, as long as we
still reset sequences, there’s no need to truncate/delete from empty
tables.
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:2>
Comment (by felixxm):
I'm not sure, TBH. Checking no. of rows for each table will have a
performance impact, and creating a huge union for all tables (as proposed)
may not be better:
{{{
(select 't' as has_rows from t limit 1) union (select 't2' as has_rows
from t2 limit 1);
}}}
We need to take into account a few restrictions e.g. the database's limit
on the number of query parameters and 3rd-party database backends may not
support `UNION`.
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:3>
* owner: nobody => jonatron
* status: new => assigned
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:4>
Comment (by jonatron):
I've been benchmarking with 1000 tables and 5 tests. You'll notice sqlite
has a problem with too many tables. Also the query for sqlite needs to be:
{{{select * from ( SELECT '{}' AS has_rows FROM {} LIMIT 1 )}}}
instead of
{{{( SELECT '{}' AS has_rows FROM {} LIMIT 1 )}}}
postgres
current behaviour
Ran 5 tests in 1.538s
Ran 5 tests in 1.545s
Ran 5 tests in 1.573s
with patch
Ran 5 tests in 0.807s
Ran 5 tests in 0.880s
Ran 5 tests in 0.812s
mysql
current behaviour
Ran 5 tests in 17.101s
Ran 5 tests in 15.396s
Ran 5 tests in 15.487s
with patch
Ran 5 tests in 3.891s
Ran 5 tests in 3.545s
Ran 5 tests in 3.522s
sqlite - 400 tables
current behaviour
Ran 5 tests in 2.006s
Ran 5 tests in 2.039s
Ran 5 tests in 2.055s
with patch
Ran 5 tests in 0.106s
Ran 5 tests in 0.104s
Ran 5 tests in 0.107s
sqlite - 1000 tables
django.db.utils.OperationalError: too many terms in compound SELECT
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:5>
Comment (by felixxm):
About SQLite, I mentioned this in my
[https://code.djangoproject.com/ticket/31276#comment:3 comment] ''"... the
database's limit on the number of query parameters"'' (999 on SQLite).
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:6>
Comment (by jonatron):
The sqlite problem can be fixed using the same chunking as is in
_quote_params_for_last_executed_query.
I can't explain sqlite's performance.
sqlite
1000 tables with patch
Ran 5 tests in 0.237s
Ran 5 tests in 0.345s
Ran 5 tests in 0.288s
Ran 5 tests in 0.257s
Ran 5 tests in 0.240s
1000 tables without patch
Ran 5 tests in 55.617s
Ran 5 tests in 57.577s
Ran 5 tests in 55.242s
900 tables without patch
Ran 5 tests in 45.796s
800 tables without patch
Ran 5 tests in 35.795s
600 tables without patch
Ran 5 tests in 20.521s
550 tables without patch
Ran 5 tests in 19.025s
500 tables without patch
Ran 5 tests in 3.049s
400 tables without patch
Ran 5 tests in 2.068s
200 tables without patch
Ran 5 tests in 0.612s
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:7>
* has_patch: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:8>
* needs_docs: 0 => 1
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:9>
* status: assigned => closed
* resolution: => wontfix
--
Ticket URL: <https://code.djangoproject.com/ticket/31276#comment:12>