#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Type:
| Uncategorized
Status: new | Component: Database
| layer (models, ORM)
Version: 5.1 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
To avoid hitting DBMS-specific limits on the number of parameters that can
be used in a parameterised query, Django tries to do bulk operations in
limited-size batches. This includes in
db.models.deletion.Collector.collect, which calls get_del_batches to split
a list of objects into manageable chunks for deletion. This was done in
#16426.
If there are keys on other models referring to the model being deleted,
and those keys have an on_delete setting that would cause an update to
that field (such as SET_NULL), the Collector's field_updates list will
gather per-chunk updates to each such field. However, Collector.delete
then combines the QuerySets generated for each update into a single
combined filter. If there are more rows in total than the parameter limit
then this will fail.
It seems straightforward to stop doing that:
{{{
@@ -481,9 +481,8 @@ class Collector:
updates.append(instances)
else:
objs.extend(instances)
- if updates:
- combined_updates = reduce(or_, updates)
- combined_updates.update(**{
field.name: value})
+ for update in updates:
+ update.update(**{
field.name: value})
if objs:
model = objs[0].__class__
query = sql.UpdateQuery(model)
}}}
However:
- I'm not certain what the original intent was behind combining the
updates here. Can there be multiple updates for some other reason than
batch chunking, that we want to minimise queries for? This happened in
#33928.
- Testing it is a bit tricky since the SQLite parameter limit is now often
higher than the traditional 999, and Python's sqlite3 doesn't let us read
or change it. On my Ubuntu box here it's 250000, which is a bit more than
is comfortable to be doing in a test. We can count queries like
DeletionTests.test_large_delete does but that's not _exactly_ checking the
thing we actually care about, that the number of parameters is within
bounds. (It's not possible at present to read the number of parameters
used from the queries_log.)
(Diversion: actually in practice I'm personally affected more by this
issue in mssql-django, but there are a bunch of confounding factors around
that backend's prior wonky attempts to work around the parameter limit on
its own.
https://github.com/microsoft/mssql-django/issues/156 has some
background.)
--
Ticket URL: <
https://code.djangoproject.com/ticket/36248>
Django <
https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.