[Django] #36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter limit

16 views
Skip to first unread message

Django

unread,
Mar 11, 2025, 8:26:01 PMMar 11
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Type:
| Uncategorized
Status: new | Component: Database
| layer (models, ORM)
Version: 5.1 | Severity: Normal
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
To avoid hitting DBMS-specific limits on the number of parameters that can
be used in a parameterised query, Django tries to do bulk operations in
limited-size batches. This includes in
db.models.deletion.Collector.collect, which calls get_del_batches to split
a list of objects into manageable chunks for deletion. This was done in
#16426.

If there are keys on other models referring to the model being deleted,
and those keys have an on_delete setting that would cause an update to
that field (such as SET_NULL), the Collector's field_updates list will
gather per-chunk updates to each such field. However, Collector.delete
then combines the QuerySets generated for each update into a single
combined filter. If there are more rows in total than the parameter limit
then this will fail.

It seems straightforward to stop doing that:

{{{
@@ -481,9 +481,8 @@ class Collector:
updates.append(instances)
else:
objs.extend(instances)
- if updates:
- combined_updates = reduce(or_, updates)
- combined_updates.update(**{field.name: value})
+ for update in updates:
+ update.update(**{field.name: value})
if objs:
model = objs[0].__class__
query = sql.UpdateQuery(model)
}}}

However:

- I'm not certain what the original intent was behind combining the
updates here. Can there be multiple updates for some other reason than
batch chunking, that we want to minimise queries for? This happened in
#33928.

- Testing it is a bit tricky since the SQLite parameter limit is now often
higher than the traditional 999, and Python's sqlite3 doesn't let us read
or change it. On my Ubuntu box here it's 250000, which is a bit more than
is comfortable to be doing in a test. We can count queries like
DeletionTests.test_large_delete does but that's not _exactly_ checking the
thing we actually care about, that the number of parameters is within
bounds. (It's not possible at present to read the number of parameters
used from the queries_log.)

(Diversion: actually in practice I'm personally affected more by this
issue in mssql-django, but there are a bunch of confounding factors around
that backend's prior wonky attempts to work around the parameter limit on
its own. https://github.com/microsoft/mssql-django/issues/156 has some
background.)
--
Ticket URL: <https://code.djangoproject.com/ticket/36248>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Mar 11, 2025, 8:32:10 PMMar 11
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: (none)
Type: Uncategorized | Status: new
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage:
| Unreviewed
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by bobince):

> We can count queries like DeletionTests.test_large_delete does but [...]

Done this here FWIW:
https://github.com/django/django/compare/main...bobince:django:batch_delete_with_set_null?expand=1
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:1>

Django

unread,
Mar 12, 2025, 2:03:41 AMMar 12
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* owner: (none) => bobince
* stage: Unreviewed => Accepted
* status: new => assigned
* type: Uncategorized => Bug

Comment:

The analysis and the patch here are on point (thanks bobince!) so I'll
accept and mark with ''has_patch''.

The only thing I'd change is the test so `bulk_batch_size` is mocked to
return a small number (e.g. 10) and not only test the number of executed
queries but count the number of parameters for each queries (you can use
`self.assertNumQueries` as a context manager for that).

That would avoid having to create 200s objects unnecessarily but more
importantly test the chunking logic in a backend agnostic way.
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:2>

Django

unread,
Mar 12, 2025, 8:22:59 AMMar 12
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by bobince):

> The only thing I'd change is the test so bulk_batch_size is mocked to
return a small number

will do!

> count the number of parameters for each queries

Yeah that's what I wanted to do, but was frustrated:
connection.queries_log/assertNumQueries doesn't keep hold of the number of
parameters, and you can't infer them from `%s` being in the SQL string
because DebugCursorWrapper uses last_executed_query to replace them with
parameter values.

I guess I could scrape the SQL string for number of integers but this
feels pretty fragile. Or I could add a "params_count" entry to the dicts
in queries_log, but that's more intrusive than I'd like just for the
benefit of one test.
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:3>

Django

unread,
Mar 12, 2025, 8:55:57 AMMar 12
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

> Yeah that's what I wanted to do, but was frustrated:
connection.queries_log/assertNumQueries doesn't keep hold of the number of
parameters, and you can't infer them from %s being in the SQL string
because DebugCursorWrapper uses last_executed_query to replace them with
parameter values.

Crap, I forgot that I ran into the exact same thing so many times and
wished it would capture uninterpolated SQL and params intead. Never mind
in this case.
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:4>

Django

unread,
Mar 12, 2025, 8:56:21 AMMar 12
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* cc: Simon Charette (added)

--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:5>

Django

unread,
Mar 12, 2025, 5:52:04 PMMar 12
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by bobince):

> I guess I could scrape the SQL string for number of integers but this
feels pretty fragile.

Have tried something similar to this (counting commas in ‘IN’ clauses) and
actually maybe it's not thaaat bad. I've written worse.

Now I have a test that's only counting parameters not queries, I tried
using the existing test model ‘R’ instead of making a new one. This turned
up an delightful wrinkle in that the fast-delete path can also ‘OR’
together multiple querysets, this time when there are multiple references
on one model to rows being deleted. For example if Article.author and
Article.editor are ForeignKeys to Person with on_delete=CASCADE, and 500
Person rows are deleted, deletion will put together a query like `DELETE
FROM article WHERE author IN (...500 params...) OR editor IN (...another
500 params...)` and if you have a DBMS with a param limit of 999 you're
now sad.

So, er, back to simpler relations for now. I don't think there's a
straightforward fix to this — it would take a more wide-ranging rethink of
how param limits impact batch sizes and I don't really have a good idea
what that would look like. Ugh, I wish param limits would go away
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:6>

Django

unread,
Mar 23, 2025, 12:10:37 AMMar 23
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Simon Charette):

Thanks for reporting your findings bobince. We are aware that our current
params limiting is far from bullet proof.

The problem
[https://github.com/django/django/pull/19088#discussion_r1925636305 was
recently re-surfaced] when we added support for composite primary key as
they require that the chunk size be divided by the number of fields
included in the relationship.

Some related tickets are #36143, #36144.

> Ugh, I wish param limits would go away

FYI you might be interested to know that SQLite 3.32+
[https://www.sqlite.org/limits.html#max_variable_number seems to default
to a much larger number].
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:7>

Django

unread,
Jun 3, 2025, 12:29:09 PMJun 3
to django-...@googlegroups.com
#36248: Bulk deletion of model referred to by a SET_NULL key can exceed parameter
limit
-------------------------------------+-------------------------------------
Reporter: bobince | Owner: bobince
Type: Bug | Status: assigned
Component: Database layer | Version: 5.1
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Sage Abdullah):

* cc: Sage Abdullah (added)

Comment:

I also noticed this issue when I was testing https://github.com/django
/django-docker-box/pull/50#discussion_r1933066031 with a low enough value
for `SQLITE_MAX_VARIABLE_NUMBER`. Some tests that do
`ContentType.objects.all().delete()` would fail with a similar error, and
I think I tried hacking a similar solution to yours but I wasn't sure if
that was the right approach as I'm not familiar with the deletion code.
--
Ticket URL: <https://code.djangoproject.com/ticket/36248#comment:8>
Reply all
Reply to author
Forward
0 new messages