[Django] #23576: ORM delete strategy can lead to race conditions inside a transaction

96 views
Skip to first unread message

Django

unread,
Sep 30, 2014, 3:54:38 PM9/30/14
to django-...@googlegroups.com
#23576: ORM delete strategy can lead to race conditions inside a transaction
----------------------------------------------+--------------------
Reporter: jdufresne | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer (models, ORM) | Version: 1.7
Severity: Normal | Keywords:
Triage Stage: Unreviewed | Has patch: 0
Easy pickings: 0 | UI/UX: 0
----------------------------------------------+--------------------
Originally posted to #19544 due to similar error message, but I this is a
different ticket.

----

I have investigated this a bit more. For me the race occurs for the
following reason:

Suppose you have `MyModel` that has a unique constraint. Suppose you have
a view of the form:

{{{
@transaction.atomic
def my_view(request):
MyModel.objects.some_long_filter().delete()
objs = builds_list_of_unsaved_models_from_request(request)
MyModel.objects.bulk_create(objs)
}}}

That is, a view that builds a list of objects in bulk. The view first
deletes the objects that were created by any previous requests to ensure
the unique constraint is not violated. From my investigation, the
`delete()` will not actually call an SQL `DELETE` statement directly -- as
I would have expected -- but instead caches a list of ids of objects that
exist, then calls a new SQL query with a `DELETE .. WHERE id IN (list of
ids)`. This all happens in `DeleteQuery.delete_qs()` file
`django/db/models/sql/subqueries.py:52`.

Now suppose you have some original data. Then, two requests occur at the
same time. The first request and the second request will both cache the
same ids from the original data to delete. Suppose the first request
finishes slightly earlier and commits the *new objects with new ids*. Now,
the second request will try to delete the cached ids from the original
data and *not* the newly created ids the first request. Upon save there is
a unique constraint violation -- as the wrong ids were deleted -- causing
an `IntegrityError`.

Coming from an application with many raw SQL queries, this unique
constraint violation was a big surprise inside a transaction. The caching
of ids in the `delete()` function seems a sneaky source of race
conditions.

----

Traceback when the IntegrityError occurs:

{{{
Traceback (most recent call last):
File "manage.py", line 13, in <module>
execute_from_command_line(sys.argv)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/core/management/__init__.py", line 399, in
execute_from_command_line
utility.execute()
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/core/management/__init__.py", line 392, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 242, in run_from_argv
self.execute(*args, **options.__dict__)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 285, in execute
output = self.handle(*args, **options)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 415, in handle
return self.handle_noargs(**options)
File
"/home/jon/devel/myproj/development/myapp/management/commands/thetest.py",
line 36, in handle_noargs
MyModel.objects.bulk_create(objs)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/manager.py", line 160, in bulk_create
return self.get_queryset().bulk_create(*args, **kwargs)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 359, in bulk_create
self._batched_insert(objs_without_pk, fields, batch_size)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 838, in _batched_insert
using=self.db)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/manager.py", line 232, in _insert
return insert_query(self.model, objs, fields, **kwargs)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 1514, in insert_query
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 903, in execute_sql
cursor.execute(sql, params)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/backends/util.py", line 69, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/utils.py", line 99, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/backends/util.py", line 53, in execute
return self.cursor.execute(sql, params)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/django/db/backends/mysql/base.py", line 124, in execute
return self.cursor.execute(query, args)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/MySQLdb/cursors.py", line 205, in execute
self.errorhandler(self, exc, value)
File "/home/jon/devel/myproj/development/venv/lib/python2.7/site-
packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
django.db.utils.IntegrityError: (1062, "Duplicate entry 'mytype-7-6' for
key 'type'")
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Sep 30, 2014, 11:08:21 PM9/30/14
to django-...@googlegroups.com
#23576: ORM delete strategy can lead to race conditions inside a transaction
-------------------------------------+-------------------------------------

Reporter: jdufresne | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by jdufresne):

* needs_better_patch: => 0
* needs_tests: => 0
* needs_docs: => 0


Comment:

I have created a minimal test case:
https://github.com/jdufresne/djtest-23576 I'd be happy to answer any
questions about the test case if anything requires explanation. The
waiting on `stdin` is to simulate a view with a long running process
inside a transaction.

To run the test, run `$ python test.py` from the project root directory.
(As it is a race condition it may take more than one try.)

The test runs in a management command instead of a HTTP view in order to
easily trigger the race using multiple processes, but the principle is the
same. Imagine the command is the view and each process is an HTTP request.
Upon running the test and triggering the error I get the following
traceback:

{{{
Traceback (most recent call last):

File "manage.py", line 10, in <module>
execute_from_command_line(sys.argv)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/core/management/__init__.py", line 385, in
execute_from_command_line
utility.execute()
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/core/management/__init__.py", line 377, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 288, in run_from_argv
self.execute(*args, **options.__dict__)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 338, in execute
output = self.handle(*args, **options)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/core/management/base.py", line 533, in handle
return self.handle_noargs(**options)
File "/home/jon/djtest/myproj/myapp/management/commands/mytest.py", line
13, in handle_noargs
C(a=A.objects.get(name='foo'), b=B.objects.get(name='bar'))
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/manager.py", line 92, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 409, in bulk_create
self._batched_insert(objs_without_pk, fields, batch_size)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 938, in _batched_insert
using=self.db)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/manager.py", line 92, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/query.py", line 921, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/models/sql/compiler.py", line 920, in execute_sql
cursor.execute(sql, params)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/backends/utils.py", line 81, in execute
return super(CursorDebugWrapper, self).execute(sql, params)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/backends/utils.py", line 65, in execute
return self.cursor.execute(sql, params)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/utils.py", line 94, in __exit__
six.reraise(dj_exc_type, dj_exc_value, traceback)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/backends/utils.py", line 65, in execute
return self.cursor.execute(sql, params)
File "/home/jon/djtest/venv/lib/python2.7/site-
packages/django/db/backends/mysql/base.py", line 128, in execute
return self.cursor.execute(query, args)
File "/home/jon/djtest/venv/lib/python2.7/site-


packages/MySQLdb/cursors.py", line 205, in execute
self.errorhandler(self, exc, value)

File "/home/jon/djtest/venv/lib/python2.7/site-


packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue

django.db.utils.IntegrityError: (1062, "Duplicate entry '1-1' for key
'a_id'")
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:1>

Django

unread,
Oct 1, 2014, 12:45:38 AM10/1/14
to django-...@googlegroups.com
#23576: ORM delete strategy can lead to race conditions inside a transaction
-------------------------------------+-------------------------------------

Reporter: jdufresne | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage:
Keywords: | Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by jdufresne):

This looks specific to database backends that do not support
`update_can_self_select`. From the list of database backends shipped with
Django, this applies only to MySQL.

This is the race free query generated for backends supporting
`update_can_self_select`.

{{{
DELETE FROM `myapp_c` WHERE `myapp_c`.`id` IN (SELECT U0.`id` AS `id` FROM
`myapp_c` U0 INNER JOIN `myapp_a` U1 ON ( U0.`a_id` = U1.`id` ) INNER JOIN
`myapp_b` U2 ON ( U0.`b_id` = U2.`id` ) WHERE (U1.`name` = 'foo' AND
U2.`name` = 'bar'))
}}}

The problematic queries used by the MySQL:

{{{
SELECT `myapp_c`.`id` FROM `myapp_c` INNER JOIN `myapp_a` ON (
`myapp_c`.`a_id` = `myapp_a`.`id` ) INNER JOIN `myapp_b` ON (
`myapp_c`.`b_id` = `myapp_b`.`id` ) WHERE (`myapp_a`.`name` = 'foo' AND
`myapp_b`.`name` = 'bar')
DELETE FROM `myapp_c` WHERE `myapp_c`.`id` IN ({ RESULT FROM PREVIOUS
QUERY })
}}}

The fact that this occurs across two queries allows rows to be inserted
that would normally meet the criteria of the `WHERE` clause.

One way to handle this would be to use MySQL's multiple table syntax
http://dev.mysql.com/doc/refman/5.7/en/delete.html which allows joins. The
query would be as simple as taking the `SELECT` query, removing the
`SELECT` clause and replacing it with `DELETE myapp_c FROM myapp_c INNER
JOIN ...`.

I am interested in doing this, however, I'm not sure exactly where to
start in order to build a better query. I see inside
`DeleteQuery.delete_qs()` is where the check for feature
`update_can_self_select` occurs. So perhaps a new feature check and branch
should occur here. However, once the feature is checked, I'm not sure how
to start building the alternative query. Any guidance would be
appreciated.

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:2>

Django

unread,
Oct 1, 2014, 1:27:23 AM10/1/14
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------

Reporter: jdufresne | Owner: nobody
Type: Uncategorized | Status: new
Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by akaariai):

* stage: Unreviewed => Accepted


Comment:

The best advice I can give is to start from
django/db/models/sql/subqueries.py and find update_can_self_select in
there. You might be able to use a different compiler for the original
query, one which does DELETE FROM <original query> instead of SELECT ...
FROM <original query>.

Note that even if you fix this, there are still cases where Django has to
fetch the IDs to memory, then delete only those IDs. The reason is that
Django needs to fire pre/post delete signals and cascade the deletion to
dependent models. Especially for pre-delete signal the model must still
exists in the database, so I don't see any other way than pre-fetch ids,
then delete those ids.

At least on PostgreSQL single query delete doesn't result in concurrency
safe behavior. For example, it is possible that the view deletes the
objects, another transaction commits an conflicting object, and then
finally the view commits its objects -> unique constraint violation.

I am marking this as accepted for allowing fast-path deletion for MySQL.

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:3>

Django

unread,
Oct 20, 2014, 9:39:13 AM10/20/14
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: jdufresne | Owner: nobody
Type: New feature | Status: new

Component: Database layer | Version: 1.7
(models, ORM) | Resolution:
Severity: Normal | Triage Stage: Accepted
Keywords: | Needs documentation: 0
Has patch: 0 | Patch needs improvement: 0
Needs tests: 0 | UI/UX: 0
Easy pickings: 0 |
-------------------------------------+-------------------------------------
Changes (by timgraham):

* type: Uncategorized => New feature


--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:4>

Django

unread,
Oct 17, 2019, 1:57:31 AM10/17/19
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: assigned

Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted

Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => assigned
* owner: nobody => Simon Charette


--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:5>

Django

unread,
Oct 17, 2019, 2:31:31 AM10/17/19
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: assigned
Component: Database layer | Version: 1.7
(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* has_patch: 0 => 1


--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:6>

Django

unread,
Oct 24, 2019, 6:40:43 AM10/24/19
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: assigned
Component: Database layer | Version: master

(models, ORM) |
Severity: Normal | Resolution:
Keywords: | Triage Stage: Ready for
| checkin

Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by felixxm):

* version: 1.7 => master
* stage: Accepted => Ready for checkin


--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:7>

Django

unread,
Oct 24, 2019, 7:06:56 AM10/24/19
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: closed

Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed

Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Mariusz Felisiak <felisiak.mariusz@…>):

* status: assigned => closed
* resolution: => fixed


Comment:

In [changeset:"7acef095d73322f45dcceb99afa1a4e50b520479" 7acef09]:
{{{
#!CommitTicketReference repository=""
revision="7acef095d73322f45dcceb99afa1a4e50b520479"
Fixed #23576 -- Implemented multi-alias fast-path deletion in MySQL
backend.

This required moving the entirety of DELETE SQL generation to the
compiler where it should have been in the first place and implementing
a specialized compiler on MySQL/MariaDB.

The MySQL compiler relies on the "DELETE table FROM table JOIN" syntax
for queries spanning over multiple tables.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:8>

Django

unread,
Oct 31, 2019, 3:16:10 AM10/31/19
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"377c7cb2f7d82dd5823d028ac9502284c14b32d4" 377c7cb2]:
{{{
#!CommitTicketReference repository=""
revision="377c7cb2f7d82dd5823d028ac9502284c14b32d4"
Refs #23576 -- Disabled MySQL multi-alias deletion path on MariaDB
10.3.2+.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:9>

Django

unread,
Aug 31, 2020, 3:23:02 AM8/31/20
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"f6405c0b8ef7aff513b105c1da68407a881a3671" f6405c0]:
{{{
#!CommitTicketReference repository=""
revision="f6405c0b8ef7aff513b105c1da68407a881a3671"
Fixed #31965 -- Adjusted multi-table fast-deletion on MySQL/MariaDB.

The optimization introduced in 7acef095d73 did not properly handle
deletion involving filters against aggregate annotations.

It initially was surfaced by a MariaDB test failure but misattributed
to an undocumented change in behavior that resulted in the systemic
generation of poorly performing database queries in 5b83bae031.

Thanks Anton Plotkin for the report.

Refs #23576.
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:10>

Django

unread,
Aug 31, 2020, 3:23:32 AM8/31/20
to django-...@googlegroups.com
#23576: Fast-path deletion for MySQL
-------------------------------------+-------------------------------------
Reporter: Jon Dufresne | Owner: Simon
| Charette
Type: New feature | Status: closed
Component: Database layer | Version: master
(models, ORM) |
Severity: Normal | Resolution: fixed
Keywords: | Triage Stage: Ready for
| checkin
Has patch: 1 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0

Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Mariusz Felisiak <felisiak.mariusz@…>):

In [changeset:"2986ec031d311891af8c59c845ce3b5e46d17a76" 2986ec03]:
{{{
#!CommitTicketReference repository=""
revision="2986ec031d311891af8c59c845ce3b5e46d17a76"
[3.1.x] Fixed #31965 -- Adjusted multi-table fast-deletion on
MySQL/MariaDB.

The optimization introduced in 7acef095d73 did not properly handle
deletion involving filters against aggregate annotations.

It initially was surfaced by a MariaDB test failure but misattributed
to an undocumented change in behavior that resulted in the systemic
generation of poorly performing database queries in 5b83bae031.

Thanks Anton Plotkin for the report.

Refs #23576.

Backport of f6405c0b8ef7aff513b105c1da68407a881a3671 from master
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/23576#comment:11>

Reply all
Reply to author
Forward
0 new messages