[Django] #33516: QuerySet update() under postgres with select_for_update(skip_locked=True) results in neither FOR UPDATE nor SKIP LOCKED

19 views
Skip to first unread message

Django

unread,
Feb 15, 2022, 3:09:40 PM2/15/22
to django-...@googlegroups.com
#33516: QuerySet update() under postgres with select_for_update(skip_locked=True)
results in neither FOR UPDATE nor SKIP LOCKED
-------------------------------------+-------------------------------------
Reporter: Grant | Owner: nobody
Gainey |
Type: Bug | Status: new
Component: Database | Version: 3.2
layer (models, ORM) |
Severity: Normal | Keywords:
Triage Stage: | Has patch: 0
Unreviewed |
Needs documentation: 0 | Needs tests: 0
Patch needs improvement: 0 | Easy pickings: 0
UI/UX: 0 |
-------------------------------------+-------------------------------------
=== Versions:
{{{
$ cat /etc/fedora-release
Fedora release 34 (Thirty Four)

$ pip freeze | grep -i django
Django==3.2.12
django-crispy-forms==1.13.0
django-currentuser==0.5.3
django-extensions==3.1.5
django-filter==21.1
django-guardian==2.4.0
django-guid==3.2.1
django-import-export==2.7.1
django-lifecycle==0.9.3
django-readonly-field==1.1.1
django-storages==1.12.3
djangorestframework==3.13.1
djangorestframework-queryfields==1.0.0

$ psql --version
psql (PostgreSQL) 13.4
}}}

=== Problem Statement
pulpcore is a Django app that does A Lot Of Concurrent Stuff. One of those
Things includes "update the last-touched timestamp of large batches of
entities at the end of a workflow". We use
`queryset.update(timestamp_of_interest=now())` for this.

Batch updates don't guarantee row-update-order in Postgresql. In heavy-
concurrency contexts, when updating overlapping content in multiple
threads, this can result in deadlocks as postgres acquires row-locks in
arbitrary orders.

To get around this, in raw postgres, one uses `UPDATE...WHERE id IN
(SELECT id FROM table WHERE expression ORDER BY id FOR UPDATE)`. The
SELECT-FOR-UPDATE-ORDER-BY enforces row-lock-ordering prior to the UPDATE,
and everyone is happily synchronized and all the dining philosophers get
to eat.

If (as in our specific use-case) you don't actually care **who** updates
the row, just that it is being updated "now()-ish", you can regain some
concurrency by using Postgres' `SKIP LOCKED` functionality.

So - we attempted to use the following to accomplish that in our update
workflow:
{{{
TestModel.objects.filter(pk__in=content_q).order_by("pk").select_for_update(
skip_locked=True
).update(timestamp_of_interest=now())
}}}

By the time we get to Postgres, this **does not emit FOR UPDATE or SKIP
LOCKED**, and therefore doesn't solve our deadlock problem. Cue sad-
trombone.

Specifically, from the driver-example below, this code:
{{{
def update_timestamp(index):
content_q = (
TestModel.objects.filter()
.order_by("pk")
.values_list("pk", flat=True)
)
TestModel.objects.filter(pk__in=content_q).order_by("pk").select_for_update(
skip_locked=True
).update(timestamp_of_interest=now())
}}}

generates this SQL:
{{{
2022-02-15 19:12:47.276 UTC [337361] LOG: statement: UPDATE
"core_testmodel" SET "timestamp_of_interest" =
'2022-02-15T19:12:47.187475+00:00'::timestamptz WHERE
"core_testmodel"."pulp_id" IN (SELECT U0."pulp_id" FROM "core_testmodel"
U0)
}}}

Running the driver-example will generate deadlocks; the more TestModel
entries exist, the more often you hit a deadlock. With 5K entries, test
runs hit deadlocks ~5-10 times. With 20K entries, it was more like ~60-70%
of the 100 runs.

=== Workaround
Right now, we implemented a workaround in raw SQL. You can see it here:

https://github.com/pulp/pulpcore/pull/2159/files#diff-
81f6a78175bb93934b6beff952646d3ca1ef3731f1ff14492d4ec77bfc3fdf82R132-R141

We would **much much rather** not have to use raw-sql.

=== Full Example:

* Here's a really simple model, stored as `pulpcore/apps/models/test.py`
{{{
import uuid
from django.db import models
from django_lifecycle import LifecycleModel


class TestModel(LifecycleModel):
pulp_id = models.UUIDField(primary_key=True, default=uuid.uuid4,
editable=False)
timestamp_of_interest = models.DateTimeField(auto_now=True)
}}}

* Here's a migration that creates that model:
{{{
# Generated by Django 3.2.12 on 2022-02-15 18:53

from django.db import migrations, models
import django_lifecycle.mixins
import uuid


class Migration(migrations.Migration):

dependencies = [
]

operations = [
migrations.CreateModel(
name='TestModel',
fields=[
('pulp_id', models.UUIDField(default=uuid.uuid4,
editable=False, primary_key=True, serialize=False)),
('timestamp_of_interest',
models.DateTimeField(auto_now=True)),
],
options={
'abstract': False,
},
bases=(django_lifecycle.mixins.LifecycleModelMixin,
models.Model),
),
]
}}}

* Here's some python that adds 5000 rows to the resulting table (there is
probably a better/faster way to do this, but this is obvious):
{{{
from pulpcore.app.models import TestModel

for tm in range(5000):
TestModel.objects.create()
}}}

* Turning on statement-loggin in Postgres and restarting:
{{{
$ sudo vim /var/lib/pgsql/data/postgresql.conf
...
log_statement = 'all' # none, ddl, mod, all
$ sudo systemctl restart postgresql
}}}

* And here, at last, is driver-code that will generate deadlocks and let
us inspect the emitted SQL:
{{{
from threading import Thread
from pulpcore.app.models import TestModel
from django.utils.timezone import now


durations = []

def update_timestamp(index):
content_q = (
TestModel.objects.filter()
.order_by("pk")
.values_list("pk", flat=True)
)
TestModel.objects.filter(pk__in=content_q).order_by("pk").select_for_update(
skip_locked=True
).update(timestamp_of_interest=now())
print(">>>done")

for r in range(10):
threads = []
for i in range(10):
threads.append(Thread(target=update_timestamp, args=(i,)))
for t in threads:
t.start()
for t in threads:
t.join()
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33516>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

Django

unread,
Feb 16, 2022, 12:33:46 AM2/16/22
to django-...@googlegroups.com
#33516: QuerySet update() under postgres with select_for_update(skip_locked=True)
results in neither FOR UPDATE nor SKIP LOCKED
-------------------------------------+-------------------------------------
Reporter: Grant Gainey | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Changes (by Simon Charette):

* status: new => closed
* resolution: => invalid


Comment:

In the latest version of Django
[https://github.com/django/django/blob/7119f40c9881666b6f9b5cf7df09ee1d21cc8344/django/db/models/sql/query.py#L2161-L2171
the ordering should not be cleared] when `select_for_update` is used on
the subquery provided to a `__in` lookup.


{{{#!python


def update_timestamp(index):
content_q = (

TestModel.objects.order_by("pk")
.select_for_update(skip_locked=True)


.values_list("pk", flat=True)
)

TestModel.objects.filter(pk__in=content_q).update(timestamp_of_interest=now())
}}}

There was some recent changes in this area in 4.0 and I haven't confirmed
if this is an issue on 3.2 but in all cases `order_by.select_for_update`
has to be called on the subquery and not the outer one for this to work as
the resulting SQL hints at and that's likely why your solution isn't
working here.

--
Ticket URL: <https://code.djangoproject.com/ticket/33516#comment:1>

Django

unread,
Feb 16, 2022, 8:55:31 AM2/16/22
to django-...@googlegroups.com
#33516: QuerySet update() under postgres with select_for_update(skip_locked=True)
results in neither FOR UPDATE nor SKIP LOCKED
-------------------------------------+-------------------------------------
Reporter: Grant Gainey | Owner: nobody
Type: Bug | Status: closed
Component: Database layer | Version: 3.2
(models, ORM) |
Severity: Normal | Resolution: invalid
Keywords: | Triage Stage:
| Unreviewed
Has patch: 0 | Needs documentation: 0

Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------

Comment (by Grant Gainey):

Ah-ha! Perfect, that explains everything!

Two comments/suggestions:
* Could it be possible for select_for_update() to throw an exception when
it's not going to be emitted? Would have helped me target the problem way
sooner if the method yelled at me for using it incorrectly
* Similarly - when called 'correctly', select_for_update() throws an
exception if you're not in a transaction. When called incorrectly, it
doesn't. Would be great if the behavior was consistent.

For posterity's sake, below is my test-script using the suggested changes,
that works on a 20K table with no deadlocks. Thanks for the pointer!

{{{
import statistics
import time
from threading import Thread
from pulpcore.app.models import Content
from django.db import transaction
from django.utils.timezone import now

durations = []

def update_timestamp(index):

print(">>>in update_timedstamp index {}".format(index))
start = time.time()
with transaction.atomic():
content_q = (
Content.objects.filter()
.order_by("pk")
.select_for_update(skip_locked=True)


.values_list("pk", flat=True)
)

Content.objects.filter(pk__in=content_q).update(timestamp_of_interest=now())
end = time.time()
durations.append(end-start)
print(">>>done")

for r in range(10):
threads = []
for i in range(10):
threads.append(Thread(target=update_timestamp, args=(i,)))
for t in threads:
t.start()
for t in threads:
t.join()

print("Avg time : {}".format(sum(durations) / len(durations)))
print("Median time : {}".format(statistics.median(durations)))
print("StdDev : {}".format(statistics.stdev(durations)))
}}}

And the emitted SQL:
{{{
2022-02-16 13:46:02.517 UTC [368804] LOG: statement: UPDATE
"core_content" SET "timestamp_of_interest" =
'2022-02-16T13:46:02.516956+00:00'::timestamptz WHERE
"core_content"."pulp_id" IN (SELECT U0."pulp_id" FROM "core_content" U0
ORDER BY U0."pulp_id" ASC FOR UPDATE SKIP LOCKED)
}}}

--
Ticket URL: <https://code.djangoproject.com/ticket/33516#comment:2>

Reply all
Reply to author
Forward
0 new messages