How to atomically create and lock an object?

1,348 views
Skip to first unread message

Carsten Fuchs

unread,
May 11, 2016, 9:44:12 AM5/11/16
to Django users
Dear Django group,

please consider this code:


from datetime import date
from django.db import models


class TestModel(models.Model):
jahr = models.SmallIntegerField()
monat = models.SmallIntegerField()
some_value = models.SmallIntegerField()

class Meta:
unique_together = ('jahr', 'monat')


def demonstrate_the_problem():
d = date.today()

try:
t = TestModel.objects.get(jahr=d.year, monat=d.month)
# t exists, no need to create or modify it.
return t.some_value
except TestModel.DoesNotExist:
# t did not yet exist, so we have to create it anew.
# Note that there is a "unique together" constraint in place
# that makes sure that the tuple (jahr, monat) only exists once.
# Thus we create a new instance, then lock it with
# select_for_update() -- but this is still not atomic!
TestModel(jahr=d.year, monat=d.month).save()
t = TestModel.objects.get(
jahr=d.year, monat=d.month).select_for_update()

# A long computation, eventually setting fields in the new t,
# then saving it for the next call to this function.
t.some_value = 123
t.save()
return t.some_value


The problem is that another thread too may have created a TestModel with the
same (jahr, monat) in the timespan between our "does not exist" and "lock",
triggering a violation of the "unique" constraint.

Thus, the question is how we can make the sequence "does not exist – create anew
– lock" atomic?

Any feedback would very much be appreciated!

Many thanks and best regards,
Carsten

Simon Charette

unread,
May 11, 2016, 10:04:24 AM5/11/16
to Django users
Hi Carsten,

Did you try using select_for_update() with get_or_create()[1] in an
atomic()[2] context?

@transation.atomic

def demonstrate_the_problem():
    d = date.today()
    t = TestModel.objects.select_for_update().get_or_create(
        jahr=d.year, monat=d.month
    )
    # ... long `some_value` computation
    t.some_value = 123
    t.save(update_fields={'some_value'})
    return t

Note that in this case if another thread tries to select_for_update() it is
going to block at the get_of_create() until the first thread's transaction
commits.

If you'd like to prevent other threads from blocking you might want to use the
`nowait` option of select_for_update() and catch the `OperationalError` that
might be raised in order to return something else.

Cheers,
Simon

[1] https://docs.djangoproject.com/en/1.9/ref/models/querysets/#get-or-create
[2] https://docs.djangoproject.com/en/1.9/topics/db/transactions/#django.db.transaction.atomic

Carsten Fuchs

unread,
May 12, 2016, 10:48:30 AM5/12/16
to django...@googlegroups.com
Hi Simon,

many thanks for your reply!
Please see below for some follow-up questions.

Am 11.05.2016 um 16:04 schrieb Simon Charette:
> Did you try using select_for_update() with get_or_create()[1] in an
> atomic()[2] context?
>
> @transation.atomic
> def demonstrate_the_problem():
> d = date.today()
> t = TestModel.objects.select_for_update().get_or_create(
> jahr=d.year, monat=d.month
> )
> # ... long `some_value` computation
> t.some_value = 123
> t.save(update_fields={'some_value'})
> return t
>
> Note that in this case if another thread tries to select_for_update() it is
> going to block at the get_of_create() until the first thread's transaction
> commits.

Why will the other thread block?
(Both threads may enter the "create" case, so the select_for_update() may not
yet be effective for the other thread?)

I looked into get_or_create()'s source code and the _create_object_from_params()
method that it calls. Is this due to the "second"
return self.get(**lookup), False
near its end?

Also, I understand the purpose of wrapping demonstrate_the_problem() in
atomic(), accounting for possibly unrelated exceptions in "long `some_value`
computation". But why does _create_object_from_params() wrap its single call to
`create()` in atomic(), too? Isn't create() inherently atomic?

Best regards,
Carsten

Simon Charette

unread,
May 12, 2016, 11:10:18 AM5/12/16
to Django users
Hi Carsten,


> Why will the other thread block?
> (Both threads may enter the "create" case, so the select_for_update() may not
> yet be effective for the other thread?)

> I looked into get_or_create()'s source code and the _create_object_from_params()
> method that it calls. Is this due to the "second"
>         return self.get(**lookup), False
> near its end?

Exactly. Only one thread will succeed in creating the object. The other one will
get an `IntegrityError` and try to `.get()` the existing object which is going
to use `select_for_update(nowait=False)`-- a blocking call.


> Also, I understand the purpose of wrapping demonstrate_the_problem() in
> atomic(), accounting for possibly unrelated exceptions in "long `some_value`
> computation". But why does _create_object_from_params() wrap its single call to
> `create()` in atomic(), too? Isn't create() inherently atomic?

The create() method is atomic but in order to recover from an integrity error
it could raise on conflictual data it must be wraped in a transaction
(if autocommit is on) or use a savepoint in your case because a transaction
is already started by the demonstrate_the_problem() atomic wrapper. Else the
connection is left in an unusable state by the integrity error.

Cheers,
Simon

Carsten Fuchs

unread,
May 12, 2016, 12:06:17 PM5/12/16
to django...@googlegroups.com
Hi Simon,

that's awesome! This problem has been bothering me for a long time because I
never quite understood how to get the locking / blocking and the transactions right.

Your's was a huge help, thank you very much for it!
:-)

Best regards,
Carsten
Reply all
Reply to author
Forward
0 new messages