Locking / serializing access to one element in database

56 views
Skip to first unread message

Joakim Hove

unread,
Oct 21, 2015, 6:01:50 PM10/21/15
to Django users
Hello;

this arises in the context of a django application  - but it might be a more general Python/Postgres/... problem.

[ The django application is existing all right - but the problem I am describing here is yet only in my head; I am seeking advice on how to proceed. ] Assume I have a model with a large text field:


class TextModel(models.Model):
     text = models.TextField( ... )

     @classmethod
     def update(cls, id , new_text):
           # Fetch existing ID - or alternatively create a new one.
           try: 
               tm = TextModel.objects.get( pk = id )
           except TextModel.DoesNotExist:
               tm = TextModel( )

           # Perform time consuming calculation (strcat just for demonstration) and save again.
           tm.text += new_text   
           tm.save()


Now - obviously the whole update() method is one big screaming race-condition, but it is not clear to ensure that only one thread/process is accessing this DB element at a time. Suggestions on how to solve this race condition - or suggestions of an alternative race free approach would be highly appreciated.


Joakim


Simon Charette

unread,
Oct 21, 2015, 7:16:55 PM10/21/15
to Django users
Hi Joakim,

I would suggest you use select_for_update() in a transaction.

It's hard to provide a full example without more details about
the kind of calculation required.

Does it need to be run on new_text even if now rows match
the provided id?

Simon

Joakim Hove

unread,
Oct 22, 2015, 2:59:58 AM10/22/15
to Django users
Thank you;

> I would suggest you use select_for_update() in a transaction.

That seems to be just what I want!


Carsten Fuchs

unread,
Oct 22, 2015, 9:56:36 AM10/22/15
to django...@googlegroups.com
Am 22.10.2015 um 01:16 schrieb Simon Charette:
> I would suggest you use select_for_update()
> <https://docs.djangoproject.com/en/1.8/ref/models/querysets/#select-for-update> in a
> transaction.

This only covers the case where the object with the given ID already exists, doesn't it?

That is, if a new object is created in the except-clause, concurrent threads might
create a new object each, whereas it would presumably be expected that new_text is
(twice) added to a single common instance.

As I had the quasi exact same question a while ago (see
https://groups.google.com/forum/#!topic/django-users/SOX5Vjedy_s) but never got a reply,
I (too) am wondering how the new-object case in the except-clause could be handled?

Best regards,
Carsten

Collin Anderson

unread,
Oct 24, 2015, 1:47:17 PM10/24/15
to Django users
Hi Carsten,

Something that might help: depending on how your unique keys are set up, if the thread thinks that the object is new, it could try using .save(force_insert=True). That way it won't overwrite a different object with the same PK.

Thanks,
Collin

Joakim Hove

unread,
Oct 24, 2015, 2:30:06 PM10/24/15
to Django users

This only covers the case where the object with the given ID already exists, doesn't it?

Yes - it must be so; thank you for pointing out. In my particular case I can get around it be pre creating an empty element, but that is not very nice - I will look at Collins suggestion. 

Joakim Hove

unread,
Oct 24, 2015, 2:36:15 PM10/24/15
to Django users
Hi Collin;

thank you for your suggestion:

[...] if the thread thinks that the object is new, it could try using .save(force_insert=True). 

I have read to read the force_insert documentation without understanding 100%. Assume threads t1 and t2 are racing to create the first element with this primary key, and that t1 wins the race. Then when t2 issues:
    
    save( force_insert = True )

will an exception be raised?

 

Collin Anderson

unread,
Oct 27, 2015, 2:56:14 PM10/27/15
to Django users
Yes, an exception will be raised. As an example, Django uses force_insert when creating sessions:

Carsten Fuchs

unread,
Oct 28, 2015, 3:50:09 PM10/28/15
to django...@googlegroups.com
Hi Collin, hi all,

Am 27.10.2015 um 19:56 schrieb Collin Anderson:
> Yes, an exception will be raised.

Thinking further about this, all we need is a method that gives us an exception if we
accidentally create a second object when in fact only one is wanted. Your suggestion
with manually dealing with the PKs and using .save(force_insert=True) is one method to
achieve that, another one that works well for me is using an (application-specific)

unique_together = (year, month)

constraint, which achieves the desired result (guarantee uniqueness where required,
raise an exception otherwise) without having to manually deal with PKs.

Alas, I wonder how to proceed to complete the solution. As I find it simpler to deal
with the above mentioned unique_together rather than with coming up with a PK based
solution, I refer to my original code from [1], which was:


try:
mm = TestMonthModel.objects.select_for_update().get(jahr=Jahr, monat=Monat)
except TestMonthModel.DoesNotExist:
mm = TestMonthModel(jahr=Jahr, monat=Monat)

# A *long* computation, eventually setting fields in mm and save:

mm.value = 123
mm.save()


Combining everything from this thread, this could be changed into this code
(pseudo-code, not tested):


while True:
try:
mm = TestMonthModel.objects.select_for_update().get(jahr=Jahr, monat=Monat)
break # Got what we wanted!
except TestMonthModel.DoesNotExist:
try:
# Create the expected but missing instance.
# No matter if the following succeeds normally or throws
# an Integrity error, thereafter just restart the loop.
TestMonthModel(jahr=Jahr, monat=Monat).save()
except IntegrityError:
pass

# A *long* computation, eventually setting fields in mm and save:

mm.value = 123
mm.save()


Afaics, this solves the problem, but it also feels quite awkward and I wonder if there
is a more elegant solution.

Comments? Does this sound reasonable at all?

Best regards,
Carsten

[1] https://groups.google.com/forum/#!topic/django-users/SOX5Vjedy_s

Collin Anderson

unread,
Nov 5, 2015, 10:36:18 AM11/5/15
to Django users
Hi Carsten,

If you're just updating one field, this _might_ work for you:

try:
   
TestMonthModel.objects.create(jahr=Jahr, monat=Monat)  # create() uses force_insert.
except IntegrityError:
   
pass # It already exists. No Problem.
# ... calculate some things.
TestMonthModel.objects.filter(jahr=Jahr, monat=Monat).update(value=new_value)

Collin

Carsten Fuchs

unread,
Nov 5, 2015, 3:44:05 PM11/5/15
to django...@googlegroups.com
Hi Collin,

Am 05.11.2015 um 16:36 schrieb Collin Anderson:
> If you're just updating one field, this _might_ work for you:

Why just one?


> |
> try:
> TestMonthModel.objects.create(jahr=Jahr,monat=Monat)# create() uses force_insert.
> exceptIntegrityError:
> pass # It already exists. No Problem.
> # ... calculate some things.
> TestMonthModel.objects.filter(jahr=Jahr,monat=Monat).update(value=new_value)
> |

It's a nice idea, and I'll have to think more about it, but I guess that while this
forces two parallel accesses to work with the same single instance, they can still both
enter the calculation step, whose side effects may be hard to foresee in this context.
Both will eventually also update the instance at the end, possibly with different
results… I'll have to think more about this. ;-)

Best regards,
Carsten

Reply all
Reply to author
Forward
0 new messages