Understanding concurrent updates

4 views
Skip to first unread message

dburns

unread,
Oct 27, 2009, 9:39:16 PM10/27/09
to Google App Engine
I'm trying to make sure I understand issues surrounding multiple
concurrent users. Take the example from
http://code.google.com/appengine/docs/python/datastore/transactions.html#Uses_For_Transactions

def increment_counter(key, amount):
obj = db.get(key)
obj.counter += amount
obj.put()

The text makes it clear that a naive implementation like that is
susceptible to a collision between two concurrent users (meaning two
separate instances of the python script running at the same time).
The second put() clobbers the results of the first one.

I'm unclear on how memcache fits in. If those two concurrent
instances both instead call cached_obj=memcache.get("my_key") and get
a result back from some earlier memcache.add("my_key", my_object), are
they susceptible to the same sort of collision if each instance does
something like cached_obj.counter += amount?

I kind of thought memcache worked seamlessly across instances, but I
don't see how in this case.

Thanks for any clarification you can provide.

Martin Trummer

unread,
Oct 28, 2009, 8:35:48 AM10/28/09
to Google App Engine
you cannot use memcache to reliabliy increment the counter - it's just
a cache and thus there's no such thing like a transaction (AFAIK)
the memcache only makes sure, that a single put and get are atomic

the problem arises in this case:
2 users call the same servelt, but these 2 calls will not be processed
on the same machine, but on 2 different machines.
timeline:
* request A calls cache.getKey() and the value is e.g. 10
* request B calls cache.getKey() and the value is still e.g. 10
* request A the next instruction is to increment the counter to 11
(this now is the value of the local variable on machine A)
* request B the next instruction is to increment the counter to 11
(this now is the value of the local variable on machine B)
* after that both requests will call cache.put() with the value of 11
(the sequence does not matter)
and now you've lost one count

in fact it does not really matter if that happens on 2 machines or 2
processes or threads - the problem is always the same

if this is just a hitcounter, then it's may be ok to use this, because
it does not really matter, if you sometimes lose a count.

on the other hand, if you use the datastore and make these changes in
a transaction, the datastore will make sure, that all operations in
this transaction are executed atomicly:
so: it does not matter, if A or B are processed first:
each transaction will get the value, increment it and save it
the datastore makes sure, that no other transaction can change the
value during this operation

hope this makes it clear (and I also hope it's correct)

On Oct 28, 2:39 am, dburns <drrnb...@gmail.com> wrote:
> I'm trying to make sure I understand issues surrounding multiple
> concurrent users.  Take the example fromhttp://code.google.com/appengine/docs/python/datastore/transactions.h...

Stephen

unread,
Oct 29, 2009, 9:10:02 AM10/29/09
to Google App Engine


On Oct 28, 1:39 am, dburns <drrnb...@gmail.com> wrote:
>
> If those two concurrent instances both instead call:
>
> cached_obj=memcache.get("my_key")
>
> and get a result back from some earlier:
>
> memcache.add("my_key", my_object)
>
> are they susceptible to the same sort of collision if each instance does
> something like:
>
> cached_obj.counter += amount?


Yes.

However, there is an atomic memcache.incr('my_key'). So in the
specific case of incrementing an integer, as in your example, memcache
is safe when you use the builtin 'incr' and 'decr' functions. But in
the general case - for example if you wanted to append a string to a
memcached value - then no, memcache has no transactional capability.

Reply all
Reply to author
Forward
0 new messages