Lock the instance

210 views
Skip to first unread message

Iap

unread,
Mar 15, 2010, 5:28:04 AM3/15/10
to google-a...@googlegroups.com
Hi,

I think that I do require "instance locking" for the DataStore to solve the integrity problem in my scenario.

It's a long story but I will try to put it as brief as possible:
1) Suppose there is a FIFO queue, Q.
2) To consume the queue, there are many clients ( say, the flash movie in the browser) connect to the GAE for consuming the queue.
3) For every consumption from the client side, the dispatching CGI has to
   3.1)  query the DataStore for all entries,
   3.2)  sorting the entries to get the 1st item in the queue.
   3.3)  mark the item been consumed. then, the item moves to its next phase.
   3.4)  update the queue.(remove the item from queue)

The problem will happen if :
a) when client-A requests for consumption, the dispatching CGI handles the request. but the DataStore got stuck in doing 3.2 on occasion.
b) During that stuck period, client-B requests for consumption, this time, the DataStore runs very well, without any delay.
c) The client-A and client-B reaches the 3.3 at the same time, they both got the same item in the queue. That is the problem.
    Either the client-A  or the client-B overrides the other's effort to the same item in queue.

I am thinking about to utilize the memcache to do the locking, but I was told that the memcache is not guaranteed.
I also assume that the "locking" should not depends on the DataStore which is the origin of uncertainty.
(aka to put a BooleanProperty "LOCK" to the instance)
The other suggestion is the "transaction". I encountered many exceptions while putting the step 3.2, 3.3 to a transaction.
Such as "nested transaction is not allowed","Cannot operate on different entity groups","must be an ancestor...". 
Because the transaction has so much limitation, it seems not realistic to use transaction.
Whereas to lock/release the entry (object) is simpler and more concise.
It would be nice if the item is lock-able, or the queue is lock-able.

Iap

Gopal Patel

unread,
Mar 15, 2010, 7:58:00 AM3/15/10
to Google App Engine
is there any need for queue to complete exactly in sequence ? how
much speed do you really need ? may be a simple counter shredded
counter to fetch , save and count thing is good enough. and can you
describe your PROBLEM instead of PROBLEM IN YOUR SOLUTION OF PROBLEM ,
there might be different way to achieve the same in app engine....

Iap

unread,
Mar 15, 2010, 10:28:49 AM3/15/10
to google-a...@googlegroups.com
I wonder why there is no need to complete in sequence?
FIFO = First in first out.
If the resource is limited, people who makes the request first should be serviced first.
Just like people go restaurant, buying ticket,...etc.
By the way, the sorting is not the point, 
even randomly consume an item from the queue, 
there is still chances for the overriding happened.

2010/3/15 gops <patel...@gmail.com>

Eli Jones

unread,
Mar 15, 2010, 11:08:56 AM3/15/10
to google-a...@googlegroups.com
You might be better off figuring out if you can design your code so that it does not need locking.

You could make it so that whichever client submits processed work from the queue last gets priority.. the other clients work that got returned earlier is just discarded.

This way, work could simply be submitted with work entities having explicit key_names and just being .put() to the datastore.

Either way, you can easily set up a method so that either, first-submitted or most recently-submitted work is given priority.

Granted I'm not clear on exactly what sort of work is getting processed from your queue.  It just seems like you'll expend more resources trying to be exact by not having more than one client grab the same workload.. which will then slow down the apps performance and gobble up more resources.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To post to this group, send email to google-a...@googlegroups.com.
To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.

Iap

unread,
Mar 15, 2010, 12:01:08 PM3/15/10
to google-a...@googlegroups.com


2010/3/15 Eli Jones <eli....@gmail.com>

You might be better off figuring out if you can design your code so that it does not need locking.

If I can, I would. But isn't the locking intrinsic in the multiple processing environment to solve the race-condition problem?
I do not insist to have locking, but I do need to have a solution when an instance is co-existing in two processes/requests.
If locking is luxury in GAE, at least to be warned. I suppose the "transaction" is for this purpose. 
But the limitation of transaction makes it is hard to function as expected but simple scenario.
 
You could make it so that whichever client submits processed work from the queue last gets priority.. the other clients work that got returned earlier is just discarded.

This way, work could simply be submitted with work entities having explicit key_names and just being .put() to the datastore.

Either way, you can easily set up a method so that either, first-submitted or most recently-submitted work is given priority.


The ordering might happen to be any kind of weighting. In my scenario, the ordering factor is time-stamp, 
in other scenario, it might be some others. Whenever dealing with the queue, 
to consume the queue in order is inevitable no matter by what ordering.
To consume the queue sequentially or randomly dose not make different for the problem, I think.
If there is no management mechanism, "The same item might be dispatched to two different requests", that is the problem.
Locking is an approach to manage the queue, although might not be the only way.
So, I 'd like to:
1) Do not dispatch the same item to different requests, or
2) Some management can be applied to avoid they crash each other.
 
Granted I'm not clear on exactly what sort of work is getting processed from your queue.  It just seems like you'll expend more resources trying to be exact by not having more than one client grab the same workload.. which will then slow down the apps performance and gobble up more resources.

Agree, but overridden is worse.
 

Stephen

unread,
Mar 15, 2010, 3:00:17 PM3/15/10
to Google App Engine

On Mar 15, 9:28 am, Iap <iap...@gmail.com> wrote:
>
> The other suggestion is the "transaction". I encountered many exceptions
> while putting the step 3.2, 3.3 to a transaction.
> Such as "nested transaction is not allowed","Cannot operate on different
> entity groups","must be an ancestor...".
> Because the transaction has so much limitation, it seems not realistic to
> use transaction.


Something like this?

class Queued(db.Model):
done = db.BooleanProprty(required=True, default=False)
order = db.DateTimeProperty(required=True, auto_now_add=True)

work = db.TextProperty() # whatever...

@classmethod
def get_next(cls):
next = None
while next is None:
next = cls.all().filter('done =', False) \
.order('order').fetch(1)
if next is None:
return None
next = cls._mark_done_transactionally(next)
return next

@classmethod
def _mark_done_transactionally(cls, queued):
def txn(queued):
queued = db.Get(queued.key())
if queued.done:
return None
queued.done = True
queued.put()
return queued
return db.run_in_transaction(txn, queued)


some_work_to_do = Queued.get_next()
if some_work_to_do:
print 'Do once: %s' % some_work_to_do.work

Reply all
Reply to author
Forward
0 new messages