Pull task leasing question

41 views
Skip to first unread message

Mahron

unread,
Dec 26, 2012, 1:26:35 AM12/26/12
to Google App Engine
Hi,

I would like to know what happens if two simultaneous lease_tasks are
called, is it possible that both get returned the same tasks ? or is
there something to prevent that scenario ?

John Patterson

unread,
Dec 27, 2012, 12:09:07 AM12/27/12
to google-a...@googlegroups.com
The recommendation from Google about pull queues is "Tasks should be idempotent, so even if a task lease expires and another client leases the task, performing the same task twice should not cause an error."

With push queues (standard task queues) there is also no guarantee that a task will not be executed more than once so probably the same caveat is true when you manage the queue yourself.

How likely is this?  The push queue docs say: "App Engine's Task Queue API is designed to only invoke a given task once; however, it is possible in exceptional circumstances that a task may execute multiple times (such as in the unlikely case of major system failure). Thus, your code must ensure that there are no harmful side-effects of repeated execution."

However I have seen it happen more frequently than "major system failures".  I have a task chain that uses named tasks (can only be added once) and roughly 1 / 500,000 tasks gets repeated. 

If you truly need to ensure tasks execute only once you can give each a serial number (in a task chain increment the value for each new task) and at the start of your task do a transactional "check and increment" operation on the datastore.

Mahron

unread,
Dec 27, 2012, 3:42:55 PM12/27/12
to Google App Engine
Thanks for your insight and stats.

The tasks are either idempotent or multiple execution is acceptable up
to a certain point.

But if two workers can be returned the same tasks if called
simultaneously(in normal conditions), then I have a problem, as
multiple execution would happen far too often. According to the doc I
assume it is not suppose to happen but I would like to have
confirmation.

Jason Collins

unread,
Dec 28, 2012, 10:50:56 AM12/28/12
to google-a...@googlegroups.com
Agree with John, we see spurious duplicate tasks with about the same rate. We also leverage datastore transactions as the ultimate semaphore to ensure that we don't execute multiply (in our case, we just have a model with no attributes using the taskname as the key name; in a transaction we just make sure we haven't already seen the task then write the new key down if not). 

BTW, we use our open source Fantasm (code.google.com/p/fantasm) to perform this gate check automatically.

j
Reply all
Reply to author
Forward
0 new messages