a transaction committed _and_ errored?

30 views
Skip to first unread message

Brian Olson

unread,
Sep 21, 2011, 2:31:48 PM9/21/11
to google-a...@googlegroups.com
I'm trying to figure out and diagnose an odd problem, and my best guess so far is that a datastore transaction successfully stored data and also returned an error message.

(I happen to be using Python, but this might be a general datastore issue)

The code goes about like this:

txn(key):
 x = db.get(key)
 x.foo += 1
 x.put()
 logging.info('go!')
 # transactionally enqueue something to a task-queue
 return

foo(key):
 while True:
  try:
   db.run_in_transaction(txn, key)
  except Exception, e:
   logging.exception('ouch')
   time.sleep(10.0) # wait 10 seconds
   continue # fast inline retry
  else:
   return


What I'm seeing is that there was one logged exception:
"The datastore operation timed out, or the data was temporarily unavailable."
The 'go!' line is logged twice, each time through the transaction, x.foo is incremented twice and the task-queue task runs once.


What really bugs me is that x.foo added 2 where I wanted to add 1.
What really bugs me after that is that it seems that the datastore part of the transaction seems to have run twice while the task-queue part of the transaction seems to have run once.

Another thing that bugs me, but I'm kinda coming to terms with it, is that a transaction can complete successfully and the API still returns an error. I suppose this was always possible if all the actions happened on the datastore server machine but then the connection to the appengine instance died and never got notified that the transaction happened. The data may be safe and consistent, but it makes my logical flow of programming around the operation messier.

Jeff Schnitzer

unread,
Sep 21, 2011, 4:10:25 PM9/21/11
to google-a...@googlegroups.com
My sense of this is that unless your transaction is idempotent, the
only kind of exception that is safe to catch & retry in a transaction
is the ConcurrentModificationException (or whatever the equivalent is
in python). Any other exception represents some sort of "real" error
and should either be surfaced to the user or trigger some deep
evaluation of state.

I just surface it to the user and let them figure it out.

Jeff

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/Ay0pmrE4720J.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

Jeff Schnitzer

unread,
Sep 21, 2011, 4:11:59 PM9/21/11
to google-a...@googlegroups.com
I should also mention that you have this exact same problem with any
transactional datastore, including Oracle. The database guarantees
integrity of the commit, it doesn't guarantee your code will be
notified of it.

Jeff

vlad

unread,
Sep 21, 2011, 6:08:34 PM9/21/11
to google-a...@googlegroups.com
" that a transaction can complete successfully and the API still returns an error." - yes, it is a bitch. I suffered form this for a long time. Finally I understood that all  tasks  and especially transactional tasks MUST be completely idempotant. GAE docs do not stress that enough but that is mandatory. Otherwise you fall into a trap you just described.
How to achieve idempontency is app specific. For example I use TaskId extensively  as a entity key to create/update. That way I am guaranteed that double execution will update the same entity.

Jeff Schnitzer

unread,
Sep 21, 2011, 8:06:45 PM9/21/11
to google-a...@googlegroups.com
This isn't quite right. Tasks need to be idempotent because execution
is "at least once". Transactions only need to be idempotent if you
write code that retries them when you shouldn't (ie, unknown errors).

Retrying transactions on ConcurrentModificationException (or any other
known rollback scenario) is appropriate. Retrying other exceptions is
risky business.

Jeff

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit

> https://groups.google.com/d/msg/google-appengine/-/XjZZzB7n300J.

Vlad Troyanker

unread,
Sep 21, 2011, 8:49:30 PM9/21/11
to google-a...@googlegroups.com
Jeff,

In my case I was not catching any exceptions. Exceptions are not a problem because transaction is not committed in those cases.  Problem is sometimes tasks run twice. Since I ran transactional code in those tasks, that code must be idempotant...Task scheduler "should not" re-run tasks which have executed but it does do that on occasion.

Jeff Schnitzer

unread,
Sep 21, 2011, 10:31:04 PM9/21/11
to google-a...@googlegroups.com
Sorry - I read your mail as "tasks and transactions" but you actually
wrote "tasks and transactional tasks". My fault.

Jeff

Reply all
Reply to author
Forward
0 new messages