Is it possible for a transactional db.get() to return a stale result if a recent transaction raised a “special” exception?

76 views
Skip to first unread message

Albert

unread,
Jun 10, 2013, 6:06:07 AM6/10/13
to google-a...@googlegroups.com

This is in the appengine transaction docs...

Note: If your application receives an exception when committing a transaction, it does not always mean that the transaction failed. You can receive Timeout, TransactionFailedError, or InternalError exceptions in cases where transactions have been committed and eventually will be applied successfully...

Consider the following scenario

  1. I update entity A within a transaction.
  2. transaction operation results in the above described special "exception" where the transaction have been committed and eventually will be applied
  3. I run db.get(entity_a_key_goes_here) within a transaction right after, or almost at the same time as step 2.

My Question:

Is it ever possible for the db.get() operation at step 3 above to return a stale value (or not the updated value set on step 1)? Are transactional db.get() operations guaranteed to return the freshest result even if the "weird" transaction exception occurs right before it?


Thanks!

I also asked this in stackoverflow, but I didn't get an answer.

Vinny P

unread,
Jun 11, 2013, 12:51:46 PM6/11/13
to google-a...@googlegroups.com
Hello Albert,


On Monday, June 10, 2013 5:06:07 AM UTC-5, Albert wrote:

Is it ever possible for the db.get() operation at step 3 above to return a stale value (or not the updated value set on step 1)? Are transactional db.get() operations guaranteed to return the freshest result even if the "weird" transaction exception occurs right before it?


Short answer: Yes.

Long answer: It's complicated and depends on many factors. 

For now, let's ignore the transaction and exception details. Suppose you make a simple datastore put, then immediately query the datastore for that entity. There's a good chance that the entity that you just put in won't exist, because it takes time for the datastore to commit and apply the entity (fully write the entity, including all needed indexes, etc). The time to fully write an entity differs depending on how big the entity is, how many indexes are written, etc, but I usually ballpark it at around 100-200 ms. As a side note, this is why sharding exists: because on a high traffic app, a single entity simply cannot be written to fast enough to handle all the incoming writes and not lose data.

Within a transaction context, the same principles apply: depending on how soon your db.get executes after the transaction Exception, you may get stale data. The key here is the Exception: it's there to warn you that the apply phase has been delayed for some internal reason. The apply may have already occurred, it may be delayed for an unknown amount of time, or it may not be valid anymore. So just to repeat: an Exception from a transaction may cause stale data.

My suggestion is to use the Exception to rerun the transaction; for instance if an exception comes up, you could queue up a task to redo the transaction.


On Monday, June 10, 2013 5:06:07 AM UTC-5, Albert wrote:

I also asked this in stackoverflow, but I didn't get an answer.

I don't know about others, but I prefer to lurk within this mailing list. It's more convenient for me to answer questions via email than through SO.


-----------------
-Vinny P
Technology & Media Advisor
Chicago, IL

My Go side project: http://invalidmail.com/

Alex Burgel

unread,
Jun 11, 2013, 1:57:48 PM6/11/13
to google-a...@googlegroups.com
On Tuesday, June 11, 2013 12:51:46 PM UTC-4, Vinny P wrote:
For now, let's ignore the transaction and exception details. Suppose you make a simple datastore put, then immediately query the datastore for that entity. There's a good chance that the entity that you just put in won't exist, because it takes time for the datastore to commit and apply the entity (fully write the entity, including all needed indexes, etc). The time to fully write an entity differs depending on how big the entity is, how many indexes are written, etc, but I usually ballpark it at around 100-200 ms. As a side note, this is why sharding exists: because on a high traffic app, a single entity simply cannot be written to fast enough to handle all the incoming writes and not lose data.

Within a transaction context, the same principles apply: depending on how soon your db.get executes after the transaction Exception, you may get stale data. The key here is the Exception: it's there to warn you that the apply phase has been delayed for some internal reason. The apply may have already occurred, it may be delayed for an unknown amount of time, or it may not be valid anymore. So just to repeat: an Exception from a transaction may cause stale data.

I don't think this is correct. According to this article[1], if the transaction has been committed but not applied, then any subsequent reads, writes, or new transactions to that entity group will cause any unapplied transactions to be applied. So you should not see stale data in this case.

Vinny P

unread,
Jun 11, 2013, 2:04:17 PM6/11/13
to google-a...@googlegroups.com
On Tuesday, June 11, 2013 12:57:48 PM UTC-5, Alex Burgel wrote:
I don't think this is correct. According to this article[1], if the transaction has been committed but not applied, then any subsequent reads, writes, or new transactions to that entity group will cause any unapplied transactions to be applied. So you should not see stale data in this case.


Yes, that is correct in normal situations.

If a transaction exception occurs though (as you asked in your original post) there is no guarantee that the transaction occurred. In that case, you may receive stale data because the transaction is delayed/never occurred.

aanaa...@gmail.com

unread,
Jun 11, 2013, 2:04:36 PM6/11/13
to google-a...@googlegroups.com
Repeating what I wrote on SO:

From what I have read on documents on 'ndb' I understand that if you are running your get() andput() both in 'transactions' (@ndb.transactional), then you will not get stale data. 'ndb' will either serve the updated data or fail both.

Transactions either fail or succeed. Also like other dbms, ndb too maintains 'journal'.



--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.
To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Vinny P

unread,
Jun 11, 2013, 2:15:44 PM6/11/13
to google-a...@googlegroups.com
On Tuesday, June 11, 2013 1:04:36 PM UTC-5, Ananda-GAE wrote:
Repeating what I wrote on SO:

From what I have read on documents on 'ndb' I understand that if you are running your get() andput() both in 'transactions' (@ndb.transactional), then you will not get stale data. 'ndb' will either serve the updated data or fail both.



You're adding a condition that is not necessarily in the original poster's meaning. OP said the get is "within a transaction" not necessarily "within the same transaction as the put". 

We might be getting our wires crossed here. @OP, are you doing the get within the same transaction as the put?

Alex Burgel

unread,
Jun 11, 2013, 2:21:13 PM6/11/13
to google-a...@googlegroups.com
On Tuesday, June 11, 2013 2:04:17 PM UTC-4, Vinny P wrote:
If a transaction exception occurs though (as you asked in your original post) there is no guarantee that the transaction occurred. In that case, you may receive stale data because the transaction is delayed/never occurred.

There is no guarantee that the transaction occurred, but if it did occur then you will see the latest data. That is because any read/write/new transaction to that entity group forces any unapplied transactions to apply before returning data. The 'delayed' thing you're referring to only applies to non-ancestor queries.

From another developer article: "However, even if it is not completely applied, subsequent reads, writes, and ancestor queries will always reflect the results of the commit, because these operations apply any outstanding modifications before executing." [1]

aanaa...@gmail.com

unread,
Jun 11, 2013, 2:26:48 PM6/11/13
to google-a...@googlegroups.com
Hi Vinny.

My understanding is that even if the operations are in different transaction space but if they are of the same ancestor or marked XG=true, the result will be the same.

But your point on handling Exception in your earlier mail is very valid.

Thanks.


Albert Padin

unread,
Jun 11, 2013, 8:19:04 PM6/11/13
to google-a...@googlegroups.com

First, thanks so much for your replies.


To add clarity, this is my situation:


I'm performing a transaction operation that's not idempotent. Therefore, I need to know whether it succeeded or not, and I have to be completely certain about it (before retrying if it failed). Fortunately, I can generate an id unique for each operation. So what I do is I run the following transaction via db.transactional(xg=True):


1. db.get(the_key_of_the_transaction_marker)

2. if the the transaction marker exists, then just return. We don't have to continue because it's already been done

3. if it doesn't exist, proceed….

4. perform non idempotent modifications here…

5. create transaction marker for this transaction via db.put(). with key set to the specific marker/id for this set of modifications


I use step 1 and step 5 to ensure that the operation can never be run twice. It's ok for my operation to fail completely. However, it's completely unacceptable for it to run twice.


Now, if it's possible for the db.get() in step 1 to return None even if a previous transaction had already created that entity (in its own step 5), then my "repeat transaction catcher" is useless. Hence, I'm looking for clarification regarding this…

Does this make sense?

Thanks so much!


Albert

--
You received this message because you are subscribed to a topic in the Google Groups "Google App Engine" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-appengine/rLsWMWS5Acc/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to google-appengi...@googlegroups.com.

Helge Tesdal

unread,
Jun 12, 2013, 5:57:31 AM6/12/13
to google-a...@googlegroups.com
Step 1 Will not return None if a previous transaction created the entity (that transaction might seem to have failed though)

Because datastore is using optimistic currency control (not locking), another transaction might also write the_key_of_the_transaction_marker somewhere between 1 and 5. In that case, 5 will raise an exception on write because it notices that the timestamp of the_key_of_the_transaction_marker did not match what was read in 1.

It is important to let that exception abort the transaction. Open try/excepts are dangerous in this setting.


--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-appengi...@googlegroups.com.

To post to this group, send email to google-a...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-appengine?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Helge Tesdal
Senior Developer - mCASH Norge AS
+47 815 10 150
http://mCA.SH

Albert Padin

unread,
Jun 12, 2013, 10:50:14 AM6/12/13
to google-a...@googlegroups.com

Thank you Helge!


Please let me clarify…


Are you saying that even if the db.get(key_of_marker) returns None in step 1, step 5 will return an exception if an entity of the same key was created in another transaction that happened between steps 2 - 4 of the original transaction?


Is the following Transaction A doomed to fail?



Transaction A


# Check Marker to see if this transaction has been done before

if db.get(db.Key().from_path('Marker', 'abcdefg')):

return


# Perform other db operations here…


# CONCURRENTLY, Transaction B happens and completes here...


# Set Marker specific to this transaction to avoid repeats

Marker(key_name='abcdefg').put()


return



Transaction B

Marker(key_name='abcdefg').put()

return


I just want to be 100% certain that my transactions either completely fail or completely succeed AND NEVER EVER repeat.

Do you think the above algorithm will fit my requirements?

Thanks! I appreciate this so much.


Albert

Helge Tesdal

unread,
Jun 13, 2013, 2:20:04 AM6/13/13
to google-a...@googlegroups.com
On Wed, Jun 12, 2013 at 4:50 PM, Albert Padin <alber...@gmail.com> wrote:


Are you saying that even if the db.get(key_of_marker) returns None in step 1, step 5 will return an exception if an entity of the same key was created in another transaction that happened between steps 2 - 4 of the original transaction?


Yes.

It's optimistic currency control + serializable isolation. Optimistic currency control means another txn can complete between 1 and 5. The alternative is locking, which would have prevented that (but has other limitations). Serializable isolation within txn means if txn read data that has been changed between txn begin and end, it fails and has to retry.
 

Is the following Transaction A doomed to fail?


...


Yes, it will raise an exception.
 
I just want to be 100% certain that my transactions either completely fail or completely succeed AND NEVER EVER repeat.

In that case I suggest you test it for yourself as well. It should be instructive to see it in action, and would help you sleep better at night. ;)

Make a handler for txn A that starts txn, reads, sleeps, writes 'txn A', and a different handler B that writes 'txn B'. Point browser to handler A then handler B before request A ends and see if A raises exception. Note that if B has to start a new instance, A might complete before B, in that case just try again.

Albert Padin

unread,
Jun 13, 2013, 4:45:55 AM6/13/13
to google-a...@googlegroups.com

Hi Helge!


Thanks so much! I tested it, and you were right about all the points. Thanks for your time! :)


To everyone who also posted in this thread, thanks so much for your time as well. You guys really helped me, and I appreciate it.



Cheers!



Albert

--
Reply all
Reply to author
Forward
0 new messages