TransactionFailedError: Is the transaction committed or not?

1,061 views
Skip to first unread message

Pol

unread,
Jul 25, 2011, 4:53:42 AM7/25/11
to Google App Engine
Hi,

The exception says: TransactionFailedError: The transaction could not
be committed. Please try again.

The doc at http://code.google.com/appengine/docs/python/datastore/transactions.html
says: You can receive Timeout, TransactionFailedError, or
InternalError exceptions in cases where transactions have been
committed and eventually will be applied successfully.

So if you get TransactionFailedError, to you need to execute the
transaction again or will it automatically be applied later?

- Pol

Robert Kluin

unread,
Jul 26, 2011, 1:23:16 AM7/26/11
to google-a...@googlegroups.com
Hi Pol,
Generally you will probably want to execute it again.


Robert

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to google-appengi...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/google-appengine?hl=en.
>
>

Jose Montes de Oca

unread,
Jul 26, 2011, 3:40:26 PM7/26/11
to google-a...@googlegroups.com
Hi Pol,

What this meas is that even if a transaction throws an exception this does not means the transaction failed, thats why you need to make your datastore transaction idempotent. So if you retry a transaction because it throws an exception, your transaction needs to "check" if the last transaction committed successfully or not.

Best,
Jose Montes de Oca

Joshua Smith

unread,
Jul 26, 2011, 4:03:17 PM7/26/11
to google-a...@googlegroups.com
On this topic, nobody ever answered this question:

http://code.google.com/appengine/docs/python/datastore/transactions.html

First it says, "Make sure your transactions are idempotent" and then it gives an example which isn't.

I'm not sure it's possible to do the task in that example correctly if you cannot tell whether a transaction succeeded or failed when it throws an exception. I just tried sketching out a solution that stored a transaction ID in the model, but that won't work because there could be multiple writers. The whole thing seems rather intractable, and the idea that you cannot tell whether a transaction succeeded or failed violates the principle of least surprise for anyone who's ever used a database!

Can some googlers weigh in, and explain how, for example, the example in the documentation could be implemented correctly?


So, can you?

I think that the idea of making transactions idempotent is nonsense.  I don't think it is going to be possible in many cases.

Either you support transactions, or you don't.  By my reading, GAE doesn't.

-Joshua

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/bm54SrfYFdAJ.

Robert Kluin

unread,
Jul 26, 2011, 10:11:48 PM7/26/11
to google-a...@googlegroups.com
Well, I generally put revision numbers on my entities. In my
transctions, before the update, I check that the revision number
currently on the entity matches what I expect. If it doesn't then I
know the update is out-of-date.

There are other solutions too, depending on the app and data model.


Robert

Joshua Smith

unread,
Jul 27, 2011, 7:51:35 AM7/27/11
to google-a...@googlegroups.com
Can you give an example? Because that approach would seem to be unreliable in a two-writer scenario.

Tim Hoffman

unread,
Jul 27, 2011, 8:58:42 PM7/27/11
to google-a...@googlegroups.com
If you always get modify put within a transaction how would it be unreliable?

Rgds

Tim
 

Joshua Smith

unread,
Jul 28, 2011, 10:02:53 AM7/28/11
to google-a...@googlegroups.com
The problem is that google transactions can report an exception, and then go ahead and succeed anyway.

So the docs recommend that you only write idempotent transactions, which is a completely silly suggestion.  I've yet to see a single example of how one might write an idempotent transaction.  (Unless, I suppose, you create a separate child model in the database which is parented by the object you are transacting on, and then you query the list of children every time you retry your transaction to see if its already in there, but that won't scale.)

I contend that a DB that cannot tell you reliably whether a transaction succeeded for failed does not support transactions.

GAE can essentially report 3 possible results from a transaction:
- Definitely succeeded
- Definitely failed
- Beats me

I contend that third possible result makes it impossible to write software that relies on transactions.

Therefore, GAE doesn't support transactions.

--
You received this message because you are subscribed to the Google Groups "Google App Engine" group.
To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/mP_8kv_-LlMJ.

thecheatah

unread,
Jul 28, 2011, 10:43:49 AM7/28/11
to Google App Engine
I agree with Joshua. This kind of behavior would not be acceptable if
I was to make an app that requires 100% consistency and "easy" to
program. For stuff like this I use a relational db.

Ravneet

On Jul 28, 10:02 am, Joshua Smith <JoshuaESm...@charter.net> wrote:
> The problem is that google transactions can report an exception, and then go ahead and succeed anyway.
>
> So the docs recommend that you only write idempotent transactions, which is a completely silly suggestion.  I've yet to see a single example of how one might write an idempotent transaction.  (Unless, I suppose, you create a separate child model in the database which is parented by the object you are transacting on, and then you query the list of children every time you retry your transaction to see if its already in there, but that won't scale.)
>
> I contend that a DB that cannot tell you reliably whether a transaction succeeded for failed does not support transactions.
>
> GAE can essentially report 3 possible results from a transaction:
> - Definitely succeeded
> - Definitely failed
> - Beats me
>
> I contend that third possible result makes it impossible to write software that relies on transactions.
>
> Therefore, GAE doesn't support transactions.
>
> On Jul 27, 2011, at 8:58 PM, Tim Hoffman wrote:
>
>
>
>
>
>
>
> > If you always get modify put within a transaction how would it be unreliable?
>
> > Rgds
>
> > Tim
>
> > --
> > You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> > To view this discussion on the web visithttps://groups.google.com/d/msg/google-appengine/-/mP_8kv_-LlMJ.

Pol

unread,
Jul 28, 2011, 10:45:21 AM7/28/11
to Google App Engine
On Jul 26, 9:40 pm, Jose Montes de Oca <jfmontesde...@google.com>
wrote:
I don't understand how GAE could not report if the transaction
succeeded or not. Clearly this stuff is deterministic.

You're saying I have to maintain my own journal of transactions to
check against. But then that journal is itself would have to be
updated through transactions, so it needs to be journaled as well and
so on... it's completely unsolvable. How would implement a bank on
such a system? How could you ever 100% guarantee a user's transactions
are all in there or that the account's balance is correct?

Let's even look at a simple example: if you were to implement a simple
counter, how do you make that idempotent?

txn()
counter = Counter.get(...)
counter.value += 1
counter.put()

Pol

unread,
Jul 28, 2011, 10:51:10 AM7/28/11
to Google App Engine
I see two big problems here:

1) There appear to be absolutely no record, say in the dashboard, of
failed transactions that eventually succeeded or really failed. For
some type of apps, I could live with 1 in a million transaction
failure that may or may not be eventually successful, but I need to
know which entity got in a degenerated state.

2) The benefit of the current approach of fake-real-transactions is
completely unknown: does it make GAE massively faster or more reliable
or something?

For instance, if you choose non-redundant storage in Amazon S3:
1) you get notifications for lost objects
2) it's cheaper

On Jul 28, 4:02 pm, Joshua Smith <JoshuaESm...@charter.net> wrote:
> The problem is that google transactions can report an exception, and then go ahead and succeed anyway.
>
> So the docs recommend that you only write idempotent transactions, which is a completely silly suggestion.  I've yet to see a single example of how one might write an idempotent transaction.  (Unless, I suppose, you create a separate child model in the database which is parented by the object you are transacting on, and then you query the list of children every time you retry your transaction to see if its already in there, but that won't scale.)
>
> I contend that a DB that cannot tell you reliably whether a transaction succeeded for failed does not support transactions.
>
> GAE can essentially report 3 possible results from a transaction:
> - Definitely succeeded
> - Definitely failed
> - Beats me
>
> I contend that third possible result makes it impossible to write software that relies on transactions.
>
> Therefore, GAE doesn't support transactions.
>
> On Jul 27, 2011, at 8:58 PM, Tim Hoffman wrote:
>
>
>
>
>
>
>
> > If you always get modify put within a transaction how would it be unreliable?
>
> > Rgds
>
> > Tim
>
> > --
> > You received this message because you are subscribed to the Google Groups "Google App Engine" group.
> > To view this discussion on the web visithttps://groups.google.com/d/msg/google-appengine/-/mP_8kv_-LlMJ.

Stephen Johnson

unread,
Jul 28, 2011, 1:31:03 PM7/28/11
to google-a...@googlegroups.com
This does seem to be a conundrum. I'll offer up an idea for those instances where this behavior is of concern. It's not a great solution but if you need to guarantee that a transaction definitely succeeds and your transaction is idempotent then maybe this or some variation will help.

First off, incrementing version numbers won't work due to the following scenario:

1.) open a transaction find out the current version is 78
2.) perform update and increment the version to 79
3.) commit transaction
4.) get an exception. Now, did the transaction succeed or fail?? This is the conundrum
5.) so re-fetch the item and check the version number, it says 79, so transaction must have succeed, but what if some other transaction in the meantime was the one that updated the item and incremented the version number. So back to conundrum!!!

Okay, so instead of using a version number we use a random transaction stamp (this could be a random number plus some hashed in data that is being transacted, etc. to make it really unique). Then instead of a version number property, we add a transaction stamp list that holds, for example, the last 5 or 10 stamps, or however many you want. So it goes like this:

1.) open a transaction 
2.) perform update
3.) add unique transaction stamp to list, if list has reached maximum remove the oldest one (FIFO).
4.) commit transaction
5.) if everything is fine, go on our merry way
6.) if exception is generated, then re-query the entity if the entity has "our" unique timestamp in its list then the transaction really did succeed, if not then the transaction really did fail. 

Well, that's my idea,
Stephen
CortexConnect

Stephen Johnson

unread,
Jul 28, 2011, 1:31:34 PM7/28/11
to google-a...@googlegroups.com
Oops, I meant that "your transaction is NOT idempotent", duh.

Joshua Smith

unread,
Jul 28, 2011, 1:48:25 PM7/28/11
to google-a...@googlegroups.com

> 6.) if exception is generated, then re-query the entity if the entity has "our" unique timestamp in its list then the transaction really did succeed, if not then the transaction really did fail.

Could work, except that you'll need to do #6 with a task queue task, because in the case of an exception, they only say it might "eventually" make it into the database. So if you just requery and it isn't there, then you don't know that it didn't succeed… you just know it didn't succeed *yet*. And since the exception which leads to this problem is "InternalError", and we know when those start coming up you can expect to see them continue for a long time, you're going to have a lot of stuff in that queue.

So that means you need to throw this check into the task queue, and keep checking it for a while. Which probably means you need to have a really big list of "recent" transactions (not just 10).

This hack is similar to what I suggested, which is to create a transaction record in the database as a child of the entity you are updating in the transaction. When you get the exception, you wait a while (how long???) and then see if the transaction record exists using an ancestor query. If it does, then the transaction succeeded, and if not, then it didn't. This avoids having to add goofy hacky attributes to the thing you are updating, and in principle, you can schedule a task to delete any transaction records for successful transactions (or periodically sweep up old ones with a cron job).

So, I suppose, it is possible to create a framework that can really do transactions using the building blocks GAE has provided us, but to say that GAE supports transactions on its own is a stretch.

-Joshua

Stephen Johnson

unread,
Jul 28, 2011, 2:24:18 PM7/28/11
to google-a...@googlegroups.com
I should have mentioned that the re-fetch should be done in a transaction, which should take away the "eventually" part since from what I've gathered I think you're reading directly from the master transaction log. As for the continuous timing out due to datastore issues then you'd have to fallback to the task queue as Joshua has stated. It's not a perfect solution but oh well.


--

Robert Kluin

unread,
Jul 28, 2011, 4:50:36 PM7/28/11
to google-a...@googlegroups.com
On Thu, Jul 28, 2011 at 12:31, Stephen Johnson <onepag...@gmail.com> wrote:
> This does seem to be a conundrum. I'll offer up an idea for those instances
> where this behavior is of concern. It's not a great solution but if you need
> to guarantee that a transaction definitely succeeds and your transaction is
> idempotent then maybe this or some variation will help.
> First off, incrementing version numbers won't work due to the following
> scenario:
> 1.) open a transaction find out the current version is 78
> 2.) perform update and increment the version to 79
> 3.) commit transaction
> 4.) get an exception. Now, did the transaction succeed or fail?? This is the
> conundrum
> 5.) so re-fetch the item and check the version number, it says 79, so
> transaction must have succeed, but what if some other transaction in the
> meantime was the one that updated the item and incremented the version
> number. So back to conundrum!!!

This method doesn't really make sense if you read the version number
in the transaction, do stuff, increment the version, then reput within
the same transaction.

However, I do use something like this in some cases, except I have
read the current version prior to 'computing' my updated values. So I
pass the expected version number into the transaction. If I get a
mismatch then I know the update values might be wrong and the update
either needs skipped, recomputed, or the user needs to OK the changes
(and of course this happens based on the current version number).

> Okay, so instead of using a version number we use a random transaction stamp
> (this could be a random number plus some hashed in data that is being
> transacted, etc. to make it really unique). Then instead of a version number
> property, we add a transaction stamp list that holds, for example, the last
> 5 or 10 stamps, or however many you want. So it goes like this:
> 1.) open a transaction
> 2.) perform update
> 3.) add unique transaction stamp to list, if list has reached maximum remove
> the oldest one (FIFO).
> 4.) commit transaction
> 5.) if everything is fine, go on our merry way
> 6.) if exception is generated, then re-query the entity if the entity has
> "our" unique timestamp in its list then the transaction really did succeed,
> if not then the transaction really did fail.

Again, the unique id needs generated outside the transaction so that
if the transaction were to run again the id would be the same. Also,
instead of the list property you could use a separate entity (in the
same entity group) with no properties as a marker. That avoids the
deserialization cost and is totally scalable. With some minor tweaks
you would even be able to cleanup the old marker entities.

Also, as Stephen notes in a later comment, if the next read is in a
transaction it will force any outstanding writes to be applied -- so
it is always consistent. Note that a db.get(some_key) implicitly uses
a transaction.


Robert

Jose Montes de Oca

unread,
Jul 28, 2011, 5:30:39 PM7/28/11
to google-a...@googlegroups.com
Hi Joshua,

As robert said, making a process idempotent depends mostly on the app logic and data model. IMO for the example case you need more of a context of how the counter is been use to make the use of the counter in a transaction idempotent. (remember: the example just illustrates how to make use of a transaction, it does not handles any retry if the transaction failed or not)

As you said: making a separate child model, which is parent by the object you are transacting on, and inside the transaction query to see if its already in there, Its a good solution to make your transaction idempotent, and I don't see a problem why this approach wont scale. If you do an ancestor query on that kind it would be very fast.

With this approach, the code inside the transaction should do something like:
1) read child entity
2) if entity not processed, do the transaction logic.
else: no nothing.

FYI GAE support transactions, because either the whole transaction succeeds or fails. Whether you know about the outcome or not is different.

Although making process idempotent is somewhat a complicated logic, here the better solution would be to figure out why the exception is occurring. Mostly because this type of exception are not common.

You don't need to "only" write idempotent transactions, you should make them idempotent, unless it does not matter if they succeed or not, as I said before this type of exception are not common so there are many cases where 100% data retention is not as important.

Jose Montes de Oca

unread,
Jul 28, 2011, 5:33:16 PM7/28/11
to google-a...@googlegroups.com
Hi Pol,

Answering your first concern in numbers:

1) I think there is a misunderstanding on getting an exception for a failure. Because a transaction will succeed or fail. FYI we are aware this is unclear in the docs and we are working on it. But as I said before if you could live with 1 in a million transaction to not know fore sure the outcome (succeeded or failed) you wont need to make your process idempotent.

2) There is no fake-real-transaction, for example, one of the other uncommon exception that someone could get in the commit() would be Timeout exception. With any distributed system, unless you are willing to wait for an answer forever, there is always the possibility that you will get a timeout before the remote system responds with the result of the operation you requested (this is true in all databases), what GAE warranty is that the transaction either commits successfully or it does not, so if you implement some retry logic in transactions, having it idempotent will warranty you don't do the transaction twice.

For the counter example, to make something idempotent you need a way to distinguish each time the counter gets incremented, it would depend on the logic of the application. its really hard to make it idempotent by itself. lets say part of a transaction would be incrementing a counter, if the process of incrementing is unique to users and a user can only incremented once, you could maintain another kind that gets set inside the transaction that stores the user that incremented the counter. then your logic inside the transaction will first check if the user already incremented the counter by making a get to the kind that keeps track of that (we can assure that if a user appears here the transaction was committed because a transaction either fails or not) if he did not, you go ahead and process the logic again, else do nothing (it already did it). This will make sure the transaction is not redone if you retry it after getting an exception where the outcome was unknown.

But as I said before, this exceptions are uncommon and it would be good to know what are you doing inside the transaction that is causing those transactions. Timeout exception could be related to master slaves issues (in a M/S app) this could go away by migrating to High replication datastore.

Hope this helps clarifies some points.

Best,
Jose

Pol

unread,
Jul 29, 2011, 10:17:30 AM7/29/11
to Google App Engine
Hi Jose,

> 2) There is no fake-real-transaction, for example, one of the other uncommon
> exception that someone could get in the commit() would be Timeout exception.
> With any distributed system, unless you are willing to wait for an answer
> forever, there is always the possibility that you will get a timeout before
> the remote system responds with the result of the operation you requested
> (this is true in all databases), what GAE warranty is that the transaction
> either commits successfully or it does not, so if you implement some retry
> logic in transactions, having it idempotent will warranty you don't do the
> transaction twice.

Just to be sure, are you saying that for a timeout exception the
transaction has actually been committed or not? In other words, should
a Timeout exception be dealt with the same way a
TransactionFailedError is?

I observe Timeout exceptions now and then where there is too much
contention on the same entity group. But I've only seen a
TransactionFailedError once in 2 months.

> But as I said before, this exceptions are uncommon and it would be good to
> know what are you doing inside the transaction that is causing those
> transactions. Timeout exception could be related to master slaves issues (in
> a M/S app) this could go away by migrating to High replication datastore.

I'm using the HR datastore already, but I'm not surprised with the
Timeouts as in my case there can sometimes, but rarely, be close to 10
transactions / seconds on the same entity group.

> Hope this helps clarifies some points.

Thanks much for your detailed answers indeed!

- Pol

Tim Hoffman

unread,
Jul 29, 2011, 12:36:03 PM7/29/11
to google-a...@googlegroups.com


On Friday, July 29, 2011 10:17:30 PM UTC+8, Pol wrote:
Hi Jose,

> 2) There is no fake-real-transaction, for example, one of the other uncommon
> exception that someone could get in the commit() would be Timeout exception.
> With any distributed system, unless you are willing to wait for an answer
> forever, there is always the possibility that you will get a timeout before
> the remote system responds with the result of the operation you requested
> (this is true in all databases), what GAE warranty is that the transaction
> either commits successfully or it does not, so if you implement some retry
> logic in transactions, having it idempotent will warranty you don't do the
> transaction twice.

Just to be sure, are you saying that for a timeout exception the
transaction has actually been committed or not? In other words, should
a Timeout exception be dealt with the same way a
TransactionFailedError is?

No.  I believe he was saying if you get a timeout it just means you didn't get a response 
back in time.  So the transaction could have completed or failed. You don't know.
What you can be certain of is either everything in the transaction succeeded or failed.

 

I observe Timeout exceptions now and then where there is too much
contention on the same entity group. But I've only seen a
TransactionFailedError once in 2 months.

I get TransactionFailedError quite a bit.

The transaction could not be committed. Please try again.
Traceback (most recent call last):
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 702, in __call__
    handler.post(*groups)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/deferred/deferred.py", line 269, in post
    run(self.request.body)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/deferred/deferred.py", line 131, in run
    return func(*args, **kwds)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/ext/deferred/deferred.py", line 174, in invoke_member
    return getattr(obj, membername)(*args, **kwargs)
  File "/base/data/home/apps/q-tracker/0-9-9-9.352045897303914982/qtrack/models/contract.py", line 268, in defer_create_worksheet
    db.run_in_transaction(contract.create_worksheet,start_of_week(week))
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 2187, in RunInTransaction
    DEFAULT_TRANSACTION_RETRIES, function, *args, **kwargs)
  File "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", line 2294, in RunInTransactionCustomRetries
    'The transaction could not be committed. Please try again.')
TransactionFailedError: The transaction could not be committed. Please try again.

But this is running in a deferred task, and the task is automatically put back into the queue, so I know it will eventually get there.
I am running on M/S.

Rgds

Tim

vlad

unread,
Jul 29, 2011, 8:20:19 PM7/29/11
to google-a...@googlegroups.com
Joushua,

I was thinking of putting child entities approach as well. The question is does writing a new child entity blocks the whole entity group? If answer is yes. Then you get no benefit from that other than keeping meticulous track of what is happening. I would consider that approach if it lets me avoid transaction collisions as well.

Robert Kluin

unread,
Jul 30, 2011, 12:02:50 AM7/30/11
to google-a...@googlegroups.com

It won't help avoid the collision. It will let you keep track of previous transactions.

On Jul 29, 2011 8:20 PM, "vlad" <vlad.tr...@gmail.com> wrote:
>
> Joushua,
>
> I was thinking of putting child entities approach as well. The question is does writing a new child entity blocks the whole entity group? If answer is yes. Then you get no benefit from that other than keeping meticulous track of what is happening. I would consider that approach if it lets me avoid transaction collisions as well.
>

> --
> You received this message because you are subscribed to the Google Groups "Google App Engine" group.

> To view this discussion on the web visit https://groups.google.com/d/msg/google-appengine/-/TQa0MN-_vtEJ.

Heiko Roth

unread,
Jul 30, 2011, 1:39:46 PM7/30/11
to Google App Engine


On 26 Jul., 07:23, Robert Kluin <robert.kl...@gmail.com> wrote:
> Hi Pol,
>   Generally you will probably want to execute it again.
>

No. I've got the problem for time registration. We got an exception
that the transaction failed, so we do it again which can result in
duplicate (and more) datastore entries which is bad. Because now there
are duplicate bookings and so duplicate working time...

It's a disaster that the program gets a transaction failed while the
transaction didn't doing another insert resulting in a mess.
But ignoring the exception can result in now datastore entry.

So we have catch the exception, do a select and repeat the insert if
it didn't happen by now.
And that we have to do for any insert ......

I'm frustrated.

Robert Kluin

unread,
Jul 30, 2011, 3:43:27 PM7/30/11
to google-a...@googlegroups.com
If you have a way to precompute something to use as a key-name this is
significantly less of an issue. The transaction repeating would not
result in duplicate record being created that way.


Robert

Reply all
Reply to author
Forward
0 new messages