Is "transaction.atomic" in 1.6 supposed to work this way?

2,848 views
Skip to first unread message

Richard Ward

unread,
Sep 19, 2013, 8:38:20 AM9/19/13
to django-d...@googlegroups.com
We're building a site that needs to use transactions, and have been doing so against the 1.6 beta as 1.6 is nearly out and we thought it would be easier to use the new transaction api, but we came across an unexpected problem.

Basically if you catch an IntegrityError and 'recover' from it, then your transaction is still not committed.

I'm posting this here rather than in django-users as its a feature in an unreleased version and it seems like it a bug to me (apologies if it is the wrong place).

Take this code:

def innocent_looking_function():
    "Makes sure that a MyModel with foo=1 exists"
    try:
        #imagine mymodel has a unique constraint on 'foo'
        MyModel(foo=1).save()
    except IntegrityError:
        pass

@transaction.atomic
def my_func():
    innocent_looking_function()
    MyOtherModel(bar="hi").save()

If you call 'my_func' when a MyModel with foo=1 exists, then the transaction.atomic decorator will roll back the transaction, even though it seems like everything is alright. 'my_func' doesn't know that 'innocent_looking_func' may cause this behaviour, and as far as 'innocent_looking_func' is concerned it has not done anything wrong - it caught and ignored a harmless error.

I had assumed that transaction.atomic would only roll back the transaction if an exception was propagated through it (eg if innocent_looking_function did not catch the error), that would seem like the natural way of doing things.

So what is the problem here? I assume it is one of:
  1. 'innocent_looking_function' is badly written: it should not be catching IntegrityErrors under any circumstances (though that seems like a valid thing to do), it should instead use something like get_or_create.
  2. 'innocent_looking_function' should have 'with transaction.atomic():' just inside the 'try' block, and presumably so should every other bit of code where an IntegrityError is caught. But what if 'innocent_looking_function' also gets used elsewhere that may not be inside an atomic block?
  3. transaction.atomic is in some way buggy / oddly designed.

IMO this behaviour makes code that looks valid (to me) not work in the expected way.

In case its relevant I'm using InnoDB tables on MySQL.

So Is there a bug? Or am I using this incorrectly, or misunderstanding something?

Thanks,

Richard

Gert Van Gool

unread,
Sep 19, 2013, 9:29:55 AM9/19/13
to django-developers
If the `innocent_looking_function` would use transactions. And thus handles the `IntegrityError by`
issuing a rollback, just like `get_or_create` does (https://github.com/django/django/blob/1.6b4/django/db/models/query.py#L360-L390).
Do you see the same behaviour in `my_func`?

-- Gert

Mobile: +32 498725202


--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.

Aymeric Augustin

unread,
Sep 19, 2013, 9:33:53 AM9/19/13
to django-d...@googlegroups.com
2013/9/19 Richard Ward <ric...@richard.ward.name>
So what is the problem here? I assume it is one of:
  1. 'innocent_looking_function' is badly written: it should not be catching IntegrityErrors under any circumstances (though that seems like a valid thing to do), it should instead use something like get_or_create.

It's possible to catch IntegrityErrors. The correct pattern — regardless of function calls — is:

try:
    with transaction.atomic():
        do_stuff()
except IntegrityError:
    # transaction.atomic has already performed the rollback at this point
    handle_error()

  1. 'innocent_looking_function' should have 'with transaction.atomic():' just inside the 'try' block, and presumably so should every other bit of code where an IntegrityError is caught. But what if 'innocent_looking_function' also gets used elsewhere that may not be inside an atomic block?
Indeed that would implement the pattern shown above and resolve the problem.
  1. transaction.atomic is in some way buggy / oddly designed.
As the designer of this API, I say: suggestions welcome :) Read below before jumping to the drawing board, though.

IMO this behaviour makes code that looks valid (to me) not work in the expected way.

Catching IntegrityError inside transaction.atomic is wrong because it hides the exception from transaction.atomic.

As a consquence, transaction.atomic believe that everything went well and merrily commits. But the database has a broken transaction in progress; it can't commit.. transaction.atomic fails to commit, rolls back, and raises an exception — which isn't the original IntegrityError, adding to confusion.

In case its relevant I'm using InnoDB tables on MySQL.

You might have received more meaningful errors with another database engine, but overall you've described the intended, cross-database behavior of transaction.atomic. 

So Is there a bug? Or am I using this incorrectly, or misunderstanding something?

The code works as designed. It relies on exceptions to distinguish failure from success. This is the cleanest way to implement transactions as a context manager. If you swallow the exception, you make a failure look like a success, and things go wrong.

The docs talk a lot about exceptions but they don't warn specifically about this pitfall. It would be a worthy addition. Can you file a ticket?

Thank you,
 
--
Aymeric.

Anssi Kääriäinen

unread,
Sep 19, 2013, 9:45:53 AM9/19/13
to django-d...@googlegroups.com
There is a definite disparency about what the code does and what the docs say:
"""
If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back.
"""

If an exception is catched inside the block, then there should be no rollback. With PostgreSQL you can't continue to use the transaction, but in other databases you can. For these other databases allowing use of the transaction, yet still rolling back it later on seems to be a footgun - why are you allowed to continue to use the transaction if a rollback is going to happen anyways?

 - Anssi

Aymeric Augustin

unread,
Sep 19, 2013, 10:29:39 AM9/19/13
to django-d...@googlegroups.com
2013/9/19 Anssi Kääriäinen <anssi.ka...@thl.fi>
There is a definite disparency about what the code does and what the docs say:
"""
If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back.
"""

If an exception is catched inside the block, then there should be no rollback.

I'm not following you; as far as I can tell, that's what the docs say and what the code does.

With PostgreSQL you can't continue to use the transaction, but in other databases you can.

You... may... might... Truth be said, you shan't.
 
For these other databases allowing use of the transaction, yet still rolling back it later on seems to be a footgun - why are you allowed to continue to use the transaction if a rollback is going to happen anyways?

Allowing a transaction to proceed after a statement has failed is:
- stupid if the transaction is allowed to commit — you would end up with a successful transaction where not all statements have executed, which is exactly the situation transactions are designed to prevent!
- useless if it isn't allowed to commit.

That's why PostgreSQL forbids it.

That's also what the SQLite docs say:
"It is recommended that applications respond to the errors listed above by explicitly issuing a ROLLBACK command."

Extracting the same information from the MySQL and Oracle docs is left as an exercise for the reader.

--
Aymeric.

Anssi Kääriäinen

unread,
Sep 19, 2013, 10:52:25 AM9/19/13
to django-d...@googlegroups.com
After some investigation it turns out that this isn't about IntegrityErrors at all. Instead the problem is that inside @atomic block
Model.save() uses @atomic(savepoint=False) internally. And @atomic(savepoint=False) forces the outer atomic block to roll back if errors
happen.

If I recall correctly there is transaction.set_rollback(False) which can be used to remove forced rollback.

In general, there are three ways to respond to errors inside transactions:
  1. allow usage of the TX (MySQL, SQLite etc), allow user to decide commit/rollback
  2. forbid usage of the TX (PostgreSQL), force it to be rolled back.
  3. allow usage of the TX, but force rollback (Django)

To me it seems explicit error when using connection in "forced to rollback" state is better than allowing saving more data, then silently
rolling back the transaction. As you said this is useless.

To be clear, this is about exceptions (database or other) in atomic(savepoint=False) blocks, not about caught IntegrityErrors - Django
will allow you to continue and commit the TX in the caught IntgerityError case assuming the DB allows that.

 - Anssi

Aymeric Augustin

unread,
Sep 19, 2013, 4:33:00 PM9/19/13
to django-d...@googlegroups.com
On 19 sept. 2013, at 16:52, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:

> After some investigation it turns out that this isn't about IntegrityErrors at all. Instead the problem is that inside @atomic block
> Model.save() uses @atomic(savepoint=False) internally. And @atomic(savepoint=False) forces the outer atomic block to roll back if errors
> happen.

Yes, that's the actual code path — which I introduced after you requested it for performance reasons, in spite of my strong concerns that it could introduce weird edge cases ;-)

Nonetheless, in my opinion, the root cause of the perceived misbehavior is that user code (not Django code) catches IntegrityError inside an atomic block, preventing __exit__ from detecting that a database exception has happened.

DatabaseErrors must be caught outside the atomic block, as documented. Otherwise transaction.atomic will fail in various interesting and database dependent ways. It's harder to diagnose on databases that don't fully enforce transactional behavior ie. all but PostgreSQL.

> If I recall correctly there is transaction.set_rollback(False) which can be used to remove forced rollback.

This is a private API for a reason. Someone deeply familiar with the implementation details of transaction management in Django 1.6 and with an esoteric transaction management system might find this private API useful but it's very obviously not the answer to the original question. Please, everyone, don't do this.

> In general, there are three ways to respond to errors inside transactions:
> 1. allow usage of the TX (MySQL, SQLite etc), allow user to decide commit/rollback
> 2. forbid usage of the TX (PostgreSQL), force it to be rolled back.
> 3. allow usage of the TX, but force rollback (Django)

You're comparing apples to oranges here. If Django is backed by PostgreSQL, how can Django allow using the transaction if PostgreSQL forbids it?

> To me it seems explicit error when using connection in "forced to rollback" state is better than allowing saving more data, then silently rolling back the transaction. As you said this is useless.

Django doesn't allow saving more data in a broken transaction. If a DatabaseError occurs, execution jumps straight to __exit__ which sees an exception and triggers a rollback.

Of course, you can disregard the documentation and break this expectation by catching the exception. But there's no way around this in Python. A context manager cannot prevent catching exceptions.

> To be clear, this is about exceptions (database or other) in atomic(savepoint=False) blocks, not about caught IntegrityErrors - Django will allow you to continue and commit the TX in the caught IntgerityError case assuming the DB allows that.

I designed this part of Django to forbid continuing and committing the transaction. The reason is cross-database compatibility. I had to enforce the behavior of the most restrictive database, which is also the most correct, namely PostgreSQL.

Sure, you can do virtually anything by disregarding the docs and abusing private APIs, but I wouldn't describe that as "Django will allow you to…"

If I still haven't convinced you that I know what I'm talking about, here's an example with the exact same problem but without atomic(savepoint=False).

with transaction.atomic:
try:
do_stuff_in_db() # raises DatabaseError
except DatabaseError:
pass
do_more_stuff_in_db() # works only on lax databases eg. MySQL

In this example, it's totally possible that the transaction.atomic block will end up with a rollback even if do_more_stuff_in_db() succeed. Why? Because the commit is likely to fail if a statement failed during the transaction, and that will result in an implicit rollback!

At the highest level, my design is based on avoiding implicit rollbacks because explicit is better than implicit. But you can still force one if you try hard enough.

--
Aymeric.

Hannu Krosing

unread,
Sep 19, 2013, 4:47:48 PM9/19/13
to django-d...@googlegroups.com, Anssi Kääriäinen
On 09/19/2013 04:52 PM, Anssi K��ri�inen wrote:
> After some investigation it turns out that this isn't about
> IntegrityErrors at all. Instead the problem is that inside @atomic block
> Model.save() uses @atomic(savepoint=False) internally. And
> @atomic(savepoint=False) forces the outer atomic block to roll back if
> errors
> happen.
>
> If I recall correctly there is transaction.set_rollback(False) which
> can be used to remove forced rollback.
>
> In general, there are three ways to respond to errors inside transactions:
> 1. allow usage of the TX (MySQL, SQLite etc), allow user to decide
> commit/rollback
> 2. forbid usage of the TX (PostgreSQL), force it to be rolled back.
Actually you can use subtransactions in postgresql if you want to manage
failed queries yourself

BEGIN;
<do something>;
SAVEPOINT sp1;
<do something that fails>;
ROLLBACK TO SAVEPOINT sp1;
<do something else>;
COMMIT:

this will commit <do something>; and <do something else>; while rolling
back the failed <do something that fails>;
you can read more on this in
http://www.postgresql.org/docs/9.1/static/sql-rollback-to.html

Cheers

--
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic O�

Marc Tamlyn

unread,
Sep 19, 2013, 5:11:37 PM9/19/13
to django-d...@googlegroups.com

This is what nested atomics do. In effect savepoints are just creating another transaction inside the existing one (if my memories of Aymeric's talk are correct).

On 19 Sep 2013 21:48, "Hannu Krosing" <ha...@krosing.net> wrote:
2ndQuadrant Nordic OÜ

Anssi Kääriäinen

unread,
Sep 19, 2013, 5:38:17 PM9/19/13
to django-d...@googlegroups.com
I think there is a case of talking past each other going on here.


On Thursday, September 19, 2013 11:33:00 PM UTC+3, Aymeric Augustin wrote:
Nonetheless, in my opinion, the root cause of the perceived misbehavior is that user code (not Django code) catches IntegrityError inside an atomic block, preventing __exit__ from detecting that a database exception has happened.

This is *not* what is causing the problem in this case. The problem here is that if you do this inside atomic block:
try:
    obj.save()
except IntegrityError:
    pass

you can continue to use the connection, but after the outer atomic block is exited the transaction is rolled back. The combination of allowing usage of connection but rolling it back is the foot gun I have been talking about..

Note that this isn't about DatabaseErrors. *Any* exception bubbling from atomic(savepoint=False) block will cause this issue.

DatabaseErrors must be caught outside the atomic block, as documented. Otherwise transaction.atomic will fail in various interesting and database dependent ways. It's harder to diagnose on databases that don't fully enforce transactional behavior ie. all but PostgreSQL.

Again, this is not about DatabaseErrors at all. DatabaseErrors will mark "errors_occured". Errors bubbling from atomic(savepoint=False) will mark needs_rollback.

> If I recall correctly there is transaction.set_rollback(False) which can be used to remove forced rollback.

This is a private API for a reason. Someone deeply familiar with the implementation details of transaction management in Django 1.6 and with an esoteric transaction management system might find this private API useful but it's very obviously not the answer to the original question. Please, everyone, don't do this.

This is documented API...

> In general, there are three ways to respond to errors inside transactions:
>   1. allow usage of the TX (MySQL, SQLite etc), allow user to decide commit/rollback
>   2. forbid usage of the TX (PostgreSQL), force it to be rolled back.
>   3. allow usage of the TX, but force rollback (Django)

You're comparing apples to oranges here. If Django is backed by PostgreSQL, how can Django allow using the transaction if PostgreSQL forbids it?

Yes, you get #1 on PostgreSQL. But if you are using something other than PostgreSQL + Django you will get #3 I don't want that, and it seems you don't either. Yet this is what is provided. We can change this to #1 on all databases.

> To me it seems explicit error when using connection in "forced to rollback" state is better than allowing saving more data, then silently rolling back the transaction. As you said this is useless.

Django doesn't allow saving more data in a broken transaction. If a DatabaseError occurs, execution jumps straight to __exit__ which sees an exception and triggers a rollback.

No, this doesn't happen. When using savepoint=False, you get needs_rollback flagged for the connection. The connection is still usable. Anything you do will be silently rolled back.
 
Of course, you can disregard the documentation and break this expectation by catching the exception. But there's no way around this in Python. A context manager cannot prevent catching exceptions.

> To be clear, this is about exceptions (database or other) in atomic(savepoint=False) blocks, not about caught IntegrityErrors - Django will allow you to continue and commit the TX in the caught IntgerityError case assuming the DB allows that.

I designed this part of Django to forbid continuing and committing the transaction. The reason is cross-database compatibility. I had to enforce the behavior of the most restrictive database, which is also the most correct, namely PostgreSQL.


Sure, you can do virtually anything by disregarding the docs and abusing private APIs, but I wouldn't describe that as "Django will allow you to…"

If I still haven't convinced you that I know what I'm talking about, here's an example with the exact same problem but without atomic(savepoint=False).

with transaction.atomic:
    try:
        do_stuff_in_db()                # raises DatabaseError
    except DatabaseError:
        pass
    do_more_stuff_in_db()        # works only on lax databases eg. MySQL
 
In this example, it's totally possible that the transaction.atomic block will end up with a rollback even if do_more_stuff_in_db() succeed. Why? Because the commit is likely to fail if a statement failed during the transaction, and that will result in an implicit rollback!

So, why allow using the connection then?

But this case isn't as bad as:
with transaction.atomic():
    try:
        obj.save()
    except IntegrityError:
        pass
    another_obj.save()

Here, another_obj.save() will succeed. But the transaction will be *always* rolled back, silently. I don't see any good reason not to error out when using a connection which is going to be rolled back in always.

What would you think of a database that does:
> begin;
> insert into foobar values(1);
OUT: IntegrityError!
> insert into foobar values(2);
> commit;

you don't get error on commit - you just get a silent rollback. This is exactly what is happening in Django currently.

At the highest level, my design is based on avoiding implicit rollbacks because explicit is better than implicit. But you can still force one if you try hard enough.

Isn't there an implicit rollback going on here?

 - Anssi

Richard Ward

unread,
Sep 19, 2013, 7:32:26 PM9/19/13
to django-d...@googlegroups.com
Thanks for the explanations. I wasn't aware that particular difference between PostgreSQL and MySQL (I've not done much with PostgreSQL).

Perhaps I could have been clearer in my original post: Ansii is right about silence... when I call my_func the transaction.atomic decorator rolls back the transaction and then returns, rather than (eg) rolling back the transaction and then raising/propagating an exception (when using MySQL).

I don't think that what I was trying to do (continue using a transaction after an insert failed) was too outlandish. I'm not demanding that Django should allow me to do it (tho that would be nice), but I can see other people trying to do it - especially if they are mostly used to MySQL, which allows it. It may not be correct, but it is (IMO) intuitive - why should a transaction have to be considered completely ruined just because an insert failed, when I expected that it may happen? (I don't need an answer to that, just explaining my/others thought process). If Django must enforce PostgreSQL style behavior, then an exception at some point telling me off for my bad behavior would be useful. IIUC correctly I'd get that exception with Django+PostgreSQL but I don't get any with Django+MySQL.

Either way I think the docs could be improved: "Wrapping atomic in a try/except block allows for natural handling of integrity errors" is not the same as "DatabaseErrors must be caught outside the atomic block". Also "If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back" implies to me that the transaction will be rolled back iff an exception causes the stack to unwind past transaction.atomic, but the intent is "If there is any DatabaseError [or any other exception raised inside a transaction.atomic(savepoint=False)], even if it is caught by user code, the changes are rolled back."

Thanks,
Richard

Curtis Maloney

unread,
Sep 19, 2013, 8:05:35 PM9/19/13
to django-d...@googlegroups.com
I feel compelled to comment here...


On 20 September 2013 09:32, Richard Ward <ric...@richard.ward.name> wrote: 
I don't think that what I was trying to do (continue using a transaction after an insert failed) was too outlandish. I'm not demanding that Django should allow me to do it (tho that would be nice), but I can see other people trying to do it - especially if they are mostly used to MySQL, which allows it. It may not be correct, but it is (IMO) intuitive - why should a transaction have to be considered completely ruined just because an insert failed, when I expected that it may happen? (I don't need an answer to that, just explaining my/others thought process).

So you don't think that "one step of an atomic operation failing" is reason to fail the atomic operation?  Doesn't sound intuitive to me.

Django is trying to enforce the _logical_ behavior.  The concept of an "atomic" operation, even outside the context of DBMSs, is pretty much the consistent throughout CS.

If you are expecting the step could fail, then you should wrap it in a sub-transaction - the very reason they exist.  Again, you have an atomic operation [the insert] that may fail, so you want to gracefully handle it.
 
If Django must enforce PostgreSQL style behavior, then an exception at some point telling me off for my bad behavior would be useful. IIUC correctly I'd get that exception with Django+PostgreSQL but I don't get any with Django+MySQL.

I absolutely agree that Django should behave consistently in this, despite the quirks of the DBMS chosen.

Either way I think the docs could be improved: "Wrapping atomic in a try/except block allows for natural handling of integrity errors" is not the same as "DatabaseErrors must be caught outside the atomic block". Also "If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back" implies to me that the transaction will be rolled back iff an exception causes the stack to unwind past transaction.atomic, but the intent is "If there is any DatabaseError [or any other exception raised inside a transaction.atomic(savepoint=False)], even if it is caught by user code, the changes are rolled back."

Always a fan of improving the clarity of the docs :)
 
[and I blame your expectations of transactions on MySQL... who seem to delight in ignoring the sane behavior :)]

--
Curtis

Aymeric Augustin

unread,
Sep 20, 2013, 2:17:58 AM9/20/13
to django-d...@googlegroups.com
On 19 sept. 2013, at 22:47, Hannu Krosing <ha...@krosing.net> wrote:

> Actually you can use subtransactions in postgresql if you want to manage
> failed queries yourself

That's exactly what transaction.atomic does, on all supported databases.

I'm sorry if I sound like I need a transactions 101 course; I actually spent a few dozen hours working on this before rewriting Django's transaction management ;-)

--
Aymeric.

Aymeric Augustin

unread,
Sep 20, 2013, 2:37:38 AM9/20/13
to django-d...@googlegroups.com

On 19 sept. 2013, at 23:38, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:

> Here, another_obj.save() will succeed. But the transaction will be *always* rolled back, silently. I don't see any good reason not to error out when using a connection which is going to be rolled back in always.

As far as I can tell that's the actual proposal in your email, and I'm OK with trying to to raise exceptions when making a query and get_rollback returns True. I'm not sure it's doable; patch welcome ;-)

--
Aymeric.



Aymeric Augustin

unread,
Sep 20, 2013, 2:47:51 AM9/20/13
to django-d...@googlegroups.com, django-d...@googlegroups.com
Yes, I agree, that why I suggested filing a documentation ticket in my first answer.

--
Aymeric.

Richard Ward

unread,
Sep 20, 2013, 4:32:37 AM9/20/13
to django-d...@googlegroups.com
So you don't think that "one step of an atomic operation failing" is reason to fail the atomic operation?  Doesn't sound intuitive to me.   
...
The concept of an "atomic" operation, even outside the context of DBMSs, is pretty much the consistent throughout CS.

If you are expecting the step could fail, then you should wrap it in a sub-transaction - the very reason they exist.  Again, you have an atomic operation [the insert] that may fail, so you want to gracefully handle it.

In my defense... as far as I was concerned I did gracefully handle it - I caught the exception in python. If my desired outcome is: ((A or B) then C) or ((A unless IntegrityError) then C) and A and B are atomic, then I don't see the need for a sub-transaction (If the DB allows such behavior) - I'm not firing A, B and C at the database blindly, I'm reacting to what the DB tells me about each statement's execution before sending the next. Either my whole block of code is executed as I intended, or there is a rollback. If I'm sending a batch of commands to the DB I can see how MySQL's behavior will cause pain, but given that I'm always using another language to send commands one by one to the db with the ability for my code to react, I'd not previously noticed that MySQL's behavior was odd.

Anyway, my point was that I think others are not unlikely to try this. If they do, its not so easy to tell what has gone wrong. I appreciate that Django can't deal with people trying really hard to break things, I just don't reckon I was trying that hard :)

I'll open a ticket about the docs later today.

Aymeric Augustin

unread,
Sep 20, 2013, 4:02:58 PM9/20/13
to django-d...@googlegroups.com
On 20 sept. 2013, at 10:32, Richard Ward <ric...@richard.ward.name> wrote:

> I'll open a ticket about the docs later today.

I just took a few minutes to write a patch:
https://github.com/django/django/commit/4db2752e28668ed8826b770ef2ed26e8c1562db6

Let me know if you think it can be improved.

--
Aymeric.




Richard Ward

unread,
Sep 21, 2013, 9:53:50 AM9/21/13
to django-d...@googlegroups.com
I'll open a ticket about the docs later today. 
 
 
Let me know if you think it can be improved.

That looks nice and explicit to me.

I'm OK with trying to to raise exceptions when making a query and get_rollback returns True. I'm not sure it's doable; patch welcome ;-) 


It raises the same error that PostgreSQL would raise if you try and do any SQL when in a transaction with 'needs_rollback' is True. I've tested this with PostgreSQL, MySQL and SQLite3, and they all now behave in the same way that PostgreSQL behave currently (I'm running the test suite at the moment, its taking a while).

*Any* exception bubbling from atomic(savepoint=False) block will cause this issue

So for example:

with transaction.atomic():
    try:
        with transaction.atomic(savepoint=False):
            MyModel(foo=1).save()
            raise Exception("whatever")
    except Exception:
        pass
    MyModel(foo=2).save()
print MyModel.objects.all()

Prints out '[]' (rather than raising an exception) with any database.

With the above patch an exception is instead raised when you try to do the second save.

However...
You say in your docs patch that savepoints are cheap, so what is transaction.atomic(savepoint=False) for? is it just for performance, or is more like an assertion that we are definitely in a transaction (or both?).

At present the decision to rollback or commit is based on whether there is a current exeption and whether needs_rollback is True. If instead this were just based on whether there is a current exception (getting rid of needs_rollback), then exceptions bubbling from inside a transaction.atomic(savepoint=False) would still cause a rollback, and catching an exception (hiding it from the context manager) would lead to a commit (or at least an attempt to commit). This would leave Django+PostgreSQL's behaviour unchanged, allow Django+MySQL/SQLite3 to be used like I was trying to originally (for better or worse), and it would mean the snippet I posted above would commit the outer transaction instead of silently rolling it back, which is what a casual reading of the snippet would suggest might happen (Removing the option for savepoint=False would have the same effect).

Richard.
Message has been deleted

Aymeric Augustin

unread,
Sep 21, 2013, 11:58:33 AM9/21/13
to django-d...@googlegroups.com
Le 21 sept. 2013 à 15:53, Richard Ward <daedal...@gmail.com> a écrit :

You say in your docs patch that savepoints are cheap

Truth be said, I haven't run benchmarks.

so what is transaction.atomic(savepoint=False) for? is it just for performance, or is more like an assertion that we are definitely in a transaction (or both?).

It's mostly for performance. Ask Anssi for details.

There a second, more practical, reason; read below.

At present the decision to rollback or commit is based on whether there is a current exeption and whether needs_rollback is True. If instead this were just based on whether there is a current exception (getting rid of needs_rollback), then exceptions bubbling from inside a transaction.atomic(savepoint=False) would still cause a rollback, and catching an exception (hiding it from the context manager) would lead to a commit (or at least an attempt to commit). This would leave Django+PostgreSQL's behaviour unchanged

You may be right. I'm not sure. This code is tricky. Such assertions routinely take more than 10 hours of work to confirm.

Removing the option for savepoint=False would have the same effect

It would have the drawback of breaking everyone's assertNumQueries because of the extra savepoints introduced by Django.

This would be very hostile to people porting large, well-tested code bases.

-- 
Aymeric (mobile).

Kääriäinen Anssi

unread,
Sep 21, 2013, 12:49:24 PM9/21/13
to django-d...@googlegroups.com
For the performance part: a simple model.save() is about 50% more expensive with savepoints. This time is used in the database.

In addition there are 3 network trips instead of one. This could add latency in some usecases.


-------- Original message --------
Subject: Re: Is "transaction.atomic" in 1.6 supposed to work this way?
From: Aymeric Augustin <aymeric.au...@polytechnique.org>
To: "django-d...@googlegroups.com" <django-d...@googlegroups.com>
CC:

Aymeric Augustin

unread,
Sep 21, 2013, 1:07:19 PM9/21/13
to django-d...@googlegroups.com
On 21 sept. 2013, at 15:53, Richard Ward <daedal...@gmail.com> wrote:

I'm OK with trying to to raise exceptions when making a query and get_rollback returns True. I'm not sure it's doable; patch welcome ;-) 


It raises the same error that PostgreSQL would raise if you try and do any SQL when in a transaction with 'needs_rollback' is True. I've tested this with PostgreSQL, MySQL and SQLite3, and they all now behave in the same way that PostgreSQL behave currently (I'm running the test suite at the moment, its taking a while).

I wrote a patch based on this idea and unfortunately it doesn't work:

This patch triggers a large number of failures in the tests, because it prevents some read queries which are currently allowed. You didn't notice because your patch only works when DEBUG = True and the test suite runs with DEBUG = False.

If we step back from the implementation, the issue here is that MySQL and Oracle implicitly add a savepoint before each query, and under some circumstances implicitly rollback to that savepoint. This behavior allows transaction management schemes that don't map well to a context manager / decorator API like Django's.  See also https://code.djangoproject.com/ticket/11156#comment:14 which has the same root cause.

I made a significant effort trying to implement the suggestions in this thread and didn't obtain any results. If someone still wants to change something in this area, please write a patch with tests; I will review it.

-- 
Aymeric.




Anssi Kääriäinen

unread,
Sep 22, 2013, 1:38:32 PM9/22/13
to django-d...@googlegroups.com


I would go for an approach where all queries in marked-for-rollback block are prevented. See https://github.com/akaariai/django/compare/ticket_21134

Note that this is the test case I am complaining about:

r1 = Reporter.objects.create(first_name='foo', last_name='bar')
with transaction.atomic():
     r2 = Reporter(first_name='foo', last_name='bar2', id=r1.id)
     try:
         r2.save(force_insert=True)
     except IntegrityError:
        r2.save(force_update=True)     
self.assertEqual(Reporter.objects.get(pk=r1.pk).last_name, 'bar2')

The last assert currently fails. There is no explicit use of savepoint=False. It is reasonable to expect that the code succeeds on databases that allow this coding pattern. So, as is, this API is bad.

If running read-only queries in marked-for-rollback blocks is absolutely wanted, then lets just switch model.save() and other similar ORM operations to use real savepoints. This is expensive for some use cases, but adding a flag to avoid that cost can be done later on. The tx API is harder to change after 1.6 release.

 - Anssi

Aymeric Augustin

unread,
Sep 22, 2013, 2:30:06 PM9/22/13
to django-d...@googlegroups.com
On 22 sept. 2013, at 19:38, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:

> There is no explicit use of savepoint=False. It is reasonable to expect that the code succeeds on databases that allow this coding pattern.

Yes, you're right…

I looked at your version of the patch:

1) I believe the low-level APIs (rollback and savepoint_rollback) shouldn't be aware of the high level API (atomic), eg. they shouldn't manipulate the needs_rollback flag. The user should explicitly manipulate the flag with set_rollback if he starts messing with the low level APIs.

I know the low-level APIs are calling `validate_no_atomic_block`, but that's a bit different. It's making a check and possibly raising an exception, not changing the internal state of the high level API.

2) You had to do many changes to the tests; it looks like some weren't very well written and your changes are justified. I'd like to double check. Thanks your the additional test cases, too.

I'll write an iteration of the patch tonight, hopefully we'll converge to something we both consider mergeable.

> So, as is, this API is bad.

I appreciate the nuance in this assessment ;-)

> If running read-only queries in marked-for-rollback blocks is absolutely wanted, then lets just switch model.save() and other similar ORM operations to use real savepoints. This is expensive for some use cases, but adding a flag to avoid that cost can be done later on. The tx API is harder to change after 1.6 release.

I'm afraid DBAs will lynch us if we do that.

--
Aymeric.

Anssi Kääriäinen

unread,
Sep 22, 2013, 2:38:14 PM9/22/13
to django-d...@googlegroups.com
On Sunday, September 22, 2013 9:30:06 PM UTC+3, Aymeric Augustin wrote:
On 22 sept. 2013, at 19:38, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:

> There is no explicit use of savepoint=False. It is reasonable to expect that the code succeeds on databases that allow this coding pattern.

Yes, you're right…

I looked at your version of the patch:

1) I believe the low-level APIs (rollback and savepoint_rollback) shouldn't be aware of the high level API (atomic), eg. they shouldn't manipulate the needs_rollback flag. The user should explicitly manipulate the flag with set_rollback if he starts messing with the low level APIs.

I know the low-level APIs are calling `validate_no_atomic_block`, but that's a bit different. It's making a check and possibly raising an exception, not changing the internal state of the high level API.


Yes, this seems like a good idea to me. Atomic.__exit__ should be able to handle this.

2) You had to do many changes to the tests; it looks like some weren't very well written and your changes are justified. I'd like to double check. Thanks your the additional test cases, too.

It must be said that some of the changes were written in bit of a haste - so double check is definitely welcome.

 - Anssi

Shai Berger

unread,
Sep 22, 2013, 2:38:10 PM9/22/13
to django-d...@googlegroups.com
On Sunday 22 September 2013 10:38:32 Anssi Kääriäinen wrote:
>
> I would go for an approach where all queries in marked-for-rollback block
> are prevented. See https://github.com/akaariai/django/compare/ticket_21134
>
> Note that this is the test case I am complaining about:
>
> r1 = Reporter.objects.create(first_name='foo', last_name='bar')
> with transaction.atomic():
> r2 = Reporter(first_name='foo', last_name='bar2', id=r1.id)
> try:
> r2.save(force_insert=True)
> except IntegrityError:
> r2.save(force_update=True)
> self.assertEqual(Reporter.objects.get(pk=r1.pk).last_name, 'bar2')
>
> The last assert currently fails. There is no explicit use of
> savepoint=False. It is reasonable to expect that the code succeeds on
> databases that allow this coding pattern. So, as is, this API is bad.
>
> If running read-only queries in marked-for-rollback blocks is absolutely
> wanted, then lets just switch model.save() and other similar ORM operations
> to use real savepoints. This is expensive for some use cases, but adding a
> flag to avoid that cost can be done later on. The tx API is harder to
> change after 1.6 release.
>
Maybe I'm a little too late to this party, but for all non-postgresql users,
this behavior (both current and Anssi's suggestion) would be very surprising.

I would take Anssi's suggestion another step forward -- or backward, depends
where you're looking from :-) -- stop marking transactions for rollback. Make
save() and associates use savepoints, only on PostgreSQL, so that, everywhere,
one may recover from database errors within the transaction.

Then, as Anssi suggests, we can use a flag to remove these savepoints for
performance when correctness has been established.

(I am intimately familiar with a large code base that is full of such recovery
attempts, mainly because PostgreSQL has never been a primary target; on
Sqlite, Oracle and SQL Server, these things work fine. I'm sure the case I know
isn't the only one).

My 2 cents,

Shai.



Aymeric Augustin

unread,
Sep 22, 2013, 3:18:24 PM9/22/13
to django-d...@googlegroups.com
On 22 sept. 2013, at 20:38, Shai Berger <sh...@platonix.com> wrote:

> I would take Anssi's suggestion another step forward -- or backward, depends
> where you're looking from :-) -- stop marking transactions for rollback. Make
> save() and associates use savepoints, only on PostgreSQL, so that, everywhere,
> one may recover from database errors within the transaction.

The extra queries for the savepoints will break all tests that use assertNumQueries
— well, all except assertNumQueries(0). How do you propose to deal with that.

> (I am intimately familiar with a large code base that is full of such recovery
> attempts, mainly because PostgreSQL has never been a primary target; on
> Sqlite, Oracle and SQL Server, these things work fine. I'm sure the case I know
> isn't the only one).

Is this code base relying on Django's transaction management? In other words
is it using `commit_on_success` or `atomic` to manage transactions?

--
Aymeric.




Florian Apolloner

unread,
Sep 22, 2013, 3:59:02 PM9/22/13
to django-d...@googlegroups.com
On Sunday, September 22, 2013 8:38:10 PM UTC+2, Shai Berger wrote:
I would take Anssi's suggestion another step forward  -- or backward, depends
where you're looking from :-) -- stop marking transactions for rollback. Make
save() and associates use savepoints, only on PostgreSQL, so that, everywhere,
one may recover from database errors within the transaction.

I am not really in favor of that for a simple reason: I like to use third party apps and I'd like them to work with every database; in this case the strictest one is postgres, so that's the API Django should choose. I also don't think it's a good idea to just try/except statements and then continue if the database allows it; since database errors raised by Django (that is, those suggested by the dbapi) are not granular enough to be useful. This has bitten me and is still causing headaches in django-taggit, since a raised IntegrityError doesn't neccessarily affect the field you thought it would (ideas welcome)… All in all, whenever I see people catching database errors and continue afterwards it almost always has the tendency to blow up in some odd ways.

Michael Manfre

unread,
Sep 22, 2013, 4:48:22 PM9/22/13
to django-d...@googlegroups.com
On Sun, Sep 22, 2013 at 3:18 PM, Aymeric Augustin <aymeric....@polytechnique.org> wrote:
On 22 sept. 2013, at 20:38, Shai Berger <sh...@platonix.com> wrote:
The extra queries for the savepoints will break all tests that use assertNumQueries
— well, all except assertNumQueries(0). How do you propose to deal with that.

When changes are made to Django in such a way that existing tests no longer accurately test the expected behavior, I propose that the tests are fixed.

I'd argue that many of the tests that use assertNumQueries are poorly written and should be viewed as already broken. This opinion is based upon my efforts of making django-mssql pass the test suite.  assertNumQueries is backend specific, but often is not guarded with a vendor or DatabaseFeature check. The exception being tests that ensure no queries are executed because that behavior is not backend specific

When the savepoint changes break the fragile tests, that will be a good thing because they will need to be rewritten (hopefully with proper guards).

Regards,
Michael Manfre

Aymeric Augustin

unread,
Sep 22, 2013, 4:55:15 PM9/22/13
to django-d...@googlegroups.com
Hi Michael,

On 22 sept. 2013, at 22:48, Michael Manfre <mma...@gmail.com> wrote:

On Sun, Sep 22, 2013 at 3:18 PM, Aymeric Augustin <aymeric....@polytechnique.org> wrote:

The extra queries for the savepoints will break all tests that use assertNumQueries
— well, all except assertNumQueries(0). How do you propose to deal with that.

When changes are made to Django in such a way that existing tests no longer accurately test the expected behavior, I propose that the tests are fixed.

I'd argue that many of the tests that use assertNumQueries are poorly written and should be viewed as already broken. This opinion is based upon my efforts of making django-mssql pass the test suite.

Yes, I agree with that as far as Django's own test suite is concerned.

I was more worried about our end users' test suites.

Currently, if you don't use transactions, you can ignore the new transaction management entirely. If all your assertNumQueries start failing, it's another story.

-- 
Aymeric.

Shai Berger

unread,
Sep 22, 2013, 5:03:43 PM9/22/13
to django-d...@googlegroups.com
On Sunday 22 September 2013 12:59:02 Florian Apolloner wrote:
> On Sunday, September 22, 2013 8:38:10 PM UTC+2, Shai Berger wrote:
> > I would take Anssi's suggestion another step forward -- or backward,
> > depends
> > where you're looking from :-) -- stop marking transactions for rollback.
> > Make
> > save() and associates use savepoints, only on PostgreSQL, so that,
> > everywhere,
> > one may recover from database errors within the transaction.
>
> I am not really in favor of that for a simple reason: I like to use third
> party apps and I'd like them to work with every database; in this case the
> strictest one is postgres, so that's the API Django should choose.

The proposal will make things work consistently across backends -- PostgreSQL
included (that's the point of "use savepoints only on PostgreSQL")...

> I also don't think it's a good idea to just try/except statements and then
> continue if the database allows it; since database errors raised by Django
> (that is, those suggested by the dbapi) are not granular enough to be
> useful. This has bitten me and is still causing headaches in django-taggit,
> since a raised IntegrityError doesn't neccessarily affect the field you
> thought it would (ideas welcome)… All in all, whenever I see people
> catching database errors and continue afterwards it almost always has the
> tendency to blow up in some odd ways.

... but other arguments against it stand.

It is probably not a very good idea.

Shai.

Shai Berger

unread,
Sep 22, 2013, 5:09:06 PM9/22/13
to django-d...@googlegroups.com
On Sunday 22 September 2013 22:55:15 Aymeric Augustin wrote:
> Hi Michael,
>
> On 22 sept. 2013, at 22:48, Michael Manfre <mma...@gmail.com> wrote:
> > On Sun, Sep 22, 2013 at 3:18 PM, Aymeric Augustin
> > <aymeric....@polytechnique.org> wrote:
> >
> > The extra queries for the savepoints will break all tests that use
> > assertNumQueries — well, all except assertNumQueries(0). How do you
> > propose to deal with that.
> >
> > When changes are made to Django in such a way that existing tests no
> > longer accurately test the expected behavior, I propose that the tests
> > are fixed.
> >
> > I'd argue that many of the tests that use assertNumQueries are poorly
> > written and should be viewed as already broken. This opinion is based
> > upon my efforts of making django-mssql pass the test suite.
>
> Yes, I agree with that as far as Django's own test suite is concerned.
>
> I was more worried about our end users' test suites.
>

If this is agreed, then we should add an assertNoQueries, deprecate the
existing assertNumQueries, and add a new assertNumQueries that takes a backend
specification, shouldn't we? This is regardless of what we do with @atomic.

> Currently, if you don't use transactions, you can ignore the new transaction
> management entirely. If all your assertNumQueries start failing, it's
> another story.

The code base I mentioned earlier relies mostly on TransactionMiddleware, but
does have some explicit commit_on_success's etc. (it's Django 1.4 based, so no
@atomic or anything like it).

Shai.

Aymeric Augustin

unread,
Sep 22, 2013, 5:27:35 PM9/22/13
to django-d...@googlegroups.com
On 22 sept. 2013, at 23:09, Shai Berger <sh...@platonix.com> wrote:

> The code base I mentioned earlier relies mostly on TransactionMiddleware, but
> does have some explicit commit_on_success's etc. (it's Django 1.4 based, so no
> @atomic or anything like it).

I'm genuinely interested in making it as easy as possible to switch to the new
transaction management.

If you weren't using the transaction middleware, in 1.6 you would be in
autocommit mode and we wouldn't have this discussion at all.

Once you switch to ATOMIC_REQUESTS, assuming we go with the current
plan — and it still looks like the best compromise at this point — you'll have
to insert an extra atomic block to guarantee a clean rollback.

try:
do_stuff_in_db()
except DatabaseError:
do_more_stuff_in_db()

must be changed to:

try:
with transaction.atomic():
do_stuff_in_db()
except DatabaseError:
do_more_stuff_in_db()

As explained in the documentation, this pattern has the advantage of delimiting
clearly what gets rolled back when an exception happens, even if the code is
extended or refactored. It's explicit and I like it.

Of course the downside is that you'll have to go through all places where you
catch DatabaseError or a subclass and edit them as above. It sounds doable,
and a good tradeoff for eliminating the risk of nesting commit_on_succes
accidentally or otherwise getting burnt by the old transaction management.

--
Aymeric.




Michael Manfre

unread,
Sep 22, 2013, 7:53:26 PM9/22/13
to django-d...@googlegroups.com
On Sun, Sep 22, 2013 at 5:09 PM, Shai Berger <sh...@platonix.com> wrote:
If this is agreed, then we should add an assertNoQueries, deprecate the
existing assertNumQueries, and add a new assertNumQueries that takes a backend
specification, shouldn't we? This is regardless of what we do with @atomic.

assertNumQueries is fine as is, it's just been used poorly. I think passing something to identify the backend would lead to a more awkward usage and tests that assert nothing depending on the backend. It's better to have the test fail in those situations instead of assert nothing.


Anssi Kääriäinen

unread,
Sep 23, 2013, 1:31:45 AM9/23/13
to django-d...@googlegroups.com
And, if savepoints need to be avoided for some reason you can use:

try:
do_stuff_in_db()
except DatabaseError:
connection.set_rollback(False)
do_more_stuff_in_db(

This will lead to the code working as it has worked before. The
set_rollback() method is documented here:
https://docs.djangoproject.com/en/dev/topics/db/transactions/#django.db.transaction.set_rollback

And, please, do not say this is about DatabaseErrors. That is not true.
I added a couple of tests to demonstrate this:
https://github.com/akaariai/django/commit/530ba8ae599c9510f9dc6433658f88b1988754d5.
The added tests fail when patched, but they would fail even worse on
master - the calls to save() / create() in exception blocks will
succeed, but the operations are silently rolled back. I don't think
Django 1.6 should be released with this behaviour.

- Anssi
Reply all
Reply to author
Forward
0 new messages