IMO this behaviour makes code that looks valid (to me) not work in the expected way.
In case its relevant I'm using InnoDB tables on MySQL.
So Is there a bug? Or am I using this incorrectly, or misunderstanding something?
Thanks,
Richard
--
You received this message because you are subscribed to the Google Groups "Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-develop...@googlegroups.com.
To post to this group, send email to django-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
For more options, visit https://groups.google.com/groups/opt_out.
So what is the problem here? I assume it is one of:
- 'innocent_looking_function' is badly written: it should not be catching IntegrityErrors under any circumstances (though that seems like a valid thing to do), it should instead use something like get_or_create.
- 'innocent_looking_function' should have 'with transaction.atomic():' just inside the 'try' block, and presumably so should every other bit of code where an IntegrityError is caught. But what if 'innocent_looking_function' also gets used elsewhere that may not be inside an atomic block?
- transaction.atomic is in some way buggy / oddly designed.
IMO this behaviour makes code that looks valid (to me) not work in the expected way.
In case its relevant I'm using InnoDB tables on MySQL.
So Is there a bug? Or am I using this incorrectly, or misunderstanding something?
There is a definite disparency about what the code does and what the docs say:
"""
If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back.
"""
If an exception is catched inside the block, then there should be no rollback.
With PostgreSQL you can't continue to use the transaction, but in other databases you can.
For these other databases allowing use of the transaction, yet still rolling back it later on seems to be a footgun - why are you allowed to continue to use the transaction if a rollback is going to happen anyways?
This is what nested atomics do. In effect savepoints are just creating another transaction inside the existing one (if my memories of Aymeric's talk are correct).
2ndQuadrant Nordic OÜ
Nonetheless, in my opinion, the root cause of the perceived misbehavior is that user code (not Django code) catches IntegrityError inside an atomic block, preventing __exit__ from detecting that a database exception has happened.
DatabaseErrors must be caught outside the atomic block, as documented. Otherwise transaction.atomic will fail in various interesting and database dependent ways. It's harder to diagnose on databases that don't fully enforce transactional behavior ie. all but PostgreSQL.
> If I recall correctly there is transaction.set_rollback(False) which can be used to remove forced rollback.
This is a private API for a reason. Someone deeply familiar with the implementation details of transaction management in Django 1.6 and with an esoteric transaction management system might find this private API useful but it's very obviously not the answer to the original question. Please, everyone, don't do this.
> In general, there are three ways to respond to errors inside transactions:
> 1. allow usage of the TX (MySQL, SQLite etc), allow user to decide commit/rollback
> 2. forbid usage of the TX (PostgreSQL), force it to be rolled back.
> 3. allow usage of the TX, but force rollback (Django)
You're comparing apples to oranges here. If Django is backed by PostgreSQL, how can Django allow using the transaction if PostgreSQL forbids it?
> To me it seems explicit error when using connection in "forced to rollback" state is better than allowing saving more data, then silently rolling back the transaction. As you said this is useless.
Django doesn't allow saving more data in a broken transaction. If a DatabaseError occurs, execution jumps straight to __exit__ which sees an exception and triggers a rollback.
Of course, you can disregard the documentation and break this expectation by catching the exception. But there's no way around this in Python. A context manager cannot prevent catching exceptions.
> To be clear, this is about exceptions (database or other) in atomic(savepoint=False) blocks, not about caught IntegrityErrors - Django will allow you to continue and commit the TX in the caught IntgerityError case assuming the DB allows that.
I designed this part of Django to forbid continuing and committing the transaction. The reason is cross-database compatibility. I had to enforce the behavior of the most restrictive database, which is also the most correct, namely PostgreSQL.
Sure, you can do virtually anything by disregarding the docs and abusing private APIs, but I wouldn't describe that as "Django will allow you to…"
If I still haven't convinced you that I know what I'm talking about, here's an example with the exact same problem but without atomic(savepoint=False).
with transaction.atomic:
try:
do_stuff_in_db() # raises DatabaseError
except DatabaseError:
pass
do_more_stuff_in_db() # works only on lax databases eg. MySQL
In this example, it's totally possible that the transaction.atomic block will end up with a rollback even if do_more_stuff_in_db() succeed. Why? Because the commit is likely to fail if a statement failed during the transaction, and that will result in an implicit rollback!
At the highest level, my design is based on avoiding implicit rollbacks because explicit is better than implicit. But you can still force one if you try hard enough.
I don't think that what I was trying to do (continue using a transaction after an insert failed) was too outlandish. I'm not demanding that Django should allow me to do it (tho that would be nice), but I can see other people trying to do it - especially if they are mostly used to MySQL, which allows it. It may not be correct, but it is (IMO) intuitive - why should a transaction have to be considered completely ruined just because an insert failed, when I expected that it may happen? (I don't need an answer to that, just explaining my/others thought process).
If Django must enforce PostgreSQL style behavior, then an exception at some point telling me off for my bad behavior would be useful. IIUC correctly I'd get that exception with Django+PostgreSQL but I don't get any with Django+MySQL.
Either way I think the docs could be improved: "Wrapping atomic in a try/except block allows for natural handling of integrity errors" is not the same as "DatabaseErrors must be caught outside the atomic block". Also "If the block of code is successfully completed, the changes are committed to the database. If there is an exception, the changes are rolled back" implies to me that the transaction will be rolled back iff an exception causes the stack to unwind past transaction.atomic, but the intent is "If there is any DatabaseError [or any other exception raised inside a transaction.atomic(savepoint=False)], even if it is caught by user code, the changes are rolled back."
So you don't think that "one step of an atomic operation failing" is reason to fail the atomic operation? Doesn't sound intuitive to me.
...
The concept of an "atomic" operation, even outside the context of DBMSs, is pretty much the consistent throughout CS.If you are expecting the step could fail, then you should wrap it in a sub-transaction - the very reason they exist. Again, you have an atomic operation [the insert] that may fail, so you want to gracefully handle it.
I'll open a ticket about the docs later today.
Let me know if you think it can be improved.
I'm OK with trying to to raise exceptions when making a query and get_rollback returns True. I'm not sure it's doable; patch welcome ;-)
*Any* exception bubbling from atomic(savepoint=False) block will cause this issue
You say in your docs patch that savepoints are cheap
so what is transaction.atomic(savepoint=False) for? is it just for performance, or is more like an assertion that we are definitely in a transaction (or both?).
At present the decision to rollback or commit is based on whether there is a current exeption and whether needs_rollback is True. If instead this were just based on whether there is a current exception (getting rid of needs_rollback), then exceptions bubbling from inside a transaction.atomic(savepoint=False) would still cause a rollback, and catching an exception (hiding it from the context manager) would lead to a commit (or at least an attempt to commit). This would leave Django+PostgreSQL's behaviour unchanged
Removing the option for savepoint=False would have the same effect
I'm OK with trying to to raise exceptions when making a query and get_rollback returns True. I'm not sure it's doable; patch welcome ;-)How about this: https://github.com/RichardOfWard/django/commit/cb46c75db275db59b54511c090286255bd9cc46dIt raises the same error that PostgreSQL would raise if you try and do any SQL when in a transaction with 'needs_rollback' is True. I've tested this with PostgreSQL, MySQL and SQLite3, and they all now behave in the same way that PostgreSQL behave currently (I'm running the test suite at the moment, its taking a while).
r1 = Reporter.objects.create(first_name='foo', last_name='bar') |
with transaction.atomic(): |
r2 = Reporter(first_name='foo', last_name='bar2', id=r1.id) |
try: |
r2.save(force_insert=True) |
except IntegrityError: |
r2.save(force_update=True) |
self.assertEqual(Reporter.objects.get(pk=r1.pk).last_name, 'bar2') The last assert currently fails. There is no explicit use of savepoint=False. It is reasonable to expect that the code succeeds on databases that allow this coding pattern. So, as is, this API is bad. |
If running read-only queries in marked-for-rollback blocks is absolutely wanted, then lets just switch model.save() and other similar ORM operations to use real savepoints. This is expensive for some use cases, but adding a flag to avoid that cost can be done later on. The tx API is harder to change after 1.6 release. - Anssi |
On 22 sept. 2013, at 19:38, Anssi Kääriäinen <anssi.ka...@thl.fi> wrote:
> There is no explicit use of savepoint=False. It is reasonable to expect that the code succeeds on databases that allow this coding pattern.
Yes, you're right…
I looked at your version of the patch:
1) I believe the low-level APIs (rollback and savepoint_rollback) shouldn't be aware of the high level API (atomic), eg. they shouldn't manipulate the needs_rollback flag. The user should explicitly manipulate the flag with set_rollback if he starts messing with the low level APIs.
I know the low-level APIs are calling `validate_no_atomic_block`, but that's a bit different. It's making a check and possibly raising an exception, not changing the internal state of the high level API.
2) You had to do many changes to the tests; it looks like some weren't very well written and your changes are justified. I'd like to double check. Thanks your the additional test cases, too.
I would take Anssi's suggestion another step forward -- or backward, depends
where you're looking from :-) -- stop marking transactions for rollback. Make
save() and associates use savepoints, only on PostgreSQL, so that, everywhere,
one may recover from database errors within the transaction.
On 22 sept. 2013, at 20:38, Shai Berger <sh...@platonix.com> wrote:— well, all except assertNumQueries(0). How do you propose to deal with that.
The extra queries for the savepoints will break all tests that use assertNumQueries
On Sun, Sep 22, 2013 at 3:18 PM, Aymeric Augustin <aymeric....@polytechnique.org> wrote:
The extra queries for the savepoints will break all tests that use assertNumQueries— well, all except assertNumQueries(0). How do you propose to deal with that.When changes are made to Django in such a way that existing tests no longer accurately test the expected behavior, I propose that the tests are fixed.I'd argue that many of the tests that use assertNumQueries are poorly written and should be viewed as already broken. This opinion is based upon my efforts of making django-mssql pass the test suite.
If this is agreed, then we should add an assertNoQueries, deprecate theexisting assertNumQueries, and add a new assertNumQueries that takes a backend
specification, shouldn't we? This is regardless of what we do with @atomic.