Possible bug with transaction.on_commit

691 views
Skip to first unread message

barth...@infobart.com

unread,
May 16, 2016, 4:13:58 PM5/16/16
to Django users
Hi,

I believe that I encountered a bug or at least a serious limitation with transaction.on_commit in Django 1.9.x. Essentially, the on_commit hooks seem to be tied to the database connection instead of the transaction. This means that unrelated transactions may trigger on_commit hooks, which results in undesired execution order. Here is an example:

from django.db import transaction
from foobar.models import Model1, Model2


@transaction.atomic
def outer():
    print("START - OUTER")
    for i in range(4):
        f1(i)
    print("END - OUTER")


@transaction.atomic
def f1(i):
    model = Model1(description="test {0}".format(i))
    model.save()
    transaction.on_commit(lambda: commit_hook(model, i))


def commit_hook(model, i):
    print("START - on_commit hook")
    f2(i)
    print("STOP - on_commit hook")


@transaction.atomic
def f2(i):
    print("CALLING F2")
    model = Model2(description="test {0}".format(i))
    model.save()


Some quick explanations: 

- outer is the outermost atomic block. f1 will only create a savepoint(). This works as expected, there is only one commit (trace erased for simplicity). 
- f2 is wrapped in an "outermost" atomic block and indeed, f2 will be called four times and there will be four commits (trace erased for simplicity).
- f1 register a commit_hook. I expect that at the end of outer(), all commit hooks will be called.


The expected trace is:
START - OUTER
END - OUTER
START - on_commit hook
CALLING F2
STOP - on_commit hook
START - on_commit hook
CALLING F2
STOP - on_commit hook
START - on_commit hook
CALLING F2
STOP - on_commit hook
START - on_commit hook
CALLING F2
STOP - on_commit hook

The actual trace (f2 is triggering the on_commit hooks registered in f1/outer):
START - OUTER
END - OUTER
START - on_commit hook
CALLING F2
START - on_commit hook
CALLING F2
START - on_commit hook
CALLING F2
START - on_commit hook
CALLING F2
STOP - on_commit hook
STOP - on_commit hook
STOP - on_commit hook
STOP - on_commit hook

I wanted to know if someone encountered this issue or if I am misunderstanding on_commit before opening a ticket.

Regards,
Bart

Stephen J. Butler

unread,
May 16, 2016, 4:27:45 PM5/16/16
to django...@googlegroups.com
I believe that's as designed. There's this part in the documentation of "Database transactions":

Callbacks are not run until autocommit is restored on the connection following the commit (because otherwise any queries done in a callback would open an implicit transaction, preventing the connection from going back into autocommit mode).

So on_commit is meant as a connection level hook, not a transaction level one.

--
You received this message because you are subscribed to the Google Groups "Django users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to django-users...@googlegroups.com.
To post to this group, send email to django...@googlegroups.com.
Visit this group at https://groups.google.com/group/django-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/django-users/5ddb1ee0-c27f-4cee-b28a-d8cca6d1cf5f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

barth...@infobart.com

unread,
May 16, 2016, 8:29:30 PM5/16/16
to Django users
The callbacks are indeed run when autocommit is restored, but they are not executed one after the other, i.e., they are not **all** executed when autocommit is restored. The relevant code in Django:

    # Called by set_autocommit(...)
    def run_and_clear_commit_hooks(self):
        self.validate_no_atomic_block()
        try:
            while self.run_on_commit:
                sids, func = self.run_on_commit.pop(0)
                func()
        finally:
            self.run_on_commit = []

Although this looks like all callbacks are executed in this loop, if a new transaction occurs in a callback, the next transaction will in turn call set_autocommit and the remaining callbacks will be executed from the latter transaction creating a stack of calls instead of a sequence of calls.

As a degenerate example, if both f1/outer() and f2() register commit hooks, hooks registered by f1/outer() will be executed by f2() and hooks registered by f2() will be executed by f1/outer(). That seems quite counter-intuitive.

Another example: if I replace the commit_hook function in the previous code example by the following code, a deadlock will occur (this is the reason we discovered this unintuitive behavior):

def commit_hook(model, i):
    lock.acquire()
    f2(i) # will call commit_hook even if it was registered by f1()
    lock.release()

I understand the implementation of commit hooks is not straightforward, but if this is really the intended behavior, the documentation should come with a more severe warning.

Bart

Carl Meyer

unread,
May 17, 2016, 1:52:58 AM5/17/16
to django...@googlegroups.com
Hi Bart,

On 05/16/2016 02:10 PM, barth...@infobart.com wrote:
> I believe that I encountered a bug or at least a serious limitation with
> transaction.on_commit in Django 1.9.x. Essentially, the on_commit hooks
> seem to be tied to the database connection instead of the transaction.
> This means that unrelated transactions may trigger on_commit hooks,
> which results in undesired execution order.
[...]
> I wanted to know if someone encountered this issue or if I am
> misunderstanding on_commit before opening a ticket.

I am the author of on_commit, and I think the behavior you have
encountered is a bug that should be fixed. Doing additional database
work in `on_commit` callbacks wasn't really a use case I had in mind in
the design; I made sure it basically worked, but clearly didn't explore
it sufficiently in cases of multiple registered hooks.

Storing the on_commit state on the connection is not really optional,
since that's where _all_ transaction-related state is stored, but I
think it should still be possible to fix this. Perhaps by having
`run_and_clear_commit_hooks` copy the list of hooks into a local
variable and immediately clear the state on the connection before it
calls any of the hooks? Not entirely sure, need to add a failing test
and play with it a bit.

Thanks for catching and reporting this. If you'd be willing to file a
ticket in Trac, that'd be great.

Carl

signature.asc

barth...@infobart.com

unread,
May 17, 2016, 11:01:41 AM5/17/16
to Django users
Hi Carl, thanks for your input.

I opened a ticket (https://code.djangoproject.com/ticket/26627) and submitted a pull request based on your suggestion: https://github.com/django/django/pull/6617

Bart
Reply all
Reply to author
Forward
0 new messages