Deadlock with synchronized Sagas + Quartz

211 views
Skip to first unread message

Steven Grimm

unread,
Feb 25, 2016, 1:02:29 AM2/25/16
to Axon Framework Users
When I throw a bunch of load at my system, I'm seeing deadlocks
involving the Quartz TRIGGER_ACCESS lock in the database and the lock on
a Saga instance that implements synchronized event handling. The actual
execution flow in my application is more complicated than this, but
here's what I think is happening. I'm using PostgreSQL and synchronous
event and command delivery on a single node. All the interaction with
Quartz is via an Axon EventScheduler instance.

Thread A: Request handler starts a UnitOfWork
Thread A: Event is published
Thread A: Request handler commits its UnitOfWork
Thread A: Event is handled by Saga 1
Thread A: Saga 1 schedules an event; the Quartz TRIGGER_ACCESS lock is
acquired
Thread A: Saga 1 sends a command which ends up causing another event to
be published

Thread B: Some other event is handled by Saga 2 in a different UnitOfWork
Thread B: Saga 2 schedules an event; Quartz attempts to lock
TRIGGER_ACCESS but blocks because it's held by thread A's transaction

Thread A: The event from the command handler needs to be handled by Saga
2; Axon tries to acquire Saga 2's lock but blocks because the lock is
held by thread B

Then additional threads block on the two locks in question and the
application grinds to a halt.

When I was using asynchronous Saga event delivery this wasn't an issue,
possibly because event handlers don't share transactions. Once the next
Axon release comes out with the fix for connection management in async
Sagas, I can switch back to asynchronous mode, but it'd be nice to
figure out how to get the application to work properly in either mode.

Hopefully that analysis is correct. Like I mentioned, I simplified the
execution flow here but I believe I captured the essence of it. As
always, it's totally possible I'm just doing something dumb.

Thanks!

-Steve

Allard Buijze

unread,
Feb 25, 2016, 6:06:17 AM2/25/16
to Axon Framework Users
Hi Steven,

unfortunately, this is a risk with the SimpleEventBus and SimpleEventBus that is hard to solve. Normally, Axon will detect deadlocks if they occur between threads that get a deadlock on one of the Axon locks. However, since Quartz is involved in the lock, Axon doesn't detect it.

Instead of Asynchronous Saga delivery, you can also consider using the AsynchronousCommandBus. That should work around the issue as well.
In the meantime, we're working on the issue and a release.

Cheers,

Allard

--
You received this message because you are subscribed to the Google Groups "Axon Framework Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to axonframewor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steven Grimm

unread,
Feb 25, 2016, 6:06:49 PM2/25/16
to axonfr...@googlegroups.com
AsynchronousCommandBus seems to have done the trick. I'd been looking at DisruptorCommandBus previously but it didn't play nicely with the non-aggregate-based command handlers in my app. Thanks!

Can I suggest adding a paragraph or two about AsynchronousCommandBus to the documentation? I admit I didn't know it existed before this conversation.

-Steve

February 25, 2016 at 3:06 AM
Hi Steven,

unfortunately, this is a risk with the SimpleEventBus and SimpleEventBus that is hard to solve. Normally, Axon will detect deadlocks if they occur between threads that get a deadlock on one of the Axon locks. However, since Quartz is involved in the lock, Axon doesn't detect it.

Instead of Asynchronous Saga delivery, you can also consider using the AsynchronousCommandBus. That should work around the issue as well.
In the meantime, we're working on the issue and a release.

Cheers,

Allard

Reply all
Reply to author
Forward
0 new messages