Sagas With Event Sourcing

jonathan matheus

unread,

Aug 22, 2010, 10:01:49 PM8/22/10

to DDD/CQRS

I was wondering how others on this list implemented sagas when using
event sourcing. I normally implement sagas using NServicebus's Saga
implementation, but it's hard to test, hard to read, and feels heavy.
I was talking with a similarly named friend & he commented that he
uses the event store directly to implement sagas rather that
NServiceBus's implementation. Instead of going to the horse's mouth, I
thought I'd start up some good conversation on this list in case
anyone else was interested.

Some questions off the top of my head:

Does the workflow logic for sagas live in the domain on in the command
handler(s)?

Is the event store to hold state (or context of state) between steps
in the saga?

How are timeouts handled?

What's normally used for correlation? AR id?

If using an example would help, let's use the following:
If a group member wants to send an invitation to join the group to a
non-member, it must be approved by the group's owner.

Commands:
InviteUserToGroup { InvitationId, RequestorId, InviteeId, GroupId }
ApproveInvitationToJoinGroup { InvitationId, OwnerId }

Events:
UserInvitedToGroup { InvitationId, RequestorId, InviteeId, GroupId }

It seems to me that one way you could implement a saga with DDD &
event sourcing is by modeling a bounded context that models the saga's
workflow. You would need to add an event for each state transition
(InvitationApproved, InvitationDenied, InvitationExpired) so that its
state could be stored in the event store. Once the workflow
successfully ends, an event would be thrown which would modify state
in the domain(s) that the workflow was for. This approach seems like
it would add more events that are usually implicit in NServiceBus Saga
class approach, but comes with all the benefits of event sourcing.

How do you guys do it?

Rinat Abdullin

unread,

Aug 24, 2010, 12:48:07 AM8/24/10

to DDD/CQRS

> Does the workflow logic for sagas live in the domain on in the command
handler(s)?

Sagas and activities negotiate long-running interactions between the
entities. The logic probably to live outside of the domain in the
handlers.

> Is the event store to hold state (or context of state) between steps in the saga?

It seems to be so.

> How are timeouts handled?

Infrastructure should be able to do this, scheduling a check for the
future and invoking some action when the check fails. Schedules
obviously need to be persisted for the critical operations (if they
are to survive restarts).

> What's normally used for correlation? AR id?

AR id is probably used for routing the message to the appropriate
partition, where AR resides. Correlation might be more specific,
depending on the nature of saga - sender id (to deduplicate or order
messages in the cloud environment), conversation or request id, etc

Please bear in mind that practically I'm not that familiar with the
subject (this requires at least implementing saga/activity
infrastructure yourself and seeing how it evolves in the distributed
app). I'd love to hear thoughts and corrections.

Best regards,
Rinat

Werner Clausen

unread,

Jun 7, 2012, 3:37:17 AM6/7/12

to ddd...@googlegroups.com

@Jonathan,

Did you ever arrive at some sort of "best practice" here? I know you have since become a busy NSB contributor so perhaps you have completely turned away from ES and only using sagas now?

Or perhaps you weren't completely engulfed by sagas and actually had/have some experience where the ES setup is also working as sagas?

--
Werner

Greg Young

unread,

Jun 7, 2012, 6:36:27 AM6/7/12

to ddd...@googlegroups.com

Event sourced sagas can be quite useful in situations where you may want to change flow of existing items eg change flow while running

--
Le doute n'est pas une condition agréable, mais la certitude est absurde.

@yreynhout

unread,

Jun 7, 2012, 6:43:30 AM6/7/12

to ddd...@googlegroups.com

I guess it depends on how you're using the term "saga". I use it to span commands (tx) across multiple aggregates, compensating when things don't work out (e.g. an innate race condition in the domain - yes, I know, there's ways around that but design is ultimately about trade-offs - I've explored quite a bit of alternatives for not running into race conditions (from pessimistic locking to serialization to selectively making information available) - but I'm digressing). In a "closed environment" (meaning where I've got control over the message contracts and participants) I found emitting a SagaId (doesn't have to be called that) as a header - where any intermediate bus tech propagates it - or in the body/payload works great. Due to the very nature of the sagas I have to support, I usually don't run into the problem of multiple messages starting a saga, nor the fact that messages are involved in multiple sagas or kick off other sagas later in their lifecycle [aside: I'm well aware under other circumstances this would be a freaking disaster waiting to happen]. Basically a saga starts and a saga id is tagged onto the commands that are emitted [aside: some find this somewhat impure and may want to include the saga id in the initiating message - fine by me]. Saga handling stores the saga and actually sets the saga messages on "the bus". Command handling picks up the requests and any produced events carry inbound headers forward into outbound headers. Due to subscription, the saga handling picks up those messages, correlates them back to the saga it got stored, invokes the saga with the message to decide what to do next.

As for eventstorage for sagas: Yes, I use it, but if you've got a lot of messages involved in a saga, you better start accepting that things might run a bit slower as time and the number of messages related to the saga progresses (snapshotting can help for sure). State based sagas (also achievable with sagas that snapshot each time a message has been handled) might be a bit faster. All in all these is an edge case though, so don't get too hung up on it.

HTH,

Yves.

@yreynhout

unread,

Jun 7, 2012, 6:56:43 AM6/7/12

to ddd...@googlegroups.com

Forgot to answer your actual questions #smacksforehead.

So yes, these kind'o sagas live inside the BC (haven't had one that got started by another BC, but I'm sure it's in the pipeline). They get stored in a saga store, which is a bit different (at least it was at the time in my mind - but my ideas changed considerably ever since) from what I got in an eventstore. I'm storing interactions (much like in an eventstore you'd store the command along side the event(s)): SagaInteractionRecord { Id, Version, InputMessage, OutputMessages }. Replay is done using the input message (aggregates typically do replay using the output messages).

I have no need for timeouts at this point, but I would imagine this to be no different from what a state-based saga would do if it were coupled to a bus WITHOUT timeout management (Quartz, Chron, whatever).

Werner Clausen

unread,

Jun 12, 2012, 7:13:10 AM6/12/12

to ddd...@googlegroups.com

Would it be a poor design decision to take that thought to the next level, where you generally wouldn't distinguish between a saga and an aggregate? So that the "if-then-else" logic of a typical saga would be part of your aggregate logic instead?

--
Werner

Jonathan Matheus

unread,

Jun 13, 2012, 5:27:59 PM6/13/12

to ddd...@googlegroups.com

Werner,

Sorry I'm late for the party. To answer your question "Did you ever arrive at some sort of "best practice" here?", the answer is not really. Mostly because I haven't had the time to implement the infrastructure needed to get ES sagas working as friction-less as the state-based sagas that are built into NSB. I came to the same conclusion about not distinguishing between aggregates and sagas for the purpose of hydrating an event sourced model. In most cases, the real differences between the two are the programming paradigm used (State Machine pattern vs traditional domain objects) and the resulting output messages (events vs commands).

One piece of infrastructure to get this up and running with the framework I use is correlating messages to saga instances. This is something that NSB does well. One of the things that you lose going with event sourced sagas is correlating messages to saga instances using metadata rather than a simple correlation id. In this case you need to rely on a separate consistent view model or rely on snapshots. Some of these options are lost if not using DTC. Another piece of infrastructure that needs to be changed in CommonDomain is coming up with a generic enough API for hydrating replayable objects regardless of if they're a saga or an AR. This is actually very easy, however, I haven't had the bandwidth to work on it. Getting RavenHQ to production has taken up all of my time over the past few months.

Right now I'm mostly using state based sagas with what I've been calling event logging. I'm essentially storing the output messages for replay purposes, but not for rehydration of the saga. This comes with it's own set of downfalls and risks, but it works when you just need view replay.

If anyone would like to work with me in getting these changes into the CommonDomain framework, please hit me up.

Jonathan Matheus

Werner Clausen

unread,

Jun 15, 2012, 3:32:36 AM6/15/12

to ddd...@googlegroups.com

Thanks for your answer Jonathan, I like the idea of mixing sagas and replay-options (CQRS-SAGA).

--
Werner

Reply all

Reply to author

Forward