Process manager depending on a read model

590 views
Skip to first unread message

Aaron Muylaert

unread,
May 29, 2016, 9:05:41 AM5/29/16
to DDD/CQRS, aa...@tactics.be
There are two bounded contexts. Registration and Billing. 
There is a process manager between these two contexts. 
This process manager listens to events from the Registration context and transforms them into commands that are sent to Billing context.

Registration

UsageWasRegistered
    articleThatWasUsed
    numberOfArticlesUsed
    userThatUsedTheArticle



A read model aggregates these events by user and article.



User | Article  | Quantity
---- | -------  | --------
John | Beer     | 2
John | Water    | 1
Lisa | Wine     | 5



These aggregated usages can be sent to the billing context...


SendUsageForUserToBilling
   
John



...which results in an event...



UsagesForUserWereSentToBilling
   
John



...which manipulates the read model and marks those usages as "sent to billing".


User | Article | Quantity | Sent to billing
---- | ------- | -------- | ----------------
John | Beer    | 2        |      yes
John | Water   | 1        |      yes
Lisa | Wine    | 5        |      no



## Process Manager

The process manager listens to the UsagesForUserWereSentToBilling event and has a dependency on the read model that contains the aggregated usages. When a UsagesForUserWereSentToBilling event is received, the read model is queried by userId and re resulting rows are used to create a command that will be sent to the Billing context.

The problem I'm having is that both the process manager and the read model are subscribed to the same event, UsagesForUserWereSentToBilling. It is entirely possible that the process manager receives this event before the read model does. In this case, the process manager queries to read model for John's usages that were sent to billing. The read model hasn't been updated, which means that there are no usages for John that are ready to be sent to billing.

There are couple of things I have thought about:

Order the subscribers so that the read model gets the event first. This is currently running in production.
I don't want to do this because I feel like it creates a complex system. 
At the moment, this entire process of sending command, dispatching event, updating read model, process manager, send command takes place synchronous in one big transaction. However, it is entirely possible that we'll be forced to go async in the future. Ordering the subscribers won't work async.

Have a process check the usages read model periodically and send commands.
This would make the entire process async, as said before, I kind of like that everything happens in one big transaction right now so I'd rather not go this route yet.

What are your thoughts?

Danil Suits

unread,
May 29, 2016, 11:07:32 AM5/29/16
to DDD/CQRS, aa...@tactics.be
Warning: novice typing detected

Your description of what's going on seems to be focused on the read model; that seems to me the wrong place to look for answers.  The read model is just a projection/aggregation/re-imaging of data present in the book of record.


manipulates the read model and marks those usages as "sent to billing".

This doesn't make sense to me.  The read model is just a reflection; if we want to mark things, we need to make changes to the book of record, not the read model.


Likewise, when you write:

It is entirely possible that the process manager receives this event before the read model does. In this case, the process manager queries to read model for John's usages that were sent to billing. The read model hasn't been updated, which means that there are no usages for John that are ready to be sent to billing.

 You are getting badly tangled because you are looking for the information in the read model, when you should be looking for it in the book of record.


When a UsagesForUserWereSentToBilling event is received, the read model is queried by userId and re resulting rows are used to create a command that will be sent to the Billing context.

Two thoughts - first, "...SentToBilling" implies that the command has already been dispatched; BillingScheduled?  What happens if you discover some subscriber other than billing also needs that information?  It seems to me that you need to dig into the ubiquitous language, and figure out what's happened in the Registration context that has all these consequences.  Is it an end of life event for some entity in the registration context? UserRegistrationClosed?

Second, I don't understand why the read model is involved in sending the command to the billing context; the read model is just a reflection of data in the book of record.  My suspicion is that the "end of life" entity I mentioned above is supposed to know the information in the UsageWasRegistered events; either because it generates them itself, or because it gets notified of each of them when a command is dispatched by some event handler.

As an additional thought -- would it help to think of the read model pulling data from the billing service, rather than the other way around?  It would make sense to me that the list of items would be read from the Registration context, and the sent to billing flag read from the Billing context?



Aaron Muylaert

unread,
May 29, 2016, 3:48:38 PM5/29/16
to DDD/CQRS, aa...@tactics.be
Hi Danil,

Thanks for your reply. I'm a beginner myself, so don't take anything that I'm about to write as the truth. 

This doesn't make sense to me. The read model is just a reflection; if we want to mark things, we need to make changes to the book of record, not the read model.
This is exactly what happens. A user reads the data from the read model, the aggregated usages. 
Based on what he sees, he makes a selection and clicks "Send to billing".
One or more commands are sent to the write model. The write model guards invariants and publishes events.
The aggregated usages read model is subscribed to these events and certain usages are marked as "sent to billing".

You are getting badly tangled because you are looking for the information in the read model, when you should be looking for it in the book of record.
  
Let's forget about the process manager for a second. How could this system work without this process manager? 

1. An employee responsible for registration marks certain usages as "sent to billing". I agree, this is bad naming. Let's say "ready for billing".
2. An employee responsible for billing, let's call him Jeff, periodically checks the usages for usage that are "ready for billing"
3. Jeff manually bills the usages
4. Jeff manually marks the usages as "billed"

You see, the data that is inside of commands is always originating from the read model.

Now. Back to the process manager. I see a process manager as a form of automation. The process manager makes Jeff's life a bit easier. 
The way it'd work is exactly the same, it checks the read model and sends a command to billing.

The only difference with the manual process is that the process manager is not configured to periodically check the read model.
The process manager is subscribed to an event, so that when that event happens, it'll do the stuff.
My problem is that this particular event, is also the event that the read model is subscribed to. In order for the system to work correctly, I have to assure that the read model received the event first so that it is in the correct state.

The more I write about this, the more I think that the simplest solution is to have the process manager periodically check the read model.

Op zondag 29 mei 2016 17:07:32 UTC+2 schreef Danil Suits:

David Ackerman

unread,
May 29, 2016, 4:13:41 PM5/29/16
to ddd...@googlegroups.com
It sounds like you could put the relevant information in the event itself instead of querying the read model, since the read model could be in a completely different state since the event was fired (and you have the race conditions you mentioned).

Querying the read model periodically sounds like it might work too. However, since we're talking about "billing", I might be a little worried that I'll end up with inconsistencies or not be able to exactly tell if everything was accounted for (I am thinking about something showing up and then getting deleted all before the next query got scheduled. Is it desirable to pretend as if it never happened? Maybe. If you're dealing with events, you'd get all the notifications with perfect accuracy, and you could act on redactions or adjustments as necessary).

--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aaron Muylaert

unread,
May 29, 2016, 4:48:31 PM5/29/16
to DDD/CQRS
Hi David,

The aggregate responsible for producing this event does not contain the state required to put all the relevant information in the event.

Op zondag 29 mei 2016 22:13:41 UTC+2 schreef David Ackerman:

Danil Suits

unread,
May 30, 2016, 2:05:33 AM5/30/16
to DDD/CQRS

The aggregate responsible for producing this event does not contain the state required to put all the relevant information in the event.

OK, that's an interesting piece of information.

Is there another event, or series of events, that does have that information? should the process manager be subscribed to those events?
Are you sure that the other information isn't part of the same aggregate?
Are you sure that you aren't missing an aggregate that has copies of the information that you are missing?

Greg's 10 year retrospective talked about people being in different stages; I'm currently in the state where I've gotten really suspicious of process managers that have more clever in them than a state machine.  Track events? yes.  Track acknowledgements of commands? sure.  Copy data into commands?  OK.  Query the read model....?  That spooks me a bit.

I've taken the same approach that you do to process managers: how would it work if people were doing the job; but I ended up in a different place. Like you, I could see that the human being would be running a query to decide what to do. But when I started automating the procedure, instead of the process manager running the query, it was run by an aggregate. The process managers job was just to identify which aggregate was supposed to run the query, and which bits of state that the process manager already knew about should be included in the command.

I'm still kicking the tires on that idea, but so far it fits in really well with a number of quiet points from Udi Dahan's talk on reliable messaging.


Aaron Muylaert

unread,
May 30, 2016, 3:45:53 AM5/30/16
to DDD/CQRS
Is there another event, or series of events, that does have that information? should the process manager be subscribed to those events?

The aggregated usages need to be sent to billing. It needs to be exactly that. One of my first ideas was that the process manager could be subscribed to exactly the same events as the read model. This would result in my process manager containing the exact same state as the aggregated usages read model. When the final event comes in, the process manager would use it's state to build the commands that need to be sent to billing. If I would go that route, I'd be annoyed that the process manager and the read model would be mostly the same and would have to change for mostly the same reasons. The one point where they'd differ is the way they handle the "SentToBilling" event. The read model would mark the usages as "sent to billing" while the process manager would actually send them to billing.

Are you sure that the other information isn't part of the same aggregate?

Yep. The UsageWasRegistered events come from different aggregates. Given my example, this might sound a bit weird. The truth is that this example is not how the system works in reality. I changed and simplified so that I would not have to write a book about all of the different events and invariants.

Are you sure that you aren't missing an aggregate that has copies of the information that you are missing?

Not sure I understand your question.

But when I started automating the procedure, instead of the process manager running the query, it was run by an aggregate. The process managers job was just to identify which aggregate was supposed to run the query

Not sure I understand that correctly, aggregates running a query? 



Op maandag 30 mei 2016 08:05:33 UTC+2 schreef Danil Suits:

Douglas Gugino

unread,
May 30, 2016, 6:44:13 AM5/30/16
to DDD/CQRS


Greg's 10 year retrospective talked about people being in different stages;  .... IS THIS A VIDEO online somewhere?  If so, would appreciate the link.


xiety

unread,
May 30, 2016, 7:28:59 AM5/30/16
to DDD/CQRS

"Greg Young — A Decade of DDD, CQRS, Event Sourcing"


https://www.youtube.com/watch?v=LDW0QWie21s

Douglas Gugino

unread,
May 30, 2016, 7:59:13 AM5/30/16
to DDD/CQRS
pre·scient
ˈpreSH(ē)ənt/
adjective
  1. having or showing knowledge of events before they take place.
    "a prescient warning"
    synonyms:propheticpredictivevisionaryMore

Danil Suits

unread,
May 30, 2016, 4:07:41 PM5/30/16
to DDD/CQRS


Are you sure that you aren't missing an aggregate that has copies of the information that you are missing?

Not sure I understand your question.

That's fair: I'm not sure I understand it either.

As an exercise this week, I was imagining that I was modeling a bank account.  Very simple model: ATMs scattered about, that record FundsDispensed and FundsDeposited events.  These events include amounts, and an accountId that allows us to reconcile the books.

Riddle: where does an AccountOverdrawn event come from?

It's an easy enough query to run, when you have the history of the ATM events to replay.  Putting together a projection in the read model that displays the summary of the account activity, with all of the contributing transactions; that's straight forward.  But that doesn't get an event into the book of record -- it's just a view.

So I started thinking about that event; if it arrives in the book of record, it must be part of the history of an entity writing into that book.  How about the process instance as the entity: it listens to the ATM events, and keeps a running balance going, and if the balance drops negative, then the event gets written into its own history.  And I sketched out a few ideas; but none of them were particularly comfortable.

I eventually decided that there were two problems that calculating the balance in the process manager weren't solving for me.  First, I was coming to realize that I didn't seem to have an entity that the accountId belonged to.  That seemed rather suspicious.  Additionally, it began to sink in that the entire notion of an overdrawn account is business logic -- there's no particular reason that you should want to add up the balances from all of these ATM events, and do something useful in response, except as driven by the business.  Putting the calculation into the process manager was domain logic leaking out of the model.

When I introduced the idea of an Account aggregate instead, things began to feel right.  The event handler listening to the ATM events fires off Account.processTransaction commands, and the balances from the ATM events get copied into the aggregate, which also has the business intelligence to fire an AccountOverdrawn event.  Putting that logic into the Account also gives me a natural home for more complicated logic (if the account is overdrawn, there is an additional penalty for funds calculated according to the following schedule...).   The process managers revert back to being simple state machines.

Along the way, I also thought about process restarts.  How do we figure out which work was interrupted, and needs to be retried?  Like you, I'm thinking about Jeff, loading up a screen that represents some analog of an exception report; here are the process steps that have not yet been acknowledged.  We'll give Jeff a bit of agency; multiple buttons that he can click to advance the program.  What do those buttons do?  I suppose there are lots of possible answers, but the one that seems most robust in the face of changing processes is to copy an event into the event bus.  (Human being is not part of the book of record, so event is clearly the right message type).

You don't need to do anything fancy for a restart, though -- simply rebroadcasting the last message seen by the process manager should suffice.  "Do this again."  If no other events are already in flight, that should just put the same command back into the dispatch queue.

And that line of thought encourages me to think that, in our billing example, when Jeff is looking at a screen telling him to take something over to billing, the screen he should be looking at is one with a representation of an event, not one that is a representation of a projection.


Not sure I understand that correctly, aggregates running a query?

Application dispatches a command to the aggregate; one of the arguments is a DomainService/ProjectionRepository that allows the aggregate to query a stale copy of the domain model. I seem to have passed through one of the stages where I was dogmatic about that sort of thing.


Greg Young

unread,
May 30, 2016, 4:20:20 PM5/30/16
to ddd...@googlegroups.com
"
Along the way, I also thought about process restarts. How do we
figure out which work was interrupted, and needs to be retried? Like
you, I'm thinking about Jeff, loading up a screen that represents some
analog of an exception report; here are the process steps that have
not yet been acknowledged. We'll give Jeff a bit of agency; multiple
buttons that he can click to advance the program. What do those
buttons do? I suppose there are lots of possible answers, but the one
that seems most robust in the face of changing processes is to copy an
event into the event bus. (Human being is not part of the book of
record, so event is clearly the right message type)."


How would you do this with a human in the rare case it happens? They
would manually check each one they were doing to see if it was done if
not do it ...

In computer systems there is an easier way. Idempotency. Just do it
again and if it happens twice allow it to happen
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dddcqrs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Studying for the Turing test

xiety

unread,
May 31, 2016, 10:12:12 AM5/31/16
to DDD/CQRS
I made two separate event publishers working one after another synchronously. First is called Projectors (for denormalizers), and second is called Processors (for process managers). But I do not know whether it is a good decision.

Greg Young

unread,
May 31, 2016, 10:28:23 AM5/31/16
to ddd...@googlegroups.com
Event Publishers? How is a projection (denormalizer) an event
publisher? Why are they synchronous? do they depend on each other?
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dddcqrs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



xiety

unread,
May 31, 2016, 10:33:42 AM5/31/16
to DDD/CQRS
One event publisher publishes events only to denormalizers. And the other publishes only to process managers. They synchronous because process managers send commands to aggregates, and those aggregates may require data from the read model. Which may not be updated yet.
Reply all
Reply to author
Forward
0 new messages