Out of order events in read models

1,682 views
Skip to first unread message

Erik Johansson

unread,
Dec 22, 2016, 10:54:46 AM12/22/16
to DDD/CQRS
Hi,

Is there some ideomatic way of handling events that are received out of order in a read model (i.e. a read model on the receiving end of a pubsub whitch guarantees delivery at least once but not in order)? 

The only solution I´ve come across so far (which works for all but the most basic read models) is to store pending events in the read model and then apply them when the read model catches up. 

Greg Young

unread,
Dec 22, 2016, 10:59:00 AM12/22/16
to ddd...@googlegroups.com
Why are you using competing consumers and a pub/sub for a read model?
Usually its far simpler to use something like a catch up subscription
(client driven subscription) which assures ordering.

Alternatively it is common to use sequence numbers to allow for
reordering assuming a single source. With multiple sources vector
clocks/lamport sequences work reasonably well.

Greg
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dddcqrs+u...@googlegroups.com.
> Visit this group at https://groups.google.com/group/dddcqrs.
> For more options, visit https://groups.google.com/d/optout.



--
Studying for the Turing test

Kyle Cordes

unread,
Dec 22, 2016, 10:59:54 AM12/22/16
to ddd...@googlegroups.com
On December 22, 2016 at 9:54:32 AM, Erik Johansson
(erik.gusta...@gmail.com(mailto:erik.gusta...@gmail.com))
wrote:
> Hi,
>
> Is there some ideomatic way of handling events that are received out of order in a read model (i.e. a read model on the receiving end of a pubsub whitch guarantees delivery at least once but not in order)?
>
> The only solution I´ve come across so far (which works for all but the most basic read models) is to store pending events in the read model and then apply them when the read model catches up.




Yes. Here is the key bit: pubsub isn’t quite the right way of thinking
about thing. What you need is an event store of some kind. Semantics
are different. Linear sequence of all events, always delivered in
order, with the ability to reset the read marker (for a specific
reader) back to an earlier point.

Many words have been spilled in this list (maybe someone can link to
an old thread), and some more words (that I am partial to, it’s our
company blog…):

http://blog.oasisdigital.com/2015/event-sourced-cqrs-read-model/




--
Kyle Cordes
kyle....@oasisdigital.com

Antony Koch

unread,
Dec 22, 2016, 1:18:01 PM12/22/16
to DDD/CQRS
For simplicity, Udi, I believe, talks about throwing back to the queue/db/whatever and allowing using the retry mechanism to wait for the message to come again, hoping that the other message has since been processed.

Greg Young

unread,
Dec 22, 2016, 1:30:30 PM12/22/16
to ddd...@googlegroups.com
This sounds like a terrible idea for projections.

Thomas Schanko

unread,
Dec 22, 2016, 7:04:23 PM12/22/16
to DDD/CQRS
This is a terrible solution not only for projections.

Greg Young

unread,
Dec 22, 2016, 7:19:24 PM12/22/16
to ddd...@googlegroups.com
competing consumers are quite useful for many things, but projections
don't really fit. what would be your issue with other use cases of
competing consumers?

On Fri, Dec 23, 2016 at 12:04 AM, Thomas Schanko
<thomas...@hotmail.com> wrote:
> This is a terrible solution not only for projections.
>

Thomas Schanko

unread,
Dec 22, 2016, 7:34:09 PM12/22/16
to DDD/CQRS
My statement was related to the idea of using some retry mechanism to handle messages coming in out of order. I faced a similar problem and tried different approaches to cope with it. In particular in a NServiceBus based system. Neither the HandleMessageLater (aka put it back at the end of the Queue) nor the DefferMessage (resend it after a given timespan or at a certain point in time) capabilities provided enjoyable solutions. Competing consumers is a mature pattern. Again, as you mentioned, not in this case.

Thomas

Erik Johansson

unread,
Dec 23, 2016, 4:36:00 AM12/23/16
to DDD/CQRS
The solution is not based on competing consumers, maybe this was not clear from my initial description. Each message will be delivered at least once to every subscriber (not competing), but it might be out of order. But in general, if I have a risk of receiving out of order events in a read model I should probably try to solve it with another type of pattern (i.e catch up subscription) instead of trying to manage out of order events in the read model?

I will look into catch up subscription

Thanks

Erik Johansson

unread,
Dec 23, 2016, 4:43:33 AM12/23/16
to DDD/CQRS
I do have event store. The way it is hooked up right now is that the there is a listener attached to the event store which picks up newly stored events and publishes them to a pubsub. In turn, remote read models can subscribe to the pubsub and build up their internal state. The pubsub guarantees that the event is delivered at least once to each subscriber but not in the correct order. But maybe as Greg suggests, I should be looking for a subscription that guarantees order.

Den torsdag 22 december 2016 kl. 16:59:54 UTC+1 skrev Kyle Cordes:
On December 22, 2016 at 9:54:32 AM, Erik Johansson

Kyle Cordes

unread,
Dec 23, 2016, 10:36:16 AM12/23/16
to ddd...@googlegroups.com
On December 23, 2016 at 3:43:34 AM, Erik Johansson
(erik.gusta...@gmail.com(mailto:erik.gusta...@gmail.com))
wrote:
> I do have event store. The way it is hooked up right now is that the there is a listener attached to the event store which picks up newly stored events and publishes them to a pubsub. In turn, remote read models can subscribe to the pubsub and build up their internal state. The pubsub guarantees that the event is delivered at least once to each subscriber but not in the correct order. But maybe as Greg suggests, I should be looking for a subscription that guarantees order.



It seems to me that the easy path to correct semantics is:

EventStore -> Projection1

EventStore -> Projection2

etc



not:

EventStore -> PubSubMechanism -> Projections



… because the whole point of an event store is to have the correct
semantics for an event sourced system. This is not the same as the
semantics of a pubsub bus.


--
Kyle Cordes
kyle....@oasisdigital.com

Erik Johansson

unread,
Dec 23, 2016, 11:14:31 AM12/23/16
to DDD/CQRS
In this case the pubsub is just a transport for getting an event to a remote server.  But if I understand you correctly you suggest that the events should always be transmitted in order (which might not be supported by a pubsub making it a poor choice).    

Greg Young

unread,
Dec 23, 2016, 11:22:51 AM12/23/16
to ddd...@googlegroups.com
Your eventstore is already a queue that can give an ordering
assurance. Why take out of a queue with an assurance to put in one
without?

Kyle Cordes

unread,
Dec 23, 2016, 11:24:10 AM12/23/16
to ddd...@googlegroups.com
On December 23, 2016 at 10:14:32 AM, Erik Johansson
(erik.gusta...@gmail.com(mailto:erik.gusta...@gmail.com))
wrote:
> In this case the pubsub is just a transport for getting an event to a remote server. But if I understand you correctly you suggest that the events should always be transmitted in order (which might not be supported by a pubsub making it a poor choice).


Actually I suggest this: set aside the pubsub tool; likely there is
another part of your overall system architecture where it has the
right semantics, use it over there.

Don’t use a pubsub tool at all, between an event store and
projections. Have the projections receive their events from an event
store, without aid of a pubsub tool. An event store will already have
the right semantics, APIs, etc. suitable for this purpose.



--
Kyle Cordes
kyle....@oasisdigital.com

Erik Johansson

unread,
Dec 23, 2016, 1:03:44 PM12/23/16
to DDD/CQRS
Point taken. The reason why I considered it is because I am experimenting with my own EventSourcing framework (golang) and using an existing pubsub solution seemed like an easy way of being able to connect remote services.  

Thanks for your help

Erik Johansson

unread,
Dec 23, 2016, 1:06:17 PM12/23/16
to DDD/CQRS
Yes, I agree. It seemed like pubsub was a fitting choice (from where I was standing anyway :) but now I see that it is not.

Thanks for helping me out on this.

Thomas Schanko

unread,
Dec 23, 2016, 7:49:05 PM12/23/16
to DDD/CQRS
PubSub in general may or may not guarantee message order. If the underlying transport does not guarantee message order, it's likely cumbersome to add this feature in an upper layer. If you own the transport layer, it's your choice, right?

Erik Johansson

unread,
Jan 2, 2017, 5:39:32 AM1/2/17
to DDD/CQRS
yes, correct :) will look into a transport that supports ordering

Michael Yeaney

unread,
Jan 2, 2017, 9:05:35 AM1/2/17
to ddd...@googlegroups.com
One thing to point out with ordered transports - they guarantee ordering of messages **once the transport has received them**. This may or may not match the actual order the messages should be in, so you may in fact still end up implementing Lamport or vector clocks to help with this (as Greg points out earlier).

For example, consider the time diagram below (wall time flows from top to bottom). Here, an ordered transport would (correctly) deliver the message from Producer B before the message from Producer A. However, this doesn't match the order they should be in from the application point-of-view.

Producer A                          Producer B
---------------------------------------------------------------
Start send to transport          
          |
          |                         Start send to transport
(delay due to GC pause / etc)             |
          |                         Enters transport channel
          |                               |
          |                               V
          |                         Delivered to receiver
Enters transport channel
          |
          V
Delivered to receiver
          



--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+unsubscribe@googlegroups.com.

@yreynhout

unread,
Jan 5, 2017, 5:44:13 AM1/5/17
to DDD/CQRS
Sometimes the answer lies in the model, not in the technical solutions we come up with. Always question why things are "out of order" to begin with (root cause analysis). If that path has been exhausted, sure, we can look at a more technical approach.

Michal Matula

unread,
Jan 17, 2017, 9:03:50 AM1/17/17
to DDD/CQRS
Hi Erik.

The way we approached this was that we use pub/sub (to avoid constant polling event store from views). But we check the "order" information in the events and if we believe we might have missed an event we poll from event store to be sure. Then we continue listening on the subscription from the last event received from store.

M.

Dne pondělí 2. ledna 2017 11:39:32 UTC+1 Erik Johansson napsal(a):

Stan Shillis

unread,
Jan 24, 2017, 4:53:02 PM1/24/17
to DDD/CQRS
With catch up subscriptions you can still get into ordering problems i think. If you are projecting into durable storage like a sql database it might take time to complete that operation. Then if you multi-thread handling of each event to get around io performance you are back into ordering issues.

Is there a way to avoid that situation easily? I'd be interested to find out common approaches to this problem.

kelly...@adaptechsolutions.ca

unread,
Jan 25, 2017, 4:30:24 AM1/25/17
to DDD/CQRS
You shouldn't use competing consumers for read models. If you multithread the read model updater you are essentially creating competing consumers. Additionally I see no reasonable way to (in general) multithread the read model updates, unless you have some sort of natural sharding/partitioning you can do that guarantees the updates don't depend on ordering across read model instances.

Stan Shillis

unread,
Jan 25, 2017, 8:39:50 AM1/25/17
to DDD/CQRS
You would gain in performance if you can run a projection in parallel. And considering that for some use cases, like projecting data for different customers, projection would never actually compete as that data is naturally segregated by customer. But it would be difficult to setup something that would allow parallel execution and yet still have ability to go into serial execution mode when processing two consecutive events for one customer for example.

Going with in order single thread processing of events would most likely work for a while from performance point of view, and it is easiest to implement it that way.

kelly...@adaptechsolutions.ca

unread,
Jan 30, 2017, 7:32:32 PM1/30/17
to DDD/CQRS
It's pretty easy if you know that, for instance, your read models can be partitioned by "customer id" or something.  If that's the case, you just partition your read model updaters by that same partition key (aka sharding).  Then, those read model updaters update in serial manner, but across updaters they run in parallel.  If you partition in this way and your partition keys are self-levelling (i.e. random guids, hashes of the customer ID sequence, etc.) then you should balance the work fairly well.

You never want to partition work across groups of events that need to be processed serially, so for this reason, we say "don't use competing consumers" (notice that the scheme of partitioning described above is NOT competing consumers).

Kelly

Thomas Schanko

unread,
Feb 6, 2017, 5:18:23 PM2/6/17
to DDD/CQRS
From the geteventstore documentation the consumer strategy "Pinned" should be what you are looking for. Google for "Consistent Hashing" to find further explanation of the underlying idea, or look for ConsistentHashing in the Akka.net documentation to see how it is applied to message routing.
However, the geteventstore docu claims that "Pinned" is not a guarantee and that "the usual ordering and concurrency issues must be handled". To me it sounds a bit vague and I admit that currently I don't have a clue about the whats and whys.

Thomas

Greg Young

unread,
Feb 6, 2017, 5:28:55 PM2/6/17
to ddd...@googlegroups.com
What if you get a retry?

Thomas Schanko

unread,
Feb 6, 2017, 5:38:19 PM2/6/17
to DDD/CQRS
What kind of retry? The producer retries and therefor creates a duplicate message? Or a failing consumer?

Greg Young

unread,
Feb 6, 2017, 5:41:21 PM2/6/17
to ddd...@googlegroups.com
either

On Mon, Feb 6, 2017 at 10:38 PM, Thomas Schanko
<thomas...@hotmail.com> wrote:
> What kind of retry? The producer retries and therefor creates a duplicate message? Or a failing consumer?
>

Thomas Schanko

unread,
Feb 6, 2017, 5:49:08 PM2/6/17
to DDD/CQRS
If the consumer is single threaded, it shouldn't matter, right? If the producer retries and it's only about ordering within each stream, we need to handle dedub? Am I missing something? To be honest, I havn't done anything substantial with geteventstore, yet. So, I'm just guessing how it works under the hood. Sending stuff in batches could complicate things i.e...

Greg Young

unread,
Feb 6, 2017, 6:37:20 PM2/6/17
to ddd...@googlegroups.com
Producer in competing consumers would mean the backend and yes it can
retry due to a server timeout. A client can also fail and say a
message should be retried. Any retry can be an out of order message
even on a single client with a buffer size > 1

On Mon, Feb 6, 2017 at 10:49 PM, Thomas Schanko
<thomas...@hotmail.com> wrote:
> If the consumer is single threaded, it shouldn't matter, right? If the producer retries and it's only about ordering within each stream, we need to handle dedub? Am I missing something? To be honest, I havn't done anything substantial with geteventstore, yet. So, I'm just guessing how it works under the hood. Sending stuff in batches could complicate things i.e...
>

Thomas Schanko

unread,
Feb 7, 2017, 11:10:55 AM2/7/17
to DDD/CQRS
That makes sense. Thx for the clarification.

wilkins...@hotmail.com

unread,
May 31, 2018, 8:57:27 AM5/31/18
to DDD/CQRS
Optimistic concurrency would solve this issue.

In other words, by utilizing some sort of versioning/sequencing on the events, the success of an insert or update query can be made contingent on the current version in the database being the previous version in the sequence.

wilkins...@hotmail.com

unread,
May 31, 2018, 8:57:27 AM5/31/18
to DDD/CQRS
Hi, I have a hypothetical situation I'd like to discuss how this applies to:

Lets imagine I have an event store distributing versioned events, which I wish to project to 1 million different read models (silly numbers just for the sake of argument). Eventual consistency is fine, to a degree, but the users of the system would expect relatively quick turnaround of the read models in response to the events. In other words, they are making a change in the UI, then navigating to some form of 'summary' view in which they expect their latest change to be visible.

- the number of physical servers / processing power is not a concern
- I do not want any single points of failure which would block this relatively quick turn around (ie nothing should be on a single machine)
- Events are 'properly' designed such that they are descriptive and contain only the data necessary to effect that change (ie event ordering is important).

Isn't pub/sub with competing consumers a decent fit for this? 

For example, is a catch-up-subscription a) distributable amongst multiple servers and b) sufficiently close to 'real time' and c) able to do this x 1 million without adding load onto the event store itself?

It feels to me that by taking the events out of the event store itself and into a message bus for distribution (yes, from ordered, into unordered) offers much more opportunity for scale, with competing consumers offering robustness in terms of physical hardware issues, and version numbers on the events themselves offering a realistic way to maintain order.

Am I missing some other option, or am I perhaps underestimating catch up subscriptions?


On Thursday, December 22, 2016 at 3:59:00 PM UTC, Greg Young wrote:
Why are you using competing consumers and a pub/sub for a read model?
Usually its far simpler to use something like a catch up subscription
(client driven subscription) which assures ordering.

Alternatively it is common to use sequence numbers to allow for
reordering assuming a single source. With multiple sources vector
clocks/lamport sequences work reasonably well.

Greg

On Thu, Dec 22, 2016 at 2:30 PM, Erik Johansson
<erik.gusta...@gmail.com> wrote:
> Hi,
>
> Is there some ideomatic way of handling events that are received out of
> order in a read model (i.e. a read model on the receiving end of a pubsub
> whitch guarantees delivery at least once but not in order)?
>
> The only solution I´ve come across so far (which works for all but the most
> basic read models) is to store pending events in the read model and then
> apply them when the read model catches up.
>

Greg Young

unread,
May 31, 2018, 9:02:49 AM5/31/18
to ddd...@googlegroups.com
Or just use a regular subscription with assured ordering then build out caches on top (which is far simpler). I have to check if it still supported but you used to be able to bring up ES nodes explicitly as clones so they did not participate in the cluster but would asynchronously receive all events. Having many distributed read models was the *exact* use case for it,

As example we have our primary 3 node cluster in NYC (writes go here). We also have 2 clones + read models setup in Europe and 2 clones + read models setup in Asia. The EU/Asian read models are propagated off their local clones not off the main cluster in NYC.

Does this make sense? 

To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+unsubscribe@googlegroups.com.

Visit this group at https://groups.google.com/group/dddcqrs.
For more options, visit https://groups.google.com/d/optout.

wilkins...@hotmail.com

unread,
May 31, 2018, 4:51:37 PM5/31/18
to DDD/CQRS
It does make sense - thank you.

Ben Kloosterman

unread,
Jun 1, 2018, 1:42:19 AM6/1/18
to ddd...@googlegroups.com
We have a similar but different issues , where we have event stores at remote sites and we need data uploaded ( data needs to be collected when offline).  This can be out of order esp after a comms outage, the way we are tackling this is when the Event store upstream receives the message I go back in time  to the earliest message in the stream  which is later than the events arriving  , merge them sorted by time with the events arriving and re-add the records ..  This forces projections to replay with the correct order .. It does mean you have a message played twice. I considered deleting messages or rewriting the stream but the negatives of both seem to be worse than having the duplicates.
Reply all
Reply to author
Forward
0 new messages