Consume event stream without Pub/Sub

829 views
Skip to first unread message

Raymond Rutjes

unread,
Feb 9, 2016, 6:26:05 AM2/9/16
to DDD/CQRS
At the DDD Europe conference, I realized that the speakers I talked with where avoiding Pub/Sub whenever possible.

When using Pub/Sub, you have to take care of:
- Verifying the order of the events in the consuming side (if the order does matter)
- Implement an error queue for all events that could'nt be handled
- Design a replaying strategy involving multiple BCs for rebuilding read models

Jef Claes and Yves Reynhout pointed me to another direction which would be to consume the event stream by pulling it instead of subscribing to it.
What I like in this approach is:
- You natively get the events in the good order
- There is no more coupling to the pub/sub bus
- You can easily rebuild your projections by clearing the consumer local history
- You can live in a world of eventually connected BCs

The only situation I see this not work is if you need some kind of real time data projections. But in general, the business does not care about a few seconds.

I would love to have some feedback on implementations with the kind of pulling strategy I just described.
Is there a name for that pulling, local checkpointing strategy?

Greg Young also mentioned that he didn't care to much of duplicating event stores accross BCs. Events are facts, and they will not change once happened.
I was wondering about creating a projection of the event store on the consumer side to reduce the actual pulling in case of rebuilding.

Does anyone have a piece of infra code to recommend for this? Would you serve the event streams over a REST API in the application layer on the producer BC ?





Kijana Woodard

unread,
Feb 9, 2016, 10:29:47 AM2/9/16
to ddd...@googlegroups.com
"The only situation I see this not work is if you need some kind of real time data projections. But in general, the business does not care about a few seconds."

Probably wouldn't take so long.

See GES [1] for one example of how to do this.

Also note that there are cases that a fanout is preferable to working the events "in order". These cases are typically at system boundaries: sending emails, charging credit cards, etc. You wouldn't want a failure to process one message to block the rest from being processed. You likely want retries, poison message handling, etc.

You could use a bus there [e.g. NSB] or use a competing consumers implementation [GES has one]..

[1] http://docs.geteventstore.com/introduction/subscriptions/ [notably catch-up subscriptions]

--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kyle Cordes

unread,
Feb 9, 2016, 10:30:40 AM2/9/16
to ddd...@googlegroups.com
On February 9, 2016 at 5:26:07 AM, Raymond Rutjes (raymond...@gmail.com(mailto:raymond...@gmail.com)) wrote:

> At the DDD Europe conference, I realized that the speakers I talked with where avoiding Pub/Sub whenever possible.
>
> When using Pub/Sub, you have to take care of:
> - Verifying the order of the events in the consuming side (if the order does matter)
> - Implement an error queue for all events that could'nt be handled
> - Design a replaying strategy involving multiple BCs for rebuilding read models
>
> Jef Claes and Yves Reynhout pointed me to another direction which would be to consume the event stream by pulling it instead of subscribing to it.
> What I like in this approach is:


It took us a while to understand this also, implementing ES systems over the last few years. But here’s the heart of it: pub/sub is an “attractive nuisance” for anyone trying to implement an event sourced system. It seems like a relevant technology, it seems like something you should grab and shove into your software and try to use to implement ES. But, it is not actually useful, at least not in an obvious way, for getting the behaviors people often want out of an ES system.

Sorry to flog these blog posts… but several of them are the direct result of us (where I work) getting around the initial notion of using a queue or pub sub.

http://blog.oasisdigital.com/category/cqrs/

Greg Young’s event store has the right features you would expect for an event store; similar but not the same as a queue.  I think I’ve also seen some links floating in this mailing list to a community effort to build such a thing on top of PG. Regardless of whether you pick out one of these or build your own though, once you make the leap from pub/sub/queue two event store, it all gets much easier to talk about, reason about, implement, and administrate!


--
Kyle Cordes
http://kylecordes.com


Raymond Rutjes

unread,
Feb 9, 2016, 11:43:21 AM2/9/16
to DDD/CQRS
You do comfort me I my thinking.
Thanks for sharing the GES catch-up subscription feature, this is just what we need when building read models.
When it comes to critical, "unreplayable" tasks like sending emails, charging credit cards, I still ask myself if PubSub is the way to go. As Udi says, it really should be called SubPub, and in that regard I'm not confortable integrating it, as it supposes that subscribers should subscribe even before the publisher publishes. When I model my BCs I tend to not take care of the usage of the messages that will be done outside of the BC, so IMO, it looks like SubPub-BoundedContext is kind of an impedance mismatch.

Greg Young

unread,
Feb 9, 2016, 11:52:01 AM2/9/16
to ddd...@googlegroups.com
Sending emails etc is why we also support competing consumers.
--
Studying for the Turing test

Raymond Rutjes

unread,
Feb 9, 2016, 11:56:16 AM2/9/16
to DDD/CQRS
Thanks Kyle for the links,

Please correct me if I get this wrong, but I feel like GES is an EventStore, and a way to replace the Messaging layer for your bounded context if you are using a pull only strategy with catch-up subscriptions.

One think I do not like though is the direct access to the event store, because now multiple BCs (microservices) will be subscribing to some piece of infra, for instance GES. 
But I don't care to much, because, we have got the same problem with message brokers.

Could you think of a good reasons to keep sending eventually consumed messages over the wire ?

I definitely got to dig a bit further into GES and the different kind of subscriptions.

Kyle Cordes

unread,
Feb 9, 2016, 12:03:34 PM2/9/16
to ddd...@googlegroups.com
On February 9, 2016 at 10:56:19 AM, Raymond Rutjes (raymond...@gmail.com) wrote:
One think I do not like though is the direct access to the event store, because now multiple BCs (microservices) will be subscribing to some piece of infra, for instance GES. 
But I don't care to much, because, we have got the same problem with message brokers.

Could you think of a good reasons to keep sending eventually consumed messages over the wire ?


I couldn’t parse exactly what you meant, but here are a couple of bits about our implementation.

It seems potentially wasteful to send out each new event from an event store to every subscriber, especially if there are a lot of subscribers. It is also frustrating to have subscribers poll an event store, and slightly complex to implement many subscribers keeping an open connection to be notified of new events. So we took a different approach, and optimization that uses JGroups to multicast events out. A monotonic sequence number enables every recipient to make sure that it still strictly in order; and in case of a missed message, any subscriber can ask the event store for what it needs at any time.

It also seems potentially wasteful to transmit a potentially very large number of events to a subscriber, and then transmit them all again if that subscriber is a CQRS projection restarted from time to time. It is straightforward to implement local caching of such events. As Greg points out repeatedly here, and his videos, and everywhere, events never change and therefore are utterly cacheable.

Stijn Volders

unread,
Feb 9, 2016, 12:10:27 PM2/9/16
to ddd...@googlegroups.com

Might be an ignorant remark but is exposing your event store to multiple BC's not the same as sharing your database?

Communication between BC's depends on the kind of relationship between these BC's IMO

Stijn

Greg Young

unread,
Feb 9, 2016, 12:18:11 PM2/9/16
to ddd...@googlegroups.com
GES and its subscribers use a push model if the subscribers are caught
up, not a pull model. If a subscriber falls behind it resorts to the
pull model automatically (and switches back if caught up
automatically)
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dddcqrs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



Raymond Rutjes

unread,
Feb 9, 2016, 1:57:39 PM2/9/16
to DDD/CQRS
Might be an ignorant remark but is exposing your event store to multiple BC's not the same as sharing your database?
That was exactly what I asked myself and came to the conclusion that it was not a problem in case of event stores. Event stores are write only, so we are sure data remains consistent across time. Events are facts, they happened, and won't change. In that regard, we don't care of sharing direct read access to the event store.
Anyway, if we need different BCs to cooperate, we will have some kind of coupling. I'd rather couple directly the BCs on GES than introducing an additional messaging layer.
Traditional messaging tools were not designed to address the DDD paradigms, and need a lot of work to actually ensure that our application as a whole remains stable, fault tolerant.
My actual hope is finding out that GES simplifies the communication issues I have faced until now.


GES and its subscribers use a push model if the subscribers are caught 
up, not a pull model. If a subscriber falls behind it resorts to the 
pull model automatically (and switches back if caught up 
automatically) 
Thanks for the precision,
Greg, may I ask if you consider message buses or do you rely purely on GES for the actual communication between your services? 

@yreynhout

unread,
Feb 9, 2016, 4:00:46 PM2/9/16
to DDD/CQRS
- Depends on the infrastructure you're using. If https://geteventstore.com/ I believe it has all you are looking for: exposing event feeds using atom pub, competing consumers for fan out, catch up subscriptions for dynamically switching between pull and push under the hood (you could easily spool all relevant events to storage local to the projection if need be).
- I would argue that even real time projections have latency to deal with - even though you want to keep the time between event production and projection down to a minimum, network, storage and compute costs are not 0. Obviously, low latency solutions are different beasts from the run of the mill read model projections, I acknowledge.
- While BCs are logical units, they most often manifest themselves as a bunch of EC services. You may want to watch out for what you're coupling yourself to intra-BC. Contracts will require a different pace of versioning in that area.

Raymond Rutjes

unread,
Feb 9, 2016, 4:19:31 PM2/9/16
to DDD/CQRS
@yreynhout could you clarify what you mean by EC services, I'm afraid I don't get the acronym!
You may want to watch out for what you're coupling yourself to intra-BC. Contracts will require a different pace of versioning in that area.
Could you elaborate a bit here?

João Bragança

unread,
Feb 9, 2016, 4:42:22 PM2/9/16
to ddd...@googlegroups.com
EC = Eventually Consistent

Could you elaborate a bit here?

Normally you'd have one development team per bounded context. Naturally each team will release their software at different rates. So, you need to have established contracts between all of them. TL;DR: Inside the boundary, do whatever you want. On the edge you need to be more careful, because others will have a dependency on you.

--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Danil Suits

unread,
Feb 9, 2016, 4:53:17 PM2/9/16
to DDD/CQRS

It seems potentially wasteful to send out each new event from an event store to every subscriber, especially if there are a lot of subscribers. It is also frustrating to have subscribers poll an event store, and slightly complex to implement many subscribers keeping an open connection to be notified of new events. So we took a different approach, and optimization that uses JGroups to multicast events out. A monotonic sequence number enables every recipient to make sure that it still strictly in order; and in case of a missed message, any subscriber can ask the event store for what it needs at any time.


This pairs up neatly with something I had been puzzling over recently.

My language for it was that we aren't subscribing to the events as such, but rather subscribing to the changes in the cursors.  Constrain the cursors to be monotonically increasing, and it's really easy for the subscribers to simply ignore any incorrect ordering of the cursor update stream.  At any time, they can pull from the event cache all events up to the high water mark.

Same idea? or slightly different?

Danil

Greg Young

unread,
Feb 9, 2016, 5:15:38 PM2/9/16
to ddd...@googlegroups.com
Same idea. Linearization.
> --
> You received this message because you are subscribed to the Google Groups
> "DDD/CQRS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to dddcqrs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Reply all
Reply to author
Forward
0 new messages