Concepts for event sourcing on Kubernetes

392 views
Skip to first unread message

Fabian Schmied

unread,
Dec 23, 2020, 8:42:58 AM12/23/20
to DDD/CQRS
Hi there,

I wonder if anyone here has advice to share on running an event sourced system on Kubernetes with rolling updates or another zero downtime strategy such as blue/green deployments.

In particular, I'd like to know how you're dealing with backwards and forwards compatibility in the event store and the read models and such.

If you do rolling updates, do you, for example, ensure that newer releases of your software don't produce new versions of events until all pods have been updated? How does that work in practice, isn't it very cumbersome/complex in development to always work in two phases (consumers first, producers last) when implementing a feature?

Do you also make your read model databases backwards/forwards compatible? Or do you have two versions of the read model running in parallel? What does the process of deploying a new release look like in this case?

If you do blue/green, you probably can't use the standard Kubernetes deployment strategies, so what does the deployment process look like for you? Have you got two event stores (blue/green), too? How do you implement the blue/green switch?

Any input, experiences, or pointers to suggested reading are welcome.

Thank you, best regards, happy holidays,
Fabian

Lee Hambley

unread,
Dec 23, 2020, 8:52:12 AM12/23/20
to ddd...@googlegroups.com
A very short reply from me, but you can look at a feature of Avro (oft used with Kafka, but this isn't strictly necessary).

Avro comes with the option to use a "schema registry" which can enforce backwards or forward compatibility rules in two levels of strictness, you can build your rollout plans around your preferred set of guarantees.

For e.g the "backwards" compatibility guarantee (non transient) allows producers to deploy before consumers, provided the new fields in the records have default values; still deploying new records (types) requires that consumers be updated first.

A client of mine has been on Avro for about 4 years, most deploys are very painless, unless someone didn't follow the rules (e.g used a build time schema code gen rather than using the schema registry, or someone deployed an incompatible schema ignoring warnings about rollout strategy from out CI/CD)

--
You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dddcqrs/e5a317a6-9244-4d30-a359-881329197c6an%40googlegroups.com.

Rickard Öberg

unread,
Dec 24, 2020, 1:35:05 AM12/24/20
to ddd...@googlegroups.com
Hi,

We're not using Kubernetes, but we are using eventsourcing with blue/green deployments so I figured some description of what we do might still be useful.

First off, the setup. An EC2 cluster of 3 EventStore servers, replicated. 3 app servers subscribing to the EventStores, each writing to a local Neo4j Community (no replication) graph database. Eventually consistent, but we have mechanisms in place to handle "read your own writes" (write events to ES, get tx log index in commit receipt, wait for subscriber to catch up to index before rendering results).

On major release deploy we then spin up a green environment with the same setup. Start subscriber that fetches events from blue environment ES. These are potentially transformed/migrated and written down to green ES cluster, which then are used to update the Neo4j databases. Once green cluster is in live sync with blue we switch load balancers from blue to green servers.

That's the setup and mechanisms in place. With this we can do the following. 

The smallest use case is when doing schema changes in Neo4j. We simply change the subscriber rules for converting events into Neo4j db updates, and perform a blue green. Pretty straightforward. This allows us to never ever do database migration, we simply rebuild it from scratch with the new schema.

The next use case is for doing migrations on the events as they are pulled from blue into green. This can be minor changes to values and fields of events, aggregations, replacements, etc. and allows us to have a pretty straightforward way to change the event models without too much hassle. Examples for us include: changing product models to have more complex pricings, changing from product subscriptions to tier subscriptions, and aggregating various events like "user watched video X to position Y".

The final and most extreme use case, which we are going through now, is creating an entirely new content model (we do content subscription service) which is a superset and datadriven variant of old model. Existing events relating to this are projected into old database read model and corresponding new model, with new and old REST APIs (JSON:API based) providing these in parallel. Once clients have been moved to new APIs we can then upgrade write model to produce new events that create the same new read model, while removing the old write model/events and corresponding read model/API.

This covers most variations of what we have done with this setup and mechanism. 

So far, in five years, this has allowed us to have zero downtime even with some pretty substantial changes to the system and models as a whole. It's not for everyone, but has worked great in our situation. What is interesting is that we can scale the nr of Neo4j Community database instances up to basically any number, with no increase in license costs (i.e. zero). Pretty handy.

Hope this was helpful, Rickard

Fabian Schmied

unread,
Dec 30, 2020, 6:51:56 AM12/30/20
to ddd...@googlegroups.com
Hi Lee,

thanks for the pointer. It's certainly a good idea to enforce compatibility rules, I'll look at the concepts implemented by Avro.

I'm also very interested in the practical effects of using such a system on feature development. From your (client's) point of view, does it have any relevant effects on feature development and domain model evolution if the write model always needs to remain backwards compatible (for a certain amount of time)? What does "backwards compatible" even mean when the write model implements a new feature or, worse, modifies an existing one?

Concrete example (assuming an event-sourced system, which of course may not be what your client uses): Say I have an aggregate A and I modify an existing use case so that instead of event E1, event E2 is now raised. While deploying the update, there are instances of the old and the new write model in production. The old version will not understand the new event E2, so what do I do to remain backwards compatible? Keep the old implementation in the new version, adding a feature flag two switch to the new implementation only after all write model instances have been updated? (Including automated tests, configuration, etc.) In theory, this sounds complex to me, but what does it feel like in practice?

Best regards,
Fabian

You received this message because you are subscribed to a topic in the Google Groups "DDD/CQRS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dddcqrs/Rszj12RwBtA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dddcqrs+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dddcqrs/CAN_%2BVLXJd%2Bofs3TUqocwTCxS_x1Dv4zTSRNjjdv_YctR0SXwjw%40mail.gmail.com.

Fabian Schmied

unread,
Dec 30, 2020, 7:15:16 AM12/30/20
to ddd...@googlegroups.com
Hi Rickard,

Thanks a lot for the detailed description. In fact, this is extremely similar to what we do right now - also running a VM-based setup with blue/green deployments that include separate blue and green event stores and read model databases. We also use event subscriptions to build up the destination color's store/databases before switching colors. We're also quite satisfied with it, although I'm somewhat wary of how long this is going to scale - as the number of events grows, new deployments are going to take longer and longer, of course. (We do have some ideas about getting the number of live events down by archiving certain domain entities after some time, effectively removing them from the event store and moving them to a read only archive representation of their final state. But it will still be an ever-growing system...)

I've posted my original questions because of how different this seems from what Kubernetes itself suggests, e.g., with its rolling update deployment strategy. Also, most writeups I've found on blue/green or rolling updates with Kubernetes focus only on the pods (services), ignoring the database (compatibility) aspects. And with event sourcing, a different kind of compatibility comes into the picture anyway.

We're probably going to just translate what we currently have on VMs into the Kubernetes world, i.e., not using rolling updates etc., scripting everything manually and hoping that switching colors on Kubernetes load balancers works as well as doing this on ordinary load balancers. However, I'd still be very interested in hearing from people who've already tried this - our way or in a different way.

Best regards,
Fabian

You received this message because you are subscribed to a topic in the Google Groups "DDD/CQRS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dddcqrs/Rszj12RwBtA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dddcqrs+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dddcqrs/CAPJBY47icghBt1EArjjFbODk9Y%3DuQ9PbPoeXvZOA%3DJz1tJw9VQ%40mail.gmail.com.

Ben Kloosterman

unread,
Dec 31, 2020, 2:17:21 AM12/31/20
to ddd...@googlegroups.com
I'd always have a separate cube cluster for stateful services with data on cluster unless your tiny , the ability to just blow away a cluster and redeploy without a migration plan if there are issues is priceless.You cant do this easily once you have data on cluster,  

Normally do blue/green on stateless service first .. Stateless services have most of the logic and call the stateful services which build events into state and are fairly low level and change much more infrequently and are harder to do blue green on.
I would not move to blue /green until the stateless services are working well most systems dont progress past this . 

Regards,

Ben

Lee Hambley

unread,
Jan 5, 2021, 9:10:59 AM1/5/21
to ddd...@googlegroups.com

thanks for the pointer. It's certainly a good idea to enforce compatibility rules, I'll look at the concepts implemented by Avro.

My pleasure Fabian, from all the systems I used, Avro really has something unique in this regard.
I'm also very interested in the practical effects of using such a system on feature development. From your (client's) point of view, does it have any relevant effects on feature development and domain model evolution if the write model always needs to remain backwards compatible (for a certain amount of time)? What does "backwards compatible" even mean when the write model implements a new feature or, worse, modifies an existing one?

We're not exclusively a CQRS/ES platform, but we are heading in that direction. As is typical for most companies, we started as an RDBMS and started adding message busses, and "model logs" to have something resembling audit trails. By now I think it's ~70-80% possible for us to reconstruct the lifecycle of objects based on the events we observe, but the large engineering team is very reluctant to adopt CQRS with their whole heart, which I can to some extent understand. We often ship new features only considering the RDBMS factors, and get chased by the business operations teams and data scientists why this change is not reflected in the audit trail.

 
Concrete example (assuming an event-sourced system, which of course may not be what your client uses): Say I have an aggregate A and I modify an existing use case so that instead of event E1, event E2 is now raised. While deploying the update, there are instances of the old and the new write model in production. The old version will not understand the new event E2, so what do I do to remain backwards compatible? Keep the old implementation in the new version, adding a feature flag two switch to the new implementation only after all write model instances have been updated? (Including automated tests, configuration, etc.) In theory, this sounds complex to me, but what does it feel like in practice?

So "schema compatibility" (`BACKWARDS` in Avro terms, the one I would recommend people use) simply means that already existing clients can decode messages sent by newer clients. Often times this means you can simply add a _new_ field to an existing record provided it has a default value, the existing deployed consumers simple "do not see" the field. Adding new messages (records), or changing the semantic meaning is not endorsed by Avro's schema registry, for this some element of coordination from your side is required. In practice we are usually not adding "new features" (read: new messages/records) to the app, but refining existing features, or passing along more metadata that we capture at the time of the action, or something else, so the Avro trade-off suits us well, consumers are on the 80% case not blocked by complicated coordinated deploys, and in cases when we need to deploy a "breaking change", we use feature flags in our application to wrap some combination of old/new/both behaviour, so we can get the code reviewed and through all our release procedures, and into all our different environments, and then simply toggle a switch when we are ready, which so far has worked great for us, sure we have a lot of old feature flag code hanging around, but we do clean that up periodically.

Ahoy, LG aus Hamburg,

Harrison Brown

unread,
Jan 10, 2021, 6:57:14 PM1/10/21
to DDD/CQRS
Thanks for that detailed reply, Rickard. May I ask a couple of questions?
  1. You switch to green when "green cluster is in live sync with blue". Does this mean that you're copying events from blue's ES to green's ES? If so, how do you handle the potential problem that new events could be written by blue just as you're about to switch to green? It sounds like there's an issue making the switch in an atomic way to guarantee green has all the events before switching. 
  2. I can see that it's a good thing for green to be processing events so it can get its read models up to date before switching, but what about projections with side effects such as sending an email to a customer or writing data to an external system? How do you tell green that it should process the event stream into read projections but not process it to send emails (because blue will have already done those actions with side effects and we wouldn't want to do them again)?
  3. Do you do a full blue/green deploy for all deployments or only for major changes? I can think of a lot of changes which can be made as an addition rather than a change (and therefore don't benefit from green/blue). Eg, in my team we often write a new projection, deploy it and let it get up to date, then deploy a change to the II that starts using it, then deploy a change that removes the old projection — this strategy avoids having to coordinate live changes. Do you do something similar or do you just do a green/blue for all deployments? If you do, does that not cause a bottleneck (as I presume a full blue/green is slower than just pushing new code to the app servers).

Rickard Öberg

unread,
Jan 10, 2021, 7:22:17 PM1/10/21
to ddd...@googlegroups.com

On 11 Jan 2021, at 07:57, Harrison Brown <harr...@surupartners.com> wrote:
Thanks for that detailed reply, Rickard. May I ask a couple of questions?
  1. You switch to green when "green cluster is in live sync with blue". Does this mean that you're copying events from blue's ES to green's ES? If so, how do you handle the potential problem that new events could be written by blue just as you're about to switch to green? It sounds like there's an issue making the switch in an atomic way to guarantee green has all the events before switching. 
We switch load balancer from blue to green on live sync, but the import keeps running so any straggling events are copied over.
  1. I can see that it's a good thing for green to be processing events so it can get its read models up to date before switching, but what about projections with side effects such as sending an email to a customer or writing data to an external system? How do you tell green that it should process the event stream into read projections but not process it to send emails (because blue will have already done those actions with side effects and we wouldn't want to do them again)?
We do all of those integrations (and we have lots of them) on event write (write event to ES, do side-effects in the same thread), so we don’t have that problem. If you have that problem I think the way to go would be fore the processor to keep track of which timestamp or event id was the last processed, so that upon seeing them again it will just ignore. But that’s just theory since I personally don’t have that particular problem.

  1. Do you do a full blue/green deploy for all deployments or only for major changes? I can think of a lot of changes which can be made as an addition rather than a change (and therefore don't benefit from green/blue). Eg, in my team we often write a new projection, deploy it and let it get up to date, then deploy a change to the II that starts using it, then deploy a change that removes the old projection — this strategy avoids having to coordinate live changes. Do you do something similar or do you just do a green/blue for all deployments? If you do, does that not cause a bottleneck (as I presume a full blue/green is slower than just pushing new code to the app servers).

So in previous emails I listed most common cases where we do b/g deploy. Smallest change would be a Neo4j schema change in the read model. If something is an addition, then we typically don’t do b/g deploys, yeah.

Currently a full b/g is about half a day, so yeah, slower than a regular deploy. But not that bad to the point where it’s a big thing.

Cheers, Rickard

Fabian Schmied

unread,
Jan 17, 2021, 5:18:39 AM1/17/21
to ddd...@googlegroups.com
Hi Rickard,

I've a follow up question on your reply to Harrison:

> We switch load balancer from blue to green on live sync, but the import keeps running so any straggling events are copied over.

In your system, is it possible that during the sync, out of order events within a single event stream/aggregate could arise?

So, for example:
- B produces E1 in an aggregate stream A.
- Import copies E1 to G's A stream.
- G produces E2 in A.
- B produces E2' in A.
- Import copies E2' to G's A stream.

Result: G's A contains E1, E2, E2', although aggregate A's invariants might not allow E2' to come after E2.

If you have that situation, how do you deal with that?

In our system, we prevent this by ensuring the following order of steps when switching from B to G:
- Stop all services in B
- Switch load balancers from B to G
- Start all services in G

However, this means that the load balancer switch _must_ be synchronous, and it also means that there is a downtime whose length is determined by how fast these steps can be performed. For our switching to Kubernetes, I feel that we'd much better align with their philosophy if we'd allow B and G to run in parallel (for a short time), but that means we need a way to deal with the situation above. So, if you have one, I'd be very interested in it :) .

Best regards,
Fabian

--
You received this message because you are subscribed to a topic in the Google Groups "DDD/CQRS" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dddcqrs/Rszj12RwBtA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dddcqrs+u...@googlegroups.com.

Rickard Oberg

unread,
Jan 17, 2021, 6:11:29 AM1/17/21
to ddd...@googlegroups.com
Hey,

Good question! For us, we simply don’t have a case where that would be an issue (it’s just last one wins anyway), and yes technically it can happen. 

So to a large extent it comes down to designing events such that these theoretical conflicts don’t cause problems in practice. Being able to issue reversals in case a corner case issue comes up is also important. 

Our customer support is great at dealing with stuff like that anyway, with real life messiness, so I don’t worry too much about technical perfection in general. YMMV as always.

/Rickard

On 17 Jan 2021, at 18:18, Fabian Schmied <fabian....@gmail.com> wrote:


You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dddcqrs/CABBCAiuVxu0MtnWy%2B4AWDab%2BvyD-o4xAeOufKr7Oi%3Dag7QUR7A%40mail.gmail.com.

Fabian Schmied

unread,
Jan 17, 2021, 6:24:04 AM1/17/21
to ddd...@googlegroups.com
> So to a large extent it comes down to designing events such that these theoretical conflicts don’t cause problems in practice.

Yeah, I guess there won't be a general solution to that problem. In our domain, it's not that easy to deal with conflicts caused by the write model, so in the end, it probably boils down to: Is it worth the effort? (Vs. having a short downtime.) :)

Thanks, Rickard, for the insights from your experience.

If anyone else has thoughts on this as well, I'd like to hear them, too!

Best regards,
Fabian

Harrison Brown

unread,
Jan 18, 2021, 10:00:28 AM1/18/21
to ddd...@googlegroups.com
+1 on Fabian’s follow up question. I was going to ask that too. 

(My initial thought is that one should be able to model things happening slightly out of order which is a good thing in general for handling edge cases. Over time, we’ve definitely moved towards looser validation (in the UI) and relied more on being able to recover from things happening out of order.)

Harrison

Sent from my iPhone

On 17 Jan 2021, at 10:18, Fabian Schmied <fabian....@gmail.com> wrote:


You received this message because you are subscribed to the Google Groups "DDD/CQRS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dddcqrs+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dddcqrs/CABBCAiuVxu0MtnWy%2B4AWDab%2BvyD-o4xAeOufKr7Oi%3Dag7QUR7A%40mail.gmail.com.

Fabian Schmied

unread,
Jan 18, 2021, 2:52:48 PM1/18/21
to ddd...@googlegroups.com
My initial thought is that one should be able to model things happening slightly out of order which is a good thing in general for handling edge cases.

I don't know, really. Of course, living in a distributed world becomes a lot easier when  aggregates in the write model allow out of order events. On the other hand, being able to rely on consistency and invariants within an aggregate makes reasoning about the business logic much easier, I think. It's the reason DDD has aggregate in the first place.

This probably depends a lot on the domain.

Regards,
Fabian

Reply all
Reply to author
Forward
0 new messages