Why even use event replay instead of snapshots all the time?

Benjamin Eberlei

unread,

Aug 18, 2012, 12:11:21 PM8/18/12

to ddd...@googlegroups.com

Hello everyone,

i have been exposed to CQRS for roughly a month now, using it in a platform migration project. This project is using the pragmatic approach to CQRS without event sourcing and using an ORM on the write layer as described here http://codeofrob.com/entries/cqrs-is-too-complicated.html

Now I wanted to try the event sourcing part in a small side project and realize the complexities of the event + apply cycle in your aggregate roots.

1. Its a very new approach to me, that counts for one part of my difficulties.
2. You have to be very careful to modify the state only in apply*() methods.
3. Storing events is difficult when you have references to other aggregate roots (even just with ids, you still need the version of that aggregate root right at that moment the event was fired).
4. You should to test the behavior and that replaying the events leads to the same state.

This got me thinking about if event replaying actually makes sense to load the state of an entity. Greg talks about snapshotting in his video lecture and I think the complexity of event sourcing is considerably reduced, if you do snapshotting all the time and store the events only as audit log. With MongoDB or CouchDB as a data storage and a JPA like Mapper on top of them this isn't very difficult and you instantly remove the complexity of the event + apply method cycle. You don't need apply* methods anymore and with it the tests for loading entities from events.

Instead you just have a method "raise(DomainEvent)".

What do you think?

greetings,
Benjamin

Greg Young

unread,

Aug 18, 2012, 1:40:01 PM8/18/12

to ddd...@googlegroups.com

A single question:

If I have apply methods and I want to change what evnts mean to me what is required in order to make my change, for snapshots?

Greg

--
Le doute n'est pas une condition agréable, mais la certitude est absurde.

Benjamin Mapp

unread,

Aug 18, 2012, 2:01:39 PM8/18/12

to ddd...@googlegroups.com

Well of course you can do that but then you are not really event sourcing. You have no guarantee your events could rebuild your domain and lose much of the advantages that ES provides.

That said if those things do not matter then sure use domain state storage. Personally I don't think having to change state in apply that onerous but every implementation and requirement is different.

Sent from my iPhone

Greg Young

unread,

Aug 18, 2012, 2:32:16 PM8/18/12

to ddd...@googlegroups.com

Btw what is "complexity" of public behavior vs method for changing state? I do this normally even if not using event sourcing.

On Saturday, August 18, 2012, Benjamin Eberlei wrote:

Benjamin Eberlei

unread,

Aug 18, 2012, 6:07:48 PM8/18/12

to ddd...@googlegroups.com

Thanks for your feedback,

you are right with your first email that this makes the rebuilding of a model from events either impossible or much more complicated. But if you raise events for all changes, then this is still possible by having an outside service that triggers the model changes. Think of an event replayer object, that gets all events and rebuilds a domain object according to the events, but within your new world, reaching a new snapshot at the end. This would only work if you don't "forget" to include some information in the events though. The same applies to not forgetting something in apply*() methods however.

On Sat, Aug 18, 2012 at 8:32 PM, Greg Young <gregor...@gmail.com> wrote:

Btw what is "complexity" of public behavior vs method for changing state? I do this normally even if not using event sourcing.

I fully agree that changing state "internally" compared to through setters is the better approach, no disagreement here.

Benjamin Eberlei

unread,

Aug 18, 2012, 6:18:41 PM8/18/12

to ddd...@googlegroups.com

On Sat, Aug 18, 2012 at 8:01 PM, Benjamin Mapp <ben....@gmail.com> wrote:

Well of course you can do that but then you are not really event sourcing. You have no guarantee your events could rebuild your domain and lose much of the advantages that ES provides.

Yes, i wouldn't claim i was doing event sourcing in this case, explicitly not. It is just the Domain Event pattern (http://martinfowler.com/eaaDev/DomainEvent.html) to decouple different use cases and dependencies of the application, but specifically not to rebuild the domain model. If i would still raise events for all changes, i could rebuild the domain, but i would need a new set of objects for this, essentially moving the apply*() methods into a distinct object for replaying - if that was ever necessary. But In my case I derive the most value from the accountability that events give me.

That said if those things do not matter then sure use domain state storage. Personally I don't think having to change state in apply that onerous but every implementation and requirement is different.

Well it is more complicated than changing state directly in your "main" entry point to the model, by the nature of requiring at least two methods instead of one. It would certainly be simpler if the language you are using could help you enforce this constraint, but that is not the case in my project. In that case where the language cannot enforce this, doing the event + apply thing is not KISS and requires some additional thought.

Greg Young

unread,

Aug 18, 2012, 10:42:06 PM8/18/12

to ddd...@googlegroups.com

I think you misunderstand me. I change the structure of my model very often. (move objects around, introduce value object, etc). Replaying events this is a trivial change (just change app,y method), If I have structure then I need to migrate all of the structure.

On Sat, Aug 18, 2012 at 8:32 PM, Greg Young <gregor...@gmail.com> wrote:

Btw what is "complexity" of public behavior vs method for changing state? I do this normally even if not using event sourcing.

I fully agree that changing state "internally" compared to through setters is the better approach, no disagreement here.

Could you explain the point where you brought up complexity then? Maybe I am not understanding you properly.

Greg

Benjamin Eberlei

unread,

Aug 19, 2012, 3:21:25 AM8/19/12

to ddd...@googlegroups.com

On Sun, Aug 19, 2012 at 4:42 AM, Greg Young <gregor...@gmail.com> wrote:

I think you misunderstand me. I change the structure of my model very often. (move objects around, introduce value object, etc). Replaying events this is a trivial change (just change app,y method), If I have structure then I need to migrate all of the structure.

Yes, but this is still possible with snapshotting: Say you save a "version" of an aggregate root, then by having migration methods to get from version X to version Y, you can migrate entities "eventually" whenever they are loaded from the snapshotting storage.

This may not be equally powerful and handle every case, but with the help of all events that you still have, it may be equivalent. This does not require replaying events as well and can be handled by the snapshotting storage on every load of an aggregate.

On Sat, Aug 18, 2012 at 8:32 PM, Greg Young <gregor...@gmail.com> wrote:

Btw what is "complexity" of public behavior vs method for changing state? I do this normally even if not using event sourcing.

I fully agree that changing state "internally" compared to through setters is the better approach, no disagreement here.

Could you explain the point where you brought up complexity then? Maybe I am not understanding you properly.

My point is that there are subtle points in CQRS to get wrong and additional requirements for automated testing, i created three examples, problems are described in the code comments:

https://gist.github.com/3392926

Compared to a full snapshot solution, in a system relying on Event Sourcing there exist the possibility for a new category of very subtle bugs ("Temporal bugs" maybe?). This requires to write additional tests and take additional care to mitigate the risk of these kind of bugs.

I guess my point is: When Event Sourcing is not a requirement of the application and change in the model can be handled by "eventual migration" done by the snapshotting technology: Isn't a full snapshot solution less risky and more KISS, because it eliminates one category of bugs from happening and is less "different" from "usual" OOP business object code. And still a solution that is almost equally flexible?

Philip Jander

unread,

Aug 19, 2012, 4:59:07 AM8/19/12

to ddd...@googlegroups.com

Hi Benjamin,

from my experience, the "complex" part of event sourcing was *only*
getting used to it as opposed to data-centric thinking.

For me, the advantages soon outbalanced the initial learning. Any kind
of snapshotting, I would strongly advise against unless you determine by
performance monitoring that you really need it. Usually it is extra
trouble and not worth the effort. Even on the read side, I often drop
any kind of persistence and go for in-memory read models, which are
regenerated from event store on server startup each time. This does not
work for each read model, but in particular for the more complex ones
(complex state derivation rather than big data) this works extremely
well. Even for a few million events this does not take longer that
starting up any kind of tomcat-based jvm server ;)

The event + state persistence idea does never work out as you have two
independent representations of you data to cope with.
It might look nice initially but becomes a maintenance nightmare soon
afterwards.

Regarding your "apply-method" problem, I initially used a pattern to
enforce correct implementation: each entity had only methods that could
raise events plus one instance of an state co-class that only handled
events and offered its state as "get private set" for it's host class.

E.g. pseudo-code:

public class Customer {
private readonly CustomerState State;

public DoSomething(){
if (!State.Done) EmitDidSomething();
}

class CustomerState {
public bool Done { get; private set; }
internal void OnDidSomething(... e){
Done=true;
}
}
}

+ a bit of connecting code.

While this pattern is a bit of extra code, it solves the
mutate-state-by-accident problem.
Additionally, it soon led to the logic of "OnDidSomething" to be
refactored into an external method which is now shared by projections
and the domain - I call this a "concept" on the event stream. IMO a good
application of DRY, avoiding out-of-sync bugs between domain projection
logic and readmodel projection logic. Whereever concepts are shared,
there is only one pice of code responsible for defining the concept.

The most important rules I learned were to start out with defining the
events. Only the events without implementation considerations and based
on the business only (thereby also defining contexts). Then the commands
and read models (coarse) as defined by use cases and only then the
actual model with defining components, entities and finally sagas and
aggregates and fine grained data layout.
Note that aggregate and data are the last things on the list ;)

YMMV of course, but after one year dev and another year in production
with two projects I begin to feel quite confident about this.

Cheers
Phil

Benjamin Eberlei

unread,

Aug 19, 2012, 5:46:57 AM8/19/12

to ddd...@googlegroups.com

On Sun, Aug 19, 2012 at 10:59 AM, Philip Jander <jan...@janso.de> wrote:

Hi Benjamin,

from my experience, the "complex" part of event sourcing was *only* getting used to it as opposed to data-centric thinking.

For me, the advantages soon outbalanced the initial learning. Any kind of snapshotting, I would strongly advise against unless you determine by performance monitoring that you really need it. Usually it is extra trouble and not worth the effort. Even on the read side, I often drop any kind of persistence and go for in-memory read models, which are regenerated from event store on server startup each time. This does not work for each read model, but in particular for the more complex ones (complex state derivation rather than big data) this works extremely well. Even for a few million events this does not take longer that starting up any kind of tomcat-based jvm server ;)

Our difference is probably also due to the difference in technology used, but with PHP (or other scripting languages) you cannot have "in memory models" as HTTP requests are processed as shared nothing, that means you would have to rebuild your models from events on every request and not just once on server startup time (or first access).

The event + state persistence idea does never work out as you have two independent representations of you data to cope with.
It might look nice initially but becomes a maintenance nightmare soon afterwards.

I don't use the events for anything persistence related other than building aggregate views models (statistics, ...) or for logging tables. What I use them for primarily is the decoupling of business logic, domain events happen to be a very good way to end up with SOLID code independent of the persistence strategy.

Regarding your "apply-method" problem, I initially used a pattern to enforce correct implementation: each entity had only methods that could raise events plus one instance of an state co-class that only handled events and offered its state as "get private set" for it's host class.

E.g. pseudo-code:

public class Customer {
private readonly CustomerState State;

public DoSomething(){
if (!State.Done) EmitDidSomething();
}

class CustomerState {
public bool Done { get; private set; }
internal void OnDidSomething(... e){
Done=true;
}
}
}

+ a bit of connecting code.

While this pattern is a bit of extra code, it solves the mutate-state-by-accident problem.
Additionally, it soon led to the logic of "OnDidSomething" to be refactored into an external method which is now shared by projections and the domain - I call this a "concept" on the event stream. IMO a good application of DRY, avoiding out-of-sync bugs between domain projection logic and readmodel projection logic. Whereever concepts are shared, there is only one pice of code responsible for defining the concept.

This is a nice solution indeed.

@yreynhout

unread,

Aug 19, 2012, 6:24:57 AM8/19/12

to ddd...@googlegroups.com

1. Sure, been there too.
2. IL analysis if you have discipline problems. Fail the build if detected. Spank dev at fault.
3. Bullshit
4. Easy, discussed here before (search the archive of this forum).

Hth,
Yves.

Greg Young

unread,

Aug 19, 2012, 7:33:00 AM8/19/12

to ddd...@googlegroups.com

Have you actually benchmarked perf difference? Something tells me your network hit etc will be much larger than the time to replay 10 events.

Snapshot ting introduces its own issues (like now needing two ways of everything). Snapshot ting should be avoided unless needed for performance.

On Sunday, August 19, 2012, Benjamin Eberlei wrote:

Søren Trudsø Mahon

unread,

Aug 19, 2012, 3:01:00 PM8/19/12

to ddd...@googlegroups.com

@philip:
Where do you put this shared code?

I thought there would be a strict separation between the write and read side. Different assemblies? And only events shared?

Also aren't domain and read side different class, do you then define interface for state, for the method to operate on or how? Could you give example?

Don't want to hijack thread? Should I start new?

Sent from my iPhone

Philip Jander

unread,

Aug 19, 2012, 4:29:41 PM8/19/12

to ddd...@googlegroups.com

Am 19.08.2012 21:01, schrieb S�ren Truds� Mahon:
> @philip:
> Where do you put this shared code?

Either shared assemblies or just symlink the source files across
different projects. The latter has many advantages but you need to look
out for visibility/namespace clashes.

The read/write separation is about having different models. Surely,
those models may share code, either by reference or by copy-n-paste (of
which symlinking is a DRY'd variation).
Since both sides operate on event streams, it is not really surprising
to find common concepts.

> Also aren't domain and read side different class, do you then define interface for state, for the method to operate on or how? Could you give example?

Actually, this is on a function (read: method) level. The production
implementation is not nice but performant. I'd rather share a non
production but extremely readable proof-of-concept based on the Rx
framework (events, after all...).

Apologies for the German identifiers. Short dictionary:
Kunde == Customer, OffenesVolumen == OpenContractedVolume, Waehrung ==
Currency, auftraege == contracts, AuftragWurdeGenehmigt ==
ContractWasAuthorized, AuftragWurdeAbgeschlossen == ContractWasFulfilled

This is the "concept" "OffenesVolumen" shared by domain and read models,
just a pure function of the event stream based on Rx. It is parametrized
by the customerId, which can either be given or not. In the latter case,
the summed value over all customers is projected. The result is again an
observable stream of the desired state (here: a currency value).

public static class Kunde
{

public static IObservable<Waehrung>
OffenesVolumen(IObservable<Event> eventStream, Waehrung startwert, Guid?
kundeId=null)
{
Waehrung[] offenesVolumen = { startwert };
var auftraege = new Dictionary<Guid, Waehrung>();

var observable = new BehaviorSubject<Waehrung>(0.Euro());

eventStream.OfType<Events.AuftragWurdeGenehmigt>().Where(_ =>
(!kundeId.HasValue) || _.Kunde == kundeId.Value).Subscribe(e =>
{
auftraege.Add(e.Auftrag, e.Auftragsvolumen);
offenesVolumen[0] += e.Auftragsvolumen;
observable.OnNext(offenesVolumen[0]);
});

eventStream.OfType<Events.AuftragWurdeAbgeschlossen>().Where(_ =>
(!kundeId.HasValue) || _.Kunde == kundeId.Value).Subscribe(e =>
{
offenesVolumen[0] -= auftraege[e.Auftrag];
auftraege.Remove(e.Auftrag);
observable.OnNext(offenesVolumen[0]);
});

return observable;
}

}

Within the domain model, the state is derived by using the concept on a
concrete event stream:

public Waehrung OffenesVolumen { get; private set; }

public void RegisterEventHandlers(IObservable<Event> events)
{
Projektionen.Kunde.OffenesVolumen(events, 0.Euro(),
_id).Subscribe(e => OffenesVolumen = e);
}

}

and in any read model you can do the same. It doesn't matter in what way
the code is shared. The main point is that there is single place in code
defining how "OffenesVolumen" should be derieved from the event stream.
If this ever changes, it will change in all projections and the domain
model together. Obviously there might be other projections, for which
such a coupling is not desired - these are not "concepts" then.

Cheers
Phil

Benjamin Eberlei

unread,

Aug 20, 2012, 3:50:02 PM8/20/12

to ddd...@googlegroups.com

On Sun, Aug 19, 2012 at 1:33 PM, Greg Young <gregor...@gmail.com> wrote:

Have you actually benchmarked perf difference? Something tells me your network hit etc will be much larger than the time to replay 10 events.

With shared nothing language architecture i have to hit the network anyways, since my specific HTTP request does not have access to a shared application memory or a shared object cache. That means I can either query for all events or for the snapshot. With CouchDB or MongoDB i can save a full representation in one database document. That means one query for the aggregate id of a snapshot is definately faster than a range query for all events. Then we can argue if reconstructing an object from events is faster than a deserialization algorithm.

Snapshot ting introduces its own issues (like now needing two ways of everything). Snapshot ting should be avoided unless needed for performance.

What do you mean with "needing two ways"?

My approach would now be, for every unit of work:

1. save a snapshot of the state into couchdb
2. save all events into a commit document into couchdb

Dan Normington

unread,

Aug 20, 2012, 4:40:03 PM8/20/12

to ddd...@googlegroups.com

I've skimmed through this thread and it sounds to me like the reason you're making an argument for a "snapshot" over storing events is due to the fact that you are using snapshots for your query/read side. If this is the case then I think you're either making the wrong argument or the argument hasn't been clear. For the read side that what you are saying would make sense, but it would also make sense to have a differently shaped snapshot for each view within your read side. These shapshots/projections are completely different concepts than your storage mechanism for your writes. I've never heard of anybody making an argument for using an event store for a read model so what you're saying can apply to your read side but not the write.

Nils Kilden-Pedersen

unread,

Aug 20, 2012, 4:42:22 PM8/20/12

to ddd...@googlegroups.com

On Mon, Aug 20, 2012 at 3:50 PM, Benjamin Eberlei <kon...@beberlei.de> wrote:

My approach would now be, for every unit of work:

1. save a snapshot of the state into couchdb
2. save all events into a commit document into couchdb

How do you deal with partial failure?

Benjamin Eberlei

unread,

Aug 20, 2012, 4:50:23 PM8/20/12

to ddd...@googlegroups.com

On Mon, Aug 20, 2012 at 10:40 PM, Dan Normington <dnor...@gmail.com> wrote:

I've skimmed through this thread and it sounds to me like the reason you're making an argument for a "snapshot" over storing events is due to the fact that you are using snapshots for your query/read side. If this is the case then I think you're either making the wrong argument or the argument hasn't been clear. For the read side that what you are saying would make sense, but it would also make sense to have a differently shaped snapshot for each view within your read side. These shapshots/projections are completely different concepts than your storage mechanism for your writes. I've never heard of anybody making an argument for using an event store for a read model so what you're saying can apply to your read side but not the write.

No i dont want to use it for the read model, the read models data is in SQL. The reason i argue for snapshotting is, because on every http request i need to fetch the data from an I/O source (CouchDB in my case) and since CouchDB allows schemaless storage, i could serialize the whole aggregate into one document on every write operation, in this case using the events only as means of decoupling the application.

Benjamin Eberlei

unread,

Aug 20, 2012, 4:54:33 PM8/20/12

to ddd...@googlegroups.com

Three approaches with different "problems and drawbacks" are possible:

1. make the snapshot/aggregate the primary source of data, explicitly allowing for "missing" events. Drawback: Event replay for view rebuilding might be inconsistent.
2. Save snapshot AND events into one document - storage space
3. save the events first, then using a single query retrieve snapshot + missing events and apply them. - complexity+performance (requires a View and not a find by key operation)

Dan Normington

unread,

Aug 20, 2012, 4:55:23 PM8/20/12

to ddd...@googlegroups.com

How do you update your projections? Do you fire an event that looks like something that would be stored in an event store, react to it by updating read side database but not store the event? If that is the case I'd have to ask the same question that Nils did. If something fails while updating your projections how do you handle that?

Dan Normington

unread,

Aug 20, 2012, 5:00:19 PM8/20/12

to ddd...@googlegroups.com

Three approaches with different "problems and drawbacks" are possible:
1. make the snapshot/aggregate the primary source of data, explicitly allowing for "missing" events. Drawback: Event replay for view rebuilding might be inconsistent.
2. Save snapshot AND events into one document - storage space
3. save the events first, then using a single query retrieve snapshot + missing events and apply them. - complexity+performance (requires a View and not a find by key operation)

It looks like you're adding infrastructure and complexity that really isn't needed for the possibility that the hydration of your aggregates from an event store might take long. I guess I'd advise testing your theory. I've hydrated aggregates with hundreds of events and the performance was just fine.

Benjamin Eberlei

unread,

Aug 20, 2012, 5:03:05 PM8/20/12

to ddd...@googlegroups.com

Ok, I guess I have to test this.

Thanks everyone for whacking me enough on this issue ;-)

王林波

unread,

Jan 28, 2013, 4:45:39 AM1/28/13

to ddd...@googlegroups.com

Hi Philip,

Although it was an old post, I am very interested in your "public immutable getters of the aggregate root".

I created a post named "can we update read model through access aggregate root immutable public getter instead of events" in this group. Any input will be appreciated.