--
Le doute n'est pas une condition agréable, mais la certitude est absurde.
you cant find this written anywhere because it shouldnt be that way.
aggregateid is the partition point. You should have one table for all
event types.
--
why would you need to select on aggregate type to get the events for a
given aggregate?
Unnecessary, as long as your aggregate IDs are globally unique between all aggregate types.
--
it seems natural to split the events per aggregate type, if nothing else, to alleviate network traffic.
Please differentiate between event stores that are used to "power" the write side of AR+ES and the stores that are used for other purposes.AR+ES stores can be (and generally are) partitioned by aggregate IDs. The are used only for rebuilding these aggregates from their history.For rebuilding views (replaying events towards projections) I tend to use completely different event stores, with different partitioning schemas. For instance, consider this scenario:system A: account management app with a web UI. It will have* stream-1 .... stream-N stores - used to persist AR+ES entities 1...N (of different aggregate types)* stream-domain-log - used to persist copy of all events from stream-1... stream-N PLUS all commands that went through. It is used for audit, ad-hoc reporting and replays for projectionssystem X: global dashboard for the entire company (aggregating multiple systems A, B, C etc):* stream-from-A - stream of events that have been delivered from system A* stream-from-B - stream of events from BIn short, not all types of replays have to happen upon the same store with the same partitioning schema (this will just make things more complicated on a larger scale)
it seems natural to split the events per aggregate type, if nothing else, to alleviate network traffic.I suggest to delay decisions upon this optimization, till you really hit performance problems here.
Actually I'm using this approach even for small projects that are supposed to be isolated from the rest of the world.Store "specialization" really pays off in the simplicity of building blocks and the overall solution (plus the maintenance story).Although I've seen developers shrug at the notion of duplicating data between different streams ("as if CQRS duplication was not enough"))
01/12/2011 17:40 321.352.624 GLOBAL.msgs
01/12/2011 17:40 77.779.616 HEADER_TaskId_Tasks.1.msgs
01/12/2011 17:40 86.990.360 HEADER_TaskId_Tasks.2.msgs
01/12/2011 17:40 81.873.280 HEADER_TaskId_Tasks.3.msgs
01/12/2011 17:40 74.709.368 HEADER_TaskId_Tasks.4.msgs
5 bestand(en) 642.705.248 bytes(example gist here: https://gist.github.com/1418094)
Deserializing all events for a single AR could not be more simple:
- determine file name as "HEADER_[AR type]+"Id"+"_"+[AR id]+".msgs"
- open stream
- deserialize using protobuf and apply them to the AR.
This is IMO the simplest thing that can possibly work; it adds some overhead on the storage part, but it allows you to attach events to new AR's as long as the relevant id is in there; just move the handler from one AR type to another AR type, and you are good to go.
There is a long document and discussion about this on dddcqrs.com
What join? On reading?
I think it specifies in document type is only used for admin purposes. I went back and forth on whether to not include it to avoid such confusions.
I currently have my events stored per aggregate type (one MongoDB collection per type) because it maps well to the repositories.