The Ever Growing Event Store

JAmes Atwill

unread,

Sep 1, 2010, 4:31:11 PM9/1/10

to dddcqrs

I'm pitching event sourcing for an upcoming project which really lends
well to the benefits it brings. I'm able to cover off most of the
common questions, but wound up not having a good story for "If the
event store just keeps on growing, how do you manage that data?"

Assume for historical reasons the event store is likely 1 table in one
database. What experiences (or ideas) do people have on:

- backup and restore (10 years of events is a lot of events)
- size of event store over time (what's your load)
- keeping inserts performant on an ever-growing database

Also, has anyone ever deleted old events, and if so, what was the outcome?

Of course we can split up event stores by AR, but it just divides up
the same amount of data..

Thanks in advance!

JAmes

Greg Young

unread,

Sep 1, 2010, 5:50:33 PM9/1/10

to ddd...@googlegroups.com

How much does a hard drive cost?

What is the business value of having that data?

Can we archive old data?

Cheers,

Greg

--
Les erreurs de grammaire et de syntaxe ont été incluses pour m'assurer de votre attention

Neil Robbins

unread,

Sep 1, 2010, 5:58:05 PM9/1/10

to ddd...@googlegroups.com

Hmmm, thoughts clearly, not many will have 10 years experience of this style ;)

Backup & Restore

Maybe - use a Dynamo style node ring which like Dynamo could be configured to be any of

machines on a single rack
machines on multiple racks
machines across datacentres

with a quorum setup for writes & reads. This would obviate the need for worrying about restore. It would of course trade that for latency guarantees, and the need to maintain a quorum could cause availablility issues were a majority no longer addressable. Clearly if this option was looked at then choices like Cassandra, or maybe CouchDB with Cloudant's new scale-fu would be worth looking at if only to see how others have done it.

Maybe - use the built in pubsub replication built in to the CQRS system (write to eventstore & to a bus), and accept that a write being placed on the queue is as good as it arriving at a replica. With master affinity, in the face of the master being unavailable then the affinity could be switched to the replica. This is similar to Y! PNUTS. The assumption that the write to the bus is as good as a write to the disk of the replica introduces the potential for the consistency boundary around an AR to become violated, at which point the system might need to introduce ways of managing these inconsistencies. Sometimes, especially if dealing with large geographies (as Y! were when the developed PNUTS) this can be a very effective way to do things I think.

Maybe just use what's built into Oracle/SqlServer/MySQL/etc - so the standard Master-Slave for hot failover, backups to external media, log shipping, etc... For most scenarios this is probably fine, and for many will present the path of least resistence & probably lowest costs to develop. The issue though is that the moment its 'typical' scenarios are not sufficient then it becomes very expensive.

Size of event store

Use commodity hardware, replicate for durability guarantees (beats relying on expensive h/w), shard to distribute load. Hard drives are cheap as chips (the edible sort).

Keep inserts performant:

Shard (horizontally partition in a data-wise fashion) over time. Be aware of strategies of load distribution that don't provide ways of controlling data affiinity such as through keys (eg the Amazon model doesn't work so well here, but the Cassandra variant can I believe) - though this does really increase the need to manage sharding carefully.

Don't get what you mean by "Of course we can split up event stores by AR, but it just divides up the same amount of data". Surely sharding in this way is a very valid strategy for managing the load/size/speed issues?

Be interested to see what people think.

Neil

Neil Robbins

unread,

Sep 1, 2010, 6:01:07 PM9/1/10

to ddd...@googlegroups.com

I suppose as well, I didn't really get the:

"Assume for historical reasons the event store is likely 1 table in one database."

Why would those constraints apply when you're pitching for an upcoming project? I get that the organisational bias against these approaches makes it hard work to get almost any part of it adopted (I've been there & got the scars), but can't see why/how 1 table in 1 db could be justified as the only strategy even if more funky things like NoSQL & friends are.

On Wed, Sep 1, 2010 at 9:31 PM, JAmes Atwill <jat...@linuxstuff.org> wrote:

JAmes Atwill

unread,

Sep 1, 2010, 7:56:30 PM9/1/10

to ddd...@googlegroups.com

On Wed, Sep 1, 2010 at 2:50 PM, Greg Young <gregor...@gmail.com> wrote:
> How much does a hard drive cost?

Big enough to hold 10 years of data? Not as expensive as backing up
and restoring 10 years of data.

> What is the business value of having that data?

Events? Is it safe to delete old events? I'll lose my ability to
recreate my read-side, unless I intelligently roll-up similar events
to only keep the latest for a particular AR. I'm curious if anyone
has done this.

> Can we archive old data?

How do I know if data is old? "StoreCreatedEvent" may be the first
event in the eventstore, but it's still extremely relevant.

I'm looking for some rules of thumb; I'll definitely summarize once we're done.

JAmes

Neil Robbins

unread,

Sep 1, 2010, 8:06:45 PM9/1/10

to ddd...@googlegroups.com

My guess is that old data in this context might mean old ARs rather than old events as such.

So say you had a CatalogItem that was discontinued, could all of its events be archived after, say, 12 months? Or even just identify ARs which haven't been used in the last t & then archive them automatically as a background process like the snapshotting background process.

A version of archival might be pushing to something like S3 (or some other cheap, big store) so that it could always be pulled back if required - but at a the cost of higher latency for that initial pull back.

Sequence might look like

UI - QuerySide (pulls back records that include a reference to an archived AR)

UI - Command Side (user attempts operation on archived AR)

Command Side > Event Store Manager (reload AR)

Event Store Manager > Normal DB (load AR)

Normal DB > Event Store Manager (AR not found)

Event Store Manager > Archive Store (load AR)

Archive Store > Event Store Manager (here's your AR)

Event Store Manager > Normal DB (restore AR events)

Event Store Manager > Command Side (Here are your events, rebuild your AR, do your thing)

...

Or some other similar sequence.

Chris Nicola

unread,

Sep 2, 2010, 12:59:20 AM9/2/10

to ddd...@googlegroups.com

I'm pretty much going to second Greg's thought's exactly, "how much is hard drive worth?"

Do you have any ballpark estimates? Rate of events/commands? Size of the data the events contain? Growth? Do you know what your need for storage is likely to be over time? Many transactional database grow over time, so this isn't exactly a new problem invented by CQRS.

The cost for a Gb of data has dropped from over $9 to $0.07 in the past 10 years (http://ns1758.ca/winch/winchest.html). If you are having pushback on the issue of storage capacity feel free to show them a chart.

Chris

Richard Dingwall

unread,

Sep 2, 2010, 4:57:50 AM9/2/10

to ddd...@googlegroups.com

On 2 September 2010 05:59, Chris Nicola <chni...@gmail.com> wrote:
> I'm pretty much going to second Greg's thought's exactly, "how much is hard
> drive worth?"
> Do you have any ballpark estimates? Rate of events/commands? Size of the
> data the events contain? Growth? Do you know what your need for storage is
> likely to be over time? Many transactional database grow over time, so this
> isn't exactly a new problem invented by CQRS.
> The cost for a Gb of data has dropped from over $9 to $0.07 in the past 10
> years (http://ns1758.ca/winch/winchest.html). If you are having pushback on
> the issue of storage capacity feel free to show them a chart.
> Chris

In my experience, the cost of consumer hard drives is one factor you
can quite safely ignore when trying to get more space for your app.

Getting the IT department to add 2TB to your production DB server's
fibre-channel SAN (and its offsite backup/disaster recovery
environment of course) will still be near impossible.

--
Richard Dingwall
http://richarddingwall.name

Chris Nicola

unread,

Sep 2, 2010, 12:09:41 PM9/2/10

to ddd...@googlegroups.com

If the IT department that supports your systems is unable (or unwilling) to adapt to changes in the needs of the business over time, then that doesn't sound like a software or a hardware problem and it may be a good argument to outsource those needs. There are plenty of top quality hosts that can serve the need for future growth.

A big problem is that, without actually having estimates of the requirements for storage growth, it is hard to suggest any specific solution to the problem. The point still stands though, there are simple options available, but choosing one requires an understanding of both the requirements and constraints of the problem. It isn't as if standard RDBMS databases don't grow over time too.

Chris

Suirtimed

unread,

Sep 3, 2010, 9:18:51 AM9/3/10

to DDD/CQRS

Not only do OLTP databases tend to grow over time, but data warehouses
grow as well. I'm currently working on a project where concerns of
storage over time have created operational concerns. The thought
process here is that we can address storage boundaries by snapshotting
in some cases. Create a snapshot and delete the old events. The
challenge seems to be ensuring that the business understands the value
of the events themselves however and what the consequences are of
loosing/deleting them. In another case, compliance may require
purging old (past 7 years) financial transactions individually (Think
General Ledger). The only way to maintain a GL aggregate root is to
have a snapshot in place before the old transactions are deleted.
There is also the possibility of implementing the event store as one
physical event store per Business Component/Domain. In that case each
one's challenges and concerns can be resolved as needed.

On Sep 2, 11:09 am, Chris Nicola <chnic...@gmail.com> wrote:
> If the IT department that supports your systems is unable (or unwilling) to
> adapt to changes in the needs of the business over time, then that doesn't
> sound like a software or a hardware problem and it may be a good argument to
> outsource those needs. There are plenty of top quality hosts that can serve
> the need for future growth.
>
> A big problem is that, without actually having estimates of
> the requirements for storage growth, it is hard to suggest any specific
> solution to the problem. The point still stands though, there are simple
> options available, but choosing one requires an understanding of both the
> requirements and constraints of the problem. It isn't as if standard RDBMS
> databases don't grow over time too.
>
> Chris
>

> On Thu, Sep 2, 2010 at 1:57 AM, Richard Dingwall <rdingw...@gmail.com>wrote:

>
>
>
> > On 2 September 2010 05:59, Chris Nicola <chnic...@gmail.com> wrote:
> > > I'm pretty much going to second Greg's thought's exactly, "how much is
> > hard
> > > drive worth?"
> > > Do you have any ballpark estimates? Rate of events/commands? Size of the
> > > data the events contain? Growth? Do you know what your need for storage
> > is
> > > likely to be over time? Many transactional database grow over time, so
> > this
> > > isn't exactly a new problem invented by CQRS.
> > > The cost for a Gb of data has dropped from over $9 to $0.07 in the past
> > 10
> > > years (http://ns1758.ca/winch/winchest.html). If you are having
> > pushback on
> > > the issue of storage capacity feel free to show them a chart.
> > > Chris
>
> > In my experience, the cost of consumer hard drives is one factor you
> > can quite safely ignore when trying to get more space for your app.
>
> > Getting the IT department to add 2TB to your production DB server's
> > fibre-channel SAN (and its offsite backup/disaster recovery
> > environment of course) will still be near impossible.
>
> > --
> > Richard Dingwall

> >http://richarddingwall.name- Hide quoted text -
>
> - Show quoted text -

JAmes Atwill

unread,

Sep 7, 2010, 12:08:12 PM9/7/10

to ddd...@googlegroups.com

On Wed, Sep 1, 2010 at 3:01 PM, Neil Robbins <neilar...@gmail.com> wrote:

Hey Neil,

Thanks muchly for an amazing summary. If you're okay with this, I'd
like to summarize these posts somewhere.

> I suppose as well, I didn't really get the:
> "Assume for historical reasons the event store is likely 1 table in
> one database."

I guess I really only said that to skip the "split AR's into separate
tables" discussion (since I feel that's just delaying the ultimate
concern).

JAmes

Adam Dymitruk

unread,

Sep 7, 2010, 2:28:09 PM9/7/10

to ddd...@googlegroups.com

If you go to file-based ES, consider this idea:

To satisfy them and off-load events to a backup server, have a
time-based snapshot minimum along with event-count snapshot maximum.

I'm moving to a 100K per aggregate segregation policy (but could be
anything - even convention-based). Keep the files that keep older
events than the last snapshot is contained within be kept on back up
servers.

What this DOES buy you, is cheaper and faster additions of machines to
load balance your transactional side. You're not moving TBs of data
for a new node if you want complete redundancy everywhere.

Should you need to correct an error and re-read all the events, you
still can without too much additional work.

My $0.02,

Adam

--
Adam

http://adventuresinagile.blogspot.com/
http://twitter.com/adymitruk/
http://www.agilevancouver.net/
http://altnetvancouver.ning.com/

Reply all

Reply to author

Forward