I've been concerning this problem for a few months.
The "transaction logs" is actually the history events of an entity.
With an initial state is offer, the current state of an entity can be
calculated with the help of history events.
So if you want to know a property of an entity currently, these
calculation is inevitable.
But we can reduce the calculation process by taking snapshots for a
given entity.
A snapshot is another state of an entity at a specific time. You can
calculate the current state of an entity by just applying all the
later history events to the snapshot.
For example, suppose we have a task T, and it's empty map at start.
(def T {})
We can do push/pop operation on it with a history event generated. S
doesn't change at all, just some events are recorded.
(assoc T :title"A task")
(assoc T :time "today")
(dissoc T :time)
(assoc T :status "done")
So what is the current state of T?
(calculate-current T)
=> {:title "A task" :status "done"}
This state is generated by applying all the operations on the initial
state of T.
However, if there are a large amount of history events, a long time
will take to get the current state of S.
To improve this, snapshot is needed.
(assoc T :title"A task")
(assoc T :time "today")
(def T1 (calculate-current T))
(dissoc T :time)
(assoc T :status "done")
So what's the current state of T?
As we know, the last snapshot is T1, we can just apply (dissoc T1
:time) and (assoc T1 :status "done") to get the current state of T.
With the help of the snapshot, we save 50% of the calculation.
Of course, we need to use some more space to store the snapshots.
More consideration is needed if you want to query the state of an
entity at a specific time in history.
But I think it's enough to solve your problem, because you always need
the current state of a task. Thus you can just store one snapshot.
--
Sincerely,
江海龙
Hoiloong Kong