Hi, Andrew
I have no experience modeling append-only logs, but this scaling problem comes up when adding history variables to a spec, right?
Even though you’re modeling an append-only log, the solution for scalability, while retaining some log in the state space, might be removing event sequences that leave the state unchanged from the log. One could achieve this “solution” by optimizing the state space inside the spec itself.
Since we’re dealing with a sequence of events, symmetry sets wouldn’t really help, but memoization and data compression might (?).
The “solution” (in many quotes) from a spec:
Some spec has a log variable that represents the append-only log. Every action that updates log would check whether there’s an equivalent, but smaller, event sequence, thus compressing the log and the state space. This optimization would reduce the state space, reduce overall memory consumption but, even in parallel (thanks to TLC), might add substantial processing cost. It’s not a lossless compression, either.
Let’s say we had states <<s, log1>>, <<s, log2>>, <<s, log3>>. We know log1, log2 and log3 are equivalent to some log log4, and log4 is shorter than the other 3.
A spec that optimizes for log size would only have one state: <<s, log4>>.
A spec that applies a view <<s>> would also have only one state <<s>>, but with complete information loss.
One way to achieve this compression is by checking equivalent suffixes.
Let’s say some spec has variables log and x (an integer). Possible events are {+1, -1, div2}:
We know a few suffixes leave the system as if nothing had happened before (except time passing): { <<+1, -1>>, <<-1, +1>>, <<+1, +1, div2, -1>> }.
Suppose we’re in a state where log = <<+1>>. Event -1 happens, so we add it to the log: <<+1, -1>>. But this suffix is marked as redundant: the log would be transformed into <<>>.
Suppose we’re in a state where log = <<-1, -1, -1>>'. Event+1’ happens, so we add it to the log: <<-1, -1, -1, +1>>. But this suffix is marked as redundant: the log would be transformed into <<-1, -1>>.
The community module SequencesExt contains a IsSuffix(_) operator. Every action that changes the log would be written like AddOne:
EquivalentSuffixes == { <<"+1", "-1">>, <<"-1", "+1">>, <<"+1", "-1", "div2", "-1">> } AppendToLog(log, event) == LET newLog == log \o <<event>> IN IF \E suffix \in EquivalentSuffixes: IsSuffix(suffix, newLog) THEN log ELSE newLog AddOne == /\ x' = x + 1 /\ log' = AppendToLog(log, "+1")By compressing log, the state space would contain a subset of possible logs, which might be a problem…
I haven’t tested any of this, by the way, but I’m curious to know if there are any other solutions.
Best,
Jones
I decided to model check the extremely contrived example in my previous message and compare the state space and model check time compared to no log compression.
In case anyone’s interested, here are the results:
No log compressionLen(log) < ? represents the state constraint.
Len(log) < ? Distinct States Time (4 workers & 10 runs) 11 19,233 1.305 s ± 0.222 s 12 48,526 1.793 s ± 0.319 s 13 121,217 2.516 s ± 0.382 s 14 305,152 2.830 s ± 0.658 s 15 763,353 4.365 s ± 0.772 s 16 1,918,966 7.745 s ± 0.664 s 17 4,804,797 17.935 s ± 1.278 s 18 12,067,896 44.357 s ± 3.356 s 19 30,233,757 114.005 s ± 7.446 s Compressing the log by removing the longest “redundant” log suffix: Len(log) < ? Distinct States Time (4 workers & 10 runs) 11 16,319 2.877 s ± 0.425 s 12 38,356 3.571 s ± 0.571 s 13 89,795 4.971 s ± 0.837 s 14 209,188 9.074 s ± 1.174 s 15 487,065 17.080 s ± 0.918 s 16 1,132,250 35.743 s ± 1.882 s 17 2,633,623 63.383 s ± 32.838 s 18 6,122,816 144s (4 workers & 1 run) 19 14,234,901 6m46s (4 workers & 1 run) Differences by compressing Len(log) < ? State space difference (1 - (states_compressed/states_not_compressed)) Time difference (time_compressed_avg/time_not_compressed_avg) 11 15.2% smaller 2.205x slower 12 21% smaller 1.992x slower 13 25.9% smaller 1.976x slower 14 31.4% smaller 3.206x slower 15 36.2% smaller 3.913x slower 16 41.0% smaller 4.615x slower 17 45.2% smaller 3.535x slower 18 49.3% smaller 3.246x slower 19 52.9% smaller 3.561x slowerThis very contrived example with compression removed, on average, 35.3% of the state space, but it took, on average, 3.13 times longer to model check.
Removing log from the view and model checking with Constraint == Len(log) < 19 took less than a second, which is over 114x faster than including a log (maybe 1140x faster?)
(I didn’t test adding snapshots, though… I guess I’d need a different contrived example for that, hahah)
Best,
Jones
--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/82b7aa83-6aee-4125-a1bd-da0b4513c3d5n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/4230f5cc-2f1f-4131-b9b9-4d655b29d04bn%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/4230f5cc-2f1f-4131-b9b9-4d655b29d04bn%40googlegroups.com.