Hi, Lamport reading your post made me think about some specifications that a friend and I wrote some months ago.
Our goal was to write a specification for a state machine replication (RSM) protocol that uses multiple instances of
the "Paxos-family" protocols to decide on each entry of a replicated log, and we would like to have a way to test it
on various failures scenarios (because we were learning about these protocols).
In that time we also wanted to learn more about TLA+, so we decided to start by getting your specification of Paxos
from the Github (
Paxos.tla) and do a refinement mapping of it in our specification.
We noted that your Paxos specification implements a Voting specification which in turn implements the Consensus
spec. So we thought this was the natural way to proceed.
Then we copy all of these three specs, and we apply minor modifications to your original Paxos specification in
order to make it more modular for our needs, but trying to not change its logic. We didn't modify the Voting or the
Consensus spec.
We basically split the formula definition of some steps (like the phase1b bellow) in two different operators:
"condition" and "action", which allowed us to reuse these operators in the RSM spec and also overwriting it using
the configuration files (to tests slight modifications of some operators and simulate failures, for example).
----------------------------Paxos----------------------------------------
VARIABLE state
...
Init == state = [maxBal |-> [a \in Acceptor |-> -1],
maxVBal |-> [a \in Acceptor |-> -1],
maxVVal |-> [a \in Acceptor |-> None],
msgs |-> {}]
...
Phase1bCondition(a, m) == /\ m.type = "1a"
/\ m.bal > state.maxBal[a]
Phase1bAction(a, m) ==
[state EXCEPT !.maxBal[a] = m.bal,
!.msgs = Send([type |-> "1b",
acc |-> a,
bal |-> m.bal,
mbal |-> state.maxVBal[a],
mval |-> state.maxVVal[a]])]
Phase1b(a) == \E m \in state.msgs :
/\ Phase1bCondition(a, m)
/\ state' = Phase1bAction(a, m)
------------------------------------------------------------------------
---------------------------RSM-Paxos---------------------------
VARIABLE states, \* Array of paxos states
log, \* Array os decided values
proposed \* All instances knows about the proposed values
...
P(i) == INSTANCE Paxos WITH Acceptor <- Replica,
state <- states[i]
...
Phase1b(i,r) == /\ \E m \in states[i].msgs : /\ P(i)!Phase1bCondition(r, m)
/\ states' = [states EXCEPT ![i] = P(i)!Phase1bAction(r, m)]
/\ UNCHANGED <<log, proposed>>
------------------------------------------------------------------------
After played with crash-stop models, testing it using different configuration files and making assertions about the
invariances, we decided to move to the other failure models, crash-recovery and also Byzantine. We ended up
implementing a spec that extends our RSM and works as a "test-suite framework", declaring our test models that
are "instantiated" using the TLC configuration file.
For example, we use your byzantine paxos spec:
Byzantine Paxos Refinement, and apply minor modifications to
our RSM spec to have a Byzantine version of it.
---------------------------RSM-BPConProof------------------------------------
BP(i) == INSTANCE BPConProof WITH Acceptor <- CorrectReplica,
FakeAcceptor <- FakeReplica,
state <- states[i]
...
Phase1b(i,r,b) == /\ BP(i)!Phase1bCondition(r,b)
/\ states' = [states EXCEPT ![i] = BP(i)!Phase1bAction(r,b)]
/\ UNCHANGED <<log, proposed, bal>>
Phase1c(i,b) == /\ \E S \in Possible1cMsgsIn(b):
states' = [states EXCEPT ![i] = BP(i)!Phase1cAction(S)]
/\ UNCHANGED <<log, proposed, bal>>
...
-----------------------------------------------------------------------------
So reuse existing specification in this way shown very valuable for us in that time, and together with the papers,
helped us to understand better the protocols. But now, I'm wondering if our approach was completely wrong and
we misuse TLA+, TLC, and the toolbox. Because we kind of copy/paste/modify the original Paxos and ByzPaxos
specs, but we didn't modify the Voting and Consensus that still used internally by them.
Rodrigo