How do you do functions of structures?

212 views
Skip to first unread message

Hillel Wayne

unread,
Jun 6, 2021, 4:34:58 PM6/6/21
to tlaplus
Say you have a spec like this:

---- MODULE foo ----

EXTENDS Sequences
CONSTANT Worker, Msg
ASSUME Worker # {} /\ Msg # {}

VARIABLE state

WorkerState == [queue: Seq(Msg), online: BOOLEAN]
TypeOK ==
  state \in [Worker -> WorkerState]

\* ...

====

I've noticed that a lot of my students start by writing things that way,
where we have a single state variable that is a function from Worker to
WorkerState. I generally encourage them to instead write it as

VARIABLES worker_queue, worker_online

TypeOK ==
  /\ worker_queue \in [Worker -> Seq(Msg)]
  /\ worker_online \in [Worker -> BOOLEAN]

\* ...

====

The reason why I prefer this is because you can define worker_queue' and
worker_online' in separate actions, while you can only (easily) define
state' in a single action. Also, it's easier to do things like
worker_queue' = [w \in Worker |-> (* ... *)]. The downside is your
UNCHANGED statements get a little more cluttered, but IMO the benefits
are worth it.

What do you all do? Do "destructure" your state or prefer to keep it all
in one variable?

H

Karolis Petrauskas

unread,
Jun 7, 2021, 7:44:36 AM6/7/21
to tla...@googlegroups.com
I was used to using the structured version, until I started to play with TLAPS.
The things are easier to prove in the case of the structure split to multiple variables.
Not sure why. Maybe the prover can easier resolve the actions that do not change a variable used in a property to prove.

Karolis

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/f71490a1-015b-c6f0-a24f-efe11a7757e4%40gmail.com.

Andrew Helwer

unread,
Jun 7, 2021, 9:52:08 AM6/7/21
to tlaplus
Yeah destructuring is the way to go. Knowing which variables are changed vs unchanged in a given step is a very important component of documentation, in my opinion. Greatly eases translation into actual code. It also reads a lot nicer having primed variable expressions in different conjunction bullets. Plus the state traces are much easier to read.

Andrew

Stephan Merz

unread,
Jun 7, 2021, 10:34:01 AM6/7/21
to tla...@googlegroups.com
TLAPS relies on type information for optimizing the encoding for back-end reasoners, and a typing invariant such as

  /\ x \in Nat
  /\ y \in {"foo", "bar"}
  /\ z \in [ Client -> Int ]

is easier to handle than

  state \in [ x : Nat, y : {"foo", "bar"}, z : [Client -> Int] ]

Stephan

Aaron Elligsen

unread,
Jun 7, 2021, 12:30:55 PM6/7/21
to tlaplus
I was one of the student's making this mistake. Destructuring did resolve some issues and improved my specification performance. 

However true, I did find it unfortunate. The primary purpose of the record is to group associated data together. You can point out the isomorphism all day but that doesn't change the fact that it is easier to reason about related data when it is in the same place. Neither object oriented or functional programming languages really encourage destructuring data in this way. It goes against the grain of most programmers training for modularity and encapsulation. It also likely makes the TLA+ specification diverge from the code that it is trying to describe so verifying equivalent states between a production system and the tla spec will now need another translation step to either destructure or restructure the data for comparison. 

I suppose the problem is actually that from the perspective of TLA+/TLC is that they are not isomorphic They handle the two cases differently. 

Perhaps PlusCal could provide a key word to handle the conversion automatically. I can see either nested records or set unions creating complications, but maybe this would be good enough for most situations. 

```
WorkerState == DESTRUCTURE [queue: Seq(Msg), online: BOOLEAN]
state = DESTRUCTURE [worker \in Worker |-> [queue: <<>>, online: TRUE]]

TypeOK ==
  state \in [Worker -> WorkerState]

Transforms into something like
WorkerState#queue == Seq(Msg)
WorkerState#online == BOOLEAN

state#queue == [worker \in Worker |-> <<>>]
state#online == [worker \in Worker |-> TRUE]

TypeOK == 
   /\ state#queue \in [Worker -> WorkerState#queue]
   /\ state#online \in [Worker -> WorkerState#online]
```

Isaac DeFrain

unread,
Jun 7, 2021, 12:31:00 PM6/7/21
to tla...@googlegroups.com
Hey Hillel,

I've been burned too many times by trying to write variables as records of multiple things that could rightfully be variables (like you showed in the first module). The main issue is that you lose the ability to compose actions that only change a few variables each when the actions try to change different fields of the same variable. E.g.

---- MODULE ex ----

EXTENDS Naturals

VARIABLE x

Init == x = [ "fst" |-> 0, "snd" |-> 0 ]

TypeOK == x \in [ "fst" : Nat, "snd" : Nat ]

Incr_fst == x' = [ x EXCEPT !.fst = @ + 1 ]

Incr_snd == x' = [ x EXCEPT !.snd = @ + 1 ]

\* Unfortunately, this doesn't make sense here as both (sub)actions attempt to set the successor value of x... bummer :(
Incr_both ==
  /\ Incr_fst
  /\ Incr_snd

\* ...

Next ==
  \/ Incr_both
  \/ ...

\* ...

====

So I'll always choose to have more variables + composability of (sub)actions + control over less variables.

To manage the amount of names I have to write following UNCHANGED, I usually write several inclusive and exclusive variable tuples; e.g. say we have the variables x1, x2, ..., xN and I have a few actions which change only x1 and x2 and leave x3, ..., xN unchanged (this would suggest an exclusive variable tuple to me). I'll define a tuple, call it non_x1_x2_vars == <<x3, ..., xN>>, and then just write UNCHANGED non_x1_x2_vars in the aforementioned actions. As long as you write all your actions in one module, this seems to work really well. Not to mention when you have 10+ variables in a spec, it becomes next to impossible to visually inspect whether you have included all the unchanged variables.

I first saw this little trick in the raft spec where they use several (what I would call) inclusive variable tuples.


Best,

Isaac DeFrain


Felipe Oliveira Carvalho

unread,
Jun 7, 2021, 12:31:08 PM6/7/21
to tla...@googlegroups.com
I like to group all related state variables in the same struct and
then I define pure functions that return a copy with the state with
the intended changes.

This makes my action formulas look like

/\ workerState' = RecvMessage(workerState, ...)

Markus Kuppe

unread,
Jun 7, 2021, 1:31:13 PM6/7/21
to tla...@googlegroups.com
On 07.06.21 09:24, Aaron Elligsen wrote:
> However true, I did find it unfortunate. The primary purpose of the
> record is to group associated data together. You can point out the
> isomorphism all day but that doesn't change the fact that it is easier
> to reason about related data when it is in the same place. Neither
> object oriented or functional programming languages really encourage
> destructuring data in this way. It goes against the grain of most
> programmers training for modularity and encapsulation. It also likely
> makes the TLA+ specification diverge from the code that it is trying to
> describe so verifying equivalent states between a production system and
> the tla spec will now need another translation step to either
> destructure or restructure the data for comparison.

Hi,

specifications are not code. Concepts that organize programs that span
dozens or hundreds of files are usually overhead for specs that consist
of a handful of files. I certainly don't organize my bookshelf the same
way librarians do to keep thousands of books in order.

Many concepts in TLA+ go against the (my) training of programmers, which
is best summarized by: "we witnessed first hand the brain washing done
by years of [...] programming" [1]. The concepts indeed appear strange
to us, but strangeness doesn't invalidate them. If all the strange
concepts were replaced in favor of familiar ones, TLA+ would be another
programming language.

The conundrum that makes it challenging to keep specs and code in sync
are the abstractions that we want to find with TLA+.


> Perhaps PlusCal could provide a key word to handle the conversion automatically. I can see either nested records or set unions creating complications, but maybe this would be good enough for most situations.

There are other proposals to extend or fork PlusCal. If there is a
group of people who write PlusCal specs that are too large, the parties
could team up and propose and implement an extension of PlusCal with
additional modularity concepts (I believe the term "modular pluscal" is
already taken). The recently created RFC [1] repository would be a good
place to organize and discuss the work.

Cheers,
Markus

[1]
https://lamport.azurewebsites.net/tla/industrial-use.html?unhideBut=hide-rtos&unhideDiv=rtos&back-link=high-level-view.html
[2] https://github.com/tlaplus/rfcs

Markus Kuppe

unread,
Jun 7, 2021, 1:38:28 PM6/7/21
to tla...@googlegroups.com
On 07.06.21 06:52, Andrew Helwer wrote:
> Plus the state traces are much easier to read.


The new-ish ALIAS feature of TLC [1] addresses this problem (one less
reason to write specs that work well with the various tools).

Markus

[1] https://github.com/tlaplus/tlaplus/issues/485


Aaron Elligsen

unread,
Jun 7, 2021, 6:11:38 PM6/7/21
to tlaplus
The ALIAS feature looks pretty great. I think that could ease a lot of pain related to destructuring. 

I'm not really happy with the explanations so far though. I see a lot of experienced TLA+ user chiming in that destructuring works better currently, and from experience I believe it, but I don't see an explanation of why it should be this way. It seems to me an arbitrary fact that new spec writers must learn and then code around to improve spec performance and aesthetics.

TLA+ has both the syntax for records, and it can handle multiple updates to the record using the || operator, so it's not like the spec must be written without using them. In the end it just seems like record update syntax is sort of second class and so specifications work better when omitting it. I feel like it betrays the principle that TLA+ is math, like why is one representation not as good as another?

Andrew Helwer

unread,
Jun 7, 2021, 6:43:15 PM6/7/21
to tlaplus
What is the || operator?

With regard to the mathiness of it all, I'd say that's actually more like the source of this problem: the EXCEPT syntax is clunky, but necessary since you have to fully define the value of each variable in a next-state relation. Still, I guess you could do it a different way; consider:

VARIABLE me

Person == [name : STRING, age : Nat]

HaveBirthday ==
    /\ me' \in Person
    /\ me'.name = me.name
    /\ me'.age = me.age + 1

I think that would uniquely define the value of me' right? Kind of strange to think about how that could be implemented in the tools though.

Andrew

Aaron Elligsen

unread,
Jun 7, 2021, 7:12:38 PM6/7/21
to tlaplus
Ah, it looks like the || is just the EXCEPT syntax in PlusCal. 

Stephan Merz

unread,
Jun 8, 2021, 2:28:19 AM6/8/21
to tla...@googlegroups.com
Hi Andrew,

On 8 Jun 2021, at 00:43, Andrew Helwer <andrew...@gmail.com> wrote:

What is the || operator?

With regard to the mathiness of it all, I'd say that's actually more like the source of this problem: the EXCEPT syntax is clunky, but necessary since you have to fully define the value of each variable in a next-state relation. Still, I guess you could do it a different way; consider:

VARIABLE me

Person == [name : STRING, age : Nat]

HaveBirthday ==
    /\ me' \in Person
    /\ me'.name = me.name
    /\ me'.age = me.age + 1

I think that would uniquely define the value of me' right? Kind of strange to think about how that could be implemented in the tools though.

that definition indeed uniquely defines the value of me'. TLAPS has no problem with it (assuming that me \in Person). TLC can handle it provided you override the definition of Person by a finite set that TLC can enumerate, but is slow due to the explicit enumeration that TLC implements. I believe that Apalache and ProB would handle it just fine because they rely on constraint solving.

Stephan


Andrew

On Monday, June 7, 2021 at 6:11:38 PM UTC-4 Aaron Elligsen wrote:
The ALIAS feature looks pretty great. I think that could ease a lot of pain related to destructuring. 

I'm not really happy with the explanations so far though. I see a lot of experienced TLA+ user chiming in that destructuring works better currently, and from experience I believe it, but I don't see an explanation of why it should be this way. It seems to me an arbitrary fact that new spec writers must learn and then code around to improve spec performance and aesthetics.

TLA+ has both the syntax for records, and it can handle multiple updates to the record using the || operator, so it's not like the spec must be written without using them. In the end it just seems like record update syntax is sort of second class and so specifications work better when omitting it. I feel like it betrays the principle that TLA+ is math, like why is one representation not as good as another?

On Monday, June 7, 2021 at 10:38:28 AM UTC-7 Markus Alexander Kuppe wrote:
On 07.06.21 06:52, Andrew Helwer wrote: 
> Plus the state traces are much easier to read. 


The new-ish ALIAS feature of TLC [1] addresses this problem (one less 
reason to write specs that work well with the various tools). 

Markus 

[1] https://github.com/tlaplus/tlaplus/issues/485 



-- 
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.

Igor Konnov

unread,
Jun 8, 2021, 4:32:52 AM6/8/21
to tla...@googlegroups.com, Igor Konnov
Hi all,

To join the discussion, I would also vote for de-structuring state into worker_queue and worker_online. Here are my reasons from the tooling perspective:

1. The variable state is a function from Worker to WorkerState, the latter is essentially a Cartesian product of Seq(Msg) and BOOLEAN. For complex data structures, Apalache would sometimes give up and expand the product into plenty of symbolic constants. You have better chances, if it expands individual sets rather than products. I am not sure about TLAPS, but I would conjecture that quantification in SMT works better over simpler data structures than over complex data structures.

2. I think this applies both to TLAPS and Apalache: When you wrap something into a data structure, you introduce additional symbolic constraints, which makes solving harder. This is probably counterintuitive, as in programming it is often more convenient to wrap multiple variables into a single data structure. (As long as you don’t have concurrency, which may easily introduce inconsistency to your data structures.)

3. The definition of INSTANCE is built around individual state variables. So you can easily replace a state variable with an expression (e.g., another variable). If you carry all of your state in a single variable, it will be harder to use substitution in INSTANCE.


@Andrew, Stephan, Your example works almost out of the box in Apalache. However, you have to annotate the variable `me` with a type and replace `STRING` with a concrete finite set, e.g., `{“Alice”, “Bob”}`.

Apalache is not trying to invent new constants, in order to populate the set `STRING`; so we we do not translate `STRING`. Actually, it should be fairly easy to support quantification over strings, so I have added a feature request [1]. In practice, I don’t know how useful that feature would be, as the SMT solver will just produce some names like c!1, c!2, c!3. It will not do fuzzing for you.

Here is the complete example that works in Apalache:

----------------------- MODULE test_rec_assign -----------------------------
EXTENDS Integers

VARIABLE
\* @type: [ name: Str, age: Int ];
me

Person == [name : {"Alice", "Bob"}, age : Nat]

HaveBirthday ==
/\ me' \in Person
/\ me'.name = me.name
/\ me'.age = me.age + 1

Init ==
/\ me \in Person
/\ me.age = 0

Next ==
HaveBirthday
============================================================================

After all preprocessing steps, Next looks like follows in Apalache (this can be seen in `x/**/OutAnalysis.tla`):

(*
@type: (() => Bool);
*)
Next_si_0 ==
Skolem((\E t_4$1 \in Nat:
Skolem((\E t_3$1 \in { "Alice", "Bob" }:
me' := [name |-> t_3$1, age |-> t_4$1]))))
/\ me'["name"] = me["name"]
/\ me'["age"] = me["age"] + 1

In normal words, Apalache rewrites the expression me' \in Person into a record constructor that is decorated with two existential quantifiers. Apalache tries to rewrite sets, especially the record sets and Cartesian products, as smaller sets are usually easier to reason about. Moreover, the analysis pass labels the existential quantifiers as Skolemizable, so they will be replaced with a constant in the SMT encoding.

Additionally, it labels the equality as an assignment with `:=`, so the variable me’ will be bound to the symbolic value of the expression in the right-hand side, when `Next_si_0` is fired. In general, Apalache would decompose `Next` into a disjunction of predicates Next_si_0, Next_si_1, …, Next_si_n, each of them having a distinct set of symbolic assignments.


[1] https://github.com/informalsystems/apalache/issues/844


Hope this helps,
Igor

> On 08.06.2021, at 08:28, Stephan Merz <stepha...@gmail.com> wrote:
>
>> On 8 Jun 2021, at 00:43, Andrew Helwer <andrew...@gmail.com> wrote:
>>
>> […]

Markus Kuppe

unread,
Jun 8, 2021, 3:13:42 PM6/8/21
to tla...@googlegroups.com
On 06.06.21 15:57, Isaac DeFrain wrote:
>
> To manage the amount of names I have to write following UNCHANGED, I
> usually write several /inclusive/ and /exclusive/ variable tuples; e.g.
> say we have the variables x1, x2, ..., xN and I have a few actions which
> change only x1 and x2 and leave x3, ..., xN unchanged (this would
> suggest an exclusive variable tuple to me). I'll define a tuple, call it
> non_x1_x2_vars == <<x3, ..., xN>>, and then just write UNCHANGED
> non_x1_x2_vars in the aforementioned actions. As long as you write all
> your actions in one module, this seems to work really well. Not to
> mention when you have 10+ variables in a spec, it becomes next to
> impossible to visually inspect whether you have included all the
> unchanged variables.
>
> I first saw this little trick in the raft spec
> <https://github.com/ongardie/raft.tla/blob/master/raft.tla> where they
> use several (what I would call) inclusive variable tuples.


Hi,

have you tried *simulating* the specs with TLC for a few seconds to find
any missing variables in UNCHANGED statements? I haven't yet, but I
would expect the trick to catch most of them.

Markus

Leslie Lamport

unread,
Jun 8, 2021, 3:40:15 PM6/8/21
to tlaplus
I don't know if people realize that TLC can handle:

   VARIABLES x, y, a, b, c
   foo == <<x, y>>
   bar == <<a, foo>>
   boo == <<b, c>>
     ...
 
   /\ UNCHANGED <<bar, d,  boo>>
Reply all
Reply to author
Forward
0 new messages