Where is the code for the state space traversal algorithm in TLC?

zhi niu

unread,

Sep 14, 2022, 11:09:41 PM9/14/22

to tlaplus

When I use TLA+ for verification projects, I always encounter the problem of state space explosion, which causes the machine to crash or take a long time. Therefore, I want to look at the TLC state space traversal algorithm to see if the state traversal algorithm can be optimized.
question:
1. Where is the detailed location of the code for state space traversal in TLC?
2. Is there a good solution to the state space explosion problem in TLC? thanks

Andrew Helwer

unread,

Sep 15, 2022, 8:03:27 AM9/15/22

to tlaplus

It's breadth-first search that works pretty much the way you would expect: first generating all successor states to the current state, using those states' hashes to check whether they've already been visited, and pushing them onto the state queue if not. There is some extra logic when symmetry is involved but that's about it; there really isn't much to optimize. The best solution to state space explosion is probably to use small model sizes (3, at most 5 elements) and avoid using operations that add permutations (like recording order of operations) or subset-selection to your spec. Also avoid defining extremely large sets, like powersets of records and stuff like that.

If all that fails you can also try Apalache, which uses symbolic instead of finite state model checking.

Andrew

Calvin Loncaric

unread,

Sep 16, 2022, 5:00:32 PM9/16/22

to tla...@googlegroups.com

> Where is the detailed location of the code for state space traversal in TLC?

TLC's source code is on GitHub:

https://github.com/tlaplus/tlaplus/tree/master/tlatools/org.lamport.tlatools/src/tlc2

As Andrew said, the algorithm is breadth-first search. Today, the main components of the search are:

tlc2.tool.impl.Tool (enumerates initial and successor states)
tlc2.tool.queue.IStateQueue and its subclasses (stores the queue of not-yet-visited states)
tlc2.tool.fp.FPSet and its subclasses (stores the set of already-visited states; or rather, the 64-bit hashes of already-visited states)
tlc2.tool.Worker (in a loop: dequeues a state, checks properties, enumerates successors, and enqueues not-yet-visited successors)

Andrew wrote that "there really isn't much to optimize". This is only half true. There are many different ways to optimize state exploration: make successor-state enumeration faster, make the queue have lower thread contention, make the visited-state-set use less (or more) memory, etc.

However, many people much smarter than me have spent enormous effort making TLC as fast as possible. The truth in Andrew's statement is that "there really isn't much left to optimize, and what's left is very difficult to implement correctly". I don't want to discourage you---many people would be very grateful if you improved TLC's performance---but I don't know of any low-hanging fruit you can pick.

> Is there a good solution to the state space explosion problem in TLC?

It depends. Andrew alluded to my favorite solution: symmetry reduction. Symmetry reduction can exponentially decrease the number of states that TLC has to explore. You can check invariants and action properties under symmetry reduction, but unfortunately you cannot use it to check liveness properties faster.

If you are able to use symmetry reduction, I highly recommend it as the first line of defense against state space explosion. You can find information about it in Specifying Systems, section 14.3.4:

https://lamport.azurewebsites.net/tla/book-02-08-08.pdf

--

Calvin

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/1fbed6b5-7955-4f0c-af44-a16b1ca79237n%40googlegroups.com.

Gareth Smith

unread,

Sep 29, 2022, 9:18:09 AM9/29/22

to tlaplus

I haven't worked in TLA+ (yet) but designed a lot of state machine architectures for industrial controls and safety systems and the real art is system and sub-system decomposition, but getting this right is the key to avoiding state explosions.

In the context of Moore machines, which are more prone to state explosion than Mealy machines, in my practical uses, but Moore machines are easier to design for creating expected known behaviours, I suggest the following.

Generally, if you have a state explosion in a machine then you often 'fix' this by creating a group of hierarchal machines.

First thing to look for is do you have modes you can break out? In my world that is often a piece of equipment (subsystem) can have conditioning modes of Auto, Manual, Off, Remote. You then use the state of the mode machine as discrete boolean inputs into your main machine.

I use a nomlaclature whereby the state machine name prefixes the variable, seperated by a dot or underscore eg "stackermode.auto" as a boolean etc etc, this works out well for generating class based language equivelents when define your variables as class variables, eg if you want to generate python or java for simulation or demo purposes, because maybe you want to bolt an OPC server onto it, or similar.

I find also that carefully chosen hierarchies can enable creating "encapsulation by architecture" and also reduce your possible credibly needed test cases by making a rule that, for example, in the heirarchy you can only talk to/from one level up or down, and no sideways communication, thus vastly reducing number of test cases, assuming implementation prevents any possible side effects to other machines on the same layer.

As I said, the system/subsystem decomposition is where the black art is and it seems surprisngly difficult to teach/artitculate a general methodology, it seems to something you learn like German, one day you are all at sea and then one day all of a sudden the word order just sort of clicks,

Hope this is helpful.

anything...@gmail.com

unread,

Sep 29, 2022, 9:32:41 AM9/29/22

to tlaplus

I'm still working thru the composing specifications book, but I believe the formalism of communication between state machines is covered under "shared states" starting around page 144.

Clifford Heath

unread,

Sep 29, 2022, 6:18:56 PM9/29/22

to 'Nicholas Fiorentini' via tlaplus

On 29 Sep 2022, at 11:16 pm, Gareth Smith <anything...@gmail.com> wrote:
> I haven't worked in TLA+ (yet) but designed a lot of state machine architectures for industrial controls and safety systems and the real art is system and sub-system decomposition, but getting this right is the key to avoiding state explosions.

> ...

> As I said, the system/subsystem decomposition is where the black art is and it seems surprisngly difficult to teach/artitculate a general methodology, it seems to something you learn like German, one day you are all at sea and then one day all of a sudden the word order just sort of clicks,
>
> Hope this is helpful.

Excellent insights Gareth. I wish you would write an article series on the design patterns you've used.

Clifford Heath.

Reply all

Reply to author

Forward