You can get distinct state counts with a JVM system property and a small amount of TLA+.
Step 1: Enable extended statistics
Pass `-Dtlc2.tool.Simulator.extendedStatistics=true` to the JVM.
`java -Dtlc2.tool.Simulator.extendedStatistics=true -jar tla2tools.jar -simulate num=10000 MySpec`
This uses a HyperLogLog (constant space, approximate count) per worker to track distinct states. If you want exact counts and don't mind even more overhead, also pass `-Dtlc2.tool.Simulator.extendedStatistics.naive=true`
Step 2: Read the count via TLCGet("stats").distinct
With extended statistics enabled, TLCGet("stats").distinct returns the number of distinct states seen (instead of -1). You can use _PERIODIC to monitor it during simulation and POSTCONDITION to report it at the end.
Add to your spec:
```tla
SimStats ==
LET s == TLCGet("stats")
IN PrintT(<<"traces", s.traces,
"generated", s.generated,
"distinct", s.distinct>>)
```
Both need to be declared in the .cfg file:
```
_PERIODIC SimStats
POSTCONDITION SimStats
```
This will print something like the following at every progress interval and once at the end.
<<"traces", 10000, "generated", 2067418, "distinct", 8374>>
Caveats: The distinct count is per worker thread, not globally deduplicated across workers. With -workers 1 (the default) this doesn't matter. With multiple workers, each maintains its own counter. The HyperLogLog approximation is usually within a few percent of the true count. Use the naive flag if you need exactness.
M.
PS: TLCGet("stats”) also includes per-action counts, per-variable distinct values, and trace length statistics. I've used this (
https://github.com/microsoft/CCF/issues/6537) to serialize the full record to ndjson and plot coverage over time.