When TLC error trace send with stuttering

david streader

unread,

Mar 30, 2019, 2:44:22 AM3/30/19

to tlaplus

Hi I am trying to debug liveness properties and realised I lack some understanding:

My module includes:

Next == sData \/ sendD \/ sendA \/ rData \/ receiveD \/

receiveC \/ receiveA \/ dataChannel \/ ackChannel

\/ (* Disjunct to prevent deadlock on completion *)

(( messOut=Message) /\ UNCHANGED vars)

(* with out Fin the model never deadlocks *)

Fin == <>(messOut = Message) (* NOT a property of every behaviour *)

(* I interpreted these two experiments as meaning the system was busy looping between more that one state and not stuttering

But TLC error-trace is

<stuttering> State num=8

<sendD > State num7 *)

Without property Fin but with deadlock checked TLC reports no errors.

With Fin TLC reports that the temporal property fails.

Am I correct that if no deadlock is found then no infinite sequence of stuttering steps are possible ??

My problem is that when a termoral property fails and an error trace ends with <stuttering> together meant that an infinite sequence of stuttering steps was the cause of the filure ??

thanks in advance

Stephan Merz

unread,

Mar 30, 2019, 3:41:31 AM3/30/19

to tla...@googlegroups.com

Hello David,

you are not showing us the full specification. A TLA+ specifications of the form

Init /\ [][Next]_v

allows for infinite stuttering. Adding fairness conditions WF_v(A) or SF_v(A) enforces action <<A>>_v to occur infinitely often if it is "often enough" enabled and therefore in particular rules out infinite stuttering in those cases. But infinite stuttering is still possible when A is disabled at the state where stuttering occurs.

The absence of deadlocks does not rule out infinite stuttering. For example, consider the specification

Init == x=0

Inc == x<5 /\ x'=x+1

Reset == x=5 /\ x'=0

Spec == Init /\ [][Inc \/ Reset]_x /\ WF_x(Inc)

The spec has no deadlock, since at least one action is possible at any reachable state. However, it fails to satisfy the property

[]<>(x=0)

because there is a behavior that stutters infinitely at the state where x=5: since Inc is disabled at that state, the fairness property is trivially satisfied.

More to the point, your next-state relation contains a disjunct that prevents deadlock (as indicated by the comment), but it may very well allow for infinite stuttering.

Hope this helps,

Stephan

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To post to this group, send email to tla...@googlegroups.com.
Visit this group at https://groups.google.com/group/tlaplus.
For more options, visit https://groups.google.com/d/optout.

david streader

unread,

Mar 30, 2019, 8:43:56 PM3/30/19

to tlaplus

Many thanks Stephan.

So with

Init == x=0

Inc == x<5 /\ x'=x+1

Reset == x=5 /\ x'=0

Spec == Init /\ [][Inc \/ Reset]_x /\ WF_x(Inc)

Is it correct to say that when the the guard (x<5) is false it does not prevent the step Inc from being chosen. But if Inc is chosen and the guard is false then the system stutters.

Is it reasonable to say that a particular step, Inc in this example, is the stuttering step??

The error trace clearly indicates that stuttering occurs form state (x=5) but there is nothing on the State Graph to indicate that stuttering is possible from state (x=5) and not from other states. There are dashed loop arrows from every state. These loops do not seem to be related to any particular step. Do you know what they indicate?

You say

However, it fails to satisfy the property []<>(x=0) because there is a behavior that stutters infinitely at the state where x=5: since Inc is disabled at that state,

But Reset is disabled when x=1 does this lead to stuttering?

I experimented with []<>(x=3) e.t.c. The results surprised me. I think of stuttering as a property of the State Machine but where stuttering is reported depends upon the property checked not upon the State machine itself. Am I mistaken to think of stuttering as a property of the state machine? If stuttering is a property of the State Machine then I would have thought that stuttering from the State (x=2) would also falsify []<>(x=0) ??

So do the dashed loops on the State Graph indicate that it can stutter from any state

My intuition was that

Spec == Init /\ [][Inc \/ Reset]_x /\ SF_x(Inc) /\ SF_x(Reset)

would not stutter and experimentation confirmed this.

Hope my questions make sense in TLA+ terms because i feel that I am approaching TLA+ with a lot of intellectual baggage.

kind regards david

Jay Parlar

unread,

Mar 31, 2019, 9:41:11 AM3/31/19

to tla...@googlegroups.com

On Saturday, 30 March 2019 20:43:56 UTC-4, david streader wrote:

Many thanks Stephan.

So with
Init == x=0
Inc == x<5 /\ x'=x+1
Reset == x=5 /\ x'=0
Spec == Init /\ [][Inc \/ Reset]_x /\ WF_x(Inc)
Is it correct to say that when the the guard (x<5) is false it does not prevent the step Inc from being chosen. But if Inc is chosen and the guard is false then the system stutters.

`Inc` can only be "chosen" when its guard is true (the typical parlance is to say that `Inc` is "enabled" or "not enabled/disabled"). When its guard is false (i.e., whenever x is not less than 5), an Inc step cannot take place.

So what can take place when `Inc` is not enabled? When x=5, a `Reset` step _could_ take place. But there's no guarantee that it will take place, and thus stuttering is possible. In other words, when x=5, the `Reset` step is enabled, but because there's no fairness associated with that step (either weak fairness or strong fairness), we aren't requiring that the step eventually be taken

Stuttering steps are steps in which no variables in the formula change. The rest of the universe happily moves along during those steps, but _your_ particular variables remain unchanged. In a real system, this might represent a process having received a message (and thus its guard/enabling condition being true), but it never responds because the operating system scheduler doesn't give the process time on the CPU.

Or it could even represent a process receiving a message, but never responding, because the process crashes! If `Reset` represents a process in your system, and it's supposed to "do its thing" whenever x=5, what happens if `Reset` process crashes? Nothing! It sits there stuttering. But this is NOT deadlock, because _mathematically_ there's a way for the system to proceed, for the system to move to a different step. And all deadlock checking cares about is whether or not there's a transition out of a state.

This system WILL deadlock, because once it reaches x=5, there's no transition out of the state.

```

Init == x=0

Inc == x<5 /\ x'=x+1

Spec == Init /\ [][Inc]_x /\ WF_x(Inc)

```

If you disable deadlock checking, and run it with a check for temporal property []<>(x=1), then it too will report stuttering at the state x=5.

Is it reasonable to say that a particular step, Inc in this example, is the stuttering step??

Nope. `Inc` may stutter for some period of time (for a million years even), but by saying `WF_x(Inc)`, we're saying that _eventually_ the `Inc` step will take place when `Inc` is enabled.

The error trace clearly indicates that stuttering occurs form state (x=5) but there is nothing on the State Graph to indicate that stuttering is possible from state (x=5) and not from other states. There are dashed loop arrows from every state. These loops do not seem to be related to any particular step. Do you know what they indicate?

The dashed loop arrows are (I believe) indicating stuttering steps. Any one of those states can stutter, but for the states where x is one of 0..4, your system can't stutter _infinitely_.

Your weak fairness condition of WF_x(Inc) states that if `Inc` ever becomes enabled forever, then the `Inc` step will eventually take place. When, for example, x=3, your process could sit for a LONG time doing nothing (i.e., stuttering), but WF means that _eventually_ the step will take place and you'll go to x=4. And TLC only throws an error when a behaviour can _infinitely_ stutter.

But you have no such fairness condition for the time at which x=5. In that case, your process may be enabled infinitely, and never proceed, and thus TLC gives the stuttering error.

You say
However, it fails to satisfy the property []<>(x=0) because there is a behavior that stutters infinitely at the state where x=5: since Inc is disabled at that state,

But Reset is disabled when x=1 does this lead to stuttering?

The behaviour that stutters infinitely is

x=0 -> x=1 -> x=2 -> x=3 -> x=4 -> x=5 [stutters here forever]

That's not the ONLY possible behaviour your system can experience. A _possible_ behaviour is that after some time with x=5, the `Reset` action is allowed to take place, setting x back to 0. But because you haven't specified fairness on `Reset`, it's _possible_ for it to sit there stuttering forever, and thus the temporal property is violated.

I experimented with []<>(x=3) e.t.c. The results surprised me. I think of stuttering as a property of the State Machine but where stuttering is reported depends upon the property checked not upon the State machine itself. Am I mistaken to think of stuttering as a property of the state machine? If stuttering is a property of the State Machine then I would have thought that stuttering from the State (x=2) would also falsify []<>(x=0) ??

Any system can stutter at any point in time, for potentially an infinite amount of time. But if you don't specify any temporal properties in the model checker, then TLC won't check for stuttering.

When you enable a temporal property for checking, then TLC finds the state x=5 allows for infinite stuttering.

The reason that x=2 does NOT falsify []<>(x=0) is because the system won't infinitely stutter at that state, thanks to your WF_x(Inc). You are guaranteeing that your system will always progress forward from that point. But once it hits x=5, there's no more guarantee, as it can stutter infinitely there.

So do the dashed loops on the State Graph indicate that it can stutter from any state

I believe they indicate it can stutter from any state, but NOT that it will _infinitely_ stutter from that state.

My intuition was that
Spec == Init /\ [][Inc \/ Reset]_x /\ SF_x(Inc) /\ SF_x(Reset)
would not stutter and experimentation confirmed this.

You could also have done

Spec == Init /\ [][Inc \/ Reset]_x /\ WF_x(Inc) /\ WF_x(Reset)

Strong fairness is harder to implement in practice, so it's usually best to start with weak fairness.

The notions of stuttering and fairness are not intuitively obvious, in my experience. You'll likely have to read through the relevant parts of "Specifying Systems" (https://lamport.azurewebsites.net/tla/book-02-08-08.pdf) to really get a handle on them.

In particular, chapter 2 introduces stuttering very early on, and chapter 8 gives much more elaborate and formal definitions. These concepts personally took me a decent chunk of time to understand, with lots of staring at the page and re-reading the same paragraphs over and over.

Jay P.

david streader

unread,

Mar 31, 2019, 10:42:00 PM3/31/19

to tlaplus

Thanks Jay

My use of terminology alas is from the refinement of Abstract data types, ADT, and process algebra so I struggle with TLA terminology but I have read the texts in quite some detail.

So a step is disabled when its guard (or precondition) is false and although we don't say that it is the particular step that stutters it is when a guard evaluates to false that stuttering occurs.

With no fairness condition Spec == Init /\ [][Inc \/ Reset]_x then the state machine we considered

Init == x=0

Inc == x<5 /\ x'=x+1

Reset == x=5 /\ x'=0

could stutter immediately and falsify any liveness condition. But the error trace displayed depends on the temporal condition, it may first completes several steps before stuttering even though it could stutter immediately and falsify the condition. I misinterpreted the error trace as implying that this was the first point that stuttering could occur where as I gather that this is the first point where stuttering is checked.

TLA+ allows fairness to be defined for very specific situations but because of my background in process algebra and abstract data types all I want is to have strong fairness every where. In ADT and process algebras CSP and CCS nothing stutters by default you have to add some thing to enable stuttering.

I gather /\ WF_vars(Next) is will stop all stuttering but not sure what the effect of restricting the vars to a subset of the variables is. Is /\ SF_vars(Next) thought enforce strong fairness throughout the state machine?

Put another way: would SF_vars(Next) imply any other fairness condition?

It certainly is not having the effect I imagined but this may be caused by the translation from PlusCal to TLA+

regards david

Jay Parlar

unread,

Mar 31, 2019, 11:45:52 PM3/31/19

to tlaplus

David,

So a step is disabled when its guard (or precondition) is false

Correct

and although we don't say that it is the particular step that stutters it is when a guard evaluates to false that stuttering occurs.

Incorrect. Stuttering is when a step is enabled (i.e., its guard/precondition/enabling-conditions are true), but instead of the step being taken, _nothing at all happens_. And by "nothing at all happens", very strictly, I mean that _all_ the variables defined with VARIABLES remain unchanged.

Stuttering in and of itself is completely fine. Everything stutters all the time. Stuttering means that the infinite numbers of other variables in the universe may or may not be changing in between two distinct points in time, but _our_ variables defined with VARIABLES remain unchanged. And this is expected. As one example, if you're running a process on a single core CPU, the operating system is putting your process on and off the core all the time. Whenever it's off the core, you could think of your software as stuttering. The universe continues to move forward, but the state of your process remains the same. The question is, does your spec guarantee that your process will _eventually_ run again. (Naturally it's more subtle than that, but hopefully that helps with understanding/terminology).

Every spec we write, every program we write, it all is going through stuttering all the time. That's fine and expected.

But the problem is when a particular spec might _infinitely stutter_. From a terminology point of view, we must understand that stuttering is fine and normal, but infinite stuttering is what temporal properties are generally concerned with. And TLC will _only_ check for infinite stuttering if you explicitly instruct it to check a temporal property. TLC calls the error "stuttering", but really it means "infinite stuttering".

With no fairness condition Spec == Init /\ [][Inc \/ Reset]_x then the state machine we considered
Init == x=0
Inc == x<5 /\ x'=x+1
Reset == x=5 /\ x'=0
could stutter immediately and falsify any liveness condition.

Correct. When x=0, the system can stutter (just like it can at x=1, x=2, ...). Since we have no fairness specified, we have no guarantee that the stuttering will _eventually_ cease, and thus TLC raises the error.

But the error trace displayed depends on the temporal condition, it may first completes several steps before stuttering even though it could stutter immediately and falsify the condition. I misinterpreted the error trace as implying that this was the first point that stuttering could occur where as I gather that this is the first point where stuttering is checked.

For the original spec you posted (with fairness on Inc but not on Reset), then x=5 _is_ the first place that "infinite stuttering" could occur.

But with no fairness specified, the point at which the stuttering error is raised is an implementation detail. With no fairness condition specified, then the following could _also_ be raised as a stuttering error

x = 0 -> x = 1 [stuttering]

But the nature of TLC means that it "discovers" the stuttering problem at x=0, and stops right there.

TLA+ allows fairness to be defined for very specific situations but because of my background in process algebra and abstract data types all I want is to have strong fairness every where. In ADT and process algebras CSP and CCS nothing stutters by default you have to add some thing to enable stuttering.

The easiest way to do this is to not check for temporal properties :) But once you start checking for temporal properties, then TLC wants to find if any otherwise valid behaviours might infinitely stutter.

I gather /\ WF_vars(Next) is will stop all stuttering but not sure what the effect of restricting the vars to a subset of the variables is.

(I think it's an error to restrict WF to a subset of variables, but someone more knowledgable can correct me.)

Well, it won't necessarily stop all stuttering. `WF_vars(Next)` means that if a condition is "enabled forever", then the step will eventually take place. But you can have behaviours where a condition is enabled and then disabled and then enabled and then disabled, ad nauseam, which would result in a violation of some temporal properties.

In either part 1 or part 2 (http://lamport.azurewebsites.net/video/video9a.html or http://lamport.azurewebsites.net/video/video9b.html) of Lamport's "Alternating Bit Protocol" video lecture, he has a great example showing this taking place. I recommend watching both videos (and the entire series, in fact http://lamport.azurewebsites.net/video/videos.html)

Is /\ SF_vars(Next) thought enforce strong fairness throughout the state machine?

SF_vars(Next) means that if a condition is infinitely enabled and disabled, then the step will eventually take place.

Take this modified spec as an example

```

EXTENDS Integers

VARIABLES x

Init == x=0

Inc == x<5 /\ x'=x+1

Reset == x \in {5, 6} /\ x' = 0

JumpToSix == x = 5 /\ x' = 6

Next == Inc \/ Reset \/ JumpToSix

Spec == Init /\ [][Next]_x /\ SF_x(Next)

```

Set a temporaral property of `[]<>(x=6)`. You will have a stuttering violation. Even though we have SF_x(Next), we're not guaranteed to ever take the JumpToSix step.

If we change Spec to:

Spec == Init /\ [][Next]_x /\ SF_x(Inc \/ Reset) /\ SF_x(JumpToSix)

Then the stuttering violation is eliminated.

But try changing those SF_x to WF_x and see what happens!

Put another way: would SF_vars(Next) imply any other fairness condition?

If something is satisfied by weak fairness, then it is also satisfied by strong fairness.

It certainly is not having the effect I imagined but this may be caused by the translation from PlusCal to TLA+

There are definitely places in the way PlusCal is translated in which fairness can get especially tricky. You can mark the whole algorithm as fair (which might not do exactly what you expect), you can mark an individual process as fair (which also might not do what you expect), and you can even mark individual labels as fair. See page 38 of https://lamport.azurewebsites.net/tla/p-manual.pdf

Jay P.

Stephan Merz

unread,

Apr 1, 2019, 2:47:28 AM4/1/19

to tla...@googlegroups.com

Hi,

Jay answered your post, just a quick addition from my side:

I misinterpreted the error trace as implying that this was the first point that stuttering could occur where as I gather that this is the first point where stuttering is checked.

In your running example (with a weak fairness condition on the Inc action), infinite stuttering is in fact only allowed at the state where x=5. In general, TLC is not guaranteed to find the shortest counter-example for temporal properties, but it finds a shortest counter-example when checking invariants.

In ADT and process algebras CSP and CCS nothing stutters by default you have to add some thing to enable stuttering.

Note that stuttering is in fact very explicit in the next-state relations used in TLA+ specifications: [Next]_v is shorthand for "Next \/ v'=v", and this disjunct is responsible for stuttering. It allows any step to happen that leaves the expression v unchanged, and a little thought will convince you that v'=v is bound to hold if all the variables that appear in v (and perhaps more) remain unchanged.

Jay writes:

(I think it's an error to restrict WF to a subset of variables, but someone more knowledgable can correct me.)

It is syntactically legal to write WF_e(A) for any expression e (possibly different from the tuple of all variables) and any action A (even one that does not occur as a disjunct of the next-state relation), and sometimes it makes sense to write such fairness conditions. The problem with these is that they may result in specifications that are not "machine closed". Leslie recently pointed this out in his answer to another thread [1]. Unless you know exactly what you are doing, it is better to restrict yourself to fairness conditions WF_v(A) where v is the same subscript as the one used in [Next]_v, and A is a disjunct of Next.

Regards,

Stephan

[1] https://groups.google.com/forum/#!topic/tlaplus/1FvqnGFhxTE

david streader

unread,

Apr 23, 2019, 5:49:33 PM4/23/19

to tlaplus

Many thanks I never thought of stuttering as only occurring when a guard was satisfied that explains a lot.

If I seem to nitpicking it is just because I am not finding debugging livenes easy.

With spec

EXTENDS Naturals

VARIABLES x

Init == x=0

Inc == x<5 /\ x'=x+1

Reset == x=5 /\ x'=0

Spec == Init /\ [][Inc \/ Reset]_x

(* /\ SF_<<x>>(Inc) *)

/\ SF_<<x>>(Reset)

where stuttering is reported depends upon what property you ask as well as the spec itself.

with <> (x=2) it is after x = 1 but with <> (x=4) its is after x = 3.

Consequently it was important that I stopped interpreting it to be an improvement that the spec was not stuttering so soon.

Does the addition of liveness conditions change the State Graph to indicate where infinite stuttering might occur?

Jay Parlar

unread,

Apr 23, 2019, 7:19:40 PM4/23/19

to tlaplus

> Does the addition of liveness conditions change the State Graph to indicate where infinite stuttering might occur?

Can you clarify what you mean here? If by “liveness conditions”, you mean adding temporal property checks, then nothing should change in the state space simply by adding those checks.

Stuttering is only checked for if you add a liveness property, because stuttering can only affect the “correct behaviour” of a system if a temporal property is part of the correct behaviour.

Stephan Merz

unread,

Apr 24, 2019, 7:43:42 AM4/24/19

to tla...@googlegroups.com

When TLC reports a counter-example to an invariant, it is guaranteed to be of smallest length because the state space is explored using breadth-first search [1]. Checking liveness properties relies on exploring strongly connected components of the state graph and the above guarantee does not hold. The state graph is still the same, independently of the liveness property that you try to verify, but different counter-examples may be reported.

Stephan

[1] Using default settings, and ignoring effects due to parallel exploration.

david streader

unread,

Apr 30, 2019, 12:53:47 AM4/30/19

to tlaplus

Thanks

The error trace for safety properties has proven very helpful. But for liveness properties I had been reading to much into the trace. But won't be doing that in future. kind regards david

On Wednesday, 24 April 2019 23:43:42 UTC+12, Stephan Merz wrote:

When TLC reports a counter-example to an invariant, it is guaranteed to be of smallest length because the state space is explored using breadth-first search [1]. Checking liveness properties relies on exploring strongly connected components of the state graph and the above guarantee does not hold. The state graph is still the same, independently of the liveness property that you try to verify, but different counter-examples may be reported.

Stephan

[1] Using default settings, and ignoring effects due to parallel exploration.

On 23 Apr 2019, at 23:49, david streader <davidis...@gmail.com> wrote:

Many thanks I never thought of stuttering as only occurring when a guard was satisfied that explains a lot.

If I seem to nitpicking it is just because I am not finding debugging livenes easy.

With spec
EXTENDS  Naturals
VARIABLES x
Init == x=0
Inc == x<5 /\ x'=x+1
Reset == x=5 /\ x'=0
Spec == Init /\ [][Inc \/ Reset]_x
  (*   /\ SF_<<x>>(Inc) *)
     /\ SF_<<x>>(Reset)

where stuttering is reported depends upon what property you ask as well as the spec itself.

with <> (x=2) it is after x = 1 but with  <> (x=4) its is after x = 3.

Consequently it was important that I stopped interpreting it to be an improvement that the spec was not stuttering so soon.

Does the addition of liveness conditions change the State Graph to indicate where infinite stuttering might occur?

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.

To unsubscribe from this group and stop receiving emails from it, send an email to tla...@googlegroups.com.

Reply all

Reply to author

Forward

When TLC error trace send with stuttering - what does this mean?

david streader

Stephan Merz

david streader

Jay Parlar

david streader

Jay Parlar

Stephan Merz

david streader

Jay Parlar

Stephan Merz

david streader