TLC not handling disjunction correctly in special case?

112 views
Skip to first unread message

Andrew Helwer

unread,
May 22, 2017, 7:15:01 PM5/22/17
to tla...@googlegroups.com
I've created a minimal spec to demonstrate what might be a case of TLC not properly handling disjunction. Running TLC on this spec produces a runtime exception. The case occurs when you have a variable which can either hold a record value or a null value; this formula will throw an exception when the record is null, when we expect it to just evaluate to true:

Formula ==
    \/ record = NoRecord
    \/ record.key


Of course TLC isn't expected to exactly correspond to mathematical semantics (it cares about the order of elements in a conjunction when defining/referencing primed variable values, I believe) but might this be a case where TLC can "short-circuit" the evaluation of the formula once it sees the first disjunct is satisfied? Or even wrap every disjunct in a try/catch to see whether it can find one which both fails to throw an exception and evaluates to true? Maybe this has all been considered before and rejected because it produces surprising behaviour elsewhere.

Here's the full spec:

-------------------------------- MODULE Bug --------------------------------
EXTENDS     Naturals

VARIABLE    clock,
            record

RecordType == [key : BOOLEAN]

NoRecord == CHOOSE r : r \notin RecordType

TypeInvariant ==
    /\ clock \in Nat
    /\ record \in RecordType \cup {NoRecord}

Tick ==
    /\  \/ record = NoRecord
        \/ record.key
    /\ clock' = clock + 1
    /\ UNCHANGED <<record>>

CreateRecord ==
    /\ record = NoRecord
    /\ record' = [key |-> FALSE]
    /\ UNCHANGED <<clock>>

SetRecordFlag(value) ==
    /\ record /= NoRecord
    /\ record' = [record EXCEPT !.key = value]
    /\ UNCHANGED <<clock>>

DestroyRecord ==
    /\ record /= NoRecord
    /\ record' = NoRecord
    /\ UNCHANGED <<clock>>

Init ==
    /\ clock = 0
    /\ record = NoRecord

Next ==
    \/ Tick
    \/ CreateRecord
    \/ \E t \in BOOLEAN : SetRecordFlag(t)
    \/ DestroyRecord
=============================================================================

Robert Beers

unread,
May 23, 2017, 11:52:28 AM5/23/17
to tla...@googlegroups.com

Hi Andrew,

From the perspective of using TLA+/TLC to establish the correctness of architects' and designers' (something I did for many years in industry), what you are seeing is a desired behavior.

(Allow me to apologize for such a long reply to a simple question. Our team debated ideas surrounding what you touched on many times over multiple projects with lots of customers. I have much to say as a result. Let me know if any of the following doesn't make sense. Hopefully the length won't bore you to death.)

First, I want to say something we had to remind ourselves regularly: We write specifications in TLA+, not programs or informal descriptions.

With that said, let me generalize your Formula action and add a next state relation to it.

  Formula ==
    /\ \/ A
       \/ B
    /\ N

where N is the next state relation and A and B are preconditions for N to occur. In this form, what Formula says to the architects and designers I worked with is

  If the system is in a state where A is true, then N can occur.
    and
  If the system is in a state where B is true, then N can occur.

I used "and" there to clarify that A and B are separate conditions even though in our TLA+ specification we would combine them disjunctively. In a specification and in their minds, there is no relationship between A and B. Formula is really two specifications in one because of the disjunction.

Thus, if my customers saw your specific version of Formula, they would interpret that as

  If record has a NoRecord value, then the system can act
    and
  If record.key is true, then the system can act

They would immediately balk at the second specification because there is no check that record has a valid key. And they would be right. So, even before running TLC I would have to rewrite the TLA+ as

  Formula ==
    /\ \/ record = NoRecord
       \/ /\ record # NoRecord
          /\ record.key
    /\ NextStateStuff

We would write the spec this way instead of using an if-then or implication because the specification is clearer and fully spelled out. When the architects and designers read the specifications, they want to see the clear, specific rules. If the rules require them to think about language semantics or take data construction implications into mind, then they are not good rules and they will rightly push back. They would push back on your second disjunct because it implies record has to be checked first. They don't want to have to figure that out. Their jobs are stressful enough as it is. Plus, leaving the implication for them to figure out provides an effective path for a bug to get into the design, which runs counter to why we write specifications in the first place: To prevent bugs in designs.

For us folks using TLA+/TLC, we want TLC to handle the disjuncts separately. TLC parallelizes disjunction analysis, thank goodness. In our designs, we had so many actions with so many disjuncts that if TLC evaluated them all in order to check for side effects, performance would have tanked. In a fully mature specification, the next state relation is an enormous disjunction (of nested disjunctions of nested disjunctions of...). Evaluating all that in order would be prohibitive.

In short, for the sake of writing clear specifications, the TLC exception you are seeing is a good thing, and for TLC performance, we want it to evaluate disjuncts in parallel.

Sorry again for a complex answer to a simple question.


Robert

On Mon, May 22, 2017 at 4:15 PM, Andrew Helwer <andrew...@gmail.com> wrote:
I've created a minimal spec to demonstrate what might be a case of TLC not properly handling disjunction. Running TLC on this spec produces a runtime exception. The case occurs when you have a variable which can either hold a record value or a null value; this formula will throw an exception:


Formula ==
    \/ record = NoRecord
    \/ record.key


Of course TLC isn't expected to exactly correspond to mathematical semantics (it cares about the order of elements in a conjunction when defining primed variable values, I believe) but might this be a case where TLC can "short-circuit" the evaluation of the formula once it sees the first disjunct is satisfied? Or even wrap every disjunct in a try/catch to see whether it can find one which both fails to throw an exception and evaluates to true? Maybe this has all been considered before and rejected because it produces surprising behaviour elsewhere.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+unsubscribe@googlegroups.com.
To post to this group, send email to tla...@googlegroups.com.
Visit this group at https://groups.google.com/group/tlaplus.
For more options, visit https://groups.google.com/d/optout.

Leslie Lamport

unread,
May 23, 2017, 1:02:04 PM5/23/17
to tlaplus
There is a very simple explanation of why TLC is evaluating both
clauses of the disjunction even when the first is true.  Consider the
spec with this initial predicate and next state relation:

  Init == x = 22

  Next == \/ /\ x > 7
             /\ x' = 42
          \/ /\ x > 9
             /\ x' = 24   

Would you expect TLC never to find the state with x = 24 because when
evaluating Next, it always finds the first disjunct true and therefore
never evaluates the second disjunct?

When evaluating the next-state action, TLC keeps evaluating any
disjunct that might evaluate to true in order to find all possible
next states.  There are some cases in which TLC reports an error when,
if it were more clever, it would know that a subexpression it can't
evaluate does not have to be evaluated to determine the set of
possible next states--for example,

   Init == x = 0
   Next == IF x = "abc" THEN x'= x+1 ELSE x'=x+1

This is not the case in your example, because the value of the first
conjunct of Tick is unspecified if record equals NoRecord, since that
subexpression is then of the form TRUE \/ e where the value of e is
not necessarily a Boolean.  Hence, the set of possible next states in
that case is not specified by the semantics of TLA+.

Leslie

Markus Alexander Kuppe

unread,
May 24, 2017, 10:06:54 AM5/24/17
to tla...@googlegroups.com
On 23.05.2017 17:52, Robert Beers wrote:
> For us folks using TLA+/TLC, we want TLC to handle the disjuncts
> separately. TLC parallelizes disjunction analysis, thank goodness. In
> our designs, we had so many actions with so many disjuncts that if TLC
> evaluated them all in order to check for side effects, performance
> would have tanked. In a fully mature specification, the next state
> relation is an enormous disjunction (of nested disjunctions of nested
> disjunctions of...). Evaluating all that in order would be prohibitive.
>
> In short, for the sake of writing clear specifications, the TLC
> exception you are seeing is a good thing, and for TLC performance, we
> want it to evaluate disjuncts in parallel.

Hi Robert,

actually, TLC parallelizes state space exploration by concurrently
generating successor states. Each worker picks a state from the set of
unexplored states and sequentially generates its set of successor
states. Generating a state's successors is done by sequentially
evaluating all actions.
TLC does not parallelize the evaluation of a list of disjuncts though.
No matter how many times you run a spec with the following next-state
relation, TLC will always report the first assertion as error:

Next == /\ \/ Assert(FALSE, "1. Disjunct")
\/ Assert(FALSE, "2. Disjunct")
\/ Assert(FALSE, "3. Disjunct")
\/ Assert(FALSE, "4. Disjunct")
/\ x' = 42

What you describe could be an interesting concept to improve scalability
with unfavourably shaped state graphs.

Cheers
Markus

Robert Beers

unread,
May 24, 2017, 10:27:31 AM5/24/17
to tla...@googlegroups.com

Hi Markus,

Thank you for correcting my misinformation! Not sure what I was thinking, but it must have been wishful. I have been meaning to get back into the TLC code and try some ideas out. This would be a good (and challenging!) impetus.


Robert

Reply all
Reply to author
Forward
0 new messages