Time-outs

Theorem

unread,

Sep 21, 2019, 12:16:07 PM9/21/19

to tlaplus

Hello

Is there a general way/guideline to model Timeouts in TLA+. e.g. in a message-passing network model, how can time-outs be modelled. In other words, should I consider time-out a fault? in fact, a more general question is that are Time-outs considered types of faults in a distributed system.

Stephan Merz

unread,

Sep 21, 2019, 12:31:05 PM9/21/19

to tla...@googlegroups.com

I'm assuming that you are not trying to verify any real-time properties of your protocol but that you are only interested in qualitative properties of a protocol that includes timeout actions intended to recover possible faults of the underlying network, such as by resending messages that may have been lost.

In that case, simply include the timeout action as a disjunct in the definition of the next-state relation. This means that the action can occur at any point in the protocol: timeout is modeled by non-determinism. You will probably want to include a precondition that implies that some action whose effect has been "lost" has indeed occurred previously. As a concrete example, have a look at the standard specification of the alternating bit protocol (which has timeout actions for resending lost messages).

The resulting specification is of course an over-approximation of the behavior of the actual protocol: if you can verify the properties that you are interested in, then the actual protocol (where the timeout action is more restricted) will be correct as well. If you find that the over-approximation is too coarse, you may have to add more preconditions. (Remember that as the writer of a specification you have access to the entire system state even if the implementation of a node in a distributed system only sees the local state of that node.) But I recommend to start simple and see how far you get.

Stephan

On 21 Sep 2019, at 18:16, Theorem <ibas...@gmail.com> wrote:

Hello
Is there a general way/guideline to model Timeouts in TLA+. e.g. in a message-passing network model, how can time-outs be modelled. In other words, should I consider time-out a fault? in fact, a more general question is that are Time-outs considered types of faults in a distributed system.

--
You received this message because you are subscribed to the Google Groups "tlaplus" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tlaplus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tlaplus/36fe8213-63fc-47fc-86ea-bf15ba19a6f7%40googlegroups.com.

Andrew Helwer

unread,

Sep 24, 2019, 5:22:32 PM9/24/19

to tla...@googlegroups.com

To expand on Stephan's point with an example, the spec might look as follows:

Next ==

\/ ReceiveMessage

\/ Timeout

\/ ...

Where ReceiveMessage is enabled if a message has been delivered, while Timeout is enabled if no message has yet been delivered. Then your spec will cover the following behaviours:

(Message sent) -> (Message delivered) -> (Message received & processed)

(Message sent) -> (Receiver timeout) -> (Message delivered)

(Message sent) -> (Receiver timeout)

(In the final behaviour, the message is never delivered)

Which are probably all the behaviors you want to model in your system, if you have special timeout-handling logic which safety you want to assess.

Andrew

Theorem

unread,

Feb 4, 2020, 11:50:35 AM2/4/20

to tlaplus

thanks very much, this helps a lot

Theorem

unread,

Feb 4, 2020, 11:51:03 AM2/4/20

to tlaplus

many thanks, this is what I was looking for. very helpful indeed. thanks

Stephan

To unsubscribe from this group and stop receiving emails from it, send an email to tla...@googlegroups.com.

Reply all

Reply to author

Forward