[cc'ing friam and cap-talk because e-lang has been so idle lately. But e-lang is the right place for this, so further discussion on this topic should continue on e-lang.]
In E, if code in a turn went into an infinite loop, that vat was hosed.
Separately, within our model of persistence, if a vat crashed, then it was restarted from the last checkpointed state, which was ideally the beginning of the current turn. When it restarted, it might retry the same event. If it always retries that event, and if that event always causes a crash, then the vat would also be hosed.
Previously we've always said that an ocap vat defends only integrity at object granularity, but that a vat is the minimal unit for defending availability. If Alice is to protect her availability from Bob going into an infinite loop, then Bob must be in a separate vat, and Alice and Bob may interact only asynchronously.
We are now rebuilding the Communicating Event Loops model to run on blockchains. On blockchains, all computation is resource constrained. It must be paid for in a finite amount of some unit, now universally called "gas". Thus, infinite loops are reliably turned into transaction abort. For blockchains, the bookkeeping needed to rewind to the previous turn boundary is not optional; nor is it expensive when compared to the rest of the platform. For non-blockchains where we still do this bookkeeping, we can use a non-deterministic watchdog timer instead of gas.
Such transactional rewind gives us new degrees of freedom.
Imagine that Alice wishes to use Bob within the same vat, invoking Bob only asynchronously with some kind of enforced budget, to protect Alice's availability. But Alice passes some of her own objects to Bob that Bob can invoke Alice's objects synchronously during such a turn. The constraint is that Bob's stack frames may only appear when the frame at the bottom of the call stack is a Bob frame, and that everything that happens during that turn draws on the limited budget Alice set up. Within such a turn, Alice and Bob can call back and fourth synchronously, since there's still a Bob frame at the bottom of that stack.
If that turn exhausts the budget Alice allocated for that turn, then the turn aborts and the vat goes back to the state right before Bob's turn started.
Bob's turn might also send asynchronous messages, but we adopt the Waterken invariant that no messages are ever released from uncommitted turns. If the turn aborts, then it also did not send any messages. If the turn commits, then we need to think about what budget the turns caused by those messages draw on. But let's worry about that later.
Once Bob's turn itself aborts, say, from exhausting its budget, what should happen next? I propose that Bob's turn acts as if this first invocation throws an exception instead of doing anything else. IOW, rather than starting Bob's turn, the promise for the result of the turn becomes a rejected (broken) promise, reducing this case to the normal asynchronous exception handling coordination.
With all that mechanism, we could support transactional abort of a turn for other purposes. We could provide an abort(error) operation that, in general, aborts the turn, reverts to the state before the turn starts, and immediately rejects the promise for the turn's result with that error. Alice can let Bob call her objects synchronously while still protecting her availability from Bob's profligacy.
Does this seem like a good idea?