Greetings,
Claire is designing a new Controller component API that will allow components to directly control lifecycle of their children, and we have run into a subtle design issue.
Basically, Controller is a two-level API. There is a Controller protocol that represents a component, and an ExecutionController nested inside it that represents a component's execution state. When the component starts, it should spawn an ExecutionController, and when it stops, the ExecutionController should close.
For the purpose of this topic, there are two operations that are relevant: the ability to stop a component, and the ability to be notified that a component stopped (including its return code, if it had one). This notification might arrive due to a Stop(), or spontaneously if the component exited on its own.
protocol Controller {
/// Start the component, optionally providing additional handles to be given
/// to the component. Returns INSTANCE_ALREADY_RUNNING if the instance is
/// currently running.
Start(resource struct {
args StartChildArgs;
execution_controller server_end:ExecutionController;
}) -> () error Error;
};
protocol ExecutionController {
/// Stops the component. This function blocks until the stop action is
/// complete.
///
/// Note that a component may stop running on its own at any time, so it is
/// not safe to assume that this error code will not be returned if this
/// function has not been called yet.
Stop() -> () error Error;
/// When the child is stopped due to `Stop` being called, the child exiting
/// on its own, or for any other reason, `OnStop` is called and then this
/// channel is closed.
-> OnStop(struct {
stopped_payload StoppedPayload;
});
};
Stop() is a "destructive" method: one of its side effects is to close the channel of the protocol it's attached to. The problem is the following. When Stop() returns, we would like the server to close the ExecutionController, since that is what authoritatively signals to the client that the component has stopped. However, this is not possible without some delay between the response of Stop() and closing the channel, since the server cannot deliver the response after closing the channel, or atomically with closing the channel.
Option 2 is to make Stop() a void method. That way, there is no problem of giving inconsistent signals about when stop finished. Clients have to listen for PEER_CLOSED (or the event) to wait for the stop to complete.
protocol ExecutionController {
/// Causes this component to stop. When stop completes, OnStop will be
/// delivered on the channel and it will be closed.
Stop();
/// When the child is stopped due to `Stop` being called, the child exiting
/// on its own, or for any other reason, `OnStop` is called and then this
/// channel is closed.
-> OnStop(struct {
stopped_payload StoppedPayload;
});
};
There are some obvious drawbacks to this option, though. In general, "Pipelining" APIs are less user friendly. Also, if we want Stop() to return an error, it's not clear how to do that. Maybe the error could become part of the StoppedPayload, but that's a lot less friendly than it coming straight from the method.
Finally, there's an Option 3 where we lift Stop() to the top-level Controller:
protocol Controller {
/// Start the component, optionally providing additional handles to be given
/// to the component. Returns INSTANCE_ALREADY_RUNNING if the instance is
/// currently running.
Start(resource struct {
args StartChildArgs;
execution_controller server_end:ExecutionController;
}) -> () error Error;
/// Stops the component. This function blocks until the stop action is
/// complete.
///
/// When this returns, the ExecutionController will be closed and contain
/// the OnStop event.
Stop() -> () error Error;
};
This would make it possible to close the ExecutionController before returning from Stop(). However, it decouples the method from the entity it's acting on.
Is there guidance for which pattern we should prefer? Or prior art we can refer to?
P.S.
@Yifei Teng and I were discussing another hypothetical option if
terminal events were available, which would let us have our cake and eat it too:
protocol ExecutionController {
/// Stops the component. This function blocks until the stop action is
/// complete.
///
/// Atomically with sending the response, the server will put this channel
/// into the terminal state, containing the OnStop terminal event.
@terminal Stop() -> (StoppedPayload) error Error;
@terminal -> OnStop(StoppedPayload);
};
The idea here is that the server would have a way to put the channel into the "terminal state", with the terminal event, simultaneously with the Stop response. This would give us everything we want, and there is no race between delivering the response and closing the channel.