So, I have some Cloud Haskell code that look a bit like this:
import Control.Monad.Catch (bracket, etc....)
someInterestingCode = do
setProcessState s'
bracket (processApply msg) (const . dequeue) return
My question is at which points can an asynchronous exception can interrupt this code. Now, setProcessState is masked (but not uninterruptible), like so:
modifyState f =
ST.get >>= \s -> liftIO $ mask_ $ do
atomicModifyIORef' s $ \s' -> (f s', ())
And dequeue works similarly... I do not care if we setProcessState and then crash, as there are no guarantees made to library users that such a thing won't happen. And I don't care if we crash before setProcessState either.
Now obviously that processApply call can be interrupted by an async exception, and that's fine. The library documentation explains to users that their message handling functions can fail in that way. The purpose of this bit of code is to only remove the message (from an internal priority queue held in the process' private state) if the handler succeeds. However, I want to understand if an async exception can arrive between executing the first and second arguments to bracket...
If we can successfully evaluate (processApply msg) but then crash, because the library allows users to supply handlers for async exceptions (but in a configuration oriented fashion, rather than inline in the executable code), we might recover from the exception due to code situated elsewhere, but (!) if we don't dequeue the message then we'll see it again next time we hit this code. Because this code executes in a "server loop" as it were, that next time will be immediately! I will need to document this, so that users understand that their handler code needs to be idempotent in the face of duplicate requests/messages. Worse still, there is a possibility that the user's handler might have interacted with another (green) thread (i.e. Cloud Haskell process) and the interaction, due to a bug or erroneous data, will always trigger an exception back to the client. In that edge case, we will have created an infinite loop of running code that leads to an async exception, catching and swallowing it, but failing to update our internal state and therefore constantly trying to process the same input again. Worse even than that, this situation is highly susceptible to races (in the timing between the async exception arriving and the second argument to bracket being evaluated, and such things being depending on innumerable factors external to the code and even to the RTS), such that the invalid reprocessing behaviour will be opaque to the client portion of the server code. Hmph - not nice at all.
So am I right in assuming that this edge case can crop up? And if so, should I simply document this, or is there a way around it that does not involve making the whole thing uninterruptible (or masking at all tbh)? Because processApply calls client code which I know nothing about, and which runs in the Process monad (on top of IO), I cannot arbitrarily mask the whole thing, and even `mask (bracket (restore $ pApply msg) (const . dequeue) return)` doesn't seem as though it will do the trick... Would that really guarantee that dequeue runs if processApply succeeds?
Thanks in advance for any pointers.
Relevant code for those interested in background reading: