Pipes/Producers vs IO

Martyn J Pearce

Apr 7, 2017, 8:14:05 AM4/7/17
Dear Pipers (Pipe Smokers?),

Please accept my apologies for a lengthy example for what is undoubtedly
a simple problem, but I'm struggling to know how best to simplify.

I'm writing a little command-execution package. It's primarily for my
own learning, so don't worry about "couldn't I use package X...".

I have a 'systemx' function, that creates a command that will be run
with stdout/stderr inherited from the process:

systemx :: (MonadError ExecError μ, MonadReader (ProcExecCtxt3 μ) ρ)
=> CmdSpec -> ρ (Producer [CmdSpec] μ ())

A CmdSpec is a command specification - that is, an executable name &
some arguments.
ProcExecCtxt3 holds an 'executor' - a function to 'execute' the command;
in normal circumstances, this uses System.Process et al, but it's
replaceable so that I could
mock the command. Hence the free typevar μ, which is MonadIO in normal
circumstances but could be a non-IO monad for mock purposes. The
ExecError is thrown if the command returns non-zero.
The producer is to "log" the commands run - when mocking, I can use a
MonadWriter to collect the list, and use that for testing.

Now, this all works fine in simple form, where runCtxt3 is mechanics for
executing via System.Process:

cmds1 :: (MonadError ExecError μ, MonadReader (ProcExecCtxt3 μ) ρ) => ρ
(Producer [CmdSpec] μ ())
cmds1 = systemx (CmdSpec grep ["root", "/etc/passwd"])

cmds1' :: (MonadError ExecError μ, MonadReader (ProcExecCtxt3 μ) ρ) => ρ
(Producer [CmdSpec] μ ())
cmds1' = systemx (CmdSpec grep ["foo", "/etc/motd"])

λ> :t runCtxt3
runCtxt3 :: MonadIO μ => ProcExecCtxt3 μ

λ> runExceptT $ runEffect $ for (runReader (cmds1') runCtxt3) (mapM_
CMD: /bin/grep foo /etc/motd
/bin/grep: /etc/motd: No such file or directory
Left (ExecError (ExitVal 2))

λ> runExceptT $ runEffect $ for (runReader cmds1 runCtxt3) (mapM_ warnCmd)
CMD: /bin/grep root /etc/passwd
Right ()

BUT! If I attempt to execute two commands, only the last one gets run:

cmds2 :: (MonadError ExecError μ, MonadReader (ProcExecCtxt3 μ) ρ) => ρ
(Producer [CmdSpec] μ ())
cmds2 = do
systemx grepIt
systemx grepIt2

This is baffling me. I would suspect laziness, but that we're clearly
running within MonadIO, that doesn't apply, right?

I'm sure I'm being a bit dim, but any pointers would be gratefully received.


Martyn J Pearce

Apr 7, 2017, 11:20:10 AM4/7/17
I have a slightly simpler version (and evidence that I am trying to
solve this myself, or at least, am stumbling around in the dark) that
stands alone (with mtl, pipes & process).
Still, somewhere, I am losing the first grep.
The second example shows that if I lose the MonadReader, all is well.
So I considered re-ordering the monad evaluation, as you will see - but
to no avail.

{-# LANGUAGE FlexibleContexts #-}

-- base --------------------------------
import Control.Monad ( Monad, (>>), (>>=), mapM_, return )
import Data.Function ( (.), ($) )
import Data.String ( String )
import System.IO ( IO, putStrLn )

-- mtl ---------------------------------
import Control.Monad.Reader ( MonadReader, ask, runReader )

-- pipes -------------------------------
import Pipes ( Producer, for, lift, runEffect, yield )

-- process -----------------------------
import System.Process ( callCommand )


type CmdSpec = String

-- | "log" cmd, then "run" cmd with exec
sys :: Monad m => CmdSpec -> (CmdSpec -> m b) -> Producer [CmdSpec] m b
sys cmd exec = do
yield [cmd]
lift $ exec cmd

-- | log & run cmd, in the context of an executor
sys' :: (MonadReader (CmdSpec -> m b) n, Monad m) =>
CmdSpec -> n (Producer [CmdSpec] m b)
sys' c = ask >>= return . sys c

g1 :: CmdSpec
g1 = "/bin/grep root /etc/passwd"

g2 :: CmdSpec
g2 = "/bin/grep Ubuntu /etc/lsb-release"

-- what happens to the first grep?
main :: IO ()
main = do
runEffect $ for (runReader (sys' g1 >> sys' g2) callCommand)
(lift . mapM_ putStrLn)

putStrLn "----"

runEffect $ for (sys g1 callCommand >> sys g2 callCommand)
(lift . mapM_ putStrLn)

putStrLn "----"

runEffect $ runReader (fmap ( \ f -> for f (lift . mapM_ putStrLn))
(sys' g1 >> sys' g2))

putStrLn "----"

runReader (runEffect <$> fmap ( \ f -> for f (lift . mapM_ putStrLn))
(sys' g1 >> sys' g2))

Gabriel Gonzalez

Apr 8, 2017, 12:28:03 AM4/8/17
When in doubt, you can use equational reasoning to figure out what is going on.

In this case, let's use the first example, which is:

    runEffect $ for (runReader (sys' g1 >> sys' g2) callCommand)
                    (lift . mapM_ putStrLn)

If we inline the definition of `sys'` we get:

    runEffect $ for (runReader ((ask >>= return . sys g1) >> (ask >>= return . sys g2)) callCommand)
                    (lift . mapM_ putStrLn)

Now let's focus on this subexpression:

    (ask >>= return . sys g1) >> (ask >>= return . sys g2)

... which is equivalent to:

    (ask >>= return . sys g1) >>= \_ -> (ask >>= return . sys g2)

The associativity monad law says that:

    (m >>= f) >>= g = m >>= \x -> (f x >>= g)

... where in this case:

    m = ask
    f = return . sys g1
    g = \_ -> (ask >>= return . sys g2)

... which means that we can transform it to:

    ask >>= \x -> ((return . sys g1) x >>= \_ -> (ask >>= return . sys g2))

... and according to the definition of function composition that is the same as:

    ask >>= \x -> (return (sys g1 x) >>= \_ -> (ask >>= return . sys g2))

Now we can use the left identity monad law which states that:

    return a >>= f = f a

... where in this case:

    a = sys g1 x
    f = \_ -> (ask >>= return . sys g2)

... which means that we can transform it into:

    ask >>= \x -> (\_ -> (ask >>= return . sys g2)) (sys g1 x)

... which further simplifies to:

    ask >>= \x -> ask >>= return . sys g2

... which further simplifies to:

    ask >> ask >>= return . sys g2

... which shows that the `g1` is completely discarded.

Notice that this has nothing to do with `pipes`.  This is confined entirely to the `Reader`-related code.

It might be more clear if we use do notation.  The original Reader expression in do notation would be:

    do _ <- do x <- ask
               return (sys g1 x)
       y <- ask
       return (sys g2 y)

... and the monad laws say that it is equivalent to:

    do x <- ask
       _ <- return (sys g1 x)
       y <- ask
       return (sys g2 y)

The `sys g1 x` that you are `return`ing is being discarded by the empty assignment: `_ <-`.

What I think you probably meant to do was to run the `sys g1 x` command instead of `return`ing it as an inert value and then discarding it.  In other words, I think you wanted something like this:

    do x <- ask
       _ <- lift (sys g1 x)
       y <- ask
       lift (sys g2 y)

... or this:

    do x <- lift ask
       _ <- sys g1 x
       y <- lift ask
       sys g2 y

... depending on which order you nest your monad transformers.

Martyn J Pearce

Apr 9, 2017, 10:25:52 AM4/9/17
Thanks very much for the pointers, Gabriel.  I think I'm following it, and the nub of the matter seems to be that within sys' I am 'return'ing IO action rather than 'lift'ing it.

However, I am struggling to get either proposed solution (or any near variant that I can see) to work.

  λ> :t \c -> lift ask >>= sys c
  \c -> lift ask >>= sys c :: MonadReader (CmdSpec -> m b) m => CmdSpec -> Pipes.Internal.Proxy Pipes.Internal.X () () [CmdSpec] m b

In this formulation, the Monad m is both the reader monad and the result of the function read.  This makes it impossible to instantiate (I think) because of the recursive type.

  λ> :t \c -> ask >>= lift . sys c
  \c -> ask >>= lift . sys c
    :: (MonadReader (CmdSpec -> m b) (t (Pipes.Internal.Proxy Pipes.Internal.X () () [CmdSpec] m)), MonadTrans t, Monad m) =>
       CmdSpec -> t (Pipes.Internal.Proxy Pipes.Internal.X () () [CmdSpec] m) b

That Just Looks Wrong(TM), and still has m both within and without the reader.

I feel that the 'solution' is very near by, but I'm not seeing it (despite experimentation).

I appreciate that this isn't really a Pipes problem.  If there's a better place I should ask this, that's cool.  Of course, if you (or anybody else here) is able/willing to point me in the right direction, that would be appreciated.



Martyn J Pearce

Apr 9, 2017, 11:25:43 AM4/9/17
to, Nicolas Dudebout

I'm pleased to say that I spoke too soon.  The second stanza does work, though I now need to work out why.

  sys'' :: (MonadReader (CmdSpec -> m b) (t (Producer [CmdSpec] m)),

            MonadTrans t, Monad m) =>
           CmdSpec -> t (Producer [CmdSpec] m) b
  sys'' c = ask >>= lift . sys c

  runEffect (for (runReaderT (sys'' g2 >> sys'' g1) callCommand)
                 (lift . mapM_ putStrLn))

I think I need to think carefully about how MonadTrans works, and maybe even try a little equational reasoning to get my head around that one.
But, I have something that works, which obviously means that I can work from there.  And for that I am grateful.

Many thanks,

Gabriel Gonzalez

Apr 10, 2017, 12:34:16 PM4/10/17
to, Nicolas Dudebout
One tip I would give is to not use `mtl` type classes when trying to reason about the code's behavior.  Just use the `transformers` library directly and specify a concrete monad transformer stack.  That will make it easier to understand what is going on.
