RandomIO

63 views
Skip to first unread message

GS

unread,
Nov 23, 2014, 1:10:07 PM11/23/14
to haskel...@googlegroups.com
Hi,

my application needs to parse a binary file will have to do random IO
within the file, ie. jump all over the place.

I was encouraged by the paper from Oleg Kiselyov and am evaluating the
available libraries, iteratee, enumerator, conduit and pipes. The later
is currently leading the list, but I'd like to know upfront if I can do
random IO with it.

GS

Gabriel Gonzalez

unread,
Nov 23, 2014, 6:52:34 PM11/23/14
to haskel...@googlegroups.com, g...@kmmd.de
Yes, you can do this using `Pipes.Core`. The `request` function from
`Pipes.Core` generalizes `await` by letting you parametrize each request
for input with an argument. You can use this argument to seek within
the `Handle`.

To illustrate this, I'll begin from `hGet` and `hSeek`, which have the
following types:

-- Read n byte from the given Handle
hGet :: Handle -> Int -> IO ByteString

-- Seek to the given index
hSeek :: Handle -> Int -> IO ()

We can wrap these into composite function which optionally seeks and
then reads a byte:

hGetSeek :: Handle -> (Maybe Int, Int) -> IO ByteString
hGetSeek handle (mSeek, size) = do
case mSeek of
Nothing -> return ()
Just seek -> hSeek handle seek
hGet handle size

Now, I can build a pipe where each `request` decides whether to seek and
the number of bytes to read:

example :: Client (Maybe Int, Int) ByteString IO ()
example = do
bs1 <- request (Nothing, 4) -- `request` is like `await`,
except you can give it an argument
lift (print bs1)
bs2 <- request (Just 12, 4)
lift (print bs2)

I can then satisfy those `request`s using `hGetSeek` by using the
`(>\\)` operator:

-- The parentheses are optional, but I include them here for clarity
(lift . hGetSeek handle) >\\ example :: Effect IO ()

... and then run that, using `runEffect`:

runEffect (lift . hGetSeek handle ->> example) :: IO ()

What does that do? Well, `f >\\ p` just substitutes every `request` in
`p` with `f`, so it's equal to:

(lift . hGetSeek handle) >\\ example

= do
bs1 <- (lift . hGetSeek handle) (Nothing, 4)
lift (print bs1)
bs2 <- (lift . hGetSeek handle) (Just 12, 4)
lift (print bs2)

= do
bs1 <- lift (hGetSeek handle (Nothing, 4))
lift (print bs1)
bs2 <- lift (hGetSeek handle (Just 12, 4))
lift (print bs2)

... and all that `runEffect` does is remove the `lift`s:

runEffect (lift . hGetSeek handle >\\ example)

= do
bs1 <- hGetSeek handle (Nothing, 4)
print bs1
bs2 <- hGetSeek handle (Just 12, 4)
print bs2

Since we've decoupled the read action from the `example` function, we
can easily inject a mock reader if we wanted to:

runEffect ((\_ -> return "") >\\ example)

That injects a bogus handler that always returns the empty bytestring.
We can then reason that this will be equivalent to:

= runEffect (do
bs1 <- (\_ -> return "") (Nothing, 4)
lift (print bs1)
bs2 <- (\_ -> return "") (Just 12, 4)
lift (print bs2))

= runEffect (do
bs1 <- return ""
lift (print bs1)
bs2 <- return ""
lift (print bs2))

= runEffect (do
lift (print "")
lift (print ""))

= do
print ""
print ""

In the context of the above examples, the `(>\\)` operator had this
shape (the real type is much more general):

(>\\) :: (a -> Effect IO b) -> Client a b IO r -> Effect IO r

-- where:
-- a = (Maybe Int, Int)
-- b = ByteString
-- r = ()

In other words, `lift . hGetSeek handle` was an `Effect` and `example`
was a `Client`. A `Client a b m r` is like a `Consumer b m r` except
that it sends out arguments of type `a` and receives back results of
type `b`. In fact, `Consumer` is just a special case of a `Client`,
where`a = ()`:

type Consumer = Client ()

... and `await` is just `request` with `()` as the argument:

await = request ()

Note that `(>\\)` is not the only way to satisfy the upstream interface
of `example`, but it sounds like it's good enough for your specific use
case. If you're interested in more advanced features, just let me know.
Reply all
Reply to author
Forward
0 new messages