Forwarded conversationSubject:
Another Pipe datatype------------------------
From: Michael Snoyman <mic...@snoyman.com>
Date: Sat, Jun 2, 2012 at 11:11 PM
To: Chris Smith <cds...@gmail.com>, Gabriel Gonzalez <Gabri...@gmail.com>, Paolo Capriotti <p.cap...@gmail.com>
Hi guys,
Two ideas that have come up recently have been sitting in the back of
my mind for the past few days:
* Chris's idea of a fifth type parameter for upstream result.
* Gabriel's recommendation that conduit do away with the second field in PipeM.
Putting this together with conduit, I came up with the following:
https://github.com/snoyberg/conduit/blob/7c167419b14daa69f84c0228eda7afdc9558ae27/conduit/Data/Conduit/Internal.hs#L62
data Pipe i o u m r =
HaveOutput (Pipe i o u m r) (m ()) o
| NeedInput (i -> Pipe i o u m r) (u -> Pipe Void o () m r)
| Done r
| PipeM (m (Pipe i o u m r))
| Leftover (Pipe i o u m r) i
I can go into more details about intuition here if you guys are
interested, but I have a feeling I'd just be boring you with details
you can guess on your own. (If you want more elaboration, just let me
know, I'll be happy to write up some more tomorrow.) I haven't checked
either the Monad or Category laws (yet). I'm fairly certain the Monad
laws hold up, and likely barring the issue of Leftover which needs a
bit more research, I think the Category laws hold too.
One obvious (minor) point of contention is whether to use Void or ()
for the first parameter of the second field in NeedInput. I have a
feeling that () is actually more correct, which if I'm not mistaken is
what both pipes and pipes-core are already using.
Anyway, if you guys have any thoughts on this, let me know.
Michael
----------
From: Gabriel Gonzalez <gabri...@gmail.com>
Date: Sat, Jun 2, 2012 at 11:39 PM
To: Michael Snoyman <mic...@snoyman.com>
Cc: Chris Smith <cds...@gmail.com>, Paolo Capriotti <p.cap...@gmail.com>
This type only differs from Frames in three ways:
- Left-over input
- Fifth type parameter
- Non-obligatory monad
I can't comment on left-over input yet because I haven't had time to
address parsing, and everybody here already knows the advantages and
disadvantages of making the monad bind optional, so I won't go over
that.
The fifth-type parameter is already something I came up with as part
of the comonoid half of the frame implementation. I deliberately
left it out (i.e. by using Nothing) because I was specializing it to
finalization and didn't want to complicate the Pipe type constructor
(yet). You can use this generalized parameter to communicate any
kind of exception and it doesn't have to be tied to termination at
all. I typically use 'e' to denote this type parameter (for
exception), and Paolo's implementation is an example where 'e' would
be SomeException, although it's still not quite the way I would have
implemented it.
The way my generalized frames would use this 'e' parameter is that
they have an additional constructor (perhaps named "Throw")
dedicated just to throwing 'e' and then the Await constructor looks
just like the one you have, with an input of (Either e a). If the
pipe that received the 'e' value awaits a second time after
receiving the 'e' or terminates, that pipe gets temporarily
suspended and the pipe rethrows the "e" downstream first, giving
downstream a chance to handle it before it continues. The exception
is the most downstream pipe, which will not automatically rethrow
the exception if it awaits again or terminates. On the contrary, if
none of the downstream pipes terminated, this would then cause the
chain of suspended awaits to collapse back to the point where the
exception was thrown, allowing it to resume where it left off as if
nothing happened (i.e. the exception was completely handled).
However, if one of them terminates, then this terminates the entire
pipe (and, of course, you can have termination itself throw a
distinct exception of its own, too giving pipes downstream a chance
to handle that.
So the simple semantic explanation of the behavior is that if a pipe
throws an exception, every pipe downstream of it gets a chance to
handle it, and then if they all await again control returns back to
the pipe that threw the exception.
There is actually an even more general solution with a 6th type
parameter that replaces the (m ()) field of HaveOutput with a
generalized monoid, in case you are curious.
So I think the 5th type parameter is on the right track and it
mirrors what I encountered when working with frames, however,
implementing composition correctly with this is not trivial,
especially since I still think you have the specific case of just
termination exceptions still implemented incorrectly.
----------
From: Michael Snoyman <mic...@snoyman.com>
Date: Sun, Jun 3, 2012 at 7:30 AM
To: Gabriel Gonzalez <gabri...@gmail.com>
Cc: Chris Smith <cds...@gmail.com>, Paolo Capriotti <p.cap...@gmail.com>
I think we're looking at the purpose of this fifth parameter
completely differently. I'm basing it off of Chris's idea discussed
here:
http://www.reddit.com/r/haskell/comments/uav9d/pipes_20_vs_pipescore/c4u4uz5
You're describing a type of parameter that would affect the entire
remainder of the pipeline. What I've implemented (and what I think
Chris meant) would just affect the next pipe downstream. This
intuition I would use to describe this is that whenever a pipe
produces a stream of values, it produces a stream of output values,
followed by a single result type to "seal" the stream. Previously in
conduit, only the most downstream pipe had the option of returning a
meaningful result value; all other pipes implicitly returned unit. The
modification to conduit now allows upstream pipes to have arbitrary
return types.
The reason I was able to make the modifications to finalization was by
reassessing how closing a pipe would work, based on my experience with
implementing this upstream result type. Essentially, a pipe can be in
a few states:
1. Not yet run
2. Done
3. Awaiting input from upstream
4. Yielding output downstream
The change in approach is realizing that, if yielding downstream fails
(because downstream already shut down), there's no need to send a
final result value downstream. Instead, all we need to do is finalize.
That means that yield can auto-terminate, and finalization only needs
to occur during yield. Also, finalization no longer has to return a
result value.
It might be easier to look at the simplified `pipe` function which
doesn't take resume into account:
pipe :: Monad m => Pipe a b r0 m r1 -> Pipe b c r1 m r2 -> Pipe a c r0 m r2
pipe =
pipe' (return ())
where
pipe' :: Monad m => m () -> Pipe a b r0 m r1 -> Pipe b c r1 m r2
-> Pipe a c r0 m r2
pipe' final left right =
case right of
Done r2 -> PipeM (final >> return (Done r2))
HaveOutput p c o -> HaveOutput (pipe' final left p) c o
PipeM mp -> PipeM (liftM (pipe' final left) mp)
Leftover p i -> pipe' final (HaveOutput left final i) p
NeedInput rp rc ->
case left of
Done r1 -> noInput () (rc r1)
HaveOutput left' final' o -> pipe' final' left' (rp o)
PipeM mp -> PipeM (liftM (\left' -> pipe' final
left' right) mp)
Leftover left' i -> Leftover (pipe' final left' right) i
NeedInput left' lc -> NeedInput
(\a -> pipe' final (lp a) right)
(\r0 -> pipe' final (lc r0) right)
Can you clarify what you mean by termination not being handled correctly?
Michael
----------
From: Chris Smith <cds...@gmail.com>
Date: Sun, Jun 3, 2012 at 5:14 PM
To: Michael Snoyman <mic...@snoyman.com>
Cc: Gabriel Gonzalez <gabri...@gmail.com>, Paolo Capriotti <p.cap...@gmail.com>
Michael Snoyman <
mic...@snoyman.com> wrote:
> I think we're looking at the purpose of this fifth parameter
> completely differently.
I think so as well. I'm sure there are many ways that a fifth type
parameter could be added to Pipe, but Michael's code looks consistent
with what I had in mind, though I haven't looked in detail at the
finalization part.
The only comment I'd give (and this is admittedly tough in conduit due
to backward compatibility) is that it sure would be nice to separate
leftovers from the base pipe, just because leftovers are a different
concern, and don't fit cleanly into the abstraction. Including them
makes the abstraction ugly everywhere, as opposed to only in
situations where it's needed. This comes out in the "downstream
leftovers will be discarded" rule -- which is of course necessary, but
would lead me to want to build a different type for pipes with
leftovers. That would essentially amount to:
newtype LeftoverPipe a b u m r = LeftoverPipe { fromLeftoverPipe
:: Pipe a b u m (r, Maybe a) }
with an appropriate monad instance, and a specialized
pseudo-composition where the LeftoverPipe can only be used on the
upstream side. If you want to explicitly discard leftovers, then you
could unwrap it back to a regular pipe, and (liftM fst) on the inner
pipe.
--
Chris
----------
From: Paolo Capriotti <p.cap...@gmail.com>
Date: Sun, Jun 3, 2012 at 5:52 PM
To: Chris Smith <cds...@gmail.com>
Cc: Michael Snoyman <mic...@snoyman.com>, Gabriel Gonzalez <gabri...@gmail.com>
On Sun, Jun 3, 2012 at 3:14 PM, Chris Smith <
cds...@gmail.com> wrote:
> Michael Snoyman <
mic...@snoyman.com> wrote:
>> I think we're looking at the purpose of this fifth parameter
>> completely differently.
>
> I think so as well. I'm sure there are many ways that a fifth type
> parameter could be added to Pipe, but Michael's code looks consistent
> with what I had in mind, though I haven't looked in detail at the
> finalization part.
I actually think Gabriel is right (although I'm still unsure what this
"comonoid" business is all about). The `u` parameter can be seen as the
upstream return value (I guess this is Michael's intuition as well, given the
name he's chosen).
Of course, in the end, `u` and `r` will be unified, but for
individual portions of a pipe (composed using monadic bind), they are not
necessarily going to be the same type.
So `u` is playing the role of my `BrokenPipe` exception, but it actually
carries the upstream return value with it. More generally, you could use
`Either u SomeException` to support catching general exceptions instead of just
`BrokenPipe`.
What I don't like in this approach is that it makes it impossible to define a
simple `await` primitive returning `a`. The reason you can do that in
pipes-core is that, in addition to the `SomeException -> Pipe a b m r` field in
`Await`, you also have a `Throw` primitive, which lets you express the idea of
"ignoring" the exception (by essentially catching it then rethrowing it).
Extending this idea to the 5-parameter solution, it would probably make sense
to add another constructor (analogous to my `Throw`):
data Pipe a b u m r =
...
Defer u -- no continuation
defer :: u -> Pipe a b u m r
defer = Defer
which works just like `Done`, but with the upstream return value. Now you can
implement `await`:
await :: Pipe a b u m r
await = Await return defer
I'm not sure if this works, but if it does, it looks like a very nice
improvement.
Without `await`, I don't think the fifth parameter is a good tradeoff: you
would end up with a cruftier version of `await`, in exchange for a (sometimes)
cleaner `runPipe`. And `await` usually appears quite more frequently then
`runPipe`, so that doesn't seem very advantageous to me.
> This comes out in the "downstream
> leftovers will be discarded" rule -- which is of course necessary, but
> would lead me to want to build a different type for pipes with
> leftovers. That would essentially amount to:
>
> newtype LeftoverPipe a b u m r = LeftoverPipe { fromLeftoverPipe
> :: Pipe a b u m (r, Maybe a) }
>
> with an appropriate monad instance, and a specialized
> pseudo-composition where the LeftoverPipe can only be used on the
> upstream side. If you want to explicitly discard leftovers, then you
> could unwrap it back to a regular pipe, and (liftM fst) on the inner
> pipe.
I agree 100% on the fact that "pipe-with-leftovers" should be different from
"pipe", because neither is strictly more general than the other (one has
leftovers, but no sensible composition).
This `LeftoverPipe` looks a lot like my old `ChunkPipe` in the previous release
of pipes-extra. Maybe it can actually be made to work, but I originally messed
that up and it didn't satisfy the `Monad` laws.
My current solution is `PutbackPipe` (also in pipes-extra), which has the
slightly more powerful `unawait` primitive.
By the way, would it make sense to move this conversation (and possible future
ones about the design of pipes/conduits) to some public medium? How about
setting up a mailing list?
BR,
Paolo
----------
From: Michael Snoyman <mic...@snoyman.com>
Date: Sun, Jun 3, 2012 at 6:32 PM
To: Paolo Capriotti <p.cap...@gmail.com>
Cc: Gabriel Gonzalez <gabri...@gmail.com>, Chris Smith <cds...@gmail.com>
Just responding to the point of public mailing list: I think that's a great idea. Anyone opposed to a Google Group named "streaming-haskell"? And is it OK with everyone if I forward the contents of this thread to the list?
----------
From: Chris Smith <cds...@gmail.com>
Date: Sun, Jun 3, 2012 at 6:46 PM
To: Paolo Capriotti <p.cap...@gmail.com>
Cc: Michael Snoyman <mic...@snoyman.com>, Gabriel Gonzalez <gabri...@gmail.com>
Paolo Capriotti <
p.cap...@gmail.com> wrote:
> I actually think Gabriel is right (although I'm still unsure what this
> "comonoid" business is all about). The `u` parameter can be seen as the
> upstream return value (I guess this is Michael's intuition as well, given the
> name he's chosen).
Yes, that's exactly what u is...
> Of course, in the end, `u` and `r` will be unified, but for
> individual portions of a pipe (composed using monadic bind), they are not
> necessarily going to be the same type.
Right. I'm actually backing off on the "eventually u and r will be
unified" part, too. I initially thought of the u parameter as a
kludge, a trick to fix up return types for pipes that worked like
consume, and that it should eventually be stuffed back under the
covers. But it's become clear that the u type parameter is meaningful
in its own right. In my sample implementation, the type of runPipe is
runPipe :: Monad m => Pipe () Void u m r -> m r
So there is no requirement that the u type parameter unify with the
result anywhere.
> So `u` is playing the role of my `BrokenPipe` exception, but it actually
> carries the upstream return value with it. More generally, you could use
> `Either u SomeException` to support catching general exceptions instead of just
> `BrokenPipe`.
Yes, this would be a welcome change. I completely punted on questions
of exceptions and finalization, but it's true that the upstream pipe
could terminate with an exception. In that case, you'd probably want
await to rethrow the exception, so changing the type of the Await
constructor seems reasonable.
> What I don't like in this approach is that it makes it impossible to define a
> simple `await` primitive returning `a`. The reason you can do that in
> pipes-core is that, in addition to the `SomeException -> Pipe a b m r` field in
> `Await`, you also have a `Throw` primitive, which lets you express the idea of
> "ignoring" the exception (by essentially catching it then rethrowing it).
>
> Extending this idea to the 5-parameter solution, it would probably make sense
> to add another constructor (analogous to my `Throw`):
Yes, I agree with this as well. Again, I completely punted on
exception handling. If there is a way to build a pipes-style await
primitive on top of tryAwait (being the version we have now) and
exception handling, that is ideal. What I did not understand about
Gabriel's post with exception handling was the strange wiring where
downstream pipes get to handle exceptions first, and sounded like some
magical behavior was being attached to using await a second time. My
intuition says that a pipe should handle its own exceptions, and if it
terminates with an exception, they you should pass that downstream
(and a second await after catching the exception should rethrow the
same exception again). I'm willing to be convinced otherwise, though.
> Without `await`, I don't think the fifth parameter is a good tradeoff: you
> would end up with a cruftier version of `await`, in exchange for a (sometimes)
> cleaner `runPipe`. And `await` usually appears quite more frequently then
> `runPipe`, so that doesn't seem very advantageous to me.
On the other hand, you could say that it's preferable because it
allows things to be done that just can't be done (without undefined
and such) in the current pipes-core. But hopefully we can recover the
more convenient await primitive through something like exception
handling, and we won't have to worry about it.
> This `LeftoverPipe` looks a lot like my old `ChunkPipe` in the previous release
> of pipes-extra. Maybe it can actually be made to work, but I originally messed
> that up and it didn't satisfy the `Monad` laws.
Okay, I'd have to look at the details. I don't have proofs of the
monad laws for what I've written, but it looks right intuitively. In
any case, if there's a better option there, fine. My main point was
that including it in Pipe is probably a mistake in the long term,
since it fails to enforce issues that can lead to just dropping data
(I'd say something about the category laws, but really it's the
dangerous behavior that matters, and the category laws are the
symptom).
> By the way, would it make sense to move this conversation (and possible future
> ones about the design of pipes/conduits) to some public medium?
Fine with me.
--
Chris
----------
From: Paolo Capriotti <p.cap...@gmail.com>
Date: Sun, Jun 3, 2012 at 7:00 PM
To: Michael Snoyman <mic...@snoyman.com>
Cc: Gabriel Gonzalez <gabri...@gmail.com>, Chris Smith <cds...@gmail.com>
On Sun, Jun 3, 2012 at 4:32 PM, Michael Snoyman <
mic...@snoyman.com> wrote:
> Just responding to the point of public mailing list: I think that's a great
> idea. Anyone opposed to a Google Group named "streaming-haskell"? And is it
> OK with everyone if I forward the contents of this thread to the list?
Yep! Please go ahead!
BR,
Paolo