On Mon, Jul 30, 2012 at 5:51 AM, Bruno Jouhier <bjouh
...@gmail.com> wrote:
> @tim
> The API that I used in this blog post is a simplified version of the API I
> implemented in streamline. I simplified it in the blog post because I just
> wanted to demo the equivalence between the two styles of API.
> The streams module that I am using
> (https://github.com/Sage/streamlinejs/blob/master/lib/streams/server/s...)
> has most of the features that you saw missing:
> * an optional "len" parameter in the read call.
> * low and high water mark options in the ReadableStream constructor.
> The "len" parameter has your "bytes" semantics and I use it exactly the way
> you describe (typically to read 4 bytes to get a frame length and then read
> N bytes for a frame). I did not implement "maxBytes" semantics because I did
> not need it (which does not mean it would not be useful). The thing is that
> all the additional bells and whistles can be implemented around the basic
> read(cb) call (called readChunk in my module).
> I introduced low and high mark options because I wanted to avoid a
> pause/resume dance around every data event when the data arrives faster than
> it is consumed. My assumption was that a little queue with high and low
> marks would reduce the number of pause/resume calls and improve performance.
> Basically tradiing a bit of space for speed. But I have to admit that I did
> not bench it. So, if the pause/resume dance costs very little this may be
> overkill.
> @isaac and mikeal,
> This callback proposal may sound very "anti-eventish" and it may give the
> impression that I'm sorta trying to eradicate events from node's APis
> (nobody said it but I can see how it could be perceived this way). This is
> not the case. I like node's event API and I find it very elegant. But node
> gives us two API styles (callbacks and events) and it is not always easy to
> choose between the two. Here is the rationale that I use to decide between
> them:
> My main criteria is CORRELATION. Basically, I start with the assumption that
> the API is event-oriented and then I analyze the degree of correlation
> between the various events. If the events are highly correlated, I choose
> the callback style. If there are loosely correlated, I keep the event style.
> Some examples:
> * User events (browser side) are very loosely correlated => event style
> * Incoming HTTP requests (server side) are also very loosely correlated =>
> event style
> * Data streams vary. If each data chunk is a complete message which is more
> or less independent from other messages, the event style is best. If, on the
> other hand, the chunks are correlated (because the whole stream has a strong
> internal structure, or because it has been chunked on arbitrary boundaries
> that don't match its internal structure), then the callback style is best.
> * Confirmation events (like "connect/error" events that follows a connection
> attempt, or a "drain" event that follows a write returning false) are fully
> correlated => callback style.
> Also, the event style API is more powerful than the callback style API as it
> supports multiple listeners.
> BUT:
> * It is very easy to wrap a callback API with an event listener.
> * Very often, in the correlated case, there is a "main" consumer which needs
> to correlate the events, and auxiliary consumers that don't care that much
> about the correlations (log them, feed statistics, etc). A dual API with
> callbacks for the main consumer and events for the auxiliary ones works
> great.
> * Wrapping an event style API with a callback style API is a lot more
> difficult.
> * Callback style APIs are easier to use when the events are correlated
> because you don't need to setup state machines to re-correlate the events.
> Given this, I probably favor the callback style a lot more than most node
> developers. But this is not a systematic "anti-event" attitude, there is a
> rationale behind it and I wanted to share it with you.
> Bruno
> On Saturday, July 28, 2012 9:14:11 PM UTC+2, Mikeal Rogers wrote:
>> On Jul 28, 2012, at July 28, 201212:05 PM, Tim Caswell
>> <t...@creationix.com> wrote:
>> > FWIW, I actually like Bruno's proposal. It doesn't cover all the use
>> > cases, but it makes backpressure enabled pumps really easy.
>> > One use case missing that's easy to add is when consuming a binary
>> > protocol, I often only want part of the input. For example, I might
>> > want to get the first 4 bytes, decode that as a uint32 length header
>> > and then read n more bytes for the body. Without being able to
>> > request how many bytes I want, I have to handle putting data back in
>> > the stream that I don't need. That's very error prone and tedious.
>> > So on the read function, add an optional "maxBytes" or "bytes"
>> > parameter. The difference is in the maxBytes case, I want the data as
>> > soon as there is anything, even if it's less than the number of bytes
>> > I want. In the "bytes" case I want to wait till that many bytes are
>> > available. Both are valid for different use cases.
>> The early stuff I saw included a "length" option.
>> > Also streams (both readable and writable) need a configurable
>> > low-water mark. I don't want to wait till the pipe is empty before I
>> > start piping data again. This mark would control how soon writable
>> > streams called my write callback and how much readable streams would
>> > readahead from their data source before waiting for me to call read.
>> > I want to keep it always full. It would be great if this was handled
>> > internally in the stream and consumers of the stream simply configured
>> > what the mark should be.
>> I think you're missing how this works. Nobody automatically asks for data
>> so watermarks aren't strictly necessary. You ask for data if it's available
>> and you read as much as you can handle.
>> There is no "readahead". If someone stops calling read() then the buffer
>> fills and, if it's a TCP stream, it's asked to stop sending data.
>> Remember that when the "readable" event goes off it's expected that the
>> pending data is read in the same event loop cycle.