Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
How to write stream chunk by chunk with callbacks
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  21 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Alexey Petrushin  
View profile  
 More options Oct 13 2012, 5:17 am
From: Alexey Petrushin <alexey.petrus...@gmail.com>
Date: Sat, 13 Oct 2012 02:17:13 -0700 (PDT)
Local: Sat, Oct 13 2012 5:17 am
Subject: How to write stream chunk by chunk with callbacks

I don't quite understand how steam pause/resume works, or more exactly -
how to use it in simple manner. It's necessary to use it in situations when
the read stream produces data faster than the write stream can consume.

I need to write custom stream implementation and writing it with proper
handling of `pause/resume` functionality seems not a very easy task.

Plain callbacks seems simpler to me, can streams be somehow wrapped into a
code like that ( code with highlighting https://gist.github.com/3883920 ) ?

    var copy = function(inputStream, outputStream, callback){
      var copyNextChunk = function(){
        inputStream.read(fuction(err, chunk){    
          if(err) return callback(err)
          // When chunk == null there's no data, copying is finished.
          if(!chunk) return callback()
          outputStream.write(chunk, function(err){
            // Callback called only when chunk of data
            // delivered to the recipient and
            // we can send another one.
            if(err) return callback(err)
            copyNextChunk()
          })  
        })
      }
    }


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bruno Jouhier  
View profile  
 More options Oct 13 2012, 1:23 pm
From: Bruno Jouhier <bjouh...@gmail.com>
Date: Sat, 13 Oct 2012 10:23:00 -0700 (PDT)
Subject: Re: How to write stream chunk by chunk with callbacks

I wrote a post about plain callback APIs for streams:
http://bjouhier.wordpress.com/2012/07/04/node-js-stream-api-events-or...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mikeal Rogers  
View profile  
 More options Oct 13 2012, 1:31 pm
From: Mikeal Rogers <mikeal.rog...@gmail.com>
Date: Sat, 13 Oct 2012 19:31:09 +0200
Local: Sat, Oct 13 2012 1:31 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

all of these are wrong.

inputStream.pipe(outputStream)
outputStream.on('close', callback)

On Oct 13, 2012, at October 13, 20127:23 PM, Bruno Jouhier <bjouh...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Bruno Jouhier  
View profile  
 More options Oct 13 2012, 2:04 pm
From: Bruno Jouhier <bjouh...@gmail.com>
Date: Sat, 13 Oct 2012 11:04:25 -0700 (PDT)
Local: Sat, Oct 13 2012 2:04 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

What's wrong?

You'll find links to gists at the end of my post. The code works!
And Alexey's pumping function is equivalent to the pumping loop I gave in
my post.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mikeal Rogers  
View profile  
 More options Oct 13 2012, 4:33 pm
From: Mikeal Rogers <mikeal.rog...@gmail.com>
Date: Sat, 13 Oct 2012 22:33:00 +0200
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

i'm not engaging with your strawman Bruno.

i showed how we *actually* move data in node. this is not a debate, that's how it works. if anyone wants to use node, or write a module that has a stream that moves data, that's how they do it.

this was a question, not an open invitation for bikeshedding. please let the list answer questions.

On Oct 13, 2012, at October 13, 20128:04 PM, Bruno Jouhier <bjouh...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Isaac Schlueter  
View profile  
 More options Oct 13 2012, 6:19 pm
From: Isaac Schlueter <i...@izs.me>
Date: Sat, 13 Oct 2012 15:19:07 -0700
Local: Sat, Oct 13 2012 6:19 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks
This is a good time to mention streams2, I think.

Caveat: There've been some discussions about this API.  If you missed
those discussions, I'm sorry, but we're not going to have them again.
The base implementation is finished, and the integration with
node-core is close to being released.  This is a "telling" message,
not an "asking" message :)

In 0.10, the Readable Stream interface is getting a massive overhaul.
Other streams (Duplexes, Writable streams, etc) are also getting
revamped considerably, so that they use the same base classes and
provide much more consistent events.

All the changes are being done such that the previous API continues to
work.  However, if you attach a 'data' event handler, or call pause()
or resume(), then the stream switches into "old-mode", and will behave
like old streams.  (However, it'll behave appropriately, with pause()
buffering like you almost always want it to, and so on.)

There is no pause/resume. There is no 'data' event.  There is a read()
method, and a 'readable' event.  You call read() until it returns
null, and then wait for a readable event to tell you it's time to
read() more.  This is very symmetric to calling write() until it
returns false, and then waiting for the 'drain' event to tell you to
write() more.

In order to control how much you read, you can call read(n) with a
number of bytes (or characters if you've done a setEncoding() call in
the past) to return.  If that many bytes aren't available, then it'll
return null, and emit 'readable' when they are.

So, how do you tell the underlying system to pause?  Simple.  Just
don't call read().  If you're not consuming the bytes, then the buffer
will fill up to a certain level, and stop pulling bytes in from the
underlying systems, and TCP will do its job, or it'll stop reading the
file, or whatever it is that this stream of data refers to.  It's a
pull-style stream, so it doesn't spray chunks at you.

Every readable and writable class that uses the base classes will
automatically have high and low water mark control over their
buffering, and be completely consistent in how they emit events.  If
you want to extend the classes, you can simply implement the
asynchronous MyStream.prototype._read(n, callback) method, or the
asynchronous MyStream.prototype._write(chunk, callback) method.
(There are also base classes for Duplex, which does both, and for
Transform, which turns the input into the output via a
_transform(chunk, outputFunction, callback) method.)

So, regarding the OP here:

1. Your code is wrong.  Mikeal's input.pipe(output) is the way to do
it.  If you want to listen for an event when everything is done, then
it's tricky in node 0.8 and before.  In 0.10 and later, you'd do
`output.on('finish', callback)`, and all Writable objects will emit a
'finish' event when you've called end() and all the write buffer is
cleared. If you really wanted to not use pipe, here's how you'd do it
with new streams:

function flow(input, output, callback) {
  input.on('end', output.end.bind(output))
  output.on('finish', callback);
  f()
  function f() {
    var chunk
    while (null !== (chunk = input.read()))
      if (false === output.write(chunk))
        break;
    input.once('readable', f)
  }

}

With old streams, it's much trickier and harder to get right.  Look at
`util.pump` or Stream.prototype.pipe.

There's also error handling and a few interesting edge cases, and the
backwards compatibility requirement makes it a bit trickier.
Basically, just use the input.pipe(output) method, and relax :)

2. If you decide to use some other streaming data abstraction, then
that's fine, but you are off the reservation as far as Node.js is
concerned.  If you ask node users for help, don't be surprised if they
say "Just use streams and pipe()" and have no idea what your code is
doing.  I actually think wheels *should* be reinvented from time to
time, but you should probably not reinvent the wheel until you've at
least tried the one everyone else is using, so you can make an
informed decision about it.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alexey Petrushin  
View profile  
 More options Oct 13 2012, 8:08 pm
From: Alexey Petrushin <alexey.petrus...@gmail.com>
Date: Sat, 13 Oct 2012 17:08:54 -0700 (PDT)
Local: Sat, Oct 13 2012 8:08 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

Thanks for help, especially You Isaac for such a detailed answer.

As far as I understand it's possible to wrap existing evented stream API
into callback interface (with in-memory data buffers to handle mismatch
between explicit/implicit control flow).
But probably it won't worth it, it will be more easy to just use it as it's
supposed to be used (with pipes) and wait untill those changes in 0.10.
The new API seems to be very similar to what I asked for.

P.S.

As for the question and why do I need it - I'm working on application that
uses custom streams and though that maybe I can cheat and simplify my work
a little by not implementing complex evented interface :).

I once used such abstraction for working with streams in ruby:

    to.write do |writer|
      from.read{|buff| writer.write buff}
    end

Files are open and closed properly, buffer also have some default size, so
the code is very simple to use (more details
http://alexeypetrushin.github.com/vfs ).
Basically by implementing just those two methods You get ability to stream
from any stream into any stream (fs, s3, sftp, ...).

I tried to do something similar with asynchronous streams.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Hahn  
View profile  
 More options Oct 13 2012, 8:17 pm
From: Mark Hahn <m...@hahnca.com>
Date: Sat, 13 Oct 2012 17:16:44 -0700
Local: Sat, Oct 13 2012 8:16 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

> There is no 'data' event.  There is a read() method, and a 'readable'

event.  You call read() until it returns null, and then wait for a readable
event to tell you it's time to read() more.

So, if we want to pump it at max rate we would run a tight loop to read and
write in the beginning and then on every readable event?   It seems like
more work and a lot messier compared to the old data event scheme.

On Sat, Oct 13, 2012 at 5:08 PM, Alexey Petrushin <


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Nathan Rajlich  
View profile  
 More options Oct 13 2012, 8:20 pm
From: Nathan Rajlich <nat...@tootallnate.net>
Date: Sat, 13 Oct 2012 17:19:57 -0700
Local: Sat, Oct 13 2012 8:19 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks
Mark, to pump at max rate you'd use .pipe().


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Hahn  
View profile  
 More options Oct 13 2012, 8:26 pm
From: Mark Hahn <m...@hahnca.com>
Date: Sat, 13 Oct 2012 17:25:50 -0700
Local: Sat, Oct 13 2012 8:25 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

But pipe only works if the writes are to another stream.  If they are to a
db driver or something without pipe support then I have to do my own reads.
 Or am I missing something here?

On Sat, Oct 13, 2012 at 5:19 PM, Nathan Rajlich <nat...@tootallnate.net>wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Isaac Schlueter  
View profile  
 More options Oct 13 2012, 8:32 pm
From: Isaac Schlueter <i...@izs.me>
Date: Sat, 13 Oct 2012 17:31:51 -0700
Local: Sat, Oct 13 2012 8:31 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks
Mark,

Well... yes.  If you want to siphon out the data as fast as possible,
and it's not going to a writable stream interface of some sort, then
you have to read() in a tight loop on every readable event.  That's
actually not much different than the 'data' event scheme.

Note that if you attach a 'data' event handler, then it'll do this for
you.  The backwards-compatible API is exactly the one you're used to.
The major difference is that, in 0.10, if you're using 'data' events,
then pause and resume actually work in a non-surprising way (ie, you
won't get 'data' events happening while it's in a paused state), and
all streams in core will have the same set of events and methods
(instead of each of them implementing 90% of the API in subtly
different ways).


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mark Hahn  
View profile  
 More options Oct 13 2012, 8:53 pm
From: Mark Hahn <m...@hahnca.com>
Date: Sat, 13 Oct 2012 17:52:39 -0700
Local: Sat, Oct 13 2012 8:52 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

So using it in the backwards-compatible way doesn't cause any performance
loss?  If so I can choose which to use in every situation.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Isaac Schlueter  
View profile  
 More options Oct 13 2012, 10:24 pm
From: Isaac Schlueter <i...@izs.me>
Date: Sat, 13 Oct 2012 19:24:03 -0700
Local: Sat, Oct 13 2012 10:24 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks
Mark,

The overall performance impact hasn't been fully established yet.  I
suspect that it'll be slight, but it's unclear whether it'll be an
improvement (owing most likely to greater hidden class optimization
from having all streams share more of the same code), or a regression
(owing to the fact that there's just more stuff being done).

The syscall/IO footprint is identical, though.  Past experience in
this area has shown that the total time spent in JS is usually a
pretty small part of the overall latency unless the code is very hot
or doing something very stupid.  We'll see soon.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Dominic Tarr  
View profile  
 More options Oct 14 2012, 6:58 pm
From: Dominic Tarr <dominic.t...@gmail.com>
Date: Mon, 15 Oct 2012 00:58:32 +0200
Local: Sun, Oct 14 2012 6:58 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

mark, just implement a stream shaped wrapper for your db client thing, (and
make sure you publish it to npm!)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jake Verbaten  
View profile  
 More options Oct 15 2012, 1:27 am
From: Jake Verbaten <rayn...@gmail.com>
Date: Sun, 14 Oct 2012 22:26:47 -0700
Local: Mon, Oct 15 2012 1:26 am
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

https://github.com/Raynos/for-each-stream

```
forEach(stream, function (chunk) {
    /* insert chunk into database or wait for them all */

})

```

https://github.com/Raynos/write-stream#example-array

```
stream
    .pipe(toArray(function (chunks) {
        /* manipulate chunks then do database thing */
    })
```

Both of those functions use pipe internally.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Alexey Petrushin  
View profile  
 More options Oct 15 2012, 1:46 am
From: Alexey Petrushin <alexey.petrus...@gmail.com>
Date: Sun, 14 Oct 2012 22:46:45 -0700 (PDT)
Local: Mon, Oct 15 2012 1:46 am
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

Not sure about this code:

    forEach(stream, function (chunk) {
        /* insert chunk into database or wait for them all */
    })

Classical `each` concept doesn't works in async world, this loop for
example won't
wait when operation of inserting chunk into database will be successfully
(or not) finished.

It will consume memory if for example stream is big and fast and database
it written to slow.

it should be something like:

    each(stream, function (chunk, next) {
        /* insert chunk into database or wait for them all */
        /* and call next() or next(err) when You finish and want next chunk
*/
    })

Same with toArray - it will load all data into memory.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jake Verbaten  
View profile  
 More options Oct 15 2012, 1:53 am
From: Jake Verbaten <rayn...@gmail.com>
Date: Sun, 14 Oct 2012 22:52:56 -0700
Local: Mon, Oct 15 2012 1:52 am
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

> Classical `each` concept doesn't works in async world, this loop for
> example won't
> wait when operation of inserting chunk into database will be successfully
> (or not) finished.

> It will consume memory if for example stream is big and fast and database
> it written to slow.

If you want to apply backpressure simply return false from the forEach
iterator function.

If you want to catch errors simply `this.emit("error")` as `this` in the
callback is the writable stream.

forEach is the same as `stream.pipe(WriteStream(iterator))`

> Same with toArray - it will load all data into memory.

When you call toArray you want to load the entire thing into memory. That's
a choice.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
khs4473  
View profile  
 More options Oct 16 2012, 1:47 pm
From: khs4473 <khs4...@gmail.com>
Date: Tue, 16 Oct 2012 10:47:12 -0700 (PDT)
Local: Tues, Oct 16 2012 1:47 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

> i'm not engaging with your strawman Bruno.

> i showed how we *actually* move data in node. this is not a debate, that's
> how it works. if anyone wants to use node, or write a module that has a
> stream that moves data, that's how they do it.

Wow - that's a really scary response.  Bruno's callback approach to
streaming (and unix's, BTW) actually provide a much more solid foundation
than your over-engineered streams.  Boom!  ; P

Kevin


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Jeff Barczewski  
View profile  
 More options Oct 17 2012, 12:09 pm
From: Jeff Barczewski <jeff.barczew...@gmail.com>
Date: Wed, 17 Oct 2012 09:09:45 -0700 (PDT)
Local: Wed, Oct 17 2012 12:09 pm
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

@isaacs I assume that your https://github.com/isaacs/readable-stream will
be the module to watch as this evolves.

Will this module be the way to use new interface with older versions of
node?

Thanks!

Jeff


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Marco Rogers  
View profile  
 More options Oct 18 2012, 12:25 am
From: Marco Rogers <marco.rog...@gmail.com>
Date: Wed, 17 Oct 2012 21:25:14 -0700 (PDT)
Local: Thurs, Oct 18 2012 12:25 am
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

My understanding is that the readable-stream module is a good way to try
out these experimental api changes. It's a reference implementation so to
speak. But the plan is to integrate this new api into core and support it
fully. And it will be fully backwards compatible with the old api. So
eventually this module will be obsoleted for the newest version of node.
But yes you could continue to use it on older versions.

:Marco


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Isaac Schlueter  
View profile  
 More options Oct 18 2012, 7:34 am
From: Isaac Schlueter <i...@izs.me>
Date: Thu, 18 Oct 2012 12:34:07 +0100
Local: Thurs, Oct 18 2012 7:34 am
Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks

> On Wednesday, October 17, 2012 9:09:46 AM UTC-7, Jeff Barczewski wrote:

>> @isaacs I assume that your https://github.com/isaacs/readable-stream will
>> be the module to watch as this evolves.

Marco is correct.  Also:

You can also watch the progress on the "streams2" branch in git.

>> Will this module be the way to use new interface with older versions of
>> node?

Yes, the idea would be that you can use the readable-stream module if
you want to use the new interface with older versions of node.

There's also a readable.wrap(oldStyleStream) method to use a
'data'-event style stream as the data source for a read()-style
stream.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic