Received: by 10.224.213.1 with SMTP id gu1mr4900265qab.7.1350166771762; Sat, 13 Oct 2012 15:19:31 -0700 (PDT) X-BeenThere: nodejs@googlegroups.com Received: by 10.229.171.224 with SMTP id i32ls5167143qcz.8.gmail; Sat, 13 Oct 2012 15:19:08 -0700 (PDT) Received: by 10.224.183.13 with SMTP id ce13mr4901033qab.4.1350166748289; Sat, 13 Oct 2012 15:19:08 -0700 (PDT) Received: by 10.224.183.13 with SMTP id ce13mr4901032qab.4.1350166748280; Sat, 13 Oct 2012 15:19:08 -0700 (PDT) Return-Path: Received: from mail-qc0-f175.google.com (mail-qc0-f175.google.com [209.85.216.175]) by gmr-mx.google.com with ESMTPS id x31si2170021qco.0.2012.10.13.15.19.08 (version=TLSv1/SSLv3 cipher=OTHER); Sat, 13 Oct 2012 15:19:08 -0700 (PDT) Received-SPF: pass (google.com: domain of isaacschlue...@gmail.com designates 209.85.216.175 as permitted sender) client-ip=209.85.216.175; Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of isaacschlue...@gmail.com designates 209.85.216.175 as permitted sender) smtp.mail=isaacschlue...@gmail.com; dkim=pass header...@gmail.com Received: by mail-qc0-f175.google.com with SMTP id j3so3398177qcs.20 for ; Sat, 13 Oct 2012 15:19:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type; bh=VGazVUv3z0NwIzdPtBE43BYcSiv9rvE0xiwLwZrCStY=; b=xOSMXkVBBKQ4HOy1sMbihBDqxnBmh0hPafFyWOXvUDjtl5dbRupxfY2BvdcaEUnlba 0uFFetcAz36rn+Bqp+9jxY/QN51Y0Dz8M7QwLvPurQonJ8MLuAluFvJNxplqvHCG7GQx XHd+Ucuq7NPHQnDCvPbCKyDzCBpZ7VQl2kTyddpRObMSYm27SyZ45QF6hLGnNd84jiLL YzP+XV9ApdW3SNvFP3YS2cxeMPaUIkkJEpySkIkUMQBgh4JR5KKnkVpnWKjFOpVjJCP+ JKOC7WmtTbxiDmJY1r55R1l+mGeASdTxopqVvdbxwd/XyPNZ63zzRAMNWxJ0vPqgFgVS O6qg== MIME-Version: 1.0 Received: by 10.49.3.234 with SMTP id f10mr18372382qef.45.1350166748028; Sat, 13 Oct 2012 15:19:08 -0700 (PDT) Sender: isaacschlue...@gmail.com Received: by 10.49.58.164 with HTTP; Sat, 13 Oct 2012 15:19:07 -0700 (PDT) In-Reply-To: References: <42eeb107-5160-4b41-9c77-3e3dc781585e@googlegroups.com> <1d8a2240-6830-4b59-9722-3d7f1755dbfe@googlegroups.com> Date: Sat, 13 Oct 2012 15:19:07 -0700 Message-ID: Subject: Re: [nodejs] Re: How to write stream chunk by chunk with callbacks From: Isaac Schlueter To: nodejs@googlegroups.com Content-Type: text/plain; charset=UTF-8 This is a good time to mention streams2, I think. Caveat: There've been some discussions about this API. If you missed those discussions, I'm sorry, but we're not going to have them again. The base implementation is finished, and the integration with node-core is close to being released. This is a "telling" message, not an "asking" message :) In 0.10, the Readable Stream interface is getting a massive overhaul. Other streams (Duplexes, Writable streams, etc) are also getting revamped considerably, so that they use the same base classes and provide much more consistent events. All the changes are being done such that the previous API continues to work. However, if you attach a 'data' event handler, or call pause() or resume(), then the stream switches into "old-mode", and will behave like old streams. (However, it'll behave appropriately, with pause() buffering like you almost always want it to, and so on.) There is no pause/resume. There is no 'data' event. There is a read() method, and a 'readable' event. You call read() until it returns null, and then wait for a readable event to tell you it's time to read() more. This is very symmetric to calling write() until it returns false, and then waiting for the 'drain' event to tell you to write() more. In order to control how much you read, you can call read(n) with a number of bytes (or characters if you've done a setEncoding() call in the past) to return. If that many bytes aren't available, then it'll return null, and emit 'readable' when they are. So, how do you tell the underlying system to pause? Simple. Just don't call read(). If you're not consuming the bytes, then the buffer will fill up to a certain level, and stop pulling bytes in from the underlying systems, and TCP will do its job, or it'll stop reading the file, or whatever it is that this stream of data refers to. It's a pull-style stream, so it doesn't spray chunks at you. Every readable and writable class that uses the base classes will automatically have high and low water mark control over their buffering, and be completely consistent in how they emit events. If you want to extend the classes, you can simply implement the asynchronous MyStream.prototype._read(n, callback) method, or the asynchronous MyStream.prototype._write(chunk, callback) method. (There are also base classes for Duplex, which does both, and for Transform, which turns the input into the output via a _transform(chunk, outputFunction, callback) method.) So, regarding the OP here: 1. Your code is wrong. Mikeal's input.pipe(output) is the way to do it. If you want to listen for an event when everything is done, then it's tricky in node 0.8 and before. In 0.10 and later, you'd do `output.on('finish', callback)`, and all Writable objects will emit a 'finish' event when you've called end() and all the write buffer is cleared. If you really wanted to not use pipe, here's how you'd do it with new streams: function flow(input, output, callback) { input.on('end', output.end.bind(output)) output.on('finish', callback); f() function f() { var chunk while (null !== (chunk = input.read())) if (false === output.write(chunk)) break; input.once('readable', f) } } With old streams, it's much trickier and harder to get right. Look at `util.pump` or Stream.prototype.pipe. There's also error handling and a few interesting edge cases, and the backwards compatibility requirement makes it a bit trickier. Basically, just use the input.pipe(output) method, and relax :) 2. If you decide to use some other streaming data abstraction, then that's fine, but you are off the reservation as far as Node.js is concerned. If you ask node users for help, don't be surprised if they say "Just use streams and pipe()" and have no idea what your code is doing. I actually think wheels *should* be reinvented from time to time, but you should probably not reinvent the wheel until you've at least tried the one everyone else is using, so you can make an informed decision about it.