Should stream.pipe forward errors?

4.659 visualitzacions
Ves al primer missatge no llegit

Jeff Barczewski

no llegida,
28 de nov. 2012, 12:10:5128/11/12
a nod...@googlegroups.com
I was recently testing some stream piping like

rstream
  .pipe(foo)
  .pipe(bar)
  .on('error', function (err) {
    // handle the error
  });

and I found out that currently stream.pipe does not forward
the 'error' events, so if you wanted to handle the
errors you would need to do something like

rstream
  .on('error', handleError)
  .pipe(foo)
  .on('error', handleError)
  .pipe(bar)
  .on('error', handleError);

or possibly rely on domains to deal with the error, but 
I would prefer to leave domains for other more 
catastrophic errors and just deal with errors at the 
end of the pipe chain.

The current node.js code checks to see if the source has any error listeners and if there are none then it throws an Error. If there was an error listener on the source then it will be given the error to deal with.


So to all the experienced streams users out there, 

1. How do you believe it should work? (or is there a better way)
2. Is there any edge cases that we need to consider if we were to implement as I described above?


Thanks in advance for your input!

Jeff

Marco Rogers

no llegida,
28 de nov. 2012, 16:24:1128/11/12
a nod...@googlegroups.com
Yeah I wish there was a better story for this. It's easy to forward errors. It could become difficult to determine in what context the error was thrown.

:Marco

Jeff Barczewski

no llegida,
28 de nov. 2012, 17:05:5028/11/12
a nod...@googlegroups.com
Agreed. Especially if the error happens after the initial connection (which is probably when most would occur), then it won't be caught by any try/catch so it would have to rely on domains. 

If you have setup an error handler then it is just nice to deal with it there in one spot. 

Mikeal Rogers

no llegida,
28 de nov. 2012, 19:47:5528/11/12
a nod...@googlegroups.com
with domains, these will all get trapped in the same domain because they are all event emitters, if they throw.

i tried forwarding errors through pipe chains but it ended up being problematic and i've since backed off.

--
Job Board: http://jobs.nodejs.org/
Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com
To unsubscribe from this group, send email to
nodejs+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Marco Rogers

no llegida,
28 de nov. 2012, 19:49:5328/11/12
a nod...@googlegroups.com
What was problematic about it? Also I though event emitters had to be explicitly added to domains.
--
Marco Rogers
marco....@gmail.com | https://twitter.com/polotek

Life is ten percent what happens to you and ninety percent how you respond to it.
- Lou Holtz

Dan Milon

no llegida,
28 de nov. 2012, 19:48:4928/11/12
a nod...@googlegroups.com
Technically you could explicitly add each stream to its own domain.

danmilon.

On 11/29/2012 02:47 AM, Mikeal Rogers wrote:
> with domains, these will all get trapped in the same domain because
> they are all event emitters, if they throw.
>
> i tried forwarding errors through pipe chains but it ended up
> being problematic and i've since backed off.
>
> On Nov 28, 2012, at November 28, 20121:24 PM, Marco Rogers
>> nod...@googlegroups.com <mailto:nod...@googlegroups.com> To
>> unsubscribe from this group, send email to
>> nodejs+un...@googlegroups.com
>> <mailto:nodejs+un...@googlegroups.com> For more options,

Dominic Tarr

no llegida,
28 de nov. 2012, 23:52:4028/11/12
a nod...@googlegroups.com
Thinking about this, usually each stream produces errors in a special way,
normal javascript errors, (like property of null, etc) throw instead of emitting,
so there is no way to catch the error. domains fit this case.

Usually, only streams that actually do IO emit errors, which is mostly the first and/or last stream in the pipeline. maybe parsing errors in the middle, though.

Of course, there are many ways use streams, can you give some concrete examples of a particular problem you are solving with streams where passing the errors on is desirable?
or how you would like to catch the errors and why it's best?

Martin Cooper

no llegida,
29 de nov. 2012, 0:54:2929/11/12
a nod...@googlegroups.com
On Wed, Nov 28, 2012 at 2:05 PM, Jeff Barczewski <jeff.ba...@gmail.com> wrote:
Agreed. Especially if the error happens after the initial connection (which is probably when most would occur), then it won't be caught by any try/catch so it would have to rely on domains. 

If you have setup an error handler then it is just nice to deal with it there in one spot. 

That depends on what you're going to do with it. As a simple example, if I create a read stream on a tarball, pipe it through Gunzip, and pipe that through Untar, I may well be interested in reporting to the user which one of those things messed up if something went wrong. What I *don't* want to have to do is muddle through inconsistently constructed Error objects to try to figure that out for myself, if there's only one place to put my error handler.
 
In other words, I *like* the fact that the error handlers are interspersed within my pipe setup, because it gives me flexibility in both determining, and reporting, what went wrong. If I don't want to make use of that flexibility, then I can reuse the same handler, as in your example. But I have that choice.

--
Martin Cooper

Jeff Barczewski

no llegida,
29 de nov. 2012, 11:51:1829/11/12
a nod...@googlegroups.com
@Dominic Yes, IO errors would be primary need, but also some pass-through streams that parse data could also emit errors (if the data wasn't as expected) which we would want to handle at the final error handler.

So a concrete example using some simple pass-through streams:

rstream                            // input stream coming in to http server
  .pipe(digestStream)              // calculates the sha1
  .pipe(zlib.createGzip())         // compress the data
  .pipe(redisWStream)              // store it in db
  .on('error', handleError)        // will respond back to user with appropriate message for error
       .on('end', storeDigestMetaInDB); // record the digest in meta info in db  
 
Since I did not add any other error handlers at the individual levels, but since I do have one at the end of the chain, it would seem logical, that any error emitted in the chain would end up in the final error handler. The pipe is composing these streams together, so it would seem that errors should propagate through as well.

If I had not added a final error handler at all, then you have no choice but to just throw and let the domain catch it.

Isaac Schlueter

no llegida,
29 de nov. 2012, 11:52:3729/11/12
a nodejs
The biggest problem with forwarding errors happens when you have
duplex streams that forward to one another, and as Martin points out,
streams that emit errors in different circumstances.

For example, let's say that you added something like this in the
pipe() function:

src.on('error', function(er) {
dest.emit('error', er);
});

What happens when you do this?

// encryptor service
net.createServer(function (socket) {
socket.pipe(new Encryptor()).pipe(socket);
});

If the socket emits error, it forwards to the encryptor, which
forwards it to the socket, and it's an infinite loop. Now every
encryption error is also a RangeError!

You *could* make this work by keeping track of whether a stream's
error events are already forwarded, but that's just more state
tracking and complexity, and it ends up being somewhat
indeterministic. For example, these two servers would be
error-forwarded differently, though they should be semantically
equivalent:

net.createServer(function (socket) {
// all errors emit on socket
var e = new Encryptor();
e.pipe(socket).pipe(e);
});

net.createServer(function (socket) {
// all errors emit on encryptor
var e = new Encryptor();
socket.pipe(e).pipe(socket);
});



To Martin's point, this also obscures the source of the error. Though
it's more verbose, stuff like this is actually quite important:

fs.createReadStream(tarball)
.on('error', handleFileReadError)
.pipe(gzip.Gunzip())
.on('error', handleUnzipError)
.pipe(tar.Extract({ path: targetPath }))
.on('error', handleTarExtractError)
.on('close', cb)

All those different errors have different information, metadata, etc.,
and in many cases, if you don't know the object that emitted them,
then you don't have a lot of insight into what error message you
should print out. ("Your program says 'illegal data', what's going
wrong?" Good luck debugging that!)


The moral of the story is that you must either attach error listeners
to all streams you touch, or accept that they will throw (and perhaps
gather all throws into a domain). For example, if you really *wanted*
to only have a single error handler you could do this:

var d = domain.create();
d.on('error', handleAllErrors);
d.run(function() {
fs.createReadStream(tarball)
.pipe(gzip.Gunzip())
.pipe(tar.Extract({ path: targetPath }))
.on('close', cb);
});

This is actually not so bad. Domains add a bit of metadata to the
error object, so it's not too hard to figure out whether it was an
"organic" throw, or an error event that was emitted by some object.

Jeff Barczewski

no llegida,
29 de nov. 2012, 11:56:5729/11/12
a nod...@googlegroups.com
@Martin That is a good point about if I have source error listener then I shouldn't need to forward the error.

So maybe the logic could be the following for errors in a piped stream:

 1. if any error listener on source, emit there only
 2. else if any error listener on dest, forward the error to the dest (where that error handler can use it)
 3. else throw err


Currently stream.pipe does 1 and 3, so we would be adding item 2.

Jeff Barczewski

no llegida,
29 de nov. 2012, 12:50:3929/11/12
a nod...@googlegroups.com,i...@izs.me
@Isaac You bring up a good point about the duplex streams, I can see how that could get hairy real quick.

Your domains example is nice, so maybe that is the best way to handle error in a single place.


For completeness and finishing the discussion around this, I can give one last idea. Since pipe takes an options object, what if we added a new flag like 'forwardError' which defaults to false, but if it is set true then would propagate the error if there is a destination error handler? A user would only want to do this on one directional streams.

The resulting code would look like:

rstream                                            // input stream coming in to http server
  .pipe(digestStream, { forwardError: true })      // calculates the sha1
  .pipe(zlib.createGzip(), { forwardError: true }) // compress the data
  .pipe(redisWStream, { forwardError: true })      // store it in db
  .on('error', handleError)        // will respond back to user with appropriate message for error
  .on('end', storeDigestMetaInDB); // record the digest in meta info in db  
 

It is not nearly as clean with the extra option to each pipe.


Is this of any use, being able to selectively choose to forward for the cases when you know you are using single direction streams? OR should we just steer always everyone towards using domains (or individual error handlers)?


I'm thinking that maybe the domains approach is better since it is more foolproof, and you don't have to worry about whether duplex and you catch all errors that are thrown (not just the emitted ones).


Any last thoughts before we lay this to rest?


Thanks to everyone for all your input!

Jeff

Isaac Schlueter

no llegida,
29 de nov. 2012, 16:41:1129/11/12
a Jeff Barczewski,nodejs
Since pipe is used in a lot of very hot areas, and is a royal pain to
make changes to, I'd suggest just leaving it as-is for now. That pipe
option object is kind of a wart in my opinion, I'd rather not make it
bigger.

Jeff Barczewski

no llegida,
29 de nov. 2012, 17:44:1229/11/12
a nod...@googlegroups.com,Jeff Barczewski,i...@izs.me
Understood. Thanks for all the discussion. 

Using domains appears to be a good solution in addition to using individual listeners.

Marco Rogers

no llegida,
30 de nov. 2012, 0:49:2130/11/12
a nod...@googlegroups.com
This is all really good info and I'm not gonna argue. I still think it'd be nice if there was a nicer shorthand to tell one emitter to forward certain events to another emitter. So at least it's less of a pain to do it manually. I kinda don't think there is a good shorthand though. Even adding it to the options hash every time isn't much less onerous than just doing multiple on('error', ...).

:Marco

PS - I'm purposely not commenting on the example using domains. I don't have anything constructive to say.

Isaac Schlueter

no llegida,
30 de nov. 2012, 11:55:2130/11/12
a nodejs
Marco,

Easily monkeypatched/userlanded:

Stream.prototype.pipeErr = function(dest, opt) {
var fw = dest.emit.bind(dest, 'error')
this.on('error', fw)
var self = this
dest.on('unpipe', function (src) {
if (src === self)
dest.removeListener('error', fw)
})
return this.pipe(dest, opt)
}

Of course, that's not very flexible, since it only forwards errors,
not anything else.

In a general way, you can also do:

emitter.emit = otherEmitter.emit.bind(otherEmitter)

to steal ALL events. (Not very helpful with streams, but there it is.)


If you want a shorter way to do it, I mean, there's also this kind of stuff:

src.on('error', dest.emit.bind(dest, 'error'))
.pipe(dest)
.on('blerg', src.emit.bind(src, 'destBlerg'))

If an options flag isn't terse enough for you, maybe you could sketch
what kind of API you're thinking of, or what problem you're actually
trying to solve.

> PS - I'm purposely not commenting on the example using domains. I don't have
> anything constructive to say.

This guy said it well:
https://twitter.com/afoolswisdom/status/273098921532878848

If you have something destructive to say about domains, go for it.
It's an experimental feature. Destruction of incorrect features is
part of the construction of correct features. We can't make it better
if you keep destructive comments to yourself :)

George Snelling

no llegida,
8 de febr. 2014, 1:01:228/2/14
a nod...@googlegroups.com
Tickling this thread since I just wasted a few hours scratching my head expecting code that looked a lot like what Jeff wrote below to work. It doesn't.

Kind Node Keepers: the streams docs could use some attention and examples on error handling in a multi-step stream pipeline.  

Thanks for your good work,

George 
Respon a tots
Respon a l'autor
Reenvia
0 missatges nous