Re: [nodejs] Node.js documentation about domains: what makes "throw" special in JS compared to Python/Ruby?

529 views
Skip to first unread message

Forrest L Norvell

unread,
Apr 4, 2013, 11:40:54 PM4/4/13
to nod...@googlegroups.com
On Thu, Apr 4, 2013 at 5:39 PM, Nicolas Grilly <nic...@garden-paris.com> wrote:
As of the way exception handling works, V8 is not that different from Python and Ruby. In Python and Ruby, you can throw an exception anywhere, and catch it somewhere else, without "leaking references" or "creating some other sort of undefined brittle state", if your code correctly releases allocated ressources using finally statements. What makes JavaScript different?

JavaScript itself is no safer or more dangerous than Ruby or Python. However, putting everything in try/catch/finally clauses is expensive (i.e. they can't be fully optimized by V8). Also wrapping asynchronous calls in try/catch is often exactly what you don't want to do, if getting meaningful error messages is your goal. Domains are intended to be a relatively inexpensive mechanism to capture information about the cause of a crash that extends over the length of an asynchronous call chain. They're not a general-purpose error recovery mechanism.

Realistically speaking, there's no way to write a general "async finally" in Node. Side effects are just too pervasive, and while if you're careful you can ensure that resources are properly cleaned up, it's not a problem that can be automated. Node 0.8 and 0.10 domains included a domain.dispose() method that tries to clean up the EventEmitters bound to a domain, but it's heuristic and will be deprecated in 0.12.

F

Mark Hahn

unread,
Apr 4, 2013, 11:46:08 PM4/4/13
to nodejs
Short version:  Python and Ruby are usually not run asynchronously.  Exceptions and async don't play well together,



--
--
Job Board: http://jobs.nodejs.org/
Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com
To unsubscribe from this group, send email to
nodejs+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en
 
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+un...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Domenic Denicola

unread,
Apr 5, 2013, 12:36:43 AM4/5/13
to nod...@googlegroups.com

On Thursday, April 4, 2013 11:40:54 PM UTC-4, Forrest L Norvell wrote:
However, putting everything in try/catch/finally clauses is expensive (i.e. they can't be fully optimized by V8).

This is slightly misleading. In the following code:

```js
try {
  doStuff();
} catch (e) {
  // handle somehow
}

function doStuff() {
  // do some stuff
}
```

the code inside `doStuff` is still optimized by V8. The only code that is not optimized is that directly inside the `try` block, i.e. the call `doStuff()` itself. I'm not sure if optimizations matter, or are even possible, for code that simple---so if you follow a pattern like this, try/catch should have no performance impacts.

Isaac Schlueter

unread,
Apr 5, 2013, 12:49:31 AM4/5/13
to nodejs
I'd argue that throws in Ruby and Python are not safe, either!

Here's a simple example:

function doSomething(array) {
for (var i = 0; i < array.length; i++) {
mightThrow(array[i]);
}
}

If you call `doSomething(list)` then there's no way for you to know
how many items in the list were processed. Precisely *because* it
jumps up the call stack, there's no way to handle this without
catching at *every* level, not just the level that can correct from
the error.

At best, it's expensive. At worst, it leads to indeterminate state,
and cases where resources are not properly cleaned up. In real world
examples, I've seen fd leaks, sockets hung open until they timed out,
event listeners never removed, streams that never stop downloading,
etc. (And that was *before* coming to Node!)

PHP avoids this problem by tearing down the VM after each request, so
leaks aren't a real issue (there are obviously other issues with this
approach). Pure functional languages avoid the problem by having
isolated crash-only architectures.

The correct way to handle this in a stateful imperative language like
JavaScript is to kill the process and clean up all the resources.

Nicolas Grilly

unread,
Apr 5, 2013, 4:50:04 AM4/5/13
to nod...@googlegroups.com
On Friday, April 5, 2013 5:40:54 AM UTC+2, Forrest L Norvell wrote:
However, putting everything in try/catch/finally clauses is expensive (i.e. they can't be fully optimized by V8).

I'm not sure to get your point about try/catch/finally being expensive. It looks like that in Node.js current codebase, this is how domains catch exceptions and convert them to "error" events. If you need to match an exception to its current HTTP request in order to send a response, you have to use try/catch at some point. Correct?

Nicolas Grilly

unread,
Apr 5, 2013, 4:56:03 AM4/5/13
to nod...@googlegroups.com
On Friday, April 5, 2013 5:40:54 AM UTC+2, Forrest L Norvell wrote:
Realistically speaking, there's no way to write a general "async finally" in Node. Side effects are just too pervasive, and while if you're careful you can ensure that resources are properly cleaned up, it's not a problem that can be automated.

I read everywhere it's very difficult to write a kind of "async finally" in Node because side effects are too pervasive and resources will be leaked, etc. 

In that case, why Tornado strategy regarding error handling is so radically different [1]? It is also an async server. It is also based on a dynamic language very comparable to Python. But instead of crashing and restart the process after every unexpected error, they send an error page, close the request/response, and go on with the next loop.


I would be very interested to understand why this strategy is okay for Tornado but is not okay for Node.

Angel Java Lopez

unread,
Apr 5, 2013, 6:50:07 AM4/5/13
to nod...@googlegroups.com
Hi!

Umm... a totally newbie to Tornado, but after browsing the code, they are "dancing the conga" ;-) to have such behavior.

See

and how its wrap function is used at:

It's an "ad-hoc" solution for the web request. And I think Python has programatic access to the "current context", to be confirmed

Angel "Java" Lopez
@ajlopez




--

Ben Noordhuis

unread,
Apr 5, 2013, 8:17:53 AM4/5/13
to nod...@googlegroups.com
Isaac already explained it a few posts up. I'll replicate his example
here for posterity:

function doSomething(array) {
for (var i = 0; i < array.length; i++) {
mightThrow(array[i]);
}
}

In Python:

def do_something(items):
for item in items:
might_throw(item)

How many items have been processed after an exception? You don't know
unless you add *a lot* of error handling everywhere. That kind of
error handling is very easy to screw up and very difficult to debug.
(And you will screw up. I don't believe in infallible programmers.)

Tornado (and Python in general, async or not) are just as susceptible
to this issue as node.js is.

Nicolas Grilly

unread,
Apr 5, 2013, 2:09:46 PM4/5/13
to nod...@googlegroups.com, i...@izs.me
On Friday, April 5, 2013 6:49:31 AM UTC+2, Isaac Schlueter wrote:
I'd argue that throws in Ruby and Python are not safe, either!

It looks like we agree on the fact that there is nothing fundamentally different in the way `throw` works in JavaScript as a language compared to Python and Ruby.

Does it mean that this paragraph of Node.js documentation about domains:

"By the very nature of how throw works in JavaScript, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state."

could be rewritten as is (without "in JavaScript"):

"By the very nature of how throw works, there is almost never any way to safely "pick up where you left off", without leaking references, or creating some other sort of undefined brittle state."

This is important because the current phrasing makes a beginner like me think there is something specific to Node.js that justifies the strategy "restart process in case of an unexpected error". But in fact it could be a general advice applicable to other technologies as well like Tornado, for example. Correct?

Nicolas Grilly

unread,
Apr 5, 2013, 2:50:19 PM4/5/13
to nod...@googlegroups.com
On Friday, April 5, 2013 2:17:53 PM UTC+2, Ben Noordhuis wrote:
Isaac already explained it a few posts up.  I'll replicate his example
here for posterity:

  function doSomething(array) {
    for (var i = 0; i < array.length; i++) {
      mightThrow(array[i]);
    }
  }

In Python:

  def do_something(items):
    for item in items:
      might_throw(item)

How many items have been processed after an exception?  You don't know
unless you add *a lot* of error handling everywhere.  That kind of
error handling is very easy to screw up and very difficult to debug.
(And you will screw up.  I don't believe in infallible programmers.)

Tornado (and Python in general, async or not) are just as susceptible
to this issue as node.js is.

I agree that, in your examples, in Node and in Python, if an exception is thrown, your items are in an undetermined state.

But I have two objections/questions:

1/ If the list of items is a shared global state, I agree that this state is corrupted and the best strategy is probably to restart process. But in most applications, you don't have a lot of shared global state, and the code managing it is usually well reviewed. On the contrary, if the corrupted data is attached the current request, then there is no problem with catching the error in a domain, returning an HTTP 500 response, and return to the event loop. Do you agree? This is what Tornado do by default. Let's say the code managing global state is very short and easy to review. In that case, most bugs will happen in the code manipulating data attached to the current request. In that context, if an error happens, it's perfectly ok to go on with serving the next request.

2/ Your reasoning is, as I understand it: if there is an unexpected error, then the application state may be corrupted, then we have to restart the process. It tends to suggest that "unexpected error" equals "corrupted state", which justifies the "restart process" strategy. But it's perfectly possible to have some bad behaving code that silently corrupt the application state without raising any exception. By that, I mean that restarting process guarantee by no means a clean state. We can make a distinction between three kinds of errors:

a) Unexpected errors that corrupt only the request state -> They are caught by the domain which can safely clean the request and response data and return to the event loop.
b) Unexpected errors that corrupt the global state and raise an exception -> They are caught by the domain and I agree that, in this situation, restarting the process is the best option.
c) Unexpected errors that *silently* corrupt global state without raising anything -> They cannot be caught by the domain error handler.

It is very difficult, maybe almost impossible, to distinguish (a) and (b) in the domain error handler. Because of this, the current official advice is to restart Node.js process, which is the best error handling strategy for the (b) case. But the best strategy for the (a) case is to just clean the request and response data, and go on.

I would agree that restarting the process is the best strategy if it would enable us to remove all problems with global state. But it's not. I think that a lot of bugs with global state, maybe most, are silent and do not raise any exception (this is my (c) case). Because of this, restarting the process is just a bandage to fix a small part of global state issues.

We are making the most important cause of errors, the (a) case, very difficult to recover from, just to incompletely fix some issues with global states in the (b) case.

Do you agree with some part of the above reasoning?

Cheers,

Nicolas 

Forrest L Norvell

unread,
Apr 5, 2013, 8:02:04 PM4/5/13
to nod...@googlegroups.com
Ultimately this is a design question, and what you as developer are comfortable with. Almost none of this discussion is particularly tied to Node, above and beyond the fact that try-catch and throw don't work well in Node's style of async, callback-driven programming, where a function is being invoked on a different call stack than the one where it was created. If you're confident that you can construct your domains in such a way that you can clean up after failed requests, then sure, use domains as a way to recover and continue.

In general, though, a conservative design is going to look like what Isaac describes, where you capture information about an error, shut down the process, and use something like mon or supervisord to ensure that a new process will be spawned. Environments that do support that kind of robust error recovery typically impose design decisions, like Erlang's immutability and OTP's crash-only design, that reduce or eliminate the risk of leaving the process in an undefined state. It's a hard problem, and not one that Node core tries to solve.

F

--

Isaac Schlueter

unread,
Apr 6, 2013, 1:53:11 PM4/6/13
to Nicolas Grilly, nodejs
On Fri, Apr 5, 2013 at 11:09 AM, Nicolas Grilly
<nic...@garden-paris.com> wrote:
> Does it mean that this paragraph of Node.js documentation about domains:
>
> "By the very nature of how throw works in JavaScript, there is almost never
> any way to safely "pick up where you left off", without leaking references,
> or creating some other sort of undefined brittle state."
>
> could be rewritten as is (without "in JavaScript"):
>
> "By the very nature of how throw works, there is almost never any way to
> safely "pick up where you left off", without leaking references, or creating
> some other sort of undefined brittle state."

Yeah, probably, but I don't want to get into a whole big thing about
how throws in Erlang or Haskell or Lisp are actually perfectly safe,
or can be made safe in Clojure or Scala, or if only JavaScript had
typed catches, we could do this or that. Ugh. Boring tedious useless
discussions, those are.

Node is for running JavaScript, so JavaScript's limitations are Node
program limitations. *JavaScript* throws are fundamentally unsafe if
they jump over a branch in the execution-flow graph. To the extent
that "language X" is like JavaScript, it will be similarly unsafe.
(Ruby and Python are both very similar to JavaScript - stateful,
imperative, syntaxy, semi-functional, garbage collected.)

Jason Shinn

unread,
Jan 3, 2014, 12:02:18 PM1/3/14
to nod...@googlegroups.com, Nicolas Grilly, i...@izs.me
I'm resurrecting this discussion because it's the most useful deep exploration at the top of Google's search results for why try-catch is unsafe per the warning in node's documentation.

I've been bothered by a seeming disconnect between that warning, stated in absolute terms, and simple constructs like the following, which I see in examples all over the place:

try {
  JSON.parse(mightNotBeValidJson);
}
catch (e) {
  //whoops, it wasn't, but this is the only way I could know that without using an entirely different parser that fails a little more gracefully
  //further, I can handle the fact that it wasn't valid JSON without nuking the current process from orbit, it's not the only way to be sure
}

... try/catch *can* be used in such situations without creating havoc, can it not?  I'm about 6 weeks into node, and I've mostly worked with that assumption, but have continued to be bothered by the prohibition laid out in the docs from trying to recover from such events.

If my assumption is in fact correct, I think the statement that "it's almost never safe" is a little far-reaching, and much more accurate to say that it's very easy misuse, and is mostly appropriate when it contains code that is strictly synchronous and where the resources and dependencies used are well-known and simply managed.  That is to say, a limited but extensive subset of work done in node.  Beyond that, let people dig their own holes, this is programming, if we couldn't do anything dangerous it wouldn't be any fun :)

Forrest L Norvell

unread,
Jan 3, 2014, 12:15:15 PM1/3/14
to nod...@googlegroups.com
The distinction is that JSON.parse is synchronous, and thus try-catch works perfectly well with it. Also, as Isaac said months ago, try-catch starts to get iffy when code in the try block has multiple control-flow branches that could get skipped over by a throw. This way of using try-catch is simple and therefore safe.

It's unfortunate that a built-in JavaScript function that performs I/O (deserialization / unmarshalling) doesn't follow node's I/O convention of using a callback, but more because it's inconsistent (and therefore confusing) than because it's dangerous. If you use JSON.parse the way you describe, it's the best way I know of to deal with that potential error.

F
--

Jason Shinn

unread,
Jan 3, 2014, 12:44:43 PM1/3/14
to nod...@googlegroups.com
Thank you.  My main point is that I've been wandering extensively around the node-verse for the past 6 weeks and this is the first time I've ever seen a definitive statement that try/catch is actually safe in that (or a similarly simple) context.  What you say always made intuitive sense to me, but it's a little daunting to run into the dire-sounding warning early on in my excursions into node.

For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en
 
---
You received this message because you are subscribed to the Google Groups "nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nodejs+unsubscribe@googlegroups.com.

Sam Roberts

unread,
Jan 3, 2014, 1:29:44 PM1/3/14
to nod...@googlegroups.com
On Fri, Jan 3, 2014 at 9:44 AM, Jason Shinn <jms...@gmail.com> wrote:
> Thank you. My main point is that I've been wandering extensively around the
> node-verse for the past 6 weeks and this is the first time I've ever seen a
> definitive statement that try/catch is actually safe in that (or a similarly
> simple) context.

Where did you hear it suggested that try/catch isn't safe?

Its perfectly acceptable and recommended to handle errors in node when
you know enough about the context to be able to. Node even includes a
feature to do so, domains, see the api docs on nodejs.org.

Blithely ignoring errors is a bad idea in node, or any other dynamic
language where fundamental problems like invalid syntax, non-existent
modules during require, attempts to call undefined methods or
functions, servers failing to listen because a port is used, etc., are
all errors that can be caught.

Cheers,
Sam

Jason Shinn

unread,
Jan 3, 2014, 2:16:55 PM1/3/14
to nod...@googlegroups.com
I read it in the node documentation, as did the OP, where the exact quote is reproduced.  It specifically is on the docs page for domains: http://nodejs.org/api/domain.html#domain_warning_don_t_ignore_errors.

It's the most prominent instruction we have regarding how to handle thrown errors, and it says to kill the process completely (in the context of that page, use domains and cluster to kill the worker process and spin up a new one).  That's great, and is the right approach in many situations, but it's not universal.  It's perhaps not much publicized when you might want to use a simple try/catch to continue on without worry because, according to nodejitsu (http://docs.nodejitsu.com/articles/errors/what-is-try-catch) the only place in node core we really use it is the example I gave, JSON.parse().  But inherent in that example are the reasons why you generally want to take more care and why, in some simpler situations, you may not need to.

As I said, my main point is that the warning, as stated, doesn't really provided a basis for learning why and, subsequently, intelligently using and handling errors.

Sam Roberts

unread,
Jan 3, 2014, 2:50:19 PM1/3/14
to nod...@googlegroups.com
On Fri, Jan 3, 2014 at 11:16 AM, Jason Shinn <jms...@gmail.com> wrote:
> I read it in the node documentation, as did the OP, where the exact quote is
> reproduced. It specifically is on the docs page for domains:
> http://nodejs.org/api/domain.html#domain_warning_don_t_ignore_errors.

Fair enough, that's talking about errors in a fairly global scope,
where you don't know what the cause is, and you can't sensibly
recover.

> nodejitsu (http://docs.nodejitsu.com/articles/errors/what-is-try-catch) the
> only place in node core we really use it is the example I gave,
> JSON.parse().

And with the fs apis, pretty common to try to do something with a file
if it exists, and then something else if it doesn't. And require.
Pretty common to require some optional module, if its there use it,
otherwise, don't.

And robust clients reconnect .on('error',..) when connections are reset.

Etc.

I don't see node as any different from any other language in all this.
Though I think the node core team have been frustrated by naive users
with apps that are exiting deciding "hey, it will be more robust if I
just ignore all errors!", which is a terrrrrible idea. Thus the docs.
Since you are actually thinking about this, you clearly aren't that
kind of user, though.

> As I said, my main point is that the warning, as stated, doesn't really
> provided a basis for learning why and, subsequently, intelligently using and
> handling errors.

I'm not going to claim the node docs are great. But they do accept PRs!

Sam

Jason Shinn

unread,
Jan 3, 2014, 5:07:14 PM1/3/14
to nod...@googlegroups.com
Sure, and perhaps this is a bigger deal for people like myself, coming from PHP where the short-running process hides plenty of sins, and robust error handling is cool and useful and oftentimes ignored.
Reply all
Reply to author
Forward
0 new messages