Domains as thread locals

273 views
Skip to first unread message

Alvaro Mouriño

unread,
Aug 29, 2013, 5:20:04 AM8/29/13
to nod...@googlegroups.com
Hi list,

I started working with NodeJS quite recently, even if at first I was a
little skeptical about it, after using it intensively for the last
couple of months I realized it's an amazing piece of technology. The
only thing that I miss from other frameworks is the concept of
thread-locals.

After some time researching on this, a friend told me that this
behavior could be achieved using domains [0]. I built a small
framework, Kolba [1], as a proof of concept of how I see this being
implemented. I also wrote an article explaining this idea and
presenting some questions that I have.

The main question that I want to ask here is: What am I missing? Why
is this undocumented? Are there any edge cases that I don't see, which
make this idea non-viable?

Actually, the biggest problem I'm facing with the use of domain is the
suppression of exceptions. I may be using it wrong probably, which
leads to errors being silenced, but nothing regarding the emulation of
thread-locals.

Thanks,

[0] http://nodejs.org/api/domain.html
[1] https://github.com/tooxie/kolba
[2] http://alvaro.mourino.net/2013/08/29/lets-build-a-framework.html

--
Alvaro Mouriño

Alvaro Mouriño

unread,
Aug 29, 2013, 5:39:56 AM8/29/13
to nod...@googlegroups.com
Just for the record, I am fully aware that NodeJS is single threaded,
that's basically why there is no such thing as thread-locals. Maybe I
should refer to them as "request locals", I just decided to use a
terminology we are all familiar with.

2013/8/29 Alvaro Mouriño <alv...@mourino.net>:
--
Alvaro Mouriño

Juraj Kirchheim

unread,
Aug 29, 2013, 7:24:43 AM8/29/13
to nod...@googlegroups.com
I'm not sure this is the best way of tackling things. Thinking of the
process (in the broader sense of the term) that leads to creating the
response to a requests is a very limiting perspective. And I fail to
see a critical advantage. Being able to do
`require('kolba').getCurrentRequest();` ain't it. All other problems
that domains may introduce, it's smells like a recipe for high
coupling.

Rather than throwing ideas from multi-threaded synchronous programming
at the problem, I suggest using something that requires no fancy
tricks: basic OOP.

1. The code that handles a single request is bundled in an object
instantiated only for the purpose of handling that very request (you
could potentially pool those for reuse, but that's more of a
performance optimization). The request handler code accesses the
request as `this.request` and can store all other state specific to
handling that request in the `this` scope.
Such request handlers can exist in the same domain and communicate
with one another if necessary (prioritization, synchronization, direct
communication between two long polls etc.).

2. Follow `Tell, Don't Ask`. Rather than querying the global namespace
for some state, why is the request not passed in as a parameter? If
the whole thing needs to be passed in at all. You should build code
that consumes a minimal interface (you rarely need the full request,
often just the query information or just the upstream) and does
something useful with it. You can build the awesomest ever
functionality - if it's built against
`require('kolba').getCurrentRequest();` and I cannot or don't want to
use Kolba for some reason or another, I cannot use it.

This is is a sketch of how things could look like:
https://gist.github.com/back2dos/6376645
Such an approach can run on top of an arbitrary router.
And theoretically you can build all sorts of fancy stuff that spits
out complex handlers with error handling and what not from very little
input (say mongoose schema object + some ACL rule). That can be
plugged into such a router. But it needn't. People can still build
handlers from scratch or register raw handlers with the underlying
router. Composition FTW!!!

Regards,
Juraj

Kevin Swiber

unread,
Aug 29, 2013, 7:31:30 AM8/29/13
to nod...@googlegroups.com
On Thu, Aug 29, 2013 at 5:20 AM, Alvaro Mouriño <alv...@mourino.net> wrote:
Hi list,

I started working with NodeJS quite recently, even if at first I was a
little skeptical about it, after using it intensively for the last
couple of months I realized it's an amazing piece of technology. The
only thing that I miss from other frameworks is the concept of
thread-locals.

After some time researching on this, a friend told me that this
behavior could be achieved using domains [0]. I built a small
framework, Kolba [1], as a proof of concept of how I see this being
implemented. I also wrote an article explaining this idea and
presenting some questions that I have.

The main question that I want to ask here is: What am I missing? Why
is this undocumented? Are there any edge cases that I don't see, which
make this idea non-viable?

Actually, the biggest problem I'm facing with the use of domain is the
suppression of exceptions. I may be using it wrong probably, which
leads to errors being silenced, but nothing regarding the emulation of
thread-locals.

Hey Alvaro,

I think the big difference here is that variables are not necessarily domain-local and can be referenced outside a domain or even in multiple domains.  This means that bad state can leak and cause an exception domino effect.  In the Web server scenario, this could mean killing all currently executing requests in a cluster.

You could certainly structure your application/module to mimic "request-local" behavior to an extent, but this isn't a pattern or guarantee Node is pushing out of the box with domains.

The big benefit of domains is capturing execution state on uncaught exceptions, potentially allowing for a graceful shutdown and more helpful logging.  That's how I'm using domains in Pipeworks[1] (an execution pipeline tool) and more recently, Argo[2] (an HTTP reverse proxy and origin server for APIs).  So far, so good.

In short, the idea of "request-local" variables isn't non-viable, just nearly impossible to guarantee; you have to rely on module dependents using the right patterns for your framework.


--
Kevin Swiber
Projects: https://github.com/kevinswiber
Twitter: @kevinswiber

Alvaro Mouriño

unread,
Aug 29, 2013, 8:55:20 AM8/29/13
to nod...@googlegroups.com
Hi all,

Thanks for your answers. I'll try to address all your questions:

* What are the advantages of using domains?
The killer feature, for me, was that all the new events created inside
a domain get propagated only inside such domain. This allowed me to
use events without worrying about polluting other requests.

Having the request globally accessible is a (nice) side effect, even
if that were not possible, I think domains are still worth using.

* What's so bad with passing the request object around?
Nothing, actually. What I do find wrong is passing it "just in case",
adding one more parameter to every call because some method deep in
the call chain may need to make use of it. This is typically the case
with template helpers.

Is not unusual to have a helper function somewhere that checks, for
example, if the user is logged in. For this specific cases I find it
very useful to have the request always available.

Global variables have very well defined dangers [0], which can be controlled by:
1. Making use of it in as few places as possible.
2. Treating it as read-only.

If this 2 rules are followed, there are some cases where they are just
the cleanest solution. Think about the typical use of environment
variables, which is to tell a software which configuration file to
use. It's very handy, gives us a mechanism to modify a software's
behavior without having access to its code. This works because that
variable is read in only one place and is never modified. You have to
be very strict about it.

I even advice against globals in the documentation [1].

Juraj, the `this.request` approach is a good idea, would be useful to
get rid of the parameter injection which I'm not very happy with. But
it doesn't make the request globally accessible, right? As I said, I
like the idea of having both options and trust that the user will use
them responsibly.

Kevin, if I understood correctly, you say that variables defined
globally may leak outside the domain? I guess that it's already a
problem with the fact that Node runs in a single thread. Global
variables are global to every request, and I don't attempt to "fix" it
with domains, only to provide global access to the request.

Cheers!

[0] http://c2.com/cgi/wiki?GlobalVariablesAreBad
[1] https://github.com/tooxie/kolba#cons


2013/8/29 Kevin Swiber <ksw...@gmail.com>:
> --
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com
> To unsubscribe from this group, send email to
> nodejs+un...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "nodejs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to nodejs+un...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.



--
Alvaro Mouriño

Forrest L Norvell

unread,
Aug 29, 2013, 11:26:07 AM8/29/13
to nod...@googlegroups.com
Hey Alvaro,

I needed the exact behavior you describe, and for a while  to was also following an approach based on domains. Ultimately, I encountered a few issues that made it too difficult for me to proceed. One was that I was writing a module, not an app, and I needed the state I was setting up to persist through arbitrary user code, and if others are using domains for their intended purpose, it can be very difficult to guarantee that domains-as-state will be available at some arbitrary point down the road. Another issue, which was my personal breaking point, was that my module also needs to be able to capture information about errors raised in user code without altering how those errors are handled. The conflation of concerns that resulted made for a lot of very complicated and fragile code, as well as a more or less intractable thicket of edge cases.

So instead I (with some important help from Tim Caswell) made what I'm calling "continuation-local storage" (the name's verbose, which is a(n over)reaction to how confusing "domain" is as a name for what the module does). The API is described at https://npmjs.org/package/continuation-local-storage, and a glue module that monkeypatches support for it into 0.10 and 0.8 is at https://npmjs.org/package/continuation-local-storage-glue. It supports multiple independent applications of CLS via namespaces, and doesn't interfere with domains at all.

Because the glue has to monkeypatch everything that crosses asynchronous boundaries in node, it's not as efficient as it could be, so there's been some discussion between some of the Node developers about adding support in 0.12 or 1.0 to provide an asynchronous listener that would allow both domains and other applications, like CLS, to do what they need to do. Trevor Norris has been working on this for the last couple weeks, and his latest sketch of an API is http://gist.github.com/trevnorris/6372405. It's pretty simple, can be made fast (no closure creation!), and has the cool side effect of disentangling domains from the core of Node.

continuation-local-storage-glue isn't quite finished yet (which is why I haven't announced it), but it is in use in real code right now, and so far all indications are that it does exactly what it's supposed to without interfering with domains. Check it out and tell me what you think.

Forrest

Forrest L Norvell

unread,
Aug 29, 2013, 11:33:38 AM8/29/13
to nod...@googlegroups.com
Sorry for the horrible formatting. That's what I get for using Gmail on an iPad.

F

Dan Peddle

unread,
Aug 30, 2013, 12:25:22 PM8/30/13
to nod...@googlegroups.com
Hi Forrest,

Could you explain a bit more about what you mean by continuation local, and when the domain based technique failed..? I understand that calling domain.exit() closed all current domains, for example..?

Full disclosure, I work with Alvaro, and this has been one of our big stumbling points both on an internal product, and for personal projects. Been meaning to post about this, but Alvaro beat me to it.. ;)

@izs was kind enough to point us at your project when we tweeted him about this too: https://twitter.com/danpeddle/statuses/369136870896062465

It's a really interesting area, and a clean API to get this behaviour would be super cool. 

Final one - while we're talking about it - under the hood, how do domains work..? Reading the code, it seems to be hijacking execution contexts..?

Thanks!


Alvaro Mouriño

unread,
Aug 30, 2013, 1:07:56 PM8/30/13
to nod...@googlegroups.com
2013/8/29 Forrest L Norvell <for...@newrelic.com>:
Hi Forrest,

Thanks for your feedback, I must say that it's the first time someone
tells me that tried using domains and found actual problems.

CLS looks really interesting! Now I understand a little better what
Isaac meant in the quite popular mail "The Future of Programming in
Node.js". I failed to see how could domains be implemented in
userland, at least with an unpatched version of Node, and it seems
that it's just not possible. My question now is, wouldn't it be better
to have a standard (as in "inside of node") way of doing this? There
are some modules that I think would make sense to have in node, like
this one or logging.

Anyway. I think that we both try to achieve the same but have
different requirements. The restrictions that you mention led you to
write CLS don't apply to Kolba, I think. Maybe I'm missing something?

Cheers!

Forrest L Norvell

unread,
Aug 30, 2013, 2:18:06 PM8/30/13
to nod...@googlegroups.com
On Fri, Aug 30, 2013 at 10:07 AM, Alvaro Mouriño <alv...@mourino.net> wrote:
I failed to see how could domains be implemented in
userland, at least with an unpatched version of Node, and it seems
that it's just not possible.

It's completely possible to do the kind of error handling that domains does in userland, and in fact Adam Crabtree has dealt with the same problem domains are meant to solve with trycatch (https://github.com/CrabDude/trycatch), which is in some ways a conceptually cleaner approach to error handling. If you look at how continuation-local-storage-glue is implemented, they have similar implementations, which is to wrap all of the functions that can lead to asynchronous execution (process.nextTick, setImmediate, setTimeout, I/O that touches ReqWrap / MakeCallback in the C++ side of Node) so that you have an opportunity to do whatever you need with the callback to be executed asynchronously.
 
My question now is, wouldn't it be better
to have a standard (as in "inside of node") way of doing this? There
are some modules that I think would make sense to have in node, like
this one or logging.

One of the complaints about domains in Node 0.8 is that they reduced performance of everybody's code, regardless of whether they were using domains or not. Trevor Norris did a lot of work to mitigate this in 0.10, but it's led to some pretty exotic and weird code, which means added complexity in the Node code base. Not everybody wants to handle errors the same way, not everybody needs a concept as general and powerful (and esoteric) as continuation-local storage, and nobody wants logging to work the exact same between applications.

However, we all want these things to be fast, which is what the work that Trevor's trying to land in 0.12 is meant to provide. The API he's working on (https://gist.github.com/trevnorris/6372405) makes it nearly trivial to build domains, trycatch, CLS, AOP-style loggers, long stacktrace modules, and a host of other functions that need to be able to work across async boundaries. It's much more in keeping with the Node philosophy to have one simple, general API in core and keep the rest in userland. To be clear: domains aren't going anywhere. But having them disentangled from the rest of Node core (in 0.8 and 0.10, domains code is everywhere) is a big win for keeping the core code small and simple.

continuation-local-storage-glue works in 0.10 and 0.8 (and could probably be made to work with 0.6 without too much work, but c'mon), and the performance hit is noticeable but not tremendous. It will be considerably cheaper once something like Trevor's addAsyncListener lands.
 
Anyway. I think that we both try to achieve the same but have
different requirements. The restrictions that you mention led you to
write CLS don't apply to Kolba, I think. Maybe I'm missing something?

If you're writing applications and have a clear understanding of when you're using  domains for error-handling and when you're using them for state propagation, you're fine. I'm writing an agent that runs inside applications, and those applications need to run the same whether or not my code is included. I have to be very careful about not altering existing error-handling behavior, and I need to be able to pass my state around without worrying about how developers are solving their own control flow, error-handling, or logging problems.

Ultimately, all I can say is that domains were designed for error handling, and they do that pretty well. Everything else people use them for is just a byproduct. CLS is meant to do exactly what you want.

F

Forrest L Norvell

unread,
Aug 30, 2013, 2:46:31 PM8/30/13
to nod...@googlegroups.com
On Fri, Aug 30, 2013 at 9:25 AM, Dan Peddle <dan.p...@gmail.com> wrote:
Could you explain a bit more about what you mean by continuation local

Sure. When you're handling a request in e.g. Express, the technique of wrapping up future computation in a callback and handing that to a function to be called when the function is done and computation is ready to proceed is known as continuation-passing style (CPS). The callback and its closure are the continuations in CPS -- they represent a context encapsulating future computation. CLS is meant to allow you to transparently set and get values that are specific to the execution of a specific chain of function invocations, or continuation evaluations. To cut the fancy CS talk, that means that you can have a whole bunch of requests in flight concurrently, all with their own state, and this is all done transparently to user code, across asynchronous call boundaries. It's exactly what you're doing with domains today, only decoupled from error handling.

when the domain based technique failed..?

See my response to Alvaro for some of the context, but it helps to know that I'm writing an agent that traces the asynchronous execution of programs I didn't write, and in general know nothing about. The agent also is trying to capture information about as many errors that occur inside the program as possible, so it can gather and display information about them back to the developer. Further add onto that the possibility that the program in which the agent is running may itself be using domains.

So you're going to have nested domains there, almost as a certainty -- for error gathering, for state propagation, and whatever the users of the agent are doing with domains themselves. If one of the enclosing domains is attached to, say, an event emitter, and it gets and handles an error, it's going to exit all of the domains it encloses (because domains are essentially asynchronous try/catch, and throwing in try/catch is a form of nonlocal exit or computed goto, domains have a stack which allows them to unwind state in an analogous way), which will completely nuke the ability to continue execution tracing, even if the error, for whatever reason, is treated as recoverable.

That can be dealt with (through yet more monkeypatching of domains and / or event emitters), but the underlying conflation of concerns (error handling and state propagation) remains. In my case, having execution tracing and error handling decoupled is very close to a necessity, because the agent runs very close to Node core and also needs to work with multiple versions of Node, with their very different implementations of things like domains and process.nextTick. Edge cases are pretty much my entire world.
 
I understand that calling domain.exit() closed all current domains, for example..?

Domain.prototype.exit() exits the current domains and all domains above it on the domain stack, which in effect means any domains that were entered after the domain currently being exited. 

Final one - while we're talking about it - under the hood, how do domains work..? Reading the code, it seems to be hijacking execution contexts..?

A complete breakdown of how domains work in Node 0.10 is probably beyond the scope of an email message, but there are two main components of domains:

1. A small piece of C++ that captures any JavaScript exceptions that bubble up to the V8 toplevel and then hands them off to a function exposed globally on the JavaScript side of node (process._fatalException()).
2. A bunch of small bits of code scattered throughout the JavaScript source of Node that ensure that asynchronous things (timers, I/O listeners) are given the context necessary for the exception handler to find the correct function to invoke when there's an error. Sometimes this means wrapping closures around callbacks, sometimes this means attaching domains to EventEmitters or timers. It's all pretty pragmatic, except for the pieces that were added to make domains fast (which are a whole other story).

F
Reply all
Reply to author
Forward
0 new messages