Node Internals

76 views
Skip to first unread message

mob

unread,
Dec 6, 2009, 12:35:10 PM12/6/09
to nodejs
Two quick questions about the internals of node.

When servicing incoming HTTP requests, does node create one JS
interpreter (global space) for each request, or is one shared between
all requests. If shared, how do you protected each global space. If
per-request, then have any benchmarks been done to measure the memory
footprint of each JS engine for each open request. If I have a very
large number of requests, that will translate into quite a bit of
memory for all the VMs, will it not?

Second question pertains to async-dns. There seems to be some code in
node_net.cc about doing dns lookups on the host name. This is for
client connections, I'm presuming that no DNS lookups are done for
server-side logic. Is that right?

--mob

Bluebie

unread,
Dec 6, 2009, 3:51:25 PM12/6/09
to nodejs
No, node uses one JS interpreter for all requests. This is very unlike
the way ruby, python, php, and friends usually operate, and it is one
reason node is able to achieve wonderful performance and handle many
requests without consuming huge quantities of system memory. The
memory cost for having a http connection amounts to little more than a
couple of JS objects, and one TCP connection.

You say if shared, how do you protect the global space: We don't, that
is your job. Node isn't so much a webapp platform as it is a tool for
building _your own_ web servers, be them http based, web socket based,
irc protocol, or some other protocol.

I can't answer your question regarding DNS, but my guess is it does
the sane thing, only retrieving DNS when you explicitly ask for it, or
provide a hostname to something instructed to connect to it. Node.js
is way too low level to assume you'd want something done for you
automatically.

mob

unread,
Dec 6, 2009, 4:10:25 PM12/6/09
to nodejs
Thanks for the quick reply.

Regarding shared global, since Node is using the CommonJS module
loader to load modules, does this work if multiple apps are
"require"ing different modules from different load paths? If each app
defines a different load path on require, will that interfere with
other apps?

I suppose what I'm asking is can I effectively isolate different apps
under node or is it designed to support cooperating applications?

Some feedback, getting > 9K simple requests per second on a macbook
pro. Very nice performance.
Also Node is using ~45MB on macbook pro. It seems to grow from 5 to 45
MB pretty quickly and then stays constant.

--mob

Ryan Dahl

unread,
Dec 6, 2009, 4:34:19 PM12/6/09
to nod...@googlegroups.com
On Sun, Dec 6, 2009 at 6:35 PM, mob <m...@embedthis.com> wrote:
> Two quick questions about the internals of node.
>
> When servicing incoming HTTP requests, does node create one JS
> interpreter (global space) for each request, or is one shared between
> all requests. If shared, how do you protected each global space. If
> per-request, then have any benchmarks been done to measure the memory
> footprint of each JS engine for each open request. If I have a very
> large number of requests, that will translate into quite a bit of
> memory for all the VMs, will it not?

As Bluebie said, all requests belong to the same context.

> Second question pertains to async-dns. There seems to be some code in
> node_net.cc about doing dns lookups on the host name. This is for
> client connections, I'm presuming that no DNS lookups are done for
> server-side logic. Is that right?

If you do server.listen(8000, "somedomain.com") a look up will be done
to get the address. The code in node_net.cc will eventually be using
the node_dns.cc module - it isn't yet due to a bug.

Thomas Lockney

unread,
Dec 6, 2009, 9:00:03 PM12/6/09
to nod...@googlegroups.com
On 12/06/2009 01:10 PM, mob wrote:
> Regarding shared global, since Node is using the CommonJS module
> loader to load modules, does this work if multiple apps are
> "require"ing different modules from different load paths? If each app
> defines a different load path on require, will that interfere with
> other apps?
>
> I suppose what I'm asking is can I effectively isolate different apps
> under node or is it designed to support cooperating applications?
>
This depends on how you are defining you "apps". When you run node, you
tell it what code you want to execute (excuse me for being Mr. Obvious
here). In my mind, that is your "app". If you concurrently another
instance of node, whether pointing at the same code or some other code
(which may or may not share modules), that is another instance of an app
running in it's own process space. Nothing is shared between them.

Bluebie

unread,
Dec 7, 2009, 7:37:31 AM12/7/09
to nodejs
If you want to run several, distinct and unrelated applications I
would recomend you give each application it's own node instance,
running on it's own port, and set up some sort of modern highly
performant proxy server (or write one in node) to route out requests
from port 80, to the processes requiring those requests. You could use
something like ngynix for such a task, and have it handle static file
serving too. :)

Node's implementation of the 'secure modules' common-js specification
is syntax compatible, but is not actually factually secure. Modules
loaded can infect the global namespace with all sorts of junk.
Regardless, it is an event loop, and you get just one per node
process, so if you run multiple apps in one node instance, one badly
coded app could adversely affect the performance of the rest.
Additionally, many computer systems now days contain multiple cores or
CPU's, and the event loop runs on just one of them, so you may be able
to get a lot more performance out of your system when using multiple
processes.

It'd probably be safe to do something really hacky, like buffer in the
first 5kb of text from a new connection (or till \r\n\r\n, whichever
comes first), looking for a 'Host:' header, matching it to an
application port on localhost (or where ever your particular app is
running...), and opening a connection to that port, sending in the
buffer, and wiring up the two tcp sockets (or tcp and unix sockets?)
to talk directly to each other from that point in. Your subapps would
loose the ability to see the client's IP address as a side effect, but
you wouldn't have to be parsing and constructing a ton of HTTP cruft
in the proxy end, and it's pretty light weight, so the performance
cost should be relatively low.

This assumes browsers always make a new connection when contacting a
different hostname, even if that hostname resolves to the same IP
address as an existing connection, but this is just a guess, and
reality might not agree with me. :)

It is important to understand that node is very low level. It is much
more like a programming language than a web server. It does come with
a nice little http server library for free, but it is not at all by
default setup to do any http work, or have any specific behaviours
regarding it. It's your job to figure out how you want your web server
to work, and build it that way, with node. You are no longer limited
by other people's architectural whims. :)

Erik Garrison

unread,
Dec 7, 2009, 9:46:38 AM12/7/09
to nod...@googlegroups.com
On Mon, Dec 7, 2009 at 7:37 AM, Bluebie <bl...@creativepony.com> wrote:
>
> Node's implementation of the 'secure modules' common-js specification
> is syntax compatible, but is not actually factually secure. Modules
> loaded can infect the global namespace with all sorts of junk.
> Regardless, it is an event loop, and you get just one per node
> process, so if you run multiple apps in one node instance, one badly
> coded app could adversely affect the performance of the rest.
> Additionally, many computer systems now days contain multiple cores or
> CPU's, and the event loop runs on just one of them, so you may be able
> to get a lot more performance out of your system when using multiple
> processes.
>

In fact, nothing in the Commonjs securable modules specification
asserts that an implementation should secure the modules in unique
execution contexts. The only mention of sandboxing is related to the
module search path attribute (it may not be sandboxed). Security of
context is implied by the spec's opening description:

"These modules are offered privacy of their top scope, facility for
importing singleton objects from other modules, and exporting their
own API." (http://wiki.commonjs.org/wiki/Modules/1.1)

However, it is not asserted by the contract of the specification,
IIRC. I'll try to figure out why this omission occurred; perhaps it
was for reasons of compatibility.

I've been toying around with the idea of context-isolated modules in
the browser. They have the advantage of being conceptually easy to
manage in terms of code isolation. They can be used to securely
modularize foreign javascript code without code-level modification or
the addition of boilerplate to that code. Unfortunately, they have
the disadvantage of breaking potentially important type inference
features (see below). My tests have been in the browser, using
iframes to provide 'clean' window elements as an execution context for
potentially invasive code. I am following the work of Dean Edwards:
http://dean.edwards.name/weblog/2006/11/hooray/ and Andrea Giammarchi:
http://webreflection.blogspot.com/2009/07/elsewhere-sandboxes-have-never-been.html,
and attempting to link it to James Burke's run.js asynchronous
javascript loader http://code.google.com/p/runjs/.

This same pattern could be adapted to node. v8 uses contexts to
provide code isolation and caching of these contexts to eliminate the
performance bottleneck incurred by generating the contexts. See
http://code.google.com/apis/v8/embed.html#contexts.

If the proper hooks were provided, applications written in node could
select to use context isolation for each request. Is this something
that could be useful? I can see that it would also be limiting to
enforce and would prefer that it not become standard.

Above I alluded to frustrations about the breakage of introspective
capabilities: In other languages you often can't modify primordial
objects at runtime, so they skirt some of the fundamental issues that
arise from using multiple execution contexts in the same environment.
In javascript, what happens when you compare references from two
distinct contexts can be quite frustrating e.g.:

(new context1.Array()) instanceof context2.array

will invariably be false. The primordial objects from which each
array is descended are not the same. This behavior is technically
correct, but I can imagine it being frustrating to deal with. More
frustratingly, this issue crosses javascript's fuzzy border between
dynamic and static. In a fully dynamic language with multiple
contexts and the ability to share references between them it would be
possible to dynamically subclass the primordials of the 'global' top
level scope and resolve this issue, but it is simply not possible to
modify the prototype chains of primordials in this way (at least in
the browser context I have tested). Perhaps this is not true in v8, I
am not sure.

Erik
Reply all
Reply to author
Forward
0 new messages