> On Nov 17, 11:57 am, Ryan Dahl <r...@tinyclouds.org> wrote:
> > On Thu, Nov 17, 2011 at 11:55 AM, Liam <networkimp...@gmail.com> wrote:
> > > On Nov 17, 11:40 am, Tomasz Janczuk <tom...@janczuk.org> wrote:
> > >> The particular numbers I show on the slide were obtained by running 50
> > >> instances of an HTTP long polling web application (https://github.com/ > > >> tjanczuk/denser/blob/master/src/denser/samples/chat2.js) either as 50
> > >> separate processes or 50 isolates each on its own thread in a single
> > >> process, averaged over the number of apps. The memory metric I used is
> > >> "private working set", which on Windows is the amount of physical
> > >> memory utilized by a process that cannot be shared with other
> > >> processes (so shared libs are excluded; in this case the v8.dll on
> > >> windows was excluded).
> > > It would be interesting to see the numbers across a range of threads.
> > > (It'd also be nice to have a script to automate the tests so ppl can
> > > run it in different contexts.)
> > > It would also be helpful to see an application that has far higher
> > > memory demands, and uses some common third party modules...
> > There is no memory sharing between isolates.
> Exactly, so the 2x benefit described seems unlikely for real-world
> apps.
I don't think Tomasz' experiments used memory sharing. He was just
measuring the 'overhead' of running a separate process.
I agree the benefit would greatly diminish for memory-bound node
applications since the shared global state of the process would be a
very small fraction of the private non-shared state across isolates. I
am curious if you have you seen a lot of node apps with large memory
consumption in the "real world"? I chose HTTP long polling web chat
app as a benchmark based on the assumption that node works
particularly well the class of apps that need to orchestrate IO while
doing a lot of non-CPU and non-memory intensive waiting in between. I
think such apps would actually show reduced average memory footprint
when hosted in isolates as opposed to individual processes.
On Nov 17, 12:04 pm, Liam <networkimp...@gmail.com> wrote:
> On Nov 17, 11:57 am, Ryan Dahl <r...@tinyclouds.org> wrote:
> > On Thu, Nov 17, 2011 at 11:55 AM, Liam <networkimp...@gmail.com> wrote:
> > > On Nov 17, 11:40 am, Tomasz Janczuk <tom...@janczuk.org> wrote:
> > >> The particular numbers I show on the slide were obtained by running 50
> > >> instances of an HTTP long polling web application (https://github.com/ > > >> tjanczuk/denser/blob/master/src/denser/samples/chat2.js) either as 50
> > >> separate processes or 50 isolates each on its own thread in a single
> > >> process, averaged over the number of apps. The memory metric I used is
> > >> "private working set", which on Windows is the amount of physical
> > >> memory utilized by a process that cannot be shared with other
> > >> processes (so shared libs are excluded; in this case the v8.dll on
> > >> windows was excluded).
> > > It would be interesting to see the numbers across a range of threads.
> > > (It'd also be nice to have a script to automate the tests so ppl can
> > > run it in different contexts.)
> > > It would also be helpful to see an application that has far higher
> > > memory demands, and uses some common third party modules...
> > There is no memory sharing between isolates.
> Exactly, so the 2x benefit described seems unlikely for real-world
> apps.
On Thu, Nov 17, 2011 at 1:01 PM, Liam <networkimp...@gmail.com> wrote: > On Nov 17, 12:40 pm, Bert Belder <bertbel...@gmail.com> wrote: >> On Nov 17, 9:04 pm, Liam <networkimp...@gmail.com> wrote:
>> > On Nov 17, 11:57 am, Ryan Dahl <r...@tinyclouds.org> wrote: >> > > ... >> > > There is no memory sharing between isolates.
>> > Exactly, so the 2x benefit described seems unlikely for real-world >> > apps.
>> I don't think Tomasz' experiments used memory sharing. He was just >> measuring the 'overhead' of running a separate process.
> Right, so shouldn't that be given as KB per process (on various OSes) > not a ratio?
> And then the question is whether that cost is worth the benefits of a > process, e.g. it can't crash other processes.
> Maybe it's not; I don't have an opinion. Ryan's on the record saying > "processes are the tool for concurrency," so this seems like a > significant shift?
We're giving people the option of starting a new Node either as a standalone process or as a new thread/isolate. Most people will want to continue using processes as their method of parallelizing because it's most robust. However some people will want to use threads because with an addon they can do fancy synchronization between the isolates. The isolates feature is to enable advanced parallelizing techniques in Node addons.
On Nov 17, 12:53 pm, Tomasz Janczuk <tom...@janczuk.org> wrote:
> I agree the benefit would greatly diminish for memory-bound node
> applications since the shared global state of the process would be a
> very small fraction of the private non-shared state across isolates. I
> am curious if you have you seen a lot of node apps with large memory
> consumption in the "real world"?
I think it's safe to say that the appeal of Javascript will draw in
developers who push Node into every conceivable role, so "Node is for
x & y sorts of apps" probably isn't a durable assertion. I think we'll
see Node displace Java to some extent, as its ecosystem matures.
It's not hard to eat memory in JS. For instance, it's natural to use
an Object as a key:value cache, and an Array as a queue. I suspect
those aren't the most efficient structures for those needs, but it
will be common practice.
> On Thu, Nov 17, 2011 at 1:01 PM, Liam <networkimp...@gmail.com> wrote:
> > On Nov 17, 12:40 pm, Bert Belder <bertbel...@gmail.com> wrote:
> >> On Nov 17, 9:04 pm, Liam <networkimp...@gmail.com> wrote:
> >> > On Nov 17, 11:57 am, Ryan Dahl <r...@tinyclouds.org> wrote:
> >> > > ...
> >> > > There is no memory sharing between isolates.
> >> > Exactly, so the 2x benefit described seems unlikely for real-world
> >> > apps.
> >> I don't think Tomasz' experiments used memory sharing. He was just
> >> measuring the 'overhead' of running a separate process.
> > Right, so shouldn't that be given as KB per process (on various OSes)
> > not a ratio?
> > And then the question is whether that cost is worth the benefits of a
> > process, e.g. it can't crash other processes.
> > Maybe it's not; I don't have an opinion. Ryan's on the record saying
> > "processes are the tool for concurrency," so this seems like a
> > significant shift?
> We're giving people the option of starting a new Node either as a
> standalone process or as a new thread/isolate. Most people will want
> to continue using processes as their method of parallelizing because
> it's most robust. However some people will want to use threads because
> with an addon they can do fancy synchronization between the isolates.
> The isolates feature is to enable advanced parallelizing techniques in
> Node addons.
Aha. Got it. Good idea.
Synchronization like... sharing data via a Buffer?
> He asked a question in a silly way and some of us were pretty rude to
him.
We did not get rude until the skit got really silly. He kept repeating the same mantra over and over even though he got a lot of replies explaining the same thing to him over and over. We went for an extended period of time being very polite.
> > He asked a question in a silly way and some of us were pretty rude to
> him.
> We did not get rude until the skit got really silly. He kept repeating the
> same mantra over and over even though he got a lot of replies explaining
> the same thing to him over and over. We went for an extended period of
> time being very polite.
On Thu, Nov 17, 2011 at 2:07 PM, Liam <networkimp...@gmail.com> wrote: > Mark I think you missed your intended thread target :-)
> On Nov 17, 2:00 pm, Mark Hahn <m...@hahnca.com> wrote: > > > He asked a question in a silly way and some of us were pretty rude to
> > him.
> > We did not get rude until the skit got really silly. He kept repeating > the > > same mantra over and over even though he got a lot of replies explaining > > the same thing to him over and over. We went for an extended period of > > time being very polite.
On Thu, 17 Nov 2011 13:34:02 -0800 (PST), Liam wrote: > It's not hard to eat memory in JS. For instance, it's natural to use > an Object as a key:value cache, and an Array as a queue. I suspect > those aren't the most efficient structures for those needs, but it > will be common practice.
I agree, but that could change if the core had a specific struct to handle those 2 kinds of "bad" usage (something similar to Buffer) or perhaps a C module..
On Thu, Nov 17, 2011 at 4:45 PM, Liam <networkimp...@gmail.com> wrote: > On Nov 17, 1:31 pm, Ryan Dahl <r...@tinyclouds.org> wrote: > > On Thu, Nov 17, 2011 at 1:01 PM, Liam <networkimp...@gmail.com> wrote: > > > On Nov 17, 12:40 pm, Bert Belder <bertbel...@gmail.com> wrote: > > >> On Nov 17, 9:04 pm, Liam <networkimp...@gmail.com> wrote:
> > >> > On Nov 17, 11:57 am, Ryan Dahl <r...@tinyclouds.org> wrote: > > >> > > ... > > >> > > There is no memory sharing between isolates.
> > >> > Exactly, so the 2x benefit described seems unlikely for real-world > > >> > apps.
> > >> I don't think Tomasz' experiments used memory sharing. He was just > > >> measuring the 'overhead' of running a separate process.
> > > Right, so shouldn't that be given as KB per process (on various OSes) > > > not a ratio?
> > > And then the question is whether that cost is worth the benefits of a > > > process, e.g. it can't crash other processes.
> > > Maybe it's not; I don't have an opinion. Ryan's on the record saying > > > "processes are the tool for concurrency," so this seems like a > > > significant shift?
> > We're giving people the option of starting a new Node either as a > > standalone process or as a new thread/isolate. Most people will want > > to continue using processes as their method of parallelizing because > > it's most robust. However some people will want to use threads because > > with an addon they can do fancy synchronization between the isolates. > > The isolates feature is to enable advanced parallelizing techniques in > > Node addons.
<3
> Aha. Got it. Good idea.
> Synchronization like... sharing data via a Buffer?
Synchronization like sharing deeply frozen data structures, or some kind of pass-and-forget reference semantics, or some other interesting model.
> On Thu, Nov 17, 2011 at 2:07 PM, Liam <networkimp...@gmail.com> wrote: > Mark I think you missed your intended thread target :-)
> On Nov 17, 2:00 pm, Mark Hahn <m...@hahnca.com> wrote: > > > He asked a question in a silly way and some of us were pretty > rude to
> > him.
> > We did not get rude until the skit got really silly. He kept > repeating the > > same mantra over and over even though he got a lot of replies > explaining > > the same thing to him over and over. We went for an extended > period of > > time being very polite.
Regarding isolates and domains: is it the intent that when code running in an isolate fails (e.g. unhandled exception), domains will allow node to clean up global state associated with that isolate only, and as a result leave remaining isolates running? Or is the entire process going down if code in a single isolate fails?
On Nov 16, 1:36 pm, Ryan Dahl <r...@tinyclouds.org> wrote:
> We're just starting development on the new branch of Node. New > development continues in the master branch - bug fixes go into the > v0.6 branch. The current target for a v0.8 release is early January. > We will continue weekly v0.6 releases throughout v0.8 development.
> - Isolates. Node will allow users to spawn "child processes" that actually > run in a thread. We have to get rid of all the global variables in node. > Compiled extensions need know which isolate they're targeting, and we need > to decide if we want to load the extension multiple times or just once. > Also, some changes to libuv are necessary, since we will have to completely > clean up a loop. Finally we'll have to deal with debugging a multi-threaded > node process. (Ben, Ryan) > https://github.com/joyent/node/issues/2133
> - Domains. Domains provide a lightweight isolation mechanism for all i/o > related to a particular network connection (e.g. an incoming http request). > If an unhandled error is encountered, all i/o local to that particular > domain is canceled and all handles are cleaned up. An experimental > implementation can be found athttps://github.com/joyent/node/commits/domains. > Some of the early work for domains will be used in Isolate support (e.g. > cleaning up handles when an Isolate is killed) (Bert, Ryan) > https://github.com/joyent/node/issues/2134
> - Better allocator. Currently node uses a memory pool allocator to provide > read buffers for incoming data streams. This allocator is only > reasonably efficient for network connections and local pipes on Unix, and on > Windows when zero reads are used. For file i/o and buffered reads on Windows > we need a better story. (Igor) > https://github.com/joyent/node/issues/2135
> - Addons. We need to define an easy and suggested way of building > extensions, which should be similar across all supported platforms. (Everyone) > https://github.com/joyent/node/issues/2136
On Nov 17, 2011 8:43 PM, "Tomasz Janczuk" <tom...@janczuk.org> wrote:
> Regarding isolates and domains: is it the intent that when code > running in an isolate fails (e.g. unhandled exception), domains will > allow node to clean up global state associated with that isolate only, > and as a result leave remaining isolates running? Or is the entire > process going down if code in a single isolate fails?
If one isolate fails we will continue execution. We need the ability to clean up and open handles and I/O requests. The current plan is this (subject to change): each isolate has many domains, each domains has many handles. When an isolate goes down it kills all of its domains, which in turn kills all of its handles.
The intent with domains is simply to give the user more insight and better control over the I/O they are preforming by grouping related handles together. For example, if timers are set up during the process of handling an HTTP request and the HTTP connection is terminated because a peer hit the stop button on their browser, it would be nice to be able to automatically cancel those timers.
> If one isolate fails we will continue execution. We need the ability > to clean up and open handles and I/O requests. The current plan is > this (subject to change): each isolate has many domains, each domains > has many handles. When an isolate goes down it kills all of its > domains, which in turn kills all of its handles.
> The intent with domains is simply to give the user more insight and > better control over the I/O they are preforming by grouping related > handles together. For example, if timers are set up during the process > of handling an HTTP request and the HTTP connection is terminated > because a peer hit the stop button on their browser, it would be nice > to be able to automatically cancel those timers.
You should consider using [Trello](http://trello.com) for the Node roadmap. That would be a nice way of letting people know what's coming and what's being worked on, and devs can vote on the things in the backlog they'd love to see implemented.
On Nov 17, 9:36 pm, Ryan Dahl <r...@tinyclouds.org> wrote:
> If one isolate fails we will continue execution. We need the ability > to clean up and open handles and I/O requests. The current plan is > this (subject to change): each isolate has many domains, each domains > has many handles. When an isolate goes down it kills all of its > domains, which in turn kills all of its handles.
What constitutes failure of an isolate? Uncaught exceptions, and...?
What constitutes failure of an isolate? Uncaught exceptions, and...?
I think it would be any exit of an isolate, under either normal conditions (process.exit(), or no remaining watchers) or abnormal conditions (ie uncaught or fatal exception). The isolate lifecycle should be indistinguishable from that of any normal node process (with the exception of handling of external signals, because it would not be possible to direct a signal at a single isolate only).
Besides IO and open handles, is there anything else that gets clean up when an isolate goes away? For instance, what happens with other isolates or processes spawned by that isolate?
There is a certain level of resemblance between isolates and processes, I wonder to what extent the behavioral model of am isolate and a process will be reconciled in node? Specifically, when I write a node app, do I need to be explicit if it is to run in a process or an isolate, or will node runtime provide an environment that abstract away the differences?
On Nov 17, 9:39 pm, Mikeal Rogers <mikeal.rog...@gmail.com> wrote:
> > If one isolate fails we will continue execution. We need the ability > > to clean up and open handles and I/O requests. The current plan is > > this (subject to change): each isolate has many domains, each domains > > has many handles. When an isolate goes down it kills all of its > > domains, which in turn kills all of its handles.
> > The intent with domains is simply to give the user more insight and > > better control over the I/O they are preforming by grouping related > > handles together. For example, if timers are set up during the process > > of handling an HTTP request and the HTTP connection is terminated > > because a peer hit the stop button on their browser, it would be nice > > to be able to automatically cancel those timers.
Besides IO and open handles, is there anything else that gets clean up
> when an isolate goes away? For instance, what happens with other > isolates or processes spawned by that isolate?
> There is a certain level of resemblance between isolates and > processes, I wonder to what extent the behavioral model of am isolate > and a process will be reconciled in node? Specifically, when I write a > node app, do I need to be explicit if it is to run in a process or an > isolate, or will node runtime provide an environment that abstract > away the differences?
I can't speak for the core team, and I suppose some of the details are still under discussion. But for my application I interpreted the need to be:
- behave identically to processes wherever possible; - be explicit about the differences in behaviour when you can't.
So, if an isolate spawns a child process, the child continues to run when the isolate exits. Similarly, if an isolate forks another isolate, the child will continue to run once if the parent exits. For most applications, there will be no difference seen by the application code between the isolate and process case.
Those things that are process-wide - such as process.setuid() - cannot be faithfully emulated in isolates, so the action would apply to all isolates.
env and cwd are process-wide but you could make some kind of effort to fake these as being independent between multiple isolates in a process. In the case of cwd, I've made no effort to do that - a call to chdir() will take effect for all isolates in a process - but for my use-case this was irrelevant. In the case of env I have the usual process-wide env, and local isolate overrides that are initialised by the env passed in to fork(). There is a well-defined behaviour for these, but they are different in some situations from the conventional case; for example, they are not transitive (ie a child of a child doesn't inherit the local overrides). Again, that's enough for my use-case, but I don't know if that's a useful behaviour in the general case.
On Fri, Nov 18, 2011 at 1:26 PM, Paddy Byers <paddy.by...@gmail.com> wrote: > Hi,
> Besides IO and open handles, is there anything else that gets clean up >> when an isolate goes away? For instance, what happens with other >> isolates or processes spawned by that isolate?
>> There is a certain level of resemblance between isolates and >> processes, I wonder to what extent the behavioral model of am isolate >> and a process will be reconciled in node? Specifically, when I write a >> node app, do I need to be explicit if it is to run in a process or an >> isolate, or will node runtime provide an environment that abstract >> away the differences?
> I can't speak for the core team, and I suppose some of the details are > still under discussion. But for my application I interpreted the need to be:
> - behave identically to processes wherever possible; > - be explicit about the differences in behaviour when you can't.
> So, if an isolate spawns a child process, the child continues to run when > the isolate exits. Similarly, if an isolate forks another isolate, the > child will continue to run once if the parent exits.
What are you implying here? I'm under the impression that regardless of whether your in an isolates environment or not, the child_process module will continue to spawn real os processes. At least, I sure hope. The whole isolates pseudo-process worker thing is great and all, but it should get its own API. The child_process API (other than being unfortunately named) is just fine -- there's no need to remove this feature (and I don't believe the core team intends to).
> For most applications, there will be no difference seen by the application > code between the isolate and process case.
> Those things that are process-wide - such as process.setuid() - cannot be > faithfully emulated in isolates, so the action would apply to all isolates.
I don't believe the intent of isolates is to emulate processes. It's an abstraction over processes. Maybe it'll be easier if we just called them "workers" -- and we can say there are thread-backed workers and process-backed workers -- both would have a shared API surface. Which one you choose just depends on your application's needs (thread-backed workers could enable efficient memory sharing tricks, process workers on *nix could open the door to CoW fork tricks).
Those in the know: is this a fair characterization?
> env and cwd are process-wide but you could make some kind of effort to > fake these as being independent between multiple isolates in a process. In > the case of cwd, I've made no effort to do that - a call to chdir() will > take effect for all isolates in a process - but for my use-case this was > irrelevant. In the case of env I have the usual process-wide env, and local > isolate overrides that are initialised by the env passed in to fork(). > There is a well-defined behaviour for these, but they are different in some > situations from the conventional case; for example, they are not transitive > (ie a child of a child doesn't inherit the local overrides). Again, that's > enough for my use-case, but I don't know if that's a useful behaviour in > the general case.
What are you implying here? I'm under the impression that regardless of
> whether your in an isolates environment or not, the child_process module > will continue to spawn real os processes. At least, I sure hope.
Yes, you're right - if I implied otherwise, it wasn't intended.
> The whole isolates pseudo-process worker thing is great and all, but it > should get its own API. The child_process API (other than being > unfortunately named) is just fine -- there's no need to remove this feature > (and I don't believe the core team intends to).
I agree. (But .. child_process.fork() is already a different API from spawn(), exec() etc, and it has an options argument that could be used to specify one behaviour or another. But the core team might choose to expose the functionality in a different way.)
> I don't believe the intent of isolates is to emulate processes. It's an > abstraction over processes. Maybe it'll be easier if we just called them > "workers" -- and we can say there are thread-backed workers and > process-backed workers -- both would have a shared API surface. Which one > you choose just depends on your application's needs (thread-backed workers > could enable efficient memory sharing tricks, process workers on *nix could > open the door to CoW fork tricks).
I agree here as well. I wasn't saying that the idea is to emulate processes; only to emulate already-well-understood behaviour that we get today with process-backed workers wherever it is possible and makes sense, so existing modules and applications are broadly agnostic to the environment they're run in.
But as I said at the start, don't take my word for it; it was just a personal take on it.
On Fri, Nov 18, 2011 at 20:32, Dean Landolt <d...@deanlandolt.com> wrote:
> On Fri, Nov 18, 2011 at 1:26 PM, Paddy Byers <paddy.by...@gmail.com> wrote:
>> Hi,
>>> Besides IO and open handles, is there anything else that gets clean up >>> when an isolate goes away? For instance, what happens with other >>> isolates or processes spawned by that isolate?
>>> There is a certain level of resemblance between isolates and >>> processes, I wonder to what extent the behavioral model of am isolate >>> and a process will be reconciled in node? Specifically, when I write a >>> node app, do I need to be explicit if it is to run in a process or an >>> isolate, or will node runtime provide an environment that abstract >>> away the differences?
>> I can't speak for the core team, and I suppose some of the details are >> still under discussion. But for my application I interpreted the need to be: >> - behave identically to processes wherever possible; >> - be explicit about the differences in behaviour when you can't. >> So, if an isolate spawns a child process, the child continues to run when >> the isolate exits. Similarly, if an isolate forks another isolate, the child >> will continue to run once if the parent exits.
> What are you implying here? I'm under the impression that regardless of > whether your in an isolates environment or not, the child_process module > will continue to spawn real os processes. At least, I sure hope. The whole > isolates pseudo-process worker thing is great and all, but it should get its > own API. The child_process API (other than being unfortunately named) is > just fine -- there's no need to remove this feature (and I don't believe the > core team intends to).
>> For most applications, there will be no difference seen by the application >> code between the isolate and process case. >> Those things that are process-wide - such as process.setuid() - cannot be >> faithfully emulated in isolates, so the action would apply to all isolates.
> I don't believe the intent of isolates is to emulate processes. It's an > abstraction over processes. Maybe it'll be easier if we just called them > "workers" -- and we can say there are thread-backed workers and > process-backed workers -- both would have a shared API surface. Which one > you choose just depends on your application's needs (thread-backed workers > could enable efficient memory sharing tricks, process workers on *nix could > open the door to CoW fork tricks). > Those in the know: is this a fair characterization?