Concurrency standard module

9 views
Skip to first unread message

Kris Zyp

unread,
Sep 21, 2009, 12:12:32 PM9/21/09
to comm...@googlegroups.com
I was wondering if we could consider a concurrency module to help with
some basic concurrency constructs, in particular for working in a shared
memory threaded environments. While writing middleware for Jack, I
realized it can be almost impossible to write certain functionality in
thread safe manner without some concurrency support, like at
locks/mutexes, that are lacking from raw JavaScript. I realize that
there are certainly alternate concurrency models that do not require
locks, but I didn't think that CommonJS was attempting to force a
particular concurrency model (and I am not sure that it should either).
With the possibility that modules could be run in a threaded
environment, it seems important to have something to safeguard against
race conditions. For example, Jack runs on Jetty and Simple using
multiple threads with shared memory. While such a topic often sparks
responses suggesting that something like erlang's messaging would be
nice, but to really seriously consider that I think we would need a real
proposal and even then I am afraid it would be quite onerous to require
that something like Jack and Persevere be rewritten to follow a message
passing model, especially while preserving performance and scalability.
I think for where we are right now, we have to consider that the
concurrency model is an unknown, and provide a way for modules to
safeguard themselves from race conditions.

Anyway, here is a real simple, rough start to get going, based on some
of Java's stuff. I don't really care if it changes, I just want
something to be able to write cross engine middleware that is
thread-safe (or have someone explain how I can do it with what we have
right now).

require("concurrency").Lock -> constructor for a reentrant mutual
exclusion lock.
require("concurrency").SharedLock -> constructor for a reentrant shared
lock.
require("concurrency").ThreadLocal -> constructor for a thread-local
variable.

var lock = new (require("concurrency").Lock)();
lock.lock() -> aquires a lock
lock.unlock() -> attempts to releases a lock
lock.tryLock() -> try to aquire a lock

var threadLocal = new (require("concurrency").ThreadLocal)();
threadLocal.get() -> get the value for the current thread
threadLocal.get(value) -> set the value for the current thread

And we could consider something like Rhino's "sync" function.
Thanks,
Kris

Wes Garland

unread,
Sep 21, 2009, 1:20:27 PM9/21/09
to comm...@googlegroups.com
Hi, Kris;

Writing programs using multiple threads of JavaScript, what I call "MT JS" and JS in a multi-threaded environment (thread-safe JS) are two quite different things.  I know you know this, I'm just setting the stage for further discussions.

What you're talking about is "MT JS", where two separate threads of JavaScript share access to the same JavaScript objects.

This is possible in SpiderMonkey (caveat reador, bugs in 1.8+), I guess it must also be possible in Rhino?  I'm not sure it is in JSCore or V8, would love to hear from the peanut gallery.

Do you have a syntax for creating Threads from JS, or are you creating new threads of JS from Java and then allowing access to the objects "from the back"?  If you're planning to create Threads from JS, I think you should consider the API that GPSEE uses: http://www.page.ca/~wes/opensource/gpsee/docs/modules/symbols/Thread.Thread.html

I don't yet have a locking API -- haven't need one for my simple cases, jsapi serializes property accesses which is enough.  But your mutex API looks basic, and good. lock, unlock, tryLock. Perfecto.

I think the litmus test for "is the level-level lock API good enough" is -- can we write code like this out of it?

var myLock = new require("concurrency").Lock;
withLock(myLock, function() { doWork(); });

I believe the answer is yes:

function withLock(lock, fn)
{
   try
   {
     lock.lock();
     fn();
   }
   catch(e)
   {
     throw(e);
   }
   finally
   {
     lock.unlock();
   }
}


require("concurrency").Lock -> constructor for a reentrant mutual
exclusion lock.
require("concurrency").SharedLock -> constructor for a reentrant shared
lock.

Why do we need shared locks, and how do you see those working?  These are just member read locks?  If so, how are updates handled?  Also, I believe SharedLocks could be implemented in pure JS from exclusive locks, meaning it may not be necessary to spec these now.
 
require("concurrency").ThreadLocal -> constructor for a thread-local
variable.

Instanciating real thread local variables requires modification of the JavaScript interpreter.   I do not see that happening.  I think what you're going after should be called Thread Local Storage. Which is is totally doable.

var threadLocal  = new (require("concurrency").ThreadLocal)();
threadLocal.get() -> get the value for the current thread
threadLocal.get(value) -> set the value for the current thread

I take it you meant "get" in that last line.

I was just about to propose a CAS additional to this spec, but I just realized that it's not possible with generic JS types.  The syntax I considered was

v = require("concurrency").compareAndSwap(v, old, new);

The problem is that there is no lockless mechanism for assigning to v, as JS is pass-by-value. Unless..

if (!require("concurrency").compareAndSwap(scope, "v", old, new))
  print("failure");

I still believe that my CAS idea a bad one: just throwing it out there to keep others from going down that same path.

Wes

--
Wesley W. Garland
Director, Product Development
PageMail, Inc.
+1 613 542 2787 x 102

Kris Zyp

unread,
Sep 21, 2009, 1:34:01 PM9/21/09
to comm...@googlegroups.com

Wes Garland wrote:
> Hi, Kris;
>
> Writing programs using multiple threads of JavaScript, what I call "MT
> JS" and JS in a multi-threaded environment (thread-safe JS) are two
> quite different things. I know you know this, I'm just setting the
> stage for further discussions.
>
> What you're talking about is "MT JS", where two separate threads of
> JavaScript share access to the same JavaScript objects.
>
> This is possible in SpiderMonkey (caveat reador, bugs in 1.8+), I
> guess it must also be possible in Rhino? I'm not sure it is in JSCore
> or V8, would love to hear from the peanut gallery.
>
> Do you have a syntax for creating Threads from JS, or are you creating
> new threads of JS from Java and then allowing access to the objects
> "from the back"? If you're planning to create Threads from JS, I
> think you should consider the API that GPSEE uses:
> http://www.page.ca/~wes/opensource/gpsee/docs/modules/symbols/Thread.Thread.html

> <http://www.page.ca/%7Ewes/opensource/gpsee/docs/modules/symbols/Thread.Thread.html>
The immediate situation where I was thinking about this was in Jack and
Persevere where the threads are created by the servlet engine. The JS
isn't actually creating the threads, but must deal with the fact that
may be executed in multiple threads. Of course being able to actually
create the threads in JS would be good as well. Your API looks good to
me. I know Rhino's shell also has a "spawn" function (simple executes
the callback parameter in a separate thread).


>
> I don't yet have a locking API -- haven't need one for my simple
> cases, jsapi serializes property accesses which is enough. But your
> mutex API looks basic, and good. lock, unlock, tryLock. Perfecto.
>
> I think the litmus test for "is the level-level lock API good enough"
> is -- can we write code like this out of it?
>
> var myLock = new require("concurrency").Lock;
> withLock(myLock, function() { doWork(); });
>
> I believe the answer is yes:
>
> function withLock(lock, fn)
> {
> try
> {
> lock.lock();
> fn();
> }
> catch(e)
> {
> throw(e);
> }
> finally
> {
> lock.unlock();
> }
> }
>

Yup.


>
> require("concurrency").Lock -> constructor for a reentrant mutual
> exclusion lock.
> require("concurrency").SharedLock -> constructor for a reentrant
> shared
> lock.
>
>
> Why do we need shared locks, and how do you see those working? These
> are just member read locks? If so, how are updates handled? Also, I
> believe SharedLocks could be implemented in pure JS from exclusive
> locks, meaning it may not be necessary to spec these now.

There are times when shared locks (that can acquire shared/read or
exclusive/write) are helpful, but you are right, they can be implemented
with simple locks. Just thought I would throw it out there.

>
>
> require("concurrency").ThreadLocal -> constructor for a thread-local
> variable.
>
>
> Instanciating real thread local variables requires modification of the
> JavaScript interpreter. I do not see that happening. I think what
> you're going after should be called Thread Local Storage. Which is is
> totally doable.
>
> var threadLocal = new (require("concurrency").ThreadLocal)();
> threadLocal.get() -> get the value for the current thread
> threadLocal.get(value) -> set the value for the current thread
>
>
> I take it you meant "get" in that last line.
>

yeah.
Kris

Neville Burnell

unread,
Sep 21, 2009, 5:39:30 PM9/21/09
to CommonJS
I'd like to also support implicit unlocking, something like:

require("concurrency").Lock(function() {
//executes within lock/unlock boundaries
});

Neville Burnell

unread,
Sep 21, 2009, 5:56:15 PM9/21/09
to CommonJS
ugh, that would be more like:

var lock = new (require("concurrency").Lock)();

lock.do(function(){
}); -> aquires a lock, calls func, unlocks

Neville Burnell

unread,
Sep 21, 2009, 6:03:52 PM9/21/09
to CommonJS
which is what Wes described also .... I need coffee 8-)

ihab...@gmail.com

unread,
Sep 21, 2009, 6:07:56 PM9/21/09
to comm...@googlegroups.com
Hi folks,

On Mon, Sep 21, 2009 at 9:12 AM, Kris Zyp <kri...@gmail.com> wrote:
> I was wondering if we could consider a concurrency module to help with
> some basic concurrency constructs, in particular for working in a shared

> memory threaded environments. ... I didn't think that CommonJS was


> attempting to force a particular concurrency model (and I am not sure that
> it should either).

To the extent that individual modules need to defend their consistency
of their state, they must assume *some* concurrency model. If a given
CommonJS module can be used in both threaded and non-threaded systems,
it must have a "worst case" to fall back onto. This worst case might
on the face of it be to build an implementation, and expose an API,
that includes explicit locking and thread-aware usage patterns. Hence
leaving the question open has forced everyone into coding for threads.
Now, how will the threaded primitives be supported on a non-threaded
system?

Does this mean we end up with 3 "flavors" of modules: (a)
non-threaded; (b) threaded; and (c) purely functional?

Perhaps I should ask this from another angle: JS programming has thus
far proceeded via event loop concurrency, and it seems like a
direction that is far less error-prone than _ad hoc_ threads and
locking. What is wrong with maintaining that?

Ihab

--
Ihab A.B. Awad, Palo Alto, CA

ihab...@gmail.com

unread,
Sep 21, 2009, 6:21:50 PM9/21/09
to comm...@googlegroups.com
By way of a concrete proposal here --

What I *do* think CommonJS should standardize is an API for building
and living within "workers", each of which is single threaded, and
which pass data to one another asynchronously. This should be similar
to Gears and HTML5 workers, and should be informed by those efforts.

Ihab

Wes Garland

unread,
Sep 21, 2009, 11:07:30 PM9/21/09
to comm...@googlegroups.com
Ihab:

I, too, think that JS workers are a good idea. There is another use-case to consider, however.

If an existing MT application wants to run multiple threads of JavaScript: how do we handle this?

Individual property access is not difficult; SpiderMonkey (and I assume Rhino) offer serialized property access.

The tricky part is maintaining atomic updates across multiple properties of the same object. This really is tricky, and we'd need to "stamp" modules MT-safe if we wanted to support this in CommonJS. This means that we need a theory of thread-safety for modules which purport to be MT safe, and it should be consistent everywhere.

I think the following rules may be sufficient for this:
 - Underlying JavaScript engine provides serialized property access
 - require() is guarded by a mutual exclusion lock across the entire JS runtime (all threads)
 - modules are responsible for their own theory of thread safety for non-exported functions and variables, and can assume the presence of a level-0 concurrency module
 - exported functions which accept non-Atom arguments must not modify those arguments
 - exported functions which return non-Atoms must return new non-Atoms
 - Boolean, Number and String are considered Atoms.

Note also that I really believe very very strongly that absolute minification in the theory of thread safety is absolutely paramount.

Wes Garland

unread,
Sep 21, 2009, 11:10:44 PM9/21/09
to comm...@googlegroups.com
> If an existing MT application wants to run multiple threads of JavaScript: how do we handle this?

Forgot to clarify this -- recall that CommonJS isn't necessarily about writing programs in JS, it's also about augment existing applications with JavaScript.  One such example is the Apache 2 web server running the threaded MPM. What if we .... wanted to write a JavaScript database API which pools connections?

Kris Zyp

unread,
Sep 22, 2009, 1:16:39 AM9/22/09
to comm...@googlegroups.com
I think this is indeed a reasonable alternative (and I appreciate the
concrete proposal, I most wanted to avoid indefinitely deferring a
specification because some other model might be better without any real
solid way to move forward). A few questions/thoughts (not necessarily in
any coherent argument):

First, this is a real uphill battle. CommonJS has barely started and we
have already been herded into shared memory threading with Jack and
others due to the underlying concurrency models of the JVM and C, and
the least resistance path of building on Jetty, Simple, etc. Keeping
shared nothing concurrency (with event loops) wouldn't be a bad path,
but it'll be non-trivial to keep on that path.

Second, it seems like CommonJS has had a fairly low-level focus so far.
Naturally the low-level aspect of concurrency is shared memory threading
(that which is exposed by the hardware and what the engines must deal
with in concurrent situations). Even if we want to everyone to be doing
shared nothing architecture, should we still define concurrency
constructs like thread constructors, thread-locals, and locks in order
to be able to compose shared nothing event loops in JavaScript by using
the lower level constructs? I suppose if we expose these things, we
can't guarantee that everyone will always build on the higher level
event loops.

Also, I wanted to actually think through what event looping would look
like in one of our primary target applications, that is the web server.
Of course a web server needs to be able concurrently process requests.
If we are using event loops with nothing shared between threads, we need
separate environments for each thread (and a unique JSGI app created for
each one). This introduces some funky non-determinism. With just a few
requests coming into something like jetty, you could very well have
every request fulfilled by the same JSGI app, since every request might
finish before the next came in. A programmer may (perhaps inadvertently)
be storing some state in a per-app variable (in a closure inside the
middleware, outside the app), and expect it to be able available to all
requests. It is not until requests come in at a fast enough pace to
force multiple threads to handle requests, that another JSGI app becomes
utilized, and that variable is no longer shared across all requests.

Also, if the inter-worker communication is asynchronous (which I presume
is preferable to avoid deadlocks), it means that any type of information
sharing between workers will force us into the added complexity of
callbacks or promises (preferably promises). Something as simple as a
server-side session management would seem to entangle us into fairly
complex asynchronous code.

That being said, this is appealing to me if others think we can really
make it go. The fact that there is an existing API for this in HTML5
could perhaps allow us to sidestep a lot of bike shedding. If we think
it is desirable, maybe we could make an event-loop based fork of Jack to
try it out. I might do it sometime, unless someone else jumps on it.
Thanks,
Kris

Ryan Dahl

unread,
Sep 22, 2009, 6:07:46 AM9/22/09
to comm...@googlegroups.com
I agree with Ihab - it would be nice if CommonJS defined APIs for a
single processes/worker and not enter the realm of threads. Javascript
does not have threads - there is not any proposal to have threads.
While some/most implementation can handle threads - it seems rather
unnecessary to leave behind ES5 and HTML5 just for lack of
imagination. Concurrency should either be done with an event loop
(which requires much more devotion to asynchronous APIs) or with
workers.

Node.js achieves high concurrency on a single event loop. Here is a
"hello world" web server benchmark I ran yesterday for a talk:
http://s3.amazonaws.com/four.livejournal/20090921/bench.png
There is no need to define extensions to javascript (threads) to write
servers - indeed, evidence suggests that doing so damns the server to
suck.

Hannes Wallnoefer

unread,
Sep 22, 2009, 6:53:37 AM9/22/09
to comm...@googlegroups.com
2009/9/22 Ryan Dahl <coldre...@gmail.com>:
>
> I agree with Ihab - it would be nice if CommonJS defined APIs for a
> single processes/worker and not enter the realm of threads. Javascript
> does not have threads - there is not any proposal to have threads.
> While some/most implementation can handle threads - it seems rather
> unnecessary to leave behind ES5 and HTML5 just for lack of
> imagination. Concurrency should either be done with an event loop
> (which requires much more devotion to asynchronous APIs) or with
> workers.

To me the threads model feels natural, but I know that's just because
I've banged my head on it for long enough. So I'm all for considering
fresh (or, as is the case for the event loop, refurbished) concepts.

> Node.js achieves high concurrency on a single event loop. Here is a
> "hello world" web server benchmark I ran yesterday for a talk:
> http://s3.amazonaws.com/four.livejournal/20090921/bench.png

Out of curiosity, is that Helma 1 or NG? Is there a blog post or
presentation for that talk?

Hannes

Neville Burnell

unread,
Sep 22, 2009, 7:03:46 AM9/22/09
to CommonJS
> Out of curiosity, is that Helma 1 or NG? Is there a blog post or
> presentation for that talk?

I'm interested in the detail also.

Ryan Dahl

unread,
Sep 22, 2009, 7:08:02 AM9/22/09
to comm...@googlegroups.com
On Tue, Sep 22, 2009 at 12:53 PM, Hannes Wallnoefer <han...@gmail.com> wrote:
>
>> Node.js achieves high concurrency on a single event loop. Here is a
>> "hello world" web server benchmark I ran yesterday for a talk:
>> http://s3.amazonaws.com/four.livejournal/20090921/bench.png
>
> Out of curiosity, is that Helma 1 or NG? Is there a blog post or
> presentation for that talk?

That's helma-1.6.3 using this http://helma.pastebin.com/f1d786c9 to
serve the request. (v8cgi 0.6, narwhal d147c160, node 0.1.11.) I know
this isn't fair to present benchmarks this way, without scripts,
methods, or data. I'm just being lazy. I did a more detailed benchmark
a few weeks ago (without helma)
http://four.livejournal.com/1019177.html and the setup is similar.

The talk was just a short lightening presentation:
http://s3.amazonaws.com/four.livejournal/20090922/webmontag.pdf

Kris Zyp

unread,
Sep 22, 2009, 9:06:13 AM9/22/09
to comm...@googlegroups.com
Do you plan on making Node be JSGI compliant? It does not look like
there is any conflict between the JSGI API and your fundamental
processing model. Or would one need to run something like Jack on top of
Node to utilize cross-server applications? I was curious what you meant
by "Narwhal". Narwhal's http module is a HTTP client, not a server
(maybe thats why it scored low!). Or did you mean Jack? And if so, which
engine was it running on? Having such low numbers with Simple would seem
suspect since Simple has such a high performance characteristics with
fairly little overhead added by Jack (or maybe there is something
expensive in there that I missed).
Kris

Ryan Dahl

unread,
Sep 22, 2009, 9:40:19 AM9/22/09
to comm...@googlegroups.com
On Tue, Sep 22, 2009 at 3:06 PM, Kris Zyp <kri...@gmail.com> wrote:
>
> Do you plan on making Node be JSGI compliant?

JSGI compliant - probably not. At least partially CommonJS compliant -
probably. At the moment I'm letting node evolve on its own.

> It does not look like
> there is any conflict between the JSGI API and your fundamental
> processing model. Or would one need to run something like Jack on top of
> Node to utilize cross-server applications?

I don't like that JSGI forces one to determine the response code and
response headers in a single function. One cannot for example, access
a database before returning the status code. Maybe I want to 404 if a
record does not exist. Well, one can do this if your database call
blocks execution inside the jack handler, but blocking implies
allocating a new execution stack with, if not threads, then
coroutines. Allocating a 2mb execution stack for each request is
rather an unacceptable sacrifice to be had at the API level, even
before an implementation is around to further slow down the server.

So, perhaps Node will have proper coroutines at some point (the
current implementation is rather bad). And perhaps someone will write
a JSGI interface on top of Node's more general HTTP interface, and
then people can access databases and wait() for the promise to
complete, and return 404 from the same function. This is not ruled
out.

> I was curious what you meant
> by "Narwhal". Narwhal's http module is a HTTP client, not a server
> (maybe thats why it scored low!). Or did you mean Jack? And if so, which
> engine was it running on? Having such low numbers with Simple would seem
> suspect since Simple has such a high performance characteristics with
> fairly little overhead added by Jack (or maybe there is something
> expensive in there that I missed).

The exact code being used was the same as described in
http://four.livejournal.com/1018026.html (which is a different link
than the one I gave above).

Kris Zyp

unread,
Sep 22, 2009, 10:06:33 AM9/22/09
to comm...@googlegroups.com

Ryan Dahl wrote:
> On Tue, Sep 22, 2009 at 3:06 PM, Kris Zyp <kri...@gmail.com> wrote:
>
>> Do you plan on making Node be JSGI compliant?
>>
>
> JSGI compliant - probably not. At least partially CommonJS compliant -
> probably. At the moment I'm letting node evolve on its own.
>

Oh, you aren't the one responsible for node?


>
>> It does not look like
>> there is any conflict between the JSGI API and your fundamental
>> processing model. Or would one need to run something like Jack on top of
>> Node to utilize cross-server applications?
>>
>
> I don't like that JSGI forces one to determine the response code and
> response headers in a single function. One cannot for example, access
> a database before returning the status code. Maybe I want to 404 if a
> record does not exist. Well, one can do this if your database call
> blocks execution inside the jack handler, but blocking implies
> allocating a new execution stack with, if not threads, then
> coroutines. Allocating a 2mb execution stack for each request is
> rather an unacceptable sacrifice to be had at the API level, even
> before an implementation is around to further slow down the server.
>

This is perfectly doable with the JSGI model plus promises that we have
discussed, as long as the JSGI server does not grab the status from the
response object until it needs to. Here is an example JSGI App with
async database interaction, completely inline with your event loop
model, no threading or coroutines needed (although I wouldn't mind some
coroutine sugar ala JS 1.7 generators, but its not necessary):

app = function databaseApp(env){
var resultPromise = doAsyncDatabase("select * from table where id = 3");
var response = {
status: 200, // the JSGI server should not write the status until
the first body write
headers:{},
body: {forEach: function(write){
return resultPromise.then(function(resultSet){
if(resultSet.length == 0){
response.status = 404; // change the status before the
forEach promise is fulfilled
}
else{
write(JSON.stringify(resultSet[0]));
}
});
}}
};
return response;
};

> So, perhaps Node will have proper coroutines at some point (the
> current implementation is rather bad). And perhaps someone will write
> a JSGI interface on top of Node's more general HTTP interface, and
> then people can access databases and wait() for the promise to
> complete, and return 404 from the same function. This is not ruled
> out.
>

Yeah, I guess we could do that, although hopefully it is apparent that
JSGI + promises API would not actually hinder Node (and it would be cool
to be able to run JSGI apps directly on it).
Kris

Wes Garland

unread,
Sep 22, 2009, 10:09:08 AM9/22/09
to comm...@googlegroups.com
> Allocating a 2mb execution stack for each request is
> rather an unacceptable sacrifice to be had at the API level, even
> before an implementation is around to further slow down the server.

It also seems about 1.8MB too heavy, but I suppose that's neither here nor there. Are Rhino contexts REALLY that big, thought?

Ryan Dahl

unread,
Sep 22, 2009, 10:52:54 AM9/22/09
to comm...@googlegroups.com
On Tue, Sep 22, 2009 at 4:06 PM, Kris Zyp <kri...@gmail.com> wrote:
>
> This is perfectly doable with the JSGI model plus promises that we have
> discussed, as long as the JSGI server does not grab the status from the
> response object until it needs to.

Okay - I stand corrected. That's an important feature!

> app = function databaseApp(env){
>   var resultPromise = doAsyncDatabase("select * from table where id = 3");
>   var response = {
>     status: 200, // the JSGI server should not write the status until
> the first body write
>     headers:{},
>     body: {forEach: function(write){
>         return resultPromise.then(function(resultSet){
>            if(resultSet.length == 0){
>               response.status = 404; // change the status before the
> forEach promise is fulfilled
>            }
>            else{
>                write(JSON.stringify(resultSet[0]));
>            }
>         });
>     }}
>   };
>   return response;
> };

Does seem rather convoluted, doesn't it? I would still rather see
something like:

server.addListener("request", function (request, response) {


var resultPromise = doAsyncDatabase("select * from table where id = 3");

resultPromise.addCallback(function (resultSet) {
if (resultSet.length == 0) {
response.sendHeader(404, {});
} else {
response.sendHeader(200, {});
response.sendBody(JSON.stringify(resultSet));
}
response.finish();
});
});

Kris Zyp

unread,
Sep 22, 2009, 10:58:38 AM9/22/09
to comm...@googlegroups.com

I agree that the JSGI/promise approach is a more awkward than your
example, but I think think the philosophy is that JSGI allows the simple
case (of a basic sync function) to be very simple, and you can deal with
extra complexity as needed. In that sense, I think JSGI strikes a great
balance, making the common use cases very easy to code while still
keeping it possible to do more advance async processing.
Kris

Kris Zyp

unread,
Sep 22, 2009, 11:16:11 AM9/22/09
to comm...@googlegroups.com

Ryan Dahl wrote:
>
> Does seem rather convoluted, doesn't it? I would still rather see
> something like:
>
> server.addListener("request", function (request, response) {
> var resultPromise = doAsyncDatabase("select * from table where id = 3");
> resultPromise.addCallback(function (resultSet) {
> if (resultSet.length == 0) {
> response.sendHeader(404, {});
> } else {
> response.sendHeader(200, {});
> response.sendBody(JSON.stringify(resultSet));
> }
> response.finish();
> });
> });
>

Another thing we could consider is allowing promises to be returned
directly from the JSGI app, rather than just from the forEach call as I
demonstrated. Returning a promise from a JSGI app would look like:

app = function databaseApp(env){
return doAsyncDatabase("select * from table where id = 3").
then(function(resultSet){
if(resultSet.length == 0){
return {status: 404, headers:{}, body:""};
}
else{
return {
status: 200,
headers:{},
body:JSON.stringify(resultSet[0])
};
}
});
};

That reads a lot nicer, but it would put extra burden on servers and
middleware to have to look for promises at two different places (you
still need promises from forEach in order to facilitate streamed responses).
Kris

Kevin Dangoor

unread,
Sep 22, 2009, 11:19:57 AM9/22/09
to comm...@googlegroups.com
On Tue, Sep 22, 2009 at 10:52 AM, Ryan Dahl <coldre...@gmail.com> wrote:
Does seem rather convoluted, doesn't it? I would still rather see
something like:

 server.addListener("request", function (request, response) {
     var resultPromise = doAsyncDatabase("select * from table where id = 3");
     resultPromise.addCallback(function (resultSet) {
          if (resultSet.length == 0) {
              response.sendHeader(404, {});
          } else {
              response.sendHeader(200, {});
              response.sendBody(JSON.stringify(resultSet));
          }
          response.finish();
     });
 });


It's worth noting that most app developers will not be coding to "raw JSGI". You could make an adapter to JSGI+Promises that looks like that, right?

Kevin

--
Kevin Dangoor

work: http://labs.mozilla.com/
email: k...@blazingthings.com
blog: http://www.BlueSkyOnMars.com

Kevin Dangoor

unread,
Sep 22, 2009, 11:21:24 AM9/22/09
to comm...@googlegroups.com
On Tue, Sep 22, 2009 at 11:16 AM, Kris Zyp <kri...@gmail.com> wrote:
That reads a lot nicer, but it would put extra burden on servers and
middleware to have to look for promises at two different places (you
still need promises from forEach in order to facilitate streamed responses).


Yeah, that is a heavy burden. The Python folks apparently had a lot of discussion about async WSGI and they were talking about "async-aware" middleware. A solution that doesn't require every piece of the stack to have to inspect the return value is a big win.

Kevin

Tom Robinson

unread,
Sep 22, 2009, 2:50:08 PM9/22/09
to comm...@googlegroups.com
I suspect doing this would break a lot of middleware that looks at the
response. But then again, the whole idea of JSGI middleware won't work
at all with a purely async interface like Node's, AFAIK.

The only thing I'd worry about is tricky unexpected behaviors, but I
suppose that can mostly be solved through documentation (listing which
middleware is partially or fully async compliant)

Mike Wilson

unread,
Sep 23, 2009, 7:19:43 AM9/23/09
to comm...@googlegroups.com
Kris Zyp wrote:
> Something as simple as a
> server-side session management would seem to entangle us into fairly
> complex asynchronous code.

Yes, this is what I also was thinking about in a discussion a
couple of weeks ago when we touched on threading/runtimes.

Theoretically, I'm all for the shared-nothing approach and
would like to see that materialized in CommonJS. Though, I
haven't had any personal revelation on how this would be
used in a web server that wants simultaneous requests to
share data "from RAM", ie application or session data, in
an unobtrusive way...
Other script languages often use the database for this data
sharing, but it sure would be nice to have a solution that
scales to "live" objects as well.

Best regards
Mike Wilson

ihab...@gmail.com

unread,
Sep 23, 2009, 11:04:07 AM9/23/09
to comm...@googlegroups.com
On Wed, Sep 23, 2009 at 4:19 AM, Mike Wilson <mik...@hotmail.com> wrote:
> ... I

> haven't had any personal revelation on how this would be
> used in a web server that wants simultaneous requests to
> share data "from RAM", ie application or session data, in
> an unobtrusive way...
> Other script languages often use the database for this data
> sharing, but it sure would be nice to have a solution that
> scales to "live" objects as well.

The "live" objects could live in a shared worker, and each worker
servicing an independent HTTP request could send that worker
asynchronous messages. That shared worker now behaves exactly like the
"database" you describe.

Mike Wilson

unread,
Sep 23, 2009, 12:23:33 PM9/23/09
to comm...@googlegroups.com
Yes, I realize it can be done that way. The key was
"unobtrusive", and the async patterns required by a
worker solution can get a bit verbose. It all depends
on how well you can factor out the session access I
guess.

Anyway, if there is general agreement that the typical
stateful webapp can be nicely designed with this
pattern, then let's go for it!

Best regards
Mike

ihab.awad wrote:

Donny Viszneki

unread,
Sep 23, 2009, 12:40:52 PM9/23/09
to comm...@googlegroups.com
On Mon, Sep 21, 2009 at 12:12 PM, Kris Zyp <kri...@gmail.com> wrote:
> require("concurrency").ThreadLocal -> constructor for a thread-local
> variable.

Assuming that a newly created thread has some kind of entry point at
which point it starts executing and has its own lexical scope... what
is the purpose of thread local memory? Thread local memory is a
technology for systems programming languages.

http://en.wikipedia.org/wiki/Thread-local_storage

--
http://codebad.com/

ihab...@gmail.com

unread,
Sep 23, 2009, 2:39:41 PM9/23/09
to comm...@googlegroups.com
Hi Wes,

On Mon, Sep 21, 2009 at 8:07 PM, Wes Garland <w...@page.ca> wrote:
> The tricky part is maintaining atomic updates across multiple properties of
> the same object.

Not just the *same* object -- *across* multiple objects as well. In
other words, for some subgraph S of objects, we need to be concerned
about whether a temporarily inconsistent state of this subgraph is (a)
observable; and (b) usable by other objects as a way to subvert that
subgraph's future attempts at regaining consistency.

The "worker" model offers a pretty good compromise. Each worker
processes one event at a time. While a worker is processing an event,
its internal state may be inconsistent. However, it does not serve any
other events, so this inconsistency is neither (a) observable nor (b)
subvertible. The programmer in charge of the worker should arrange to
reestablish consistency at the end of each event (even if by a Boolean
flag that says "inconsistent" and causes service requests to queue up
or be rejected).

> I think the following rules may be sufficient for this:
>  - Underlying JavaScript engine provides serialized property access
>  - require() is guarded by a mutual exclusion lock across the entire JS
> runtime (all threads)
>  - modules are responsible for their own theory of thread safety for
> non-exported functions and variables, and can assume the presence of a
> level-0 concurrency module
>  - exported functions which accept non-Atom arguments must not modify those
> arguments
>  - exported functions which return non-Atoms must return new non-Atoms
>  - Boolean, Number and String are considered Atoms.

I don't know if these rules would work, but they do seem to suggest
that each *module* lives in its own threading realm of sorts, exposing
a simplified interface to other modules. The worker model keeps the
two concerns largely independent: multiple modules may live within one
worker and communicate in a simple, synchronous manner, or they may
spawn other workers and communicate asynchronously with these workers.

Donny Viszneki

unread,
Sep 23, 2009, 3:16:52 PM9/23/09
to comm...@googlegroups.com
On Wed, Sep 23, 2009 at 2:39 PM, <ihab...@gmail.com> wrote:
> On Mon, Sep 21, 2009 at 8:07 PM, Wes Garland <w...@page.ca> wrote:
>> The tricky part is maintaining atomic updates across multiple properties of
>> the same object.
>
> Not just the *same* object -- *across* multiple objects as well. In
> other words, for some subgraph S of objects, we need to be concerned
> about whether a temporarily inconsistent state of this subgraph is (a)
> observable; and (b) usable by other objects as a way to subvert that
> subgraph's future attempts at regaining consistency.

Do you think your concern cannot be reasonably achieved if we have
Wes's concerns granted? For clarity:

Ihab seems concerned with collisions of Javascript logic.

Wes seems concerned with collisions of interpreter logic.

If you can be sure that accessing a particular variable concurrently
is safe, can't you then use that as a primitive for locking other
subsections of the object graph? I don't think the goal is to make it
impossible for a programmer to write poor code.

> The "worker" model offers a pretty good compromise. Each worker
> processes one event at a time. While a worker is processing an event,
> its internal state may be inconsistent. However, it does not serve any
> other events, so this inconsistency is neither (a) observable nor (b)
> subvertible. The programmer in charge of the worker should arrange to
> reestablish consistency at the end of each event (even if by a Boolean
> flag that says "inconsistent" and causes service requests to queue up
> or be rejected).

Sure, but can't Workers be *easily* implemented on top of the other
concurrency features others are asking for?

The Worker API is just a multi-threading API that chops up the object
graph for each Worker, such that the only things shared are basically
synchronized message queues.

p.s. if anyone wants to tell me what the mothership that launches the
first Workers is called in Worker vernacular, you win cookies

--
http://codebad.com/

Kris Zyp

unread,
Sep 23, 2009, 3:23:10 PM9/23/09
to comm...@googlegroups.com

Mike Wilson wrote:
> Yes, I realize it can be done that way. The key was
> "unobtrusive", and the async patterns required by a
> worker solution can get a bit verbose. It all depends
> on how well you can factor out the session access I
> guess.
>
> Anyway, if there is general agreement that the typical
> stateful webapp can be nicely designed with this
> pattern, then let's go for it!
>
You can't wrap an asynchronous operations in something to make it look
like a simple sync operation. Anytime you accessed a session (if it was
shared server side data structure) you would be forced to put the
continuation in a callback (via promises or otherwise, with the
exception of generator style async code). In that sense I am not sure if
it would be classified as "unobtrusive".

I am actually in favor of the event-loop, but I am not sure if we really
want or even can provide a total shared-nothing architecture. The most
popular platform (at least ATM) is Rhino with access to the Java
runtime. If a platform attempts to create multiple isolated
vats/mini-processes, each with their own event loop, they still
ultimately can access shared data in the JVM from different threads. The
only way to prevent that would be to block access to Java packages
(which is one of the main reasons people use Rhino is for Java
libraries, to fill in the gaps that CommonJS won't be able to provide
within the next 5 years) or run multiple Java processes (that would be
horrible).

Because of the fact that we can't realistically enforce a true
shared-nothing architecture in JS environment, I don't think we should
try to eliminate every possible form of access to shared data between
threads. I believe we would be in a better position to provide a good
event-loop based environment where developers don't really need to share
references to mutable objects across threads(they can pass async
messages between vats), but can if they really want to. The default
behavior is that modules should never need to deal with thread safety
(unless they are specifically designed for that purpose). Guidelines
should insist that mutable data that is referenced between threads
should not escape modules. In particular, JSGI needs to allow middleware
and apps to communicate with another worker process through the HTML5
worker API, such that they can communicate between concurrency without
accessing shared mutable data. Also, promises that enqueue tasks must
only be fulfilled on the initiated process.

Consequently, I believe we should still provide a *low-level*
concurrency API that defines some basic concurrency constructs. We can
then build an event loop system on top of this which could essentially
be implemented in JS using the concurrency module's functions (which
makes easier to span engines). Other modules may also use this API (with
great caution) if they need to implement something which fits better
with the threaded shared mutable data model. This allows module writers
to choose the best tool for the job.

Thoughts?
Thanks,
Kris

Tom Robinson

unread,
Sep 23, 2009, 3:51:08 PM9/23/09
to comm...@googlegroups.com

I've been trying to figure out how one would efficiently implement
(for example) a JSGI server that uses multiple web workers. Workers
only allow you to send messages back and forth. Would you need to
buffer and serialize the request and response, or is there some way
you could hand off the io streams to a worker?


Donny Viszneki

unread,
Sep 23, 2009, 4:08:23 PM9/23/09
to comm...@googlegroups.com

Doesn't that depend on the Workers implementation?

--
http://codebad.com/

Kris Zyp

unread,
Sep 23, 2009, 4:18:00 PM9/23/09
to comm...@googlegroups.com

I've been thinking through this as well. This is one of the reasons why
I suggested that we don't want a completely shared nothing architecture.
Basically, I think we want Jack to create a pool of workers, and then as
requests come in, they should be delegated to whatever worker is idle
and available (or have the request go into a pool waiting for idle
workers if none is available). Each worker has its own event loop, and
so when a request is delegated to the worker, it goes to in its event
loop to be processed. For the sake of efficiency, its important that one
can actually pass a reference to the
request/response/inputstream,outputstream to the workers, not just an
string.

I started toying with how this would work, and this is what I was thinking:
reactor.js - This would be built up to be the event loop handler (I
think that was Kris Kowal's intention with this module). It would have a
thread-safe enqueue function that could be called from any thread, and
would add a task to the event queue (in Rhino, we would probably use a
LinkedBlockingQueue). reactor would also expose an enterEventLoop()
function that would start the processing of the queue (should only be
called by the thread for the event loop).

worker.js - This would expose a Worker constructor that would start a
new thread and a new global environment for the worker, and put a
postMessage function there that would delegate to the caller's reactor
enqueue. It would startup a new reactor in the worker thread, execute
the script provided by the argument to worker and then and trigger
enterEventLoop().

The way I was thinking was that there might also be a "postEvent" that
could post any event to the worker with any data (not just a string).
The main JSGI worker (the mothership) would then be able to call
postEvent methods on each worker to send it the request object. The
worker would define an onrequest event to receive the events (this could
be implemented by a script that was passed to the worker from the Jack
main thread), which would be given the all the necessary references to
carry out the response to the request.

Does this seem like a reasonable approach?

Kris

Ryan Dahl

unread,
Sep 23, 2009, 5:49:28 PM9/23/09
to comm...@googlegroups.com
On Wed, Sep 23, 2009 at 9:51 PM, Tom Robinson <tlrob...@gmail.com> wrote:
>
> I've been trying to figure out how one would efficiently implement
> (for example) a JSGI server that uses multiple web workers. Workers
> only allow you to send messages back and forth. Would you need to
> buffer and serialize the request and response, or is there some way
> you could hand off the io streams to a worker?

Sending a "port" through a message channel is part of html5's spec
http://www.whatwg.org/specs/web-apps/current-work/multipage/comms.html#posting-messages
it's the analogue of sending an fd with sendmsg(). This would be a way
of distributing requests across multiple workers.

Kris Zyp

unread,
Sep 23, 2009, 11:02:46 PM9/23/09
to comm...@googlegroups.com
I thought an example me be helpful in looking at the implications of
different concurrency approachs. He is a page counter (with atomic
counting) in two different styles. Something like a server session
object would probably be fairly similar:

Using shared memory with locks:

var hits = 0;
var lock = new (require("concurrency")).Lock();
function incrementAndGetHits(){
lock.lock();
hits++;
lock.unlock();
return hits;
}
exports.app = function(env){
return {
status:200,
headers:{"content-type":"text/plain"},
body: "This page has been hit " + incrementAndGetHits() + " times";
};
}


And here is the same thing using shared nothing with HTML5 workers:

jsgi file:
var hitTrackerWorker = new (require("worker").SharedWorker)("hit-tracker");
var Future = require("promise").Future; // get the constructor
var waitingFutures = []; // all the requests that are waiting to be finished

// note that we can't just do this in a closure inside the app because
there may be
// multiple unfulfilled promises at once, but there is only one
onmessage handler
hitTrackerWorker.port.onmessage = function(event){
// probably easiest to use JSON for structured messages
var message = JSON.parse(event.data);
switch(message.action){
case "incrementResults":
waitingFutures.shift().fulfill(message.hits);
... all other types of response would need to register here
}
};
});
};

// this app will be created in each worker
exports.app = function(env){
// send a message to the shared worker to increment the hits and
send back the result
hitTrackerWorker.port.postMessage(JSON.stringify({
action: "incrementAndGetHits"
}));

var future = new Future();
// add it to our queue of futures that need to be fulfilled
waitingFutures.push(future);
return {
status:200,
headers:{"content-type":"text/plain"},
body: {
forEach: function(write){
// delay the response with a promise
return future.promise.then(function(hits){
write("This page has been hit " + hits + " times");
});
}
}
};
}

hit-tracker.js:
var hits = 0;
onmessage= function(event){
var message = JSON.parse(event.data);
switch(message.action){
case "incrementAndGetHits":
hits++; // this is thread-safe since it is in a single worker
// send back the results
event.port[0].postMessage(JSON.stringify({
action: "incrementResults":
hits: hits
}));
.... any other actions handled by this worker
}
}


Is there an easier way to do this with workers? This exercise makes me a
little skeptical that shared nothing architecture is really the best
tool for every need. I like having event loops available, and would
certainly use them whenever possible, but I am dubious that we should
(or could) try to eliminate shared mutable data across threads. What if
we simply associated an event loop with every thread (allowing them to
share everything), had JSGI servers distribute request across event
loops and had enqueuing operations (deferred futures, setTimeouts, etc.)
always enqueue on to the current thread's event loop so as to make it
easier for lexically private data to always remain on the same thread
and avoid race conditions?

Kris

ihab...@gmail.com

unread,
Sep 24, 2009, 12:33:49 AM9/24/09
to comm...@googlegroups.com
Hi Kris,

Thanks for writing this up ... this is a really good example.

My hunch is that the HTML5 workers solution is made verbose by the
fact that the HTML5 worker API is very low-level (albeit *perhaps*
adequate to implement the sugar we need).

I've fired off a thread to e-l...@eros-os.org, and it's already shown
up in the archives, so see:

http://www.eros-os.org/mailman/listinfo/e-lang

The gist of it is to ask, what would this look like in E? E was built,
with a lot of design freedom from legacy, around this specific
architecture, so it should be easy there, rght? Let's see what they
come up with, then we can figure out what missing parts we would need
to be able to do similar stuff, and whether these missing parts are
fundamental to JS or just helper libs that someone needs to up and
write.

Again, thanks for the problem and for taking the time to write out the
solution. To my knowledge, this is the first time we've asked
ourselves this question in any detail for pure JS. Kevin Reid, a
longtime E developer, built an implementation of CapTP (the RPC
protocol that E uses), but that was on top of Caja. I am not yet well
enough versed in his work to determine how much of it could (with
sacrifice of its security properties) be adapted to pure JS; Caja
gives JS programmers some virtualization abilities that pure JS
doesn't have.

Kris Zyp

unread,
Sep 24, 2009, 12:59:23 AM9/24/09
to comm...@googlegroups.com, e-l...@mail.eros-os.org, ihab...@gmail.com

ihab...@gmail.com wrote:
> Hi Kris,
>
> Thanks for writing this up ... this is a really good example.
>
> My hunch is that the HTML5 workers solution is made verbose by the
> fact that the HTML5 worker API is very low-level (albeit *perhaps*
> adequate to implement the sugar we need).
>

I am curious what it would look like in E, but you are right, I think we
can do much better than direct interaction with a low level API. I
believe it would be pretty straightforward to write a JSON-RPC library
on top of workers that would handle posting a message, waiting for
response, returning a promise, and fulfilling the promise. Then we could
actually write (with shared nothing event loop):

var hitTrackerWorker = new (require("worker").SharedWorker)("hit-tracker", "tracker");
var hitTrackerRPC = require("rpc-worker").wrapWorker(hitTrackerWorker);

exports.app = function(env){
return {
status:200,
headers:{"content-type":"text/plain"},

body: {
forEach: function(write){
// hitTrackerRPC.call will return a promise
return hitTrackerRPC.call("incrementAndGet", []).


then(function(hits){
write("This page has been hit " + hits + " times");
});
}
}
};
}


hit-tracker.js:

var exporter = require("rpc-exporter").exporter;

var hits = 0;
exporter({
incrementAndGet: function(){
returns ++hits;
});

And that doesn't look too bad at all.

Still, I wonder if we should rather be stating the question as where on the gradient of sharing do we want to be (since blocking any shared-mutables may not be realistic, at least with Rhino):

shared-difficult - Every worker has a separate global, postMessage only carries strings, block everything possible, but concede that code may access shared data in the JVM for Rhino users.

shared-easy - Create threads that share everything. But provide event loops to allow users to have a nice share-nothing style mechanism for avoiding sharing if desired.

shared-something-in-between - Perhaps separate globals, but allow posting non-string objects across workers?


Kris

Wes Garland

unread,
Sep 24, 2009, 8:33:54 AM9/24/09
to comm...@googlegroups.com
Kris;

This message is a little off-topic for this list, as I'm going to talk about feasibility of implementation, rather than the merits or the API.  But I'm hoping it will provide food for thought for CommonJS.onTopic. :)

On Thu, Sep 24, 2009 at 12:59 AM, Kris Zyp <kri...@gmail.com> wrote:

Still, I wonder if we should rather be stating the question as where on the gradient of sharing do we want to be (since blocking any shared-mutables may not be realistic, at least with Rhino):

By "blocking shared-mutables", do mean blocking on read of an object, when another thread has some kind of lock on it?  If so, I think this IS implementable -- at least in SpiderMonkey -- but not for plain objects; only objects of a specific (custom) class which is implemented in C. This custom class would need to use a class-wide getter and setter, and use the private storage slot.  The private slot would contain a mutex (e.g. pointer to pthread_mutex_t)  The getters and setters would need to hold the mutex before accessing the object, and the JS programmers wanting to do multi-property updates would need to access lock()/unlock() methods which work on that private mutex.  I'll bet you can do something equivalent in Rhino.

shared-difficult - Every worker has a separate global, postMessage only carries strings, block everything possible, but concede that code may access shared data in the JVM for Rhino users.

Totally implementable, don't know why you need to block anything, strings are immutable upon emergence from the constructor, which is where they are first exposed thread races.
 
shared-easy - Create threads that share everything. But provide event loops to allow users to have a nice share-nothing style mechanism for avoiding sharing if desired.

This is (relatively) easy to implement. I have implemented this in SpiderMonkey, although I have not written any fancy event loops.

Incidentally -- the state of the SpiderMonkey Threading Union -- bugs post 1.7 were introduced which made thread which shared more than Atoms buggy.  That is mostly fixed now, although Arrays still can't be shared between threads; also, E4X and generator/iterators have thread safety issues.

Something worthy of notice with my Thread implementation -- the thread handle can be used for communication between the thread creator and the thread itself, e.g.

function myfunc()
{
  print("hello");

  this.myStatus = "world";

}

var t = new Thread(myfunc);
t.start();
t.join();
print(t.myStatus);

(program output is hello<newline>world)
 
shared-something-in-between - Perhaps separate globals, but allow posting non-string objects across workers?

This is something like what my Thread class implements as well -- only the primordial thread has the "real" global object: the other threads run with their scopes set to a child of the real global. Exactly how modules run.

Completely separate globals might not be possible for non-Atom data interchange. The reason I make this claim is that prototypes for those objects would be on the "real" global. I've also tried to implement this in Spidermonkey without success. :)

As one "solution", have we considered the co-routine-ish solution employed by the browser?  Basically, when an event happens (e.g. setTimeout()), the JS engine executes an asynchronous callback  (I think of them as non-maskable interrupts) in other JS code, and jumps back when that other piece of code has run.

While THAT solution has obvious drawbacks -- it's not multithreaded! -- it may be worth considering because it allows some async stuff, has no thread-safety issues, and can be implemented on all the JS engines, not just SpiderMonkey and Rhino.  If you were careful, you might be able to get performance similar to WIN16-style multitasking.

Wes

Kris Zyp

unread,
Sep 24, 2009, 8:43:08 AM9/24/09
to comm...@googlegroups.com

Wes Garland wrote:
> Kris;
>
> This message is a little off-topic for this list, as I'm going to talk
> about feasibility of implementation, rather than the merits or the
> API. But I'm hoping it will provide food for thought for
> CommonJS.onTopic. :)
>
> On Thu, Sep 24, 2009 at 12:59 AM, Kris Zyp <kri...@gmail.com
> <mailto:kri...@gmail.com>> wrote:
>
>
> Still, I wonder if we should rather be stating the question as
> where on the gradient of sharing do we want to be (since blocking
> any shared-mutables may not be realistic, at least with Rhino):
>
>
> By "blocking shared-mutables", do mean blocking on read of an object,
> when another thread has some kind of lock on it?

No, I meant preventing shared access to mutables between threads. Sorry,
poor choice of words since "blocking" has a different meaning in the
concurrency context.

Yeah, having objects that originate from multiple globals is indeed a
pain (we experience that in the browse with frames).

>
> As one "solution", have we considered the co-routine-ish solution
> employed by the browser? Basically, when an event happens (e.g.
> setTimeout()), the JS engine executes an asynchronous callback (I
> think of them as non-maskable interrupts) in other JS code, and jumps
> back when that other piece of code has run.

Thats not coroutines, setTimeout uses the event loop that we are talking
about (places the provided callback on the event queue). The only form
of coroutines in JS are generators.

>
> While THAT solution has obvious drawbacks -- it's not multithreaded!
> -- it may be worth considering because it allows some async stuff, has
> no thread-safety issues, and can be implemented on all the JS engines,
> not just SpiderMonkey and Rhino. If you were careful, you might be
> able to get performance similar to WIN16-style multitasking.

Right, the event loops allows for asynchronity within a thread/process.
But there could be multiple thread/processes each with their own event
loops (thats what happens with workers).
Kris

Wes Garland

unread,
Sep 24, 2009, 8:57:04 AM9/24/09
to comm...@googlegroups.com
Ihab;

On Wed, Sep 23, 2009 at 2:39 PM, <ihab...@gmail.com> wrote:
Not just the *same* object -- *across* multiple objects as well. In
other words, for some subgraph S of objects, we need to be concerned
about whether a temporarily inconsistent state of this subgraph is (a)
observable; and (b) usable by other objects as a way to subvert that
subgraph's future attempts at regaining consistency.

Good point, although I consider all objects to children of global, so it really is a question of perspecitive. :)

Locking access to multiple objects for atomic update would have to be responsibility of the using programmer.  This is not a drawback, face it, Threads Are Hard.  There is no way the environment can serialize inter-object or inter-property access (I see these as nearly the same thing) for the programmer: he has to design and implement his own theory of thread safety for his application / module / etc.
 
The "worker" model offers a pretty good compromise. Each worker
processes one event at a time. While a worker is processing an event,
its internal state may be inconsistent. However, it does not serve any
other events, so this inconsistency is neither (a) observable nor (b)
subvertible. The programmer in charge of the worker should arrange to
reestablish consistency at the end of each event (even if by a Boolean
flag that says "inconsistent" and causes service requests to queue up
or be rejected).

That's interesting -- I've never used the worker model as such (recall: I've spent most of my life programming in C) -- but this is pretty much the same as the C threading pattern where the "inconsistent"  Bool is really just a mutex; if you hold the mutex, you are allowed to put the data structure in an inconsistent state.
 
> I think the following rules may be sufficient for this:
>  - Underlying JavaScript engine provides serialized property access
>  - require() is guarded by a mutual exclusion lock across the entire JS
> runtime (all threads)
>  - modules are responsible for their own theory of thread safety for
> non-exported functions and variables, and can assume the presence of a
> level-0 concurrency module
>  - exported functions which accept non-Atom arguments must not modify those
> arguments
>  - exported functions which return non-Atoms must return new non-Atoms
>  - Boolean, Number and String are considered Atoms.

I don't know if these rules would work, but they do seem to suggest
that each *module* lives in its own threading realm of sorts, exposing
a simplified interface to other modules.

It implies that each module has, internally, it's own theory of thread safety. The module owner what does, and does not, represent internal inconsistent state and he can manage it.  A clever module owner might even write a module which requires no synchronization primitives. :)
 
The worker model keeps the
two concerns largely independent: multiple modules may live within one
worker and communicate in a simple, synchronous manner, or they may
spawn other workers and communicate asynchronously with these workers.

This is not really different from what I've proposed, I think.

What happens in the worker model when two threads want to load the same module? I see three possibilities without my rules:

1. You break the original promise of module singletons, or
2. You lock access to module exports while they are in use by another thread (how? JS programmer remembers?)
3. You hope like hell the module is thread safe

I've proposed try to arrange it so that in #3 you don't need to hope.  Everything on the module's exports is immutable, and everything being sent to the module is unchanging.

Note that I just realized that my rules have a hole in them: 
>  - exported functions which accept non-Atom arguments must not modify those
> arguments

This is not strong enough in the general-threads case.

Thanks to your explanation, I have now considered the case of worker threads calling into singleton modules. The rules I've proposed above "fall away" given that no objects are shared across threads, except this one:


>  - modules are responsible for their own theory of thread safety for
> non-exported functions and variables, and can assume the presence of a
> level-0 concurrency module

...which leads to an interesting question, in a shared-nothing environment, is that "nothing" also module scope?  If so, you can't have stateful modules, or you can't have module singletons.

Wes
 

Kris Zyp

unread,
Sep 24, 2009, 9:07:07 AM9/24/09
to comm...@googlegroups.com

Wes Garland wrote:
>
>
> What happens in the worker model when two threads want to load the
> same module? I see three possibilities without my rules:
>
> 1. You break the original promise of module singletons, or

With a shared nothing/hard/little approach, modules are loaded once for
each different worker. This doesn't break any promises that we have
made, it is always been known that separate processes would each load
their own modules. The promise we made was that it would be
non-observable; code would never have a reference to two different
module exports.

>
> ...which leads to an interesting question, in a shared-nothing
> environment, is that "nothing" also module scope?

Yes.


> If so, you can't have stateful modules, or you can't have module
> singletons.

Modules can't have their own state that is shared across all workers.
Rather shared state must be emulated through messaging to a central
shared worker.
Kris

Wes Garland

unread,
Sep 24, 2009, 9:52:05 AM9/24/09
to comm...@googlegroups.com
Kris:

Interesting points, all of which make workers much more implementable.

They definitely put more requirements on the implementation of require, however, at least when require isn't written in JavaScript.

It means that require() needs to track exports and module scope on a thread-by-thread basis, presumably storing pointers to them in thread local storage or similar.

"Similar" because TLS won't work if the implementation pools M:N threads:contexts.  I guess I actually mean "worker-by-worker" basis; 1:1 worker:context seems right.

So, if I understand correctly, we can revise the require() semantics to formally say "per-context singletons" and most of the MT issues go away with workers, even from the native engine POV? That's attractive right there.

Donny Viszneki

unread,
Sep 24, 2009, 12:25:00 PM9/24/09
to comm...@googlegroups.com
On Wed, Sep 23, 2009 at 11:02 PM, Kris Zyp <kri...@gmail.com> wrote:
> Using shared memory with locks:
>
> var hits = 0;
> var lock = new (require("concurrency")).Lock();
> function incrementAndGetHits(){
> lock.lock();
> hits++;
> lock.unlock();
> return hits;
> }
> exports.app = function(env){
> return {
> status:200,
> headers:{"content-type":"text/plain"},
> body: "This page has been hit " + incrementAndGetHits() + " times";
> };
> }
>
> And here is the same thing using shared nothing with HTML5 workers:
> [snip]

> Is there an easier way to do this with workers?

On Thu, Sep 24, 2009 at 12:59 AM, Kris Zyp <kri...@gmail.com> wrote:
> I
> believe it would be pretty straightforward to write a JSON-RPC library
> on top of workers that would handle posting a message, waiting for
> response, returning a promise, and fulfilling the promise.

> [snip]


> And that doesn't look too bad at all.

As said earlier in this thread, Workers are isolated threads joined by
synchronized messaging queues. RPC to my mind looks about the same as
using those queues for synchronizing. In fact you could do RPC over
those queues. On top of this, you can build mutex and shared locks,
and then your mutex-locked hit-counter example applies to HTML5
Workers.

--
http://codebad.com/

Donny Viszneki

unread,
Sep 24, 2009, 12:26:09 PM9/24/09
to comm...@googlegroups.com
On Thu, Sep 24, 2009 at 12:25 PM, Donny Viszneki
<donny.v...@gmail.com> wrote:
> RPC to my mind looks about the same as
> using those queues for synchronizing. In fact you could do RPC over
> those queues. On top of this, you can build mutex and shared locks,
> and then your mutex-locked hit-counter example applies to HTML5
> Workers.

Or put another way, both RPC and the messaging queues in HTML5 Workers
provide the primitive synchronization needed for the typical
synchronized programming scenario.

--
http://codebad.com/

Mike Wilson

unread,
Sep 25, 2009, 5:55:16 AM9/25/09
to comm...@googlegroups.com
I can just say that you are touching on very interesting
subjects here, that I hope will provide a lot of revelations
on how to best do things in the web stack. Keep up the good
work!

Best regards
Mike

Kris Zyp

unread,
Sep 25, 2009, 8:39:07 AM9/25/09
to comm...@googlegroups.com

Not completely, asynchronous messaging can't be encapsulated into a sync
function (that doesn't require a callback or promise) that can provide
locking. It is impossible to use workers to build something that looks
like a synchronously accessible object, or a synchronous mutex. It might
be more accurate to say that mutexes are one of the primitives that can
be used to build workers (for synchronizing the message queue).
Kris

Kris Zyp

unread,
Sep 25, 2009, 9:59:22 AM9/25/09
to comm...@googlegroups.com
I wanted to throw out a proposal based on our discussions to see if it
we might have something to come to more consensus about:

Event-loop concurrency with shared-only-explicitly
* CommonJS will utilize HTML5 style workers with event-loops to isolate
data from other concurrent processes. Each worker will have its own
global environment and its own set of modules (singleton per worker).
Modules will therefore not need thread-safe coding.
* CommonJS standard modules will not guarantee any mutex or other
thread-safe techniques to access or modify data, unless they explicitly
advertise such capability for their particular function. Users should
not expect any other modules to be thread-safe, unless explicitly
advertised. Modules used in workers that access data accessible only
from a single worker/thread will execute deterministically.
* CommonJS will provide a module for creating and accessing shared data
that spans workers. Anyone that uses this shared data must do so with
caution, realizing that passing this to other modules may result in
non-deterministic behavior.

This is based on:
* Event-loops are familiar from browser-based JavaScript, they have been
demonstrating to be powerful form of concurrency that is makes it
relatively simple to avoid race conditions and deadlocks, and preserving
this paradigm is useful.
* CommonJS is intended to have a fairly large scope of potential users.
Different jobs require different tools, so forcing a single concurrency
on everyone may not be best in our situation. In fact, there is already
existing JavaScript written on multiple threads with shared state, and
will continue to be, regardless of what we say or do here. Defining APIs
(and practices) for modules is in the scope of CommonJS. Eliminating
every piece of JS code in the world that uses shared-state multithreaded
is not.
* Modules need to have some expectations of concurrency to be properly
written.

The following modules and APIs will be defined:

event-queue - This module will be an event-queue which provides a
thread-safe "enqueue" function (generally called internally from
different threads) and a "next" or "take" function that returns the next
event in the queue, or blocks until one is available. The event-queue
may also have an "isIdle" function, and "onready" event that can be used
by construct worker pools to delegate tasks to (especially for JSGI
servers). Entering the event loop is as simple as creating a loop that
gets the next event and executes it.

worker - This module will provide a constructor for Worker and
SharedWorker that follow the HTML5 specification, providing a
postMessage function and an onmessage event handling. Each worker can
execute concurrently and will have its own global environment and set of
modules. Each worker, once spawned will execute the provided script and
then enter the event loop, which is just an infinite loop of getting the
next event from the queue and executed it. When a worker is constructed,
the initiating code may (optional) also explicitly pass (to the
constructor or to "provideShared()" function) a shared object. This
object will be accessible to the worker through the "getShared()"
function on the worker module (the worker's instance of the module). The
worker module will also provide a postData(object) function which will
be sugar for postMessage(JSON.stringify(object)); with automatic
JSON.parse(message) to fill the event.data property in the onmessage
listeners (noting that it would make it very easy for implementations to
provide much faster processing than actually doing a JSON
serialization/deserialization).

concurrency - This module will provide mutex/locking in order to be able
to safely interact with a shared object or any other mutable data is
accessible to multiples threads through means outside of that defined in
CommonJS (for example, if an environment provided a thread constructor).

JSGI will then explicitly require that all concurrent request handling
be performed by workers. JSGI will provide a shared object (initially
empty) to those workers. Other than the shared object, workers
environments should follow the standard isolation (different globals and
module instances).

Also, it might be worth adding a module that would create a new global
and new loader for modules (creating new singletons in the global).
Between concurrency (if it had mutex as well wait/notify type of
synchronization) one could probably build the worker and event-queue
modules entirely in JavaScript, and native bindings would only be
required for concurrency and the global creator modules.

An RPC module would also be useful, but I wanted to start low-level.

Anyway, I hope this proposal strikes a good balance of defaulting and
encouraging users towards event-loop concurrency, while still allowing
for shared-state semantics when necessary.

Kris

Wes Garland

unread,
Sep 25, 2009, 10:40:56 AM9/25/09
to comm...@googlegroups.com
Kris:

Thanks for taking the time to draw what I see as an excellent road map. 

The only suggestion in your message which I am not in full agreement with is the discussion of JSGI mechanics -- and I am not in disagreement there: I have not done the requisite thinking required to form an opinion.


> one could probably build the worker and event-queue
> modules entirely in JavaScript, and native bindings would only be
> required for concurrency and the global creator modules.

That is in fact my plan.

Adding a concurrency module for locks would give me the tools to implement either of these easily in pure JS (plus some other classes I have already), and I could back them with either threaded or setTimeout-style events.

Note that it is not strictly necessary to have native bindings for mutexes: IIRC both Dekker and Lamport's Bakery algorithms have been implemented successfully in JS with code "out there" somewhere.

Donny Viszneki

unread,
Sep 25, 2009, 7:02:17 PM9/25/09
to comm...@googlegroups.com
On Fri, Sep 25, 2009 at 8:39 AM, Kris Zyp <kri...@gmail.com> wrote:
> It is impossible to use workers to build something that looks
> like a synchronously accessible object, or a synchronous mutex. It might
> be more accurate to say that mutexes are one of the primitives that can
> be used to build workers (for synchronizing the message queue).

I see things differently.

Suppose we have two synchronized message queues, and two threads of
execution. Each thread holds the write-end of one queue, and the
read-end of the other queue. Let's define these things now with a
little pseudo-code:

CODE
var MessagesFromA = new MessageQueue;
var MessagesFromB = new MessageQueue;
var EventSource = whatever;
function ThreadA() {
/* Thread A will process events from EventSource */
for (var event in EventSource) {
/* Inform Thread B of a hit */
MessagesFromA.write("hit");
/* Wait for the number of hits */
var hits = MessagesFromB.read();
/* presumably you'd then do something with "hits" */
event.respond("You have hit me "+hits+" times.");
}
}
function ThreadB() {
var counter = 0;
/* Thread B will count */
for (var message in MessagesFromA) {
/* Record hits */
if (message == "hit")
counter++;
/* Report hit count */
MessagesFromB.write(counter);
}
}
/CODE

Does this clarify things? Is there some limitation of this paradigm I
don't appreciate?

--
http://codebad.com/

Neville Burnell

unread,
Sep 25, 2009, 7:10:30 PM9/25/09
to CommonJS
For those following this discussion, Lamport's Bakery algorithm in
JavaScript:

http://www.aitk.info/Bakery/Bakery.html

Kris Zyp

unread,
Sep 26, 2009, 8:28:57 AM9/26/09
to comm...@googlegroups.com

I understand now, sometimes JavaScript a lot better for communication
than English :). What you are doing is synchronous getting messages
(your read function blocks), and you can compose and encapsulate with
that. However, HTML5 workers (and E and any other inspirations, I
believe) use asynchronous messaging and do not provide direct access to
the event queue (and programmatically blocking for the next event).

I think this is further evidence that we should be exposing that the
low-level concurrency constructs (mutexes, threads, blocking thread-safe
queues, or other stuff to build them), and then building workers on top
of these capabilities, rather than just a monolithic worker API. In your
example, it would certainly be more efficient to use a mutex rather than
emulating one in such a way that forces a new thread for every mutex
(expensive). My intent with the concurrency plan I proposed was to
provide these low-levels constructs as well as workers, so developers
could use what suited their situation.

Kris


Mark Miller

unread,
Sep 26, 2009, 10:34:48 AM9/26/09
to comm...@googlegroups.com
On Sat, Sep 26, 2009 at 5:28 AM, Kris Zyp <kri...@gmail.com> wrote:
> I understand now, sometimes JavaScript a lot better for communication
> than English :). What you are doing is synchronous getting messages
> (your read function blocks), and you can compose and encapsulate with
> that. However, HTML5 workers (and E and any other inspirations, I
> believe) use asynchronous messaging and do not provide direct access to
> the event queue (and programmatically blocking for the next event).
>
> I think this is further evidence that we should be exposing that the
> low-level concurrency constructs (mutexes, threads, blocking thread-safe
> queues, or other stuff to build them), and then building workers on top
> of these capabilities, rather than just a monolithic worker API. In your
> example, it would certainly be more efficient to use a mutex rather than
> emulating one in such a way that forces a new thread for every mutex
> (expensive). My intent with the concurrency plan I proposed was to
> provide these low-levels constructs as well as workers, so developers
> could use what suited their situation.


This argument is only evidence if one accepts that emulating mutexes
across vats is desirable. There is a growing literature explaining
just how awful this concurrency paradigm is.
(<http://www.erights.org/talks/promises/paper/tgc05.pdf> is my own
contribution, adapted into Part 3 of
<http://erights.org/talks/thesis/>, which is more complete but also
longer.)

Not only has JavaScript not yet made the threading mistake (which Java
will probably never recover from), it is already heading in the right
direction. Asynchronously communicating event loops is a great
concurrency model, is already the model that JavaScript is investing
in, and is the only one possible for JavaScript in the browser.

--
Text by me above is hereby placed in the public domain

Cheers,
--MarkM

Message has been deleted

sleepnova

unread,
Sep 26, 2009, 11:56:08 AM9/26/09
to CommonJS
Just to show some prior art.

Concurrent JavaScript - http://osteele.com/sources/javascript/concurrent
Erlang style concurrency in Java -
http://codemonkeyism.com/want-erlang-concurrency-but-are-stuck-with-java-4-alternatives
Concurrent Programming in Clojure - http://clojure.org/concurrent_programming
Erlang-style concurrency with JavaScript 1.7 - http://www.beatniksoftware.com/blog/?p=80

Sleepnova

ihab...@gmail.com

unread,
Sep 26, 2009, 12:56:30 PM9/26/09
to comm...@googlegroups.com, kpr...@mac.com
So ...

On Wed, Sep 23, 2009 at 9:33 PM, <ihab...@gmail.com> wrote:
> I've fired off a thread to e-l...@eros-os.org ...

For those of us not on e-lang, here is Kevin Reid's solution:

http://www.eros-os.org/pipermail/e-lang/2009-September/013274.html

Interestingly, the main cool trick he used is the "pass by
construction" ability which simplifies the passing of live objects
(including far references *back* to the originating vat - note how the
object can refer back to 'incrementAndGetHits', as a far reference,
*after* it has migrated).

None of this is fundamentally unachievable in CommonJS (given 'eval'
and a data channel), but it requires a rich inter-vat RPC library. As
I mentioned, Kevin Reid has built a capability RPC library for Caja,
here:

http://code.google.com/p/caja-captp/

and, as I mentioned, it's not clear to me yet how much of this can be
built on top of pure JS without Caja.

Tom Robinson

unread,
Sep 26, 2009, 1:00:40 PM9/26/09
to comm...@googlegroups.com

On Sep 26, 2009, at 5:28 AM, Kris Zyp wrote:

> I understand now, sometimes JavaScript a lot better for communication
> than English :). What you are doing is synchronous getting messages
> (your read function blocks), and you can compose and encapsulate with
> that. However, HTML5 workers (and E and any other inspirations, I
> believe) use asynchronous messaging and do not provide direct access
> to
> the event queue (and programmatically blocking for the next event).
>
> I think this is further evidence that we should be exposing that the
> low-level concurrency constructs (mutexes, threads, blocking thread-
> safe
> queues, or other stuff to build them), and then building workers on
> top
> of these capabilities, rather than just a monolithic worker API. In
> your
> example, it would certainly be more efficient to use a mutex rather
> than
> emulating one in such a way that forces a new thread for every mutex
> (expensive). My intent with the concurrency plan I proposed was to
> provide these low-levels constructs as well as workers, so developers
> could use what suited their situation.
>
> Kris

FWIW, V8 does not support any sort of multithreading (apparently even
with completely independent contexts within the same process)

http://groups.google.com/group/v8-users/browse_thread/thread/5b30d0a7907f22aa
http://groups.google.com/group/v8-users/browse_thread/thread/451dd45e48f83c1c

In fact, if you fire up Chrome and run a web worker demo you'll see
each worker show up as a separate process.


-tom

sleepnova

unread,
Sep 26, 2009, 1:26:12 PM9/26/09
to CommonJS
It seems function chaining is a good way to explicit specify the
dependency between tasks.
In some sense, it's all about dependencies.
An extreme design of explicit dependency management lead to an event
driven architecture.

On Sep 22, 11:16 pm, Kris Zyp <kris...@gmail.com> wrote:
> Ryan Dahl wrote:
>
> > Does seem rather convoluted, doesn't it? I would still rather see
> > something like:
>
> >   server.addListener("request", function (request, response) {
> >       var resultPromise = doAsyncDatabase("select * from table where id = 3");
> >       resultPromise.addCallback(function (resultSet) {
> >            if (resultSet.length == 0) {
> >                response.sendHeader(404, {});
> >            } else {
> >                response.sendHeader(200, {});
> >                response.sendBody(JSON.stringify(resultSet));
> >            }
> >            response.finish();
> >       });
> >   });
>
> Another thing we could consider is allowing promises to be returned
> directly from the JSGI app, rather than just from the forEach call as I
> demonstrated. Returning a promise from a JSGI app would look like:
>
> app = function databaseApp(env){
>    return doAsyncDatabase("select * from table where id = 3").
>         then(function(resultSet){
>             if(resultSet.length == 0){
>                return {status: 404, headers:{}, body:""};
>             }
>             else{
>                return {
>                 status: 200,
>                 headers:{},
>                 body:JSON.stringify(resultSet[0])
>                };
>             }
>          });
>
> };
>
> That reads a lot nicer, but it would put extra burden on servers and
> middleware to have to look for promises at two different places (you
> still need promises from forEach in order to facilitate streamed responses).
> Kris

Kris Zyp

unread,
Sep 26, 2009, 11:55:44 PM9/26/09
to comm...@googlegroups.com

Its worth a lot! The fact that running V8 in multiple threads will put
the VM in a state of "likely to crash fairly soon", means that is simply
impossible to write shared state multithreaded code across all
platforms. It would seem that the only type of concurrency that CommonJS
can standardize support for initiating is shared-nothing (and of course
we would use event loops). If JSGI is to be something that can truly be
implemented on all engines, it would seem it must be specified as having
shared-nothing concurrency (sans intentional work-arounds by users, like
going directly to the JVM, as one should still be able to use JVM
threads to implement this).

If threading is platform specific, than concurrency constructs should be
platform specific (they certainly already exist in Rhino), and not a
part of CommonJS. Providing shared data through workers (as I had
originally suggested) should also not be available. However, providing
real access to event queues (which are used to build workers) through an
event-queue module could still be available. In particular, it may be
helpful to allow for construction of multiple event-queues (within a
single thread/process/vat) and permit associating event-queues with
receiving messages for specific worker ports. One could then
programmatically block on waiting for an event (as one typically is
doing in a normal event-loop while idle), in order to achieve
synchronous execution, like Donny pointed out. I believe this is more
deadlock prone, but that might be a small price to pay for being able to
provide sync messaging when necessary (worker-style async message would
still be the default, encouraged mechanism).

Kris

ihab...@gmail.com

unread,
Sep 27, 2009, 12:11:39 AM9/27/09
to comm...@googlegroups.com
On Sat, Sep 26, 2009 at 8:55 PM, Kris Zyp <kri...@gmail.com> wrote:
> ... In particular, it may be

> helpful to allow for construction of multiple event-queues (within a
> single thread/process/vat) and permit associating event-queues with
> receiving messages for specific worker ports.

Am I correct that what you are proposing here is having each "vat"
export multiple message endpoints, each one independently addressable
by the outside world?

> One could then
> programmatically block on waiting for an event (as one typically is
> doing in a normal event-loop while idle), in order to achieve
> synchronous execution, like Donny pointed out.

In other words, within a "vat", computation may be blocked on the
event queues of one or more of these message endpoints, while
computation proceeds in response to events received at other message
endpoints?

If so, then this can be achieved simply building an inter-"vat" RPC
layer where the target of a message is the tuple (vatid, oid). "oid"
is the address of the target object.

Neville Burnell

unread,
Sep 27, 2009, 1:36:58 AM9/27/09
to CommonJS
and if this vat messaging / rpc was via a socket, then vats might be
on distinct engines / machines?

Donny Viszneki

unread,
Sep 27, 2009, 2:50:32 AM9/27/09
to comm...@googlegroups.com
On Sat, Sep 26, 2009 at 11:55 PM, Kris Zyp <kri...@gmail.com> wrote:
> Its worth a lot! The fact that running V8 in multiple threads will put
> the VM in a state of "likely to crash fairly soon", means that is simply
> impossible to write shared state multithreaded code across all
> platforms.

This may not be true. Taking care to respect that "multithreaded"
doesn't need to mean "using kernel threads," there may be reasonable
ways to get V8 to do something called "multithreading."

Also, who knows what will be in V8's future. Do not rule out the
possibility that what CommonJS decides may well influence the
direction of V8, if the group can actually put together some good
ideas.

> If threading is platform specific, than concurrency constructs should be
> platform specific (they certainly already exist in Rhino), and not a
> part of CommonJS.

It certainly raises some important questions. Perhaps we should be
evaluating whether or not it would be prudent to define different
types of compliance. (A "level zero compliance" or some such thing has
already been proposed. "Levels" of compliance are not *so* different
from independent optional specifications.) It may turn out, though,
that it's best to only focus on a single model of concurrency. But I
would *not* say this is the case merely because V8 isn't yet up to the
task.

> originally suggested) should also not be available. However, providing
> real access to event queues (which are used to build workers) through an
> event-queue module could still be available. In particular, it may be
> helpful to allow for construction of multiple event-queues (within a
> single thread/process/vat) and permit associating event-queues with
> receiving messages for specific worker ports. One could then
> programmatically block on waiting for an event (as one typically is
> doing in a normal event-loop while idle), in order to achieve
> synchronous execution, like Donny pointed out.

I disagree. We don't really need formal access to event queues (as
opposed to the popular Javascript idiom of providing event handler
functions) and if you are really anal about needing them, you can
implement it using generators. (I was writing a long email in which
one of the things I address is the confusion that explicitly waiting
on an event in a blocking function on a formal reference to an event
queue is semantically identical to what the Javascript world does now
with event handlers.

Put another way, imagine that at the top of every Javascript program
is a bit of Javascript code that you just can't see, that goes like
this:

function thisFunctionHostsPrograms() {
var yourProgram = new Program("your-javascript-source-code");
var EventQueue = whatever.
for(var event in EventQueue)
if (yourProgram.handlesEvent(event.type))
yourProgram.handle(event); // maybe this is a
setInterval() interval event, or an onClick event being fired
}

The reason I recommend against providing formal access to an event
queue object is because there are really very few use cases (I
believe, maybe I need to spend another day thinking) and you can
always use generators.

--
http://codebad.com/

Kris Zyp

unread,
Sep 27, 2009, 9:36:54 AM9/27/09
to comm...@googlegroups.com

Neville Burnell wrote:
> and if this vat messaging / rpc was via a socket, then vats might be
> on distinct engines / machines?
>
Yes, I believe it would be relatively straightforward for the processes
to span machines. And I think see where you are going with that. The
potential for developing distributed parallelized applications with
little or no code change is very compelling.
Kris
> >
>

Kris Zyp

unread,
Sep 27, 2009, 9:36:58 AM9/27/09
to comm...@googlegroups.com

Donny Viszneki wrote:
> On Sat, Sep 26, 2009 at 11:55 PM, Kris Zyp <kri...@gmail.com> wrote:
>
>> Its worth a lot! The fact that running V8 in multiple threads will put
>> the VM in a state of "likely to crash fairly soon", means that is simply
>> impossible to write shared state multithreaded code across all
>> platforms.
>>
>
> This may not be true. Taking care to respect that "multithreaded"
> doesn't need to mean "using kernel threads," there may be reasonable
> ways to get V8 to do something called "multithreading."
>
> Also, who knows what will be in V8's future. Do not rule out the
> possibility that what CommonJS decides may well influence the
> direction of V8, if the group can actually put together some good
> ideas.
>

From what I understood from the mailing list threads Tom referenced, V8
supports global lock and switch, (like context switching in preemptive
threading), but that by no means provides the real ability of threads to
run on multiple cores simultaneously. I believe they stated that there
are a 1000 undocumented places that rely on the single thread (or global
lock) assumption. It sounds like adding multiple threading support would
be a massive project. I know that in the Rhino, the necessary mechanisms
for dealing with multiple threads are pervasive throughout the codebase.
I this this was a ground up architectural decision in Rhino, and it
would be non-trivial addition to V8.

>
>> If threading is platform specific, than concurrency constructs should be
>> platform specific (they certainly already exist in Rhino), and not a
>> part of CommonJS.
>>
>
> It certainly raises some important questions. Perhaps we should be
> evaluating whether or not it would be prudent to define different
> types of compliance. (A "level zero compliance" or some such thing has
> already been proposed. "Levels" of compliance are not *so* different
> from independent optional specifications.) It may turn out, though,
> that it's best to only focus on a single model of concurrency. But I
> would *not* say this is the case merely because V8 isn't yet up to the
> task.
>

Perhaps it would be worthwhile for us to have some common agreement on
what a concurrency API looks like for those machines that support it,
while noting that it isn't a part of formal CommonJS (both due to the
lack of platform support, and the worker-style concurrency assumption of
other modules).


>
>> originally suggested) should also not be available. However, providing
>> real access to event queues (which are used to build workers) through an
>> event-queue module could still be available. In particular, it may be
>> helpful to allow for construction of multiple event-queues (within a
>> single thread/process/vat) and permit associating event-queues with
>> receiving messages for specific worker ports. One could then
>> programmatically block on waiting for an event (as one typically is
>> doing in a normal event-loop while idle), in order to achieve
>> synchronous execution, like Donny pointed out.
>>
>
> I disagree. We don't really need formal access to event queues (as
> opposed to the popular Javascript idiom of providing event handler
> functions) and if you are really anal about needing them, you can
> implement it using generators. (I was writing a long email in which
> one of the things I address is the confusion that explicitly waiting
> on an event in a blocking function on a formal reference to an event
> queue is semantically identical to what the Javascript world does now
> with event handlers.
>

Generators are only in Mozilla engines, they are not in V8 or JSCore.
Generators hardly solve all our problems either. Their design makes them
very awkward to use, in particular, for returning values (vs yielding).


> Put another way, imagine that at the top of every Javascript program
> is a bit of Javascript code that you just can't see, that goes like
> this:
>
> function thisFunctionHostsPrograms() {
> var yourProgram = new Program("your-javascript-source-code");
> var EventQueue = whatever.
> for(var event in EventQueue)
> if (yourProgram.handlesEvent(event.type))
> yourProgram.handle(event); // maybe this is a
> setInterval() interval event, or an onClick event being fired
> }
>
>

Yup, thats your basic event-loop.


> The reason I recommend against providing formal access to an event
> queue object is because there are really very few use cases (I
> believe, maybe I need to spend another day thinking) and you can
> always use generators.
>

In the example you had provided earlier, I thought you were attempting
to demonstrate how to achieve synchronous processing by connecting
directly to the message/event queues. Isn't that the use case?

Kris

Kris Zyp

unread,
Sep 27, 2009, 9:40:42 AM9/27/09
to comm...@googlegroups.com

No, my point was the whole vat would block will waiting for a particular
message. It might be nice to provide a synchronous function for certain
actions that are know to be fast, rather than forcing async on callers
everytime intervat communication is needed.
Kris

Donny Viszneki

unread,
Sep 27, 2009, 10:22:24 AM9/27/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 9:36 AM, Kris Zyp <kri...@gmail.com> wrote:
> Donny Viszneki wrote:
>>> If threading is platform specific, than concurrency constructs should be
>>> platform specific (they certainly already exist in Rhino), and not a
>>> part of CommonJS.
>>
>> It certainly raises some important questions. Perhaps we should be
>> evaluating whether or not it would be prudent to define different
>> types of compliance. (A "level zero compliance" or some such thing has
>> already been proposed. "Levels" of compliance are not *so* different
>> from independent optional specifications.) It may turn out, though,
>> that it's best to only focus on a single model of concurrency. But I
>> would *not* say this is the case merely because V8 isn't yet up to the
>> task.
>>
> Perhaps it would be worthwhile for us to have some common agreement on
> what a concurrency API looks like for those machines that support it,
> while noting that it isn't a part of formal CommonJS (both due to the
> lack of platform support, and the worker-style concurrency assumption of
> other modules).

I like my idea better.

> Generators are only in Mozilla engines, they are not in V8 or JSCore.
> Generators hardly solve all our problems either. Their design makes them
> very awkward to use, in particular, for returning values (vs yielding).

As I have posted on the list previously, the generator pattern is
little more than switch()ed closures with non-local jumps (exception
handling) which every Javascript platform that matters has.

http://groups.google.com/group/commonjs/msg/7a60c2251168a5b6

>> I disagree. We don't really need formal access to event queues (as
>> opposed to the popular Javascript idiom of providing event handler
>> functions) and if you are really anal about needing them, you can
>> implement it using generators. (I was writing a long email in which
>> one of the things I address is the confusion that explicitly waiting
>> on an event in a blocking function on a formal reference to an event
>> queue is semantically identical to what the Javascript world does now
>> with event handlers.
>>

>> Put another way, imagine that at the top of every Javascript program
>> is a bit of Javascript code that you just can't see, that goes like
>> this:
>>
>> function thisFunctionHostsPrograms() {
>>     var yourProgram = new Program("your-javascript-source-code");
>>     var EventQueue = whatever.
>>     for(var event in EventQueue)
>>         if (yourProgram.handlesEvent(event.type))
>>             yourProgram.handle(event); // maybe this is a
>> setInterval() interval event, or an onClick event being fired
>> }
>
> Yup, thats your basic event-loop.
>
>> The reason I recommend against providing formal access to an event
>> queue object is because there are really very few use cases (I
>> believe, maybe I need to spend another day thinking) and you can
>> always use generators.
>>
> In the example you had provided earlier, I thought you were attempting
> to demonstrate how to achieve synchronous processing by connecting
> directly to the message/event queues. Isn't that the use case?

No. It is true that I employed a formal reference to an event queue,
but it was only for demonstration purposes. Also as I said in my
previous post, the way in which I used that event queue reference is
semantically identical to the currently popular event dispatch
paradigm used throughout the Javascript world, where the event
dispatcher is hidden from the Javascript programmer, who merely
provides the handlers/callbacks for those events. Please note again
that you can also build the formal reference style upon the hidden
dispatch style, or so I claim you can. My GMail account still has my
draft for posting a follow-up that really addresses that better.

--
http://codebad.com/

Wes Garland

unread,
Sep 27, 2009, 10:46:58 AM9/27/09
to comm...@googlegroups.com
Kris:

On Sun, Sep 27, 2009 at 9:36 AM, Kris Zyp <kri...@gmail.com> wrote:
From what I understood from the mailing list threads Tom referenced, V8
supports global lock and switch, (like context switching in preemptive
threading), but that by no means provides the real ability of threads to
run on multiple cores simultaneously.

I'm not sure we require multi-core "threads" in order to have a useful concurrency API.

I also don't think we need to define that CommonJS implementations must do "real" threads -- in fact, I know that would limit the adoption scope, big-time. It might be an idea to have a CommonJS-MT set of documents, though.

FWIW, GPSEE is (theoretically) MT-capable, in every reasonable sense of the word.  It is the only non-Rhino-based CommonJS platform I'm aware of that has this goal.  And if you're doing a concurrency module, I'll probably eventually copy the API if it's roughly what you proposed earlier.

Now, as for where else a concurrency API might be useful -- consider the case where JS is preempted to run other JS, like setTimeout() in the browser.  lock() in the "primordial" code-hunk and tryLock() in the "event" code-hunk would both be very useful.

And THAT is supported by all engines AFAIK -- preemptive JS events with a promiscuously shared global object graph.

Pre-emptive threading might also be graft-on-able to V8 with techniques like GNU Pth, although I personally would just bite the bullet and go to spidermonkey if I was in that person's shoes.

By-the-way -- multi-machine message passing for worker threads is also pretty interesting to me.  It also strikes me that CommonJS might specify the inter-machine protocol, although I am fearful that the specification approach that CommonJS has taken thus far is very unlikely to produce a good wire-level protocol.

Wes

Kris Zyp

unread,
Sep 27, 2009, 11:30:07 AM9/27/09
to comm...@googlegroups.com

setTimeout events don't preempt, they simply add an event into the event
queue like other events, which will be executed in turn in the
event-loop. There are certain points in the browser environment where
events will be pulled and executed from the queue before another has
finished (sync XHR in FF3, and alert in some situations), but these are
still cooperative switches, not preemptive.

It really sounds like V8's execution is simply not stable with
concurrent execution of JavaScript within a process.


>
> Pre-emptive threading might also be graft-on-able to V8 with
> techniques like GNU Pth, although I personally would just bite the
> bullet and go to spidermonkey if I was in that person's shoes.
>
> By-the-way -- multi-machine message passing for worker threads is also
> pretty interesting to me. It also strikes me that CommonJS might
> specify the inter-machine protocol, although I am fearful that the
> specification approach that CommonJS has taken thus far is very
> unlikely to produce a good wire-level protocol.

I hope CommonJS doesn't attempt to specify an inter-machines protocol,
there are plenty of good protocols out there, we don't want to reinvent
that wheel, and we certainly wouldn't want to be limited to only
communicating with other CommonJS machines (the machines could be
running any language).

Kris

ihab...@gmail.com

unread,
Sep 27, 2009, 11:39:37 AM9/27/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 6:40 AM, Kris Zyp <kri...@gmail.com> wrote:
> No, my point was the whole vat would block will waiting for a particular
> message. It might be nice to provide a synchronous function for certain
> actions that are know to be fast, rather than forcing async on callers
> everytime intervat communication is needed.

Hm. Good point, but I honestly don't know....

So with async-everything, the risk is that casual programmers will be
put off by the extra work and will reason incorrectly about the state
of their "vat" under multiple interleaved events.

With sync-sometimes, these same programmers will probably program
themselves into deadlocks.

ihab...@gmail.com

unread,
Sep 27, 2009, 11:42:38 AM9/27/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 8:30 AM, Kris Zyp <kri...@gmail.com> wrote:
> I hope CommonJS doesn't attempt to specify an inter-machines protocol,
> there are plenty of good protocols out there, we don't want to reinvent
> that wheel, and we certainly wouldn't want to be limited to only
> communicating with other CommonJS machines (the machines could be
> running any language).

We are *so* far away from worrying about this at the moment, so I
agree with you. However, this is a bit of a hobby horse of mine, so
here goes: It would be really great to have a true, easy to use
distributed object protocol for JS (sort of like what Kevin Reid
showed us with E, where he simply schlepped an object from one "vat"
to another). That protocol would of necessity be JS specific. And that
would be ok.

Mark S. Miller

unread,
Sep 27, 2009, 11:45:08 AM9/27/09
to comm...@googlegroups.com, Discussion of E and other capability languages
[+e-lang]


On Sun, Sep 27, 2009 at 8:30 AM, Kris Zyp <kri...@gmail.com> wrote:
I hope CommonJS doesn't attempt to specify an inter-machines protocol,
there are plenty of good protocols out there, we don't want to reinvent
that wheel, and we certainly wouldn't want to be limited to only
communicating with other CommonJS machines (the machines could be
running any language).

Agreed. Both Tyler's web_send protocol <http://waterken.sourceforge.net/web_send/> and E's CapTP protocol <http://code.google.com/p/caja-captp/> are good language neutral distributed capability protocols with implementations in several languages supporting communicating event-loops concurrency. Which one is better is a complex question involving many tradeoffs. Their APIs in JavaScript are quite similar. This community should look at both and understand the considerations each is optimizing for.

I'm cc'ing e-lang because both protocols are often discussed in that forum.

--
   Cheers,
   --MarkM

Mark S. Miller

unread,
Sep 27, 2009, 11:50:57 AM9/27/09
to comm...@googlegroups.com, Discussion of E and other capability languages
[+e-lang]


Although Caja-CapTP supports idiomatic Caja, the protocol is not Caja or JS specific. I had thought at one point that it was but Kevin corrected me. Some improvements were made over prior CapTP implementations regarding choice of identifiers. Once CapTP for E-on-Common Lisp is upgraded with the same improvements, then, transport-layer-issues aside, IIUC, they should be able to interoperate. Kevin?

--
   Cheers,
   --MarkM

Wes Garland

unread,
Sep 27, 2009, 12:23:33 PM9/27/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 11:30 AM, Kris Zyp <kri...@gmail.com> wrote:
setTimeout events don't preempt, they simply add an event into the event
queue like other events, which will be executed in turn in the
event-loop. There are certain points in the browser environment where
events will be pulled and executed from the queue before another has
finished (sync XHR in FF3, and alert in some situations), but these are
still cooperative switches, not preemptive.

You're right, of course, although frankly it feels like pre-emptive switches from the JS user's POV. AFAIK, there is no way to express "please don't interrupt me now", although maybe just executing JS code is enough for that?

(will the browser interrupt for(;;) when setTimeout triggers?)
 
It really sounds like V8's execution is simply not stable with
concurrent execution of JavaScript within a process.

I have to agree with you there.
 
I hope CommonJS doesn't attempt to specify an inter-machines protocol,
there are plenty of good protocols out there, we don't want to reinvent
that wheel, and we certainly wouldn't want to be limited to only
communicating with other CommonJS machines (the machines could be
running any language).


I don't really see CommonJS workers executing anything other than JavaScript code.

On the other hand, I think it would be neat to be able to mix-and-match workers of different CommonJS engines.

Wes
 

Donny Viszneki

unread,
Sep 27, 2009, 1:33:17 PM9/27/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 12:23 PM, Wes Garland <w...@page.ca> wrote:
> On Sun, Sep 27, 2009 at 11:30 AM, Kris Zyp <kri...@gmail.com> wrote:
>> setTimeout events don't preempt, they simply add an event into the event
>
> You're right, of course, although frankly it feels like pre-emptive switches
> from the JS user's POV. AFAIK, there is no way to express "please don't
> interrupt me now", although maybe just executing JS code is enough for that?
>
> (will the browser interrupt for(;;) when setTimeout triggers?)

No. As mentioned prior, the Javascript world handles events with an
implicit loop at the very bottom of the stack that, notionally, waits
for events in some sort of event queue, and dispatches them to your
handlers. Realize, too, that even the "main" entry point (although in
the browser there may be many such "main" events, and many SCRIPT
elements) is responding to such an event.

So far in the Javascript world, the role of "preempting" happens, if
at all, in non-Javascript land. Semantically it is indistinguishable
whether there is any actual preempting/calling-back happening from the
context of the world above Javascript, or whether it is polling for
updates. It is likely that every browser does a mix of both, then
pushes it into a queue, and runs handlers for anything in the queue
any time the queue is not empty.

--
http://codebad.com/

Kris Zyp

unread,
Sep 27, 2009, 3:47:42 PM9/27/09
to comm...@googlegroups.com, Discussion of E and other capability languages

Mark S. Miller wrote:
> [+e-lang]
>
> On Sun, Sep 27, 2009 at 8:30 AM, Kris Zyp <kri...@gmail.com
> <mailto:kri...@gmail.com>> wrote:
>
> I hope CommonJS doesn't attempt to specify an inter-machines protocol,
> there are plenty of good protocols out there, we don't want to
> reinvent
> that wheel, and we certainly wouldn't want to be limited to only
> communicating with other CommonJS machines (the machines could be
> running any language).
>
>
> Agreed. Both Tyler's web_send protocol
> <http://waterken.sourceforge.net/web_send/> and E's CapTP protocol
> <http://code.google.com/p/caja-captp/> are good language neutral
> distributed capability protocols with implementations in several
> languages supporting communicating event-loops concurrency. Which one
> is better is a complex question involving many tradeoffs. Their APIs
> in JavaScript are quite similar. This community should look at both
> and understand the considerations each is optimizing for.
>

And FWIW, Persevere uses and will continue to primarily use HTTP (with a
preferred resource representation of application/javascript) as the
primary inter-machine protocol (perhaps others like the ones you
mentioned could be supported as well). JSON referencing is used for
links/URI referencing and JSON-RPC when necessary for tighter coupled
invocations. HTTP seems to be doing pretty decent in terms of adoption,
scalability, and language neutrality :). It has worked really nicely
with JS in Persevere as well. Clearly, there are a number of options out
there that people can and will use, so as we have said, CommonJS
shouldn't dictate how we communicate with others.
Kris

Mark S. Miller

unread,
Sep 27, 2009, 4:01:11 PM9/27/09
to comm...@googlegroups.com, Discussion of E and other capability languages
Both web_send and caja-captp are built to use JSON (as a request encoding) on HTTPS (as a transport). web_send actually uses HTTPS according to HTTP/REST principles, with URLs as remote object references. caja-captp on HTTPS will use HTTPS only as a transport.
--
   Cheers,
   --MarkM

Kris Zyp

unread,
Sep 27, 2009, 5:14:13 PM9/27/09
to comm...@googlegroups.com
I believe there are a number of reasons for providing an API for
accessing the event queue. (Also, I realized that the motivations for
accessing the event queue really don't require multiple event queues per
vat/process, just one). Anyway, some reasons:

* Error management. An important part of the event loop is that each
event execution in the loop is effectively wrapped in a try/catch (you
don't want an error preventing the next event from executing). In the
browser the catch handler does various things (log to the console in FF,
alert or status bar update in IE, etc.). Trying to decide on a universal
catch mechanism in CommonJS doesn't seem desirable. I would think it
would be far better to allow users to explicitly decide how errors will
be managed. In fact the event loop function is great place for error
management, since it is top-level default error handler for all uncaught
errors. There may also be other cleanup or maintenance operations that a
program may wish to execute between events.

* This is a basic construct that can be used to build the higher level
worker constructor/module (the worker module would still require native
code for spawning a process or thread, and creating the JS global scope).

* There needs to be a way for events to be enqueued on the event queue.
In the browser, all events are enqueued through native code (albiet
these operations can be triggered programmatically through setTimeout,
fireEvent, etc). In CommonJS, it seems it would be desirable for events
to be enqueuable through JavaScript rather than having to go to native
code for all enqueuing. Having an enqueue function could also
potentially allow for enqueuing events with different priority levels as
well. (In fact, I believe the basic stubs for an event queue already
exist in Narwhal in the "reactor" module (not sure I understand the
name), albeit there is no real implementation behind it yet.)

* Additional event queue functions can be provided for constructing
workers that monitor other queues for things like availability. Having
an "isEmpty" function would be important for doing worker pools with
efficient event delegation (which would be critical for scalable JSGI
servers).

* Some programs don't even need or want an event loop. A program that
simple copies a file and exits would never need an event-loop, no need
to force one on it.

* Perhaps most importantly, synchronous processing can be achieved by
programmatically processing the event-queue until an appropriate event
is received. One of the great pains of asynchronous operations is that
an action can't be encapsulated for use by synchronous callers. For
example, if another module relies on an "incrementAndGetHits" function,
expecting a numerical return value, we can't change the implementation
to use asynchronous messaging without forcing the caller to handle a
promise or callback. If the caller is another module that can't be
changed, we may simply be out of luck (and I have seen this happen often
in the browser code). Generators can provide sugar, but they don't solve
this problem either. This need to change the interface when the
implementation changes is a basic violation of encapsulation. If a code
could execute through the event queue events, one could effectively wait
for a response from another worker, and create an encapsulated
synchronous function to preserve the interface.

Perhaps an API could look like this.
var queue = require("event-queue"); // get the event queue for this
vat/process.
queue.nextEvent(); // returns (or executes?) the next event in the
queue, blocking until one is available if it is empty.
queue.enqueue(event, priority); // add the event (a function) to the queue
queue.isEmpty(); // returns whether or not the queue is empty

Kris

Ash Berlin

unread,
Sep 27, 2009, 5:28:43 PM9/27/09
to comm...@googlegroups.com

On 27 Sep 2009, at 22:14, Kris Zyp wrote:

>
>
> Perhaps an API could look like this.
> var queue = require("event-queue"); // get the event queue for this
> vat/process.
> queue.nextEvent(); // returns (or executes?) the next event in the
> queue, blocking until one is available if it is empty.
> queue.enqueue(event, priority); // add the event (a function) to the
> queue
> queue.isEmpty(); // returns whether or not the queue is empty
>

I have no problems with this API on first glance, but I do think it is
getting ahead of ourselves a little bit: we haven't yet got an agreed
upon proposal for file I/O. Can we put this as a future effort and
revisit it (hopefully) soon once we have more of the lower level
things ratified?

-ash

Neville Burnell

unread,
Sep 27, 2009, 6:17:15 PM9/27/09
to CommonJS

> Yes, I believe it would be relatively straightforward for the processes
> to span machines. And I think see where you are going with that. The
> potential for developing distributed parallelized applications with
> little or no code change is very compelling.

Yes, I was thinking we could have V8 vats which are very fast, and
rhino vats which provide access to the large java library base, in the
same app.

And it would be very easy to add vats dynamically to a 'vat-net'
simply by registering them with a distributed 'vat director'

Neville Burnell

unread,
Sep 27, 2009, 6:19:08 PM9/27/09
to CommonJS
And this is also reminiscent of Erlang nodes!

Ryan Dahl

unread,
Sep 28, 2009, 5:20:01 AM9/28/09
to comm...@googlegroups.com
Optimizing the event loop at a low level is important to performance.
Specifying such an interface as you suggests would force us to
implement most of the event loop in javascript.

Browser javascript is inherently event loop based but does not need
such constructs - server side code does not either. Something like
Node's API should be adopted:
http://tinyclouds.org/node/api.html#_events
This allows implementors to employ optimal code for juggling I/O like
libev (http://libev.schmorp.de/bench.html) instead of forcing
implementations to do this at the javascript level.

On Sun, Sep 27, 2009 at 11:14 PM, Kris Zyp <kri...@gmail.com> wrote:
>
> I believe there are a number of reasons for providing an API for
> accessing the event queue. (Also, I realized that the motivations for
> accessing the event queue really don't require multiple event queues per
> vat/process, just one). Anyway, some reasons:
>
> * Error management. An important part of the event loop is that each
> event execution in the loop is effectively wrapped in a try/catch (you
> don't want an error preventing the next event from executing). In the
> browser the catch handler does various things (log to the console in FF,
> alert or status bar update in IE, etc.). Trying to decide on a universal
> catch mechanism in CommonJS doesn't seem desirable. I would think it
> would be far better to allow users to explicitly decide how errors will
> be managed. In fact the event loop function is great place for error
> management, since it is top-level default error handler for all uncaught
> errors. There may also be other cleanup or maintenance operations that a
> program may wish to execute between events.

Uncaught exceptions can be handled by some sort of a global exception event:
process.addListener("exception", ...)

> * This is a basic construct that can be used to build the higher level
> worker constructor/module (the worker module would still require native
> code for spawning a process or thread, and creating the JS global scope).

Although one might want to implement an event loop in javascript,
commonjs should not specify this as a standard. The standard should
start above the worker-level.

> * Some programs don't even need or want an event loop. A program that
> simple copies a file and exits would never need an event-loop, no need
> to force one on it.

If someone copies a file and exits, then they simply drop out of the
event loop after one rotation - this is not a special case. The event
loop should exit naturally when there are no more pending events.

Ryan Dahl

unread,
Sep 28, 2009, 5:22:47 AM9/28/09
to comm...@googlegroups.com
On Sun, Sep 27, 2009 at 11:28 PM, Ash Berlin
<ash_flu...@firemirror.com> wrote:
>
> I have no problems with this API on first glance, but I do think it is
> getting ahead of ourselves a little bit: we haven't yet got an agreed
> upon proposal for file I/O. Can we put this as a future effort and
> revisit it (hopefully) soon once we have more of the lower level
> things ratified?

Rather the opposite. Defining other I/O before thinking about
concurrency is getting ahead of ourselves.

Wes Garland

unread,
Sep 28, 2009, 9:09:33 AM9/28/09
to comm...@googlegroups.com
> If someone copies a file and exits, then they simply drop out of the
> event loop after one rotation - this is not a special case. The event
> loop should exit naturally when there are no more pending events.

Are you proposing here that all CommonJS programs become event driven?

Kris Zyp

unread,
Sep 28, 2009, 10:09:40 AM9/28/09
to comm...@googlegroups.com

Ryan Dahl wrote:
> Optimizing the event loop at a low level is important to performance.
> Specifying such an interface as you suggests would force us to
> implement most of the event loop in javascript.
>
> Browser javascript is inherently event loop based but does not need
> such constructs - server side code does not either. Something like
> Node's API should be adopted:
> http://tinyclouds.org/node/api.html#_events
> This allows implementors to employ optimal code for juggling I/O like
> libev (http://libev.schmorp.de/bench.html) instead of forcing
> implementations to do this at the javascript level.
>

I am curious how explicitly asking for the next event in loop breaks
optimizations, it seems like all the optimizations can occur in
nextEvent() (getting the next event and executing it) which could all be
native code. However, your API provides a promise.wait() function that
effectively satisfies the goal of synchronously processing async actions
that I wanted to see (complete with disclosure of the effect queue
stacking). I am not sure how executing waiting events with
promise.wait() is more doable than an explicit call to nextEvent(), but
that would work for me.

Also would your API provide a means for creating efficient worker pools
so that a web server could effectively delegate incoming requests to
workers that are ready to process them? I had suggested that an isEmpty
function would be useful for delegating events to workers.

I still feel like explicit control of the event loop would be a more
logical way of composing all that functionality that you provide, but
you have addressed most of my requirements, and I wouldn't be opposed to
using this as a basis for our concurrency model, this is good stuff. I
think there are some spelling differences that would need to be
reconciled where we have already made API proposals (like promises), but
the fact that you have a working implementation is compelling, and
certainly gives your opinion a lot of weight.

Kris

Ryan Dahl

unread,
Sep 28, 2009, 4:35:01 PM9/28/09
to comm...@googlegroups.com
On Mon, Sep 28, 2009 at 3:09 PM, Wes Garland <w...@page.ca> wrote:
>> If someone copies a file and exits, then they simply drop out of the
>> event loop after one rotation - this is not a special case. The event
>> loop should exit naturally when there are no more pending events.
>
> Are you proposing here that all CommonJS programs become event driven?

Yes. It's rather implicit in most of javascript's APIs already. Just
take something like setTimeout(). With an event loop setTimeout is a
very natural construction; with threads it's a nightmare. In the new
HTML5 specifications it is very explicit:
http://www.whatwg.org/specs/web-apps/current-work/multipage/origin-0.html#event-loops

Wes Garland

unread,
Sep 29, 2009, 12:03:10 PM9/29/09
to comm...@googlegroups.com
> > Are you proposing here that all CommonJS programs become event driven?

> Yes. It's rather implicit in most of javascript's APIs already.

I think I could make a reasonable argument that that argument does not hold true for the majority of JavaScript code written for non-browser environments.

Urban Hafner

unread,
Sep 29, 2009, 12:16:02 PM9/29/09
to comm...@googlegroups.com
Wes Garland wrote:
> > > Are you proposing here that all CommonJS programs become event driven?
>
> > Yes. It's rather implicit in most of javascript's APIs already.
>
> I think I could make a reasonable argument that that argument does not
> hold true for the majority of JavaScript code written for non-browser
> environments.

And one could counter that argument, by asking how much of this
JavaScript is really out there. We don't know of course, but I bet it's
a tiny fraction of all the JavaScript out there.

Urban

Kris Zyp

unread,
Sep 29, 2009, 12:46:52 PM9/29/09
to comm...@googlegroups.com

Wes Garland wrote:
> > > Are you proposing here that all CommonJS programs become event driven?
>
> > Yes. It's rather implicit in most of javascript's APIs already.
>
> I think I could make a reasonable argument that that argument does not
> hold true for the majority of JavaScript code written for non-browser
> environments.

It is probably useful to think how event-loops fit into the evolution of
a CommonJS platform. Event loops are proposed for situations when
asynchronous actions are triggered. Maybe this wasn't clear, but if your
platform does not yet have any support for asynchronous actions, you
don't need an event loop. It is pointless if nothing is going to happen
asynchronously (nothing would ever go in the event queue). As I pointed
out on IRC, if you are doing HTTP request processing, you are probably
doing loop of listening for request, request processing, and repeating.
This is implicitly an event loop, where the only type of event is an
incoming request.

The major premise of the event loop concurrency is that if/when you do
add asynchronous capabilities, that you maintain the single-thread per
vat assumption. JavaScript code traditionally hasn't been written to
handle concurrent access to its data, and CommonJS will continue to
allow developers to preserve this assumption. Thus, if you create a
setTimeout function for your platform (or async XHR may be a more common
use case), the callback must not be executed in a separate thread
concurrently with other code in the same scope or CommonJS modules
should no longer be expected to behave deterministically/correctly.

The discussion about if and how to explicitly process events to block
until a desired event is finished also applies only if you actually have
an event loop of course. I also still feel that is helpful if starting
the event loop was explicit, so you don't use one unless you need to.
Kris

Wes Garland

unread,
Sep 29, 2009, 1:14:44 PM9/29/09
to comm...@googlegroups.com
> And one could counter that argument, by asking how much of this
> JavaScript is really out there. We don't know of course, but I bet it's
> a tiny fraction of all the JavaScript out there.

Neither argument is really compelling, though.

The real question is -- should this group consider forcing facilities on non-browser environments due to limitations of the browser environment?

I really don't think it should; I don't think I'm speaking out of school by saying that Kris doesn't either.

But, from my POV, "everything should be async because everything in the browser is" is a pretty weak argument.

Note that I am not arguing against async; I am arguing against a global async paradigm.

Writing a file copy function as an event-driven application, to my mind, borders on the ridiculous.

Wes Garland

unread,
Sep 29, 2009, 1:19:48 PM9/29/09
to comm...@googlegroups.com
> It is probably useful to think how event-loops fit into the evolution of
> a CommonJS platform. Event loops are proposed for situations when
> asynchronous actions are triggered. Maybe this wasn't clear, but if your
> platform does not yet have any support for asynchronous actions, you
> don't need an event loop

You know, I think it's probably worth discussing: when and how are event loops constructed, and how do they halt?

Also, how do modules know they are in an async environment? And will they ever need to?

ihab...@gmail.com

unread,
Sep 29, 2009, 1:29:15 PM9/29/09
to comm...@googlegroups.com
On Tue, Sep 29, 2009 at 10:14 AM, Wes Garland <w...@page.ca> wrote:
> The real question is -- should this group consider forcing facilities on
> non-browser environments due to limitations of the browser environment?

Quite apart from the browser vs. non-browser, I think the issue is how
"large" your "vat" is.

"Large" vat -> many services running in the same event queue. If one
object chooses to invoke a sync API and block, all the *other*
services in this "vat" will be stuck waiting while this one little
object gets its reply. These services will rightfully ask that object,
"can't you just async so we can get work done while you wait?".

=> A pure form of this is a set of APIs that do not allow *any*
objects to do any sync operations whatsoever. This is the case of the
browser assuming developers stay away from sync XHR due to its
shortcomings. This is also (afaik) the E model.

"Small" vat -> essentially one service running in one event queue.
Objects may want the option to do both sync *and* async operations. If
an object invokes a sync API, it is probably because the author of
that object has knowledge about the service and realizes it will not
be able to serve up anything meaningful anyway until the sync API
completes.

=> A pure form of this is a set of APIs that *always* block; if you
want async behavior, you create multiple "vats" and have them talk to
one another. This is Erlang.

Reply all
Reply to author
Forward
0 new messages