Let's say you're communicating with a GUI over the network: that's
cool, sometimes it's necessary. I recommend writing a wrapper that
abstracts all that asynchronous business into a set of commands that
can be queued up and run, since it's only the last callback you're
really interested in.
> --
> You received this message because you are subscribed to the Google Groups "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com.
> To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
>
>
browser
.chain
.session()
.open('/')
.assertTitle('Something')
.and(login('foo', 'bar'))
.assertTitle('Foobar')
.and(login('someone', 'else'))
.assertTitle('Someone else')
.end(function(err){
if (err) throw err;
});
The language includes named function declarations. Use them. That
alone would make your example way less terrible.
--i
Cut out the "var" and the "=". But yeah, "function" is long and
klunky for how common it is. I'd love a shorter keyword for it.
Here's a very brief untested example of what I'm talking about:
https://gist.github.com/775591
I usually only care about success or failure, so I don't use any kind
of "passalong" util in my programs. But it's not hard to see how this
general technique could be applied in other ways.
--i
IMO this is a great example; especially for new comers. It makes the
strange looking yet common function nesting look more familiar, then
progresses to show the awesomeness of JS in a simple, clear and
understandable way.
Sometimes i think those who are intimately familiar with the nested
callbacks/anon function style may not realize the amount of angst and
confusion it can cause for the uninitiated.
- shaun
get_val("foo", function(error, foo){
if (error) print("there was an error");
else get_val("bar", function(error, bar){
if (error) print("there was an error");
else get_val("baz", function(error, baz){
if (error) print("there was an error");
else print(foo + bar + baz);
});
});
});
Sync:
try {
print(get_val("foo") + get_val("bar") + get_val("baz"));
} catch(error) {
print("there was an error");
}
The sync version is cleaner: you can see at a glance what it does. It
prints something -- the sum of 3 values returned by get_val. In case
of an error, you get "there was an error". You get the luxury of
handling the error in a single place.
The async looks like the low-level version of the sync one. If we
could disassemble a JavaScript function, it would look pretty much
like the async version (in terms of program flow). What happens is
that get_val("foo") is executed first, then get_val("bar"), then
get_val("baz"), then their sum is computed and printed -- that's
exactly what both versions do. But which one do you prefer to write?
I don't want to hurt anyone's feelings, but to me at least, Node is
unusable for serious work because of this. I'm prepared to lobby
against do-everything-asynchronously for a long time. :-)
Cheers,
-Mihai
--
Mihai Bazon,
http://mihai.bazon.net/blog
print(a,b,c)
There are many of those. Have fun. (Not being facetious.
Programmers write better programs while enjoying the process, so if
that means using Rhino or v8cgi or Ruby, more power to you.)
This conversation is a bit silly. I mean, conceptually, asynchronous
code *is* a bit more complicated.
For many problems, threads do not scale effectively. But for many
problems, they're perfectly adequate, and if you're using an
implementation that's well understood, you may even be able to work
around the Hard Problems that they introduce.
Node is cool and hot and all the tech blorgs in the blagodome are
weblooging about how awesome and fast and super insanely stupid sexy
it is. But that doesn't mean it's best for every problem, or that
JavaScript is everyone's favorite language, or that you have to love
callbacks, or that you should use it.
When you're writing network programs, your program is a proxy the vast
majority of the time. It is a glue that wires together a few
different things that *by their very nature* are most efficiently
handled with asynchronous evented message-passing. XMPP, HTTP, AMQP,
child processes, etc. Those problems are well served by an event
loop.
You can run an event loop in C. You know what? After juggling
structs and function pointers for a bit, all this "verbose" JavaScript
starts to look pretty nice.
> for many "normal" CRUD type apps where blocking isn't that much an issue...
Are you joking? That's *exactly* the sort of app where blocking very
quickly becomes a *huge* issue.
Or do you mean the kind of app where only one user is ever going to be
creating, reading, updating, and deleting data at once? Because
that's what blocking IO means.
All websites are fast with one user on localhost.
> On 01/12/2011 06:54 PM, Preston Guillory wrote:
>> Sam has a point. Asynchronous code is much larger and has a lot of
>> syntactic noise. Look at his example: synchronous Java does in 2 lines what
>> takes 7 lines of asynch code
Except that the 7 lines of async code *can handle other events while
waiting*, without the overhead of coroutines or threads.
And, with the Java version, there's a *ton* of stuff going on below
the surface. Actually doing real work in a higher-level-than-c
language with an event loop is kind of a newish thing. Granted, those
of us who cut our teeth on web browsers have been doing it forever,
which is part of why JS is so right for this task. But the tooling is
not super mature, and node is a lowlevel library by design.
>> Named intermediate functions
>> don't help -- see how Isaac's example stretched to 14 lines.
They reduce indentation, label blocks, and improve stack traces. They
won't improve your golf game.
>> But for a lot of common programming tasks, Node.js
>> is verbose, for sure.
The advantages are that you can do a bunch of things in parallel, and
handle "local" stuff in the pattern as "remote" stuff.
I write most of my shell scripts in Bash.
On Wed, Jan 12, 2011 at 23:39, Sam McCall <sam.m...@gmail.com> wrote:
> Plate(mainWindow).menu('File').openMenu().item('Open').click().end(function(err)
> {
> if(err) throw err;
> });
>
> Where all the library functions are still written in callback-as-last-
> parameter style.
See? I *told you* you could do better than my example ;D
If you can come up with a terser way to express callbacks, that
doesn't lose anything in translation, then by all means, implement it.
I think that this problem, such as it is, will be solved by either
writing a new language no one uses, or just sucking it up and using
the JavaScript we have instead of the JavaScript we wish we had.
--i
Really? Seems to me like it's working. Qv: this thread.
> What is needed is a special operator to
> distinguish async function definitions and calls from sync ones. If we
> had such an operator in the language, we could happily use Javascript
> to mix sync and async calls.
Sure. That's a fair point. So... go do that.
But it won't be JavaScript. It'll be some other thing. If you want
to change JavaScript-the-language, you're posting on the wrong mailing
list.
If you wanna design a new language that compiles to node-friendly JS,
you can clearly do that. Some people will love it, others will hate
it, sky's still blue, pope's still catholic, etc.
> It will not
> put the full power of continuations in the hands of programmers
You may find that programmers are more power-hungry than you can
possibly imagine. ;)
On Thu, Jan 13, 2011 at 02:35, Sam McCall <sam.m...@gmail.com> wrote:
> First, there's a difference between syntax and idiom.
Fair point.
Node's idiom is the most simple idiom possible to *enable* more
creative things in userland. It's the base. Think of the
continuation-callback style as the bytecode that your flow control
library interfaces with.
When you're interfacing with the low-level node libraries a lot,
callbacks are actually a very nice abstraction that gives you a lot of
control over what's going on. If your program is going to deal with
streams more often than single fs operations, then perhaps it makes
more sense to structure your program differently. No one's trying to
say that callbacks are best for everything.
In npm, I *have* to do a lot of lowlevel file system operations.
That's what npm is for. So, I need some abstraction around it so that
I can do a list of N things sequentially or in parallel, and compose
multiple callback-taking actors into one.
Since I have those abstractions, it makes sense to kind of shove the
stream-ish operations (writing or uploading a file, for instance) into
a function that takes a callback, so that I can say "Write this file,
create a symlink, make this directory, upload this thing, and then
signal success or failure" and wrap that up as one "operation" to be
composed in another area of my program.
The thing about closures and continuations is that, while they look
obnoxious in these little examples, once embraced fully, they allow
for a huge degree of DRYness and consistency.
> And I think it's entirely appropriate to try to find the most
> expressive idioms, given the syntax we have. In public, both because
> otherwise we might give up (see first post) and because having idioms
> in common mean we can read each others' code.
Rewind the NodeJS clock about 10 months, to when everyone was debating
how to do Promises in the most powerful and correct way.
Shooting that horse was the only way to get everyone to stop beating
it. Now we have 3 promise libraries that are all better than the one
that was in node-core.
It is important for a platform to be very careful about what it lets
in, lest it prevent innovation by the community. Ryan's extreme
aesthetic in that regard is a big part of what has made node
successful.
> if people are aware of the friction points then maybe
> there'll be a proposal or two when ES6 comes around.
I don't disagree. But I've got programs to write in the thousand
years or so before ES6 will ever be a reality.
--i
Yep.
> I really wish we had continuations in node, this stuff would be cake.
The actor/callback pattern basically is CPS, though of course it's
much uglier in JS than in Scheme. There's no TCO, of course, but the
event loop's stack-destruction performs a similar effect as TCO.
The thing is that, like in Scheme, JS's continuations trap their state
in a closure, rather than a stack-saving Continuation object that
implements coroutines. It's very similar in practice to the Promise
pattern, except done in FP style rather than OOP.
--i
but it requires yield (generators) in the JS implementation.
Havoc
Node is the present of JavaScript.
--i
On Fri, Jan 14, 2011 at 15:52, Mikeal Rogers <mikeal...@gmail.com> wrote:
> you should tell that the es-discuss list :)
>
== Java: ==
mainWindow.menu("File").openMenu().item("Open").click();
Window dialog = mainWindow.getChild(type(Window.class));
Coroutines are possible with V8. That is, you can longjmp into another
stack without disrupting the VM - there are several implementations on
the internet. Node is not going to do this because multiple call
stacks dramatically complicate the mental model of how the system
works. Furthermore it's a hard deviation from JavaScript - browser
programmers have never had to consider multiple execution stacks - it
would confuse them to subtly yet significantly change the language.
Adding multiple stacks is not a seamless abstraction. Users must know
worry that at any function call they may be yielded out and end up in
an entirely different state. Locking is now required in many
situations. Knowing if a function is reentrant or not is required.
Documentation about the "coroutine safety" of methods is required.
Node offers a powerful guarantee: when you call a function the state
of the program will only be modified by that function and its
descendants.
On Sun, Jan 16, 2011 at 4:07 AM, Ryan Dahl <r...@tinyclouds.org> wrote:
> Adding multiple stacks is not a seamless abstraction. Users must know
> worry that at any function call they may be yielded out and end up in
> an entirely different state. Locking is now required in many
> situations. Knowing if a function is reentrant or not is required.
I sympathize with what you're saying and with node's design. There are
real tradeoffs in simplicity and also in overhead. My first reaction
to this was also, "it's better to be explicit and type in your
callbacks."
That said:
- I'm not sure all the points you mentioned about coroutines apply.
JavaScript generators are more controlled.
- I'd love to try something built on node or like node that goes
slightly differently, even if node itself remains with the tradeoffs
it has. I think the way node works is good, but it's not the only
useful way.
It sort of depends on what kind of developer and what kind of task.
- Raw-callback code is truly rocky in many instances. Chains of
callbacks where async result C depends on async result B depends on
async result A, are the worst. And trying to ensure that the outermost
callback always gets called, exactly once, with only one of either an
error or a valid result. It's not writable or readable code.
Some details that I think matter:
1. Generators here are just an implementation detail of promises. They
don't "leak" upward ... i.e. I don't think a caller would ever care
that somewhere below it in the call stack, someone used "yield" - it
should not be distinguishable from someone creating a promise in
another way.
See fromGenerator() here, which is just a promise factory:
http://git.gnome.org/browse/gjs/tree/modules/promise.js#n317
you mentioned:
> Node offers a powerful guarantee: when you call a function the state
> of the program will only be modified by that function and its
> descendants.
fwiw, I don't think generators and yield modify this guarantee. From
"inside" the generator during a yield, yes program state can be
changed by the caller; but from "outside" the generator, the generator
is just a function. Anytime program state can change "under you,"
you're going to be seeing the yield keyword, and you're going to be
implementing a generator yourself.
2. Promises in this code are just callback-holders. That is, if you
have an API that takes a callback(result, error) then you can
trivially wrap it to return a promise:
function asyncThingWithPromise() {
var p = new Promise();
asyncThingWithCallback(function(result, error) { if (error)
p.putError(error); else p.putReturn(result); })
return p;
}
And vice versa:
function asyncThingWithCallback(cb) {
asyncThingWithPromise.get(cb);
}
3. The way I'd see APIs and libraries in this world would be
promise-based, i.e. an async routine returns a promise. Given
asyncThing() that returns a promise, you can set up a callback:
asyncThing().get(function(result, error) { });
Or if you're a generator, you can kick the promise up to your caller,
save your continuation, and either resume or throw when the result
arrives:
result = yield asyncThing();
"yield" only goes up one frame though, it just makes an iterator from
the continuation and returns it, it doesn't unwind the stack or allow
arbitrary code to run. The caller (or some caller of that caller)
still has to set up a callback on the promise. The callback would
resume the continuation.
4. I could be wrong but I don't think JS generators in SpiderMonkey
get longjmp() involved. The yield keyword saves a continuation, but
the generator function then returns "normally" returning an iterator
object. The calling function has to take that iterator object (which
is implemented with a saved continuation) and if it chooses, it can
further unwind the stack by itself yielding or returning, or it
doesn't have to. Anyway there's not really much magic. Continuations
are an implementation detail of iterators.
Another way to put it, is that at each re-entrancy point you're going
to see either the "yield" keyword (push the promise to the caller, by
returning a promise-generating iterator) or you're going to see a
callback (add a callback to the promise).
It isn't like exceptions or threads where you can be interrupted at
any time. Interruptions only happen when you unwind all the way back
to the event loop which then invokes a callback.
5. The flip side of this is that "yield" isn't magic; people have to
understand how it relates to promises, what a generator function is,
etc. Pasting yield-using code from one function to another might not
work if the destination function isn't a generator.
6. I piled several potentially separable but kinda-related things into
the blog post I linked to. One is the "yield" convention. Others
include:
- have "actors" which have their own isolated JS global object, so no
state shared with other actors.
- share the event loop among actors such that callbacks from
different actors can run concurrently but those from the same actor
can't
(i.e. from perspective of each distinct JS global object, there's
only one thread, from perspective of the overall runtime, there are
multiple - but not thread-per-request! the thread pool runs libev
callbacks)
(actors think they each have their own event loop, but it's
implemented with a shared one for efficiency)
- have a framework where each http request gets its own actor
(therefore global object) ... (something that's only efficient with
SpiderMonkey in their most bleeding edge stuff, and I have no idea
about any other JS implementation. and for sure it would always have
_some_ overhead. I haven't measured it.)
- I'd expect state shared among http requests to be handled as in
other actor frameworks, for example you might start an actor which
holds your shared state, and then http request handlers could use
async message passing to ask to set/get the state, just as if they
were talking to another process.
7. In addition to using all cores without shared state or locking, an
actors approach could really help with the re-entrancy dangers of both
callbacks AND a "yield promise" syntax. The reason is that each actor
is isolated, so you don't care that the event loop is running handlers
in *other* actors. The only dangerous re-entrancy (unexpected callback
mixing) comes from callbacks in the actor you're writing right now.
One way to think about actors is that each one works like its own
little server process, even though implementation-wise they are much
lighter-weight.
8. So the idea is Akka-style functionality, and ALSO libev, and ALSO
ability to write in JavaScript, and ALSO wire it up nicely to http
with actor-per-request ;-) and a pony of course
Anyhow, I realize there are lots of practical problems with this from
a node project perspective. And that it's harder to understand than
the browser-like model.
At the same time, perhaps it remains simpler than using actors with
Java, Scala, or Erlang and preserves other advantages of node. Once
you're doing complex things, sometimes it can be simpler to have a
little more power in the framework taking care of hard problems on
your behalf.
Just some food for thought is all since it was sort of an in-the-weeds
thread to begin with. I don't intend it as a highly-unrealistic
feature request ;-) don't worry.
I am hoping there are people out there who explore these kind of
directions sometime.
Havoc
On Sun, Jan 16, 2011 at 10:01 PM, Mikeal Rogers <mikeal...@gmail.com> wrote:
> Furthermore, those implementations introduce a new keyword and are breaking
> changes to the language so considering them for use in node is incredibly
> premature.
Sure. Still, in my opinion there's a pretty interesting problem about
nicer coding models, and I think "callback spaghetti" is a real
problem in code that contains chains of async calls, especially in
more complex cases (making a set of parallel calls then collecting
results, or trying to handle errors robustly, are examples of
complexity).
There are various third-party attempts to address this for node as
mentioned earlier in the thread, showing the demand.
It's sort of like raw threads vs. higher-level tools such as
java.util.concurrent or actors. Those higher-level tools are useful.
> There isn't a task you can't accomplish because you don't have generators
> except sugar.
I've also used them pretty heavily (exactly as described) doing
client-side async code with gjs, and I think they work well for this.
The thing you can uniquely do is write sequential code (not
"inverted"/"nested") that waits on async events [1], because
generators allow you to save a continuation. Error handling also
happens more naturally since the "yield" statement can throw.
I agree generators aren't that exciting just as a way to write an
iterator, and that it's a bit odd that continuation-saving appears as
kind of a side effect of an iterator mechanism.
The usage of "yield" here is almost exactly like C#'s new "await"
keyword, I think. The "await" language feature is done purely for this
write-my-callbacks-sequentially purpose instead of mixing in the
iterator stuff. A JS feature like that might be nicer.
Kilim for Java does a very similar thing as well.
This stuff really is just syntactic sugar for typing your callbacks in
a nicer way, it's true. "Make the rest of this function a callback"
rather than typing the callback as a function literal.
However, sequential code is a lot easier for human brains to work with.
I think that's why multiple frameworks/languages/whatever are
inventing this kind of thing.
Havoc
[1] another way to write sequential async code, that's been used some
e.g. in GTK+ in the past, is the "recursive main loop" which is
allowed by GLib. In this case you don't return to the calling function
but just do a mainloop poll() while inside a callback from the
original mainloop poll(), then continue with your original callback.
There are some worrisome side effects and complexities caused by this
though so the yield/await type thing has more appeal imo.
Havoc
> What if I don't control A? What if I don't really trust A in that sense, and only want to "subscribe" to be notified that A
> has finished, and then let *my* code that I control be responsible for executing (and parameter passing) of B?
> And that line of questioning obviously gets much murkier if there's a chain of A, B, C, and D, etc.
>
This question is very important for programming in single-threaded
call back environment when you have various libraries of various
degrees of quality.
The canonical answer is to make your interface to the bad libs you
call be intermediates with a dict of strings of pending calls. You
get called back from that lib, you take the string handed back, look
it up in the dict, log error and return if not found, and delete the
string from the dict and proceed if found (using data from the dict
values if your caller is especially bad).
If there is some chance the cb will never be invoked, then you set a
timer in what you control and delete the pending call from your dict
there and do some time out handling.
Then you can explicitly handle the suckiness of your API.
On a broader note, the stark reality of network calls is that any
given call have have at least six results:
Request receieved, succeeded, reply received.
Request receieved, succeeded, reply lost.
Request receieved, failed, reply received.
Request receieved, failed, reply lost.
Request fatal to receiver, no reply, endpoint killed. (retries do not
fix all problems).
Request lost.
People that think about these outcomes and see how different from
in-stack function calls it is will write more robust networking code,
that will be faster and have fewer weird failures than those that use
an RPC type model.
The amazing power that the one-thread, async framework gets you is an
ability to reason much more strongly about your code. This reasoning
power saves you more programmer time than the async syntax is costing
you, at least once you have any trouble or surprises.
--Chris
That's exactly what I'm proposing with the @ operator, a very low-level
basic building block *built into JavaScript* for negotiating async
continuations, localized to a single statement. Many of the common use-cases
would be composable by simple use of the @ operator, and the libraries would
then be more for the advanced use-cases and for providing additional
syntactic sugar as desired.
I'm thinking about implementing this idea as an extension to V8 to make it
available to node. Even if the idea never caught on for mainstream
JavaScript engines, that doesn't mean something low-level like this (or
similar) couldn't be useful to the V8 that powers node.
--Kyle
I'm not suggesting Node add new syntax. I'm suggesting that I'm going to
fork and extend *V8*, to support the new syntax, and then I'm going to allow
that new "V8+" to easily run inside Node.
Frankly, I don't really care if "Node" (as in, the core Node team) ever
wants to accept or promote the V8 extension I propose. Whether you ever
officially endorsed the idea or not is irrelevant -- I'm simply going to
make it easy for someone to use an alternate fork of V8 in their own copy of
Node, to have a simpler way of dealing with sync/async. They should be just
as free to choose that fork of Node/V8 as they would to choose one of the
several promise/defer libs being kicked around.
BTW, extending V8 to experiment with new ideas for the language already has
strong precedent from the Mozilla team. They do new stuff all the time, most
of which has never been accepted into the official language. That doesn't
mean it's not useful for projects that choose to use their engine as its
core. The same should be true here.
In any case, that's a real shame if core Node is completely opposed to
looking at ways to improve server-side JavaScript by extending the language.
----------
I've written enough JavaScript to be convinced that the language is indeed
lacking in its ability for developers to cleanly, efficiently and securely
negotiate async and sync code in the same program. It's enough of an
important fundamental characteristic that I think *something* of a solution
belongs natively in the language, if even only a narrow helper.
If you like nested callbacks and you don't care about why this is important
to the rest of us, fine. Go on writing that god-awful looking code. Don't
expect me to use your code though. I'd rather rewrite an entire library from
scratch than use and try to maintain some of what I've seen out in the
Node.js community's projects around these async code patterns.
I didn't think I'd ever say something like this, but I prefer *JAVA* and
*PHP* over some of the terrible async JS callback code I've seen floating
around. Gah! Someone please just shoot me in the knee cap for saying that.
It's more palatable to me to take on the giant task of extending the JS
engine's language syntax than it is to keep writing (and reading) all this
crappy callback-laden code, or having some group of people arbitrarily
decide for me what *they* think is the best API for this is.
The @ operator's advantage is that it's a very simple building-block upon
which more complex and syntax-sugary promise/defer libs could be built to
help appease developer preferences. It may or may not prove to be a good
idea, but at least it's a step in *some* direction rather than the constant
circular arguing that never goes anywhere.
Can't we see it's utterly pointless to argue over natively included
APIs/libs because no single API will ever solve *all* the different
async/defer/promise use-cases properly. If Node core (or JavaScript itself)
ever chose one specific promise/defer API, they'd be alienating/ignoring all
the other developers for whom that API is either distasteful or inadequate.
Just shouldn't happen. And if it *did* happen, it'd be worse than what we
have now. So, get over it.
With respect to (very slim) chances to get native support/acceptance for @,
I think a single operator for statement-level continuations has an
attractive simplicity and smaller surface area to disagree about than any of
the other stuff being considered. Perhaps it's na�ve and outlandish of me to
think @ has any chance of making it into JavaScript, but it's got to have a
microscopic sliver of a better chance than any complicated API/lib making it
in.
----------
I'm going to build the prototype for this idea so that it's not just theory
anymore. If nothing else, at least *I* will able to start writing cleaner
and better async code. I just wish the community would stop arguing over and
over ad nauseum about the same dead horses. At some point, we just have to
take some step toward a solution. Almost any step any any direction would be
better than this dross.
Wash, rinse, repeat. Yawn.
--Kyle
Cheers,
-Mihai
> the other stuff being considered. Perhaps it's naīve and outlandish of me to
> think @ has any chance of making it into JavaScript, but it's got to have a
> microscopic sliver of a better chance than any complicated API/lib making it
> in.
>
> ----------
>
> I'm going to build the prototype for this idea so that it's not just theory
> anymore. If nothing else, at least *I* will able to start writing cleaner
> and better async code. I just wish the community would stop arguing over and
> over ad nauseum about the same dead horses. At some point, we just have to
> take some step toward a solution. Almost any step any any direction would be
> better than this dross.
>
> Wash, rinse, repeat. Yawn.
>
> --Kyle
>
> --
> You received this message because you are subscribed to the Google Groups
> "nodejs" group.
> To post to this group, send email to nod...@googlegroups.com.
> To unsubscribe from this group, send email to
> nodejs+un...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en.
>
>
--
Mihai Bazon,
http://mihai.bazon.net/blog
I am not "asking" for a language feature. I'm suggesting one, and announcing
that I'm going to build it so that perhaps others can benefit. I'm
announcing that effort *here* because I think some people in this community
may benefit from a different approach to this problem instead of the same
old arguing that never goes anywhere. This is a node.js list, and I'm
talking about a fork project of node.js.
> Just because you're afraid of the es-discuss or v8 lists don't think that
> this is a productive way to get these features somewhere you can use them.
I'm not afraid of either list. And I don't need your help or blessing to
"get these features somewhere" that I (or others) can use them.
> ...whatever you build... it won't be node... This is the node.js list.
> Let's get this back to being about node.js.
I fail to see how discussing an alternative idea which relies on an
extension to JavaScript doesn't qualify as being related to Node.js, if my
publicly expressed intent is to use that feature inside of Node wrapped
around the extended V8.
--Kyle
Hi
Aside: I'm a Gnome/GTK fan-boy. I eagerly read your book when I was in
high school - it was great.
> On Sun, Jan 16, 2011 at 4:07 AM, Ryan Dahl <r...@tinyclouds.org> wrote:
>> Adding multiple stacks is not a seamless abstraction. Users must know
>> worry that at any function call they may be yielded out and end up in
>> an entirely different state. Locking is now required in many
>> situations. Knowing if a function is reentrant or not is required.
>
> I sympathize with what you're saying and with node's design. There are
> real tradeoffs in simplicity and also in overhead. My first reaction
> to this was also, "it's better to be explicit and type in your
> callbacks."
>
> That said:
>
> - I'm not sure all the points you mentioned about coroutines apply.
> JavaScript generators are more controlled.
>
> - I'd love to try something built on node or like node that goes
> slightly differently, even if node itself remains with the tradeoffs
> it has. I think the way node works is good, but it's not the only
> useful way.
> It sort of depends on what kind of developer and what kind of task.
Yeah - it will be interesting to see how it works out. You should be
careful to note that it is not EcmaScript.
I think your ultimate goals - what you list at the end - are
achievable without modifying the language.
> - Raw-callback code is truly rocky in many instances. Chains of
> callbacks where async result C depends on async result B depends on
> async result A, are the worst. And trying to ensure that the outermost
> callback always gets called, exactly once, with only one of either an
> error or a valid result. It's not writable or readable code.
>
> Some details that I think matter:
>
> 1. Generators here are just an implementation detail of promises. They
> don't "leak" upward ... i.e. I don't think a caller would ever care
> that somewhere below it in the call stack, someone used "yield" - it
> should not be distinguishable from someone creating a promise in
> another way.
>
> See fromGenerator() here, which is just a promise factory:
> http://git.gnome.org/browse/gjs/tree/modules/promise.js#n317
Okay. So a generator will allow you to "blocking" a computation - to
save the continuation. I grant this is useful and does not leak upward
- but blocking on pure computation is not what people care about. The
callbacks in Node are never from computation - they are from I/O.
You're not addressing the complaints, actually.
People complain not being able to do this:
var result = database.query("select * from table");
People do not complain about being forced to structure their
interruptible parser as a state machine rather than recursive decent -
which generators would allow. Blocking computation vs blocking I/O.
> you mentioned:
>> Node offers a powerful guarantee: when you call a function the state
>> of the program will only be modified by that function and its
>> descendants.
>
> fwiw, I don't think generators and yield modify this guarantee. From
> "inside" the generator during a yield, yes program state can be
> changed by the caller; but from "outside" the generator, the generator
> is just a function. Anytime program state can change "under you,"
> you're going to be seeing the yield keyword, and you're going to be
> implementing a generator yourself.
Yes, it seems that's correct to me too. But again, I don't think this
is the addressing problems people have.
> 2. Promises in this code are just callback-holders. That is, if you
> have an API that takes a callback(result, error) then you can
> trivially wrap it to return a promise:
>
> function asyncThingWithPromise() {
> var p = new Promise();
> asyncThingWithCallback(function(result, error) { if (error)
> p.putError(error); else p.putReturn(result); })
> return p;
> }
>
> And vice versa:
>
> function asyncThingWithCallback(cb) {
> asyncThingWithPromise.get(cb);
> }
Promises are purely library sugar. I'm not sure if you're suggesting
Promises require generators - if so - you're wrong. Here is an
implementation that does not require modifying the language:
https://github.com/kriszyp/node-promise
If the promise has a 'wait()' function, then it becomes interesting.
> 3. The way I'd see APIs and libraries in this world would be
> promise-based, i.e. an async routine returns a promise. Given
> asyncThing() that returns a promise, you can set up a callback:
>
> asyncThing().get(function(result, error) { });
>
> Or if you're a generator, you can kick the promise up to your caller,
> save your continuation, and either resume or throw when the result
> arrives:
>
> result = yield asyncThing();
>
> "yield" only goes up one frame though, it just makes an iterator from
> the continuation and returns it, it doesn't unwind the stack or allow
> arbitrary code to run. The caller (or some caller of that caller)
> still has to set up a callback on the promise. The callback would
> resume the continuation.
>
>
> 4. I could be wrong but I don't think JS generators in SpiderMonkey
> get longjmp() involved. The yield keyword saves a continuation, but
> the generator function then returns "normally" returning an iterator
> object. The calling function has to take that iterator object (which
> is implemented with a saved continuation) and if it chooses, it can
> further unwind the stack by itself yielding or returning, or it
> doesn't have to. Anyway there's not really much magic. Continuations
> are an implementation detail of iterators.
"saves a continuation" means that the call stack was saved - you
return to the previous. Once it's saved you now have two execution
stacks. It appears spidermonkey does this by memcpying the top frames.
Magic.
> Another way to put it, is that at each re-entrancy point you're going
> to see either the "yield" keyword (push the promise to the caller, by
> returning a promise-generating iterator) or you're going to see a
> callback (add a callback to the promise).
>
> It isn't like exceptions or threads where you can be interrupted at
> any time. Interruptions only happen when you unwind all the way back
> to the event loop which then invokes a callback.
>
>
> 5. The flip side of this is that "yield" isn't magic; people have to
> understand how it relates to promises, what a generator function is,
> etc. Pasting yield-using code from one function to another might not
> work if the destination function isn't a generator.
It's definitely not EcmaScript and it's not implementable in
EcmaScript. I would call this magic.
> 6. I piled several potentially separable but kinda-related things into
> the blog post I linked to. One is the "yield" convention. Others
> include:
>
> - have "actors" which have their own isolated JS global object, so no
> state shared with other actors.
>
> - share the event loop among actors such that callbacks from
> different actors can run concurrently but those from the same actor
> can't
> (i.e. from perspective of each distinct JS global object, there's
> only one thread, from perspective of the overall runtime, there are
> multiple - but not thread-per-request! the thread pool runs libev
> callbacks)
> (actors think they each have their own event loop, but it's
> implemented with a shared one for efficiency)
This is definitely do-able. In V8 our context creation isn't very
cheap, so creating new "green processes" would not be as cheap as one
would like - but probably much cheaper than starting a whole new
process.
I like my model:
- Processes are thick, they handle many hundreds or thousands of
concurrent connections.
- Process creation is relatively expensive (compared to green
threads), it requires starting a whole OS process.
- However: all the OS features can be applied: killing them
independently, security, etc.
- Most apps live in a single process - which is simple and nice.
Note that this in no way bars one from having multiple event
loops/contexts that communicate. The only limitation is that you won't
have a event loop/context for each connection because the overhead is
too high. You need to pack a couple hundred connections together into
one process.
> - have a framework where each http request gets its own actor
> (therefore global object) ... (something that's only efficient with
> SpiderMonkey in their most bleeding edge stuff, and I have no idea
> about any other JS implementation. and for sure it would always have
> _some_ overhead. I haven't measured it.)
This is something that can't be done in Node without a lot of overhead.
> - I'd expect state shared among http requests to be handled as in
> other actor frameworks, for example you might start an actor which
> holds your shared state, and then http request handlers could use
> async message passing to ask to set/get the state, just as if they
> were talking to another process.
This is how large (multi-process) Node programs to work as well. The
shared state actor might be redis.
> 7. In addition to using all cores without shared state or locking, an
> actors approach could really help with the re-entrancy dangers of both
> callbacks AND a "yield promise" syntax. The reason is that each actor
> is isolated, so you don't care that the event loop is running handlers
> in *other* actors. The only dangerous re-entrancy (unexpected callback
> mixing) comes from callbacks in the actor you're writing right now.
> One way to think about actors is that each one works like its own
> little server process, even though implementation-wise they are much
> lighter-weight.
Yes. In Node the processes are real processes - so they are somewhat
heavy. Giving a process to each request is too much overhead in Node -
so there will always be multiple connections on a process. There are a
lot of good things about this but a bad thing is that you cannot kill
a single connection and remove its memory and consequentially you
cannot upgrade a single connection to a new version of the software
without bringing down the entire OS process. Your thin processes will
be better in that area.
> 8. So the idea is Akka-style functionality, and ALSO libev, and ALSO
> ability to write in JavaScript, and ALSO wire it up nicely to http
> with actor-per-request ;-) and a pony of course
That's my idea too. I haven't yet released my IPC/Actor framework for
Node - but this is all possible with thick, simple processes and
without modifying the language. We're working on this outside of the
public repo right now, but we'll release it in a few months.
(By 'thick' I mean that they are OS processes and tend to handle many
hundreds of connections, as opposed to Erlang-style 'green' or 'thin'
processes.)
> Anyhow, I realize there are lots of practical problems with this from
> a node project perspective. And that it's harder to understand than
> the browser-like model.
>
> At the same time, perhaps it remains simpler than using actors with
> Java, Scala, or Erlang and preserves other advantages of node. Once
> you're doing complex things, sometimes it can be simpler to have a
> little more power in the framework taking care of hard problems on
> your behalf.
I do not think Node preclude one from 'Actor'-style systems.
> Just some food for thought is all since it was sort of an in-the-weeds
> thread to begin with. I don't intend it as a highly-unrealistic
> feature request ;-) don't worry.
>
> I am hoping there are people out there who explore these kind of
> directions sometime.
Me too, it's interesting. I just want people to understand that even
though Node takes a very "simplistic" approach to computing, it is not
fundamentally limited. I admit that Node has not done enough to make
building IPC/cross-machine programs obvious - but the foundations are
there.
Ryan
Sam,
I applaud your effort. You're trying the address your problem without
resorting to a more complex programming model. High-performance,
single-call stack attempts at "blocking" I/O are woefully absent from
the literature.
(That said, I personally do not think a pre-processor is needed.)
Ryan
So, let's walk through the progression of code patterns as it relates to
this discussion, as I'm curious exactly where in these steps your superior
knowledge of patterns would have kicked in and just magically solved all my
problems (even the ones I didn't know about yet).
I start out a new project and write:
A(1,2);
B(3,4);
C(5,6);
When I first write that code, the implementations of A, B, and C are all
synchronous/immediate and under my direct control. So my code works fine.
Then, for some reason a few weeks later, I need to change the implementation
of A and B and make them both asynchronously completing (because of XHR, for
instance). So, I go back and try to re-pattern my code with as little a
change as possible, both to the calling code and ALSO to the implementation
code inside A and B.
My first stab at it is:
A(1,2,function(){
B(3,4,function(){
C(5,6);
}
});
Which works! And I shrug off the syntactic ugliness of it. After all, 3
levels of nesting is not terrible. And I'll probably never need more nesting
than that. I hope.
But then, later, I realize that I need to swap in another implementation of
A from a third-party, which I don't really know or trust. And as I start to
examine this pattern, I realize, really, all I care about in this case is to
be "notified" (think: event listening) of when A fully finishes (whether
that is immediately or later).
In fact, I read up on that third-party's documentation for A, and I find
that they actually have conditions inside the implementation where sometimes
the callback will be executed immediately, and sometimes it'll happen later.
This starts to make me nervous, because I'm not sure if I can really trust
that they'll properly call my callback at the right time. What if they call
it twice? Or too early? What if that uncertainty creates other race
conditions in my calling code.
So, at this point, I'm getting really frustrated. I don't know exactly what
pattern I can use to cleanly and efficiently ensure that A(..) will run
(however it wants to run) and then once it's complete, and only then, and
only once, will I be notified of it being complete so I can run B.
I look around at promises and defers, and I see things like `when()`. Since
those third-party implementations don't support promise/defer, I reluctantly
switch back to re-implementing A and B myself. So I try that pattern:
when(function(){ return A(1,2); }, function(){
when(function(){ return B(3,4); }, function(){
C(5,6);
}
});
Ouch, that's getting uglier quickly. It's starting to get a lot harder to
understand what the heck that code is supposed to do. Then I see `then`
style pattern, and I try that:
A(1,2)
.then(function(){ return B(3,4); })
.then(function(){ C(5,6); });
OK, we're making some progress. That style is not terrible. It's not great,
but I can probably live with it. Except for the fact that I have a heavy
"Promise" library implementation creating these return objects, and the
implementation inside A and B is ugly, and *forces* me to always return a
promise.
What if in another part of my code, I want A to only return an immediate
value and not need/observe its asynchronously completing behavior?
Oops, guess I have to refactor again, and create a wrapper to A like this:
function immediate_A(num1, num2) { ... }
function A(num1,num2) {
immediate_A(num1,num2);
return ...; /* some promise creation mechanism
}
Now, I decide that I want B to actually pass a "message" onto C (aka, the
results of what B accomplished). I'm forced to change the function signature
of C so that it accepts a third parameter, and then I have to hard-code that
B will call C with exactly that signature. I try to get clever and use
partial currying for C for its first two parameters, but then my code keeps
getting even uglier and uglier.
The further down the rabbit hole I go, the more I realize that what was
natural to me in synchronous coding is nowhere even in the same universe as
the code I need to write when things become asynchronous. I realize that
even with perfect future-thinking foreknowledge of all my possible needs of
my use of these functions, I probably wouldn't have used a pattern that
could solve all of them.
All this "abstraction" has become a hindrance rather than a useful tool. I'm
frustrated and about ready to give up. Why oh why can't I just have the
language help me out instead of forcing me to jump through all these
"functional" hoops that are inevitably brittle when I need to make a change
to my usage of the pattern a few weeks from now?
All of the questions (and others) I'm raising here are ones I've actually
run into in my own coding, especially when writing code that I share between
browser and server where I need good (and maintainable) solutions for
sync/async negotiation.
-----
Boy, it sure would be nice if this just worked:
var foo = A(10,20) * B(30,40);
A(1,2) @ B(3,4) @ C(5,6);
And it hid from me most of the pain of dealing with promises and smoothed
over the differences between A, B, and C being sync or async, and let me use
immediate return values from A, B and C if I want to, without having to care
at all about possible asynchronicity under the covers.
> This is a userland thing. That's definitely the consensus.
The word "consensus" indicates that if the majority opinion/decision is not
unanmious, minority objections must be mitigated or resolved. You're either
conflating consensus with unanimous, or suggesting that anyone who thinks
the language could/should solve this (or at least be *part* of the solution)
is wrong and ignorable.
I have, as you can see above, strong objections to the assertion that it's
purely something a "userland" library can solve. At *best*, I'm going to
need several entirely different and probably incompatible promise/defer
libraries/syntaxes that I mix into my project for my different needs. What a
nightmare.
Before you go and casually dismiss the minority opinion that the language
could/should be part of the solution, I'd love to see perfect answers to all
of the above complications (and the dozen others I didn't present).
--Kyle
On Mon, Jan 17, 2011 at 5:41 PM, Ryan Dahl <r...@tinyclouds.org> wrote:
> Aside: I'm a Gnome/GTK fan-boy. I eagerly read your book when I was in
> high school - it was great.
I appreciate it! node.js is sweet as well, otherwise I wouldn't be on
the mailing list of course.
> Yeah - it will be interesting to see how it works out. You should be
> careful to note that it is not EcmaScript.
>
> I think your ultimate goals - what you list at the end - are
> achievable without modifying the language.
I'm glad to hear you're working on actor stuff.
To be clear, I spent a fair bit of time messing around with some of
this, but for now I put it down and just posted what I had as a source
of ideas for others. I figure it'll be cool to see how node and akka
and other projects end up.
> People complain not being able to do this:
>
> var result = database.query("select * from table");
An example of where we would use the "yield" syntax in gjs code, is
syncing data via http requests to a local sqlite database, where both
the http requests and the sqlite calls are async.
So it would look like:
var result = yield database.query("select * from table");
here database.query returns a promise but the value of "result" is the
promised value, not the promise itself.
Without the "yield" sugar you'd write:
database.query("select * from table").get(function(result, exception)
{ ... whatever ... });
.get() could also be called .addCallback()
Anyway, I guess this syntactic sugar discussion ends up pretty
academic at least in the short term, since V8 doesn't have it.
It is true that it's just sugar.
> Promises are purely library sugar. I'm not sure if you're suggesting
> Promises require generators - if so - you're wrong. Here is an
> implementation that does not require modifying the language:
> https://github.com/kriszyp/node-promise
The gjs implementation of promises I linked to does not require
generators - it's just a dead-simple list of callbacks that look like:
function(result, exception) { }
Generators which yield promises are one way of creating a promise.
When you do promise.putReturn(value) or promise.putError(exception)
then all the callbacks are invoked, and some callbacks may happen to
continue a suspended generator.
> If the promise has a 'wait()' function, then it becomes interesting.
wait() presumably needs a thread, or recursive main loop, or
something, so the implementation in gjs doesn't have that. To "block"
on a promise (you can't truly block) you would use the yield stuff
which boils down to just using callbacks, rather than blocking.
I think promises are kind of useful even without wait() or
simulated-wait, i.e. just as callback holders.
Some reasons are:
- you can have more than one callback on the same event pretty easily
(the promise is like a mini signals-and-slots thing)
- you can write utility routines like "make a promise that is
fulfilled when all these other promises are fulfilled"
> stacks. It appears spidermonkey does this by memcpying the top frames.
> Magic.
yeah, it's a little scary I suppose. :-) but it's better than
longjmp()! (OK a "better than longjmp" argument can justify anything
at all ;-)
> Yes. In Node the processes are real processes - so they are somewhat
> heavy. Giving a process to each request is too much overhead in Node -
> so there will always be multiple connections on a process. There are a
> lot of good things about this but a bad thing is that you cannot kill
> a single connection and remove its memory and consequentially you
> cannot upgrade a single connection to a new version of the software
> without bringing down the entire OS process. Your thin processes will
> be better in that area.
It's kind of nebulous but what I like about actors / thin processes is
the separation of concerns; meaning you can decide separately if you
want to have multiple OS processes, a thread pool, or even a single
thread. That decision moves up into the container. Ideally, it's even
kind of magic; the container figures out how many cores you need to
use and uses them.
Even assuming the JS implementation keeps each new
global-object-plus-stack pretty light, it would definitely have to be
an optional application or web framework decision to have one per
request, since it's overhead for sure. You have http-parser so finely
tuned to be ultra-low-memory that it isn't hard to make it several
times larger I'd guess and that wouldn't be suitable in many cases.
> That's my idea too. I haven't yet released my IPC/Actor framework for
> Node - but this is all possible with thick, simple processes and
> without modifying the language. We're working on this outside of the
> public repo right now, but we'll release it in a few months.
Cool.
> Me too, it's interesting. I just want people to understand that even
> though Node takes a very "simplistic" approach to computing, it is not
> fundamentally limited.
Oh, hopefully that's very clear. I think the event loop is the right
foundational building block.
Havoc
I suggested a "cookbook" or best practice wiki and gave 2 typical
examples - we'll contribute some code if there's a good place to put
this - should we just start a git hub on this ?
> (...)
> Then, for some reason a few weeks later, I need to change the implementation of A and B and make them both asynchronously completing (because of XHR, for instance). So, I go back and try to re-pattern my code with as little a change as possible, both to the calling code and ALSO to the implementation code inside A and B.
>
> My first stab at it is:
>
> A(1,2,function(){
> B(3,4,function(){
> C(5,6);
> }
> });
> Which works! And I shrug off the syntactic ugliness of it.
Perhaps, for maximum clarity, that should have been :
A(1,2,b);
function b () { B(3,4,c) }
function c () { C(5,6) }
> After all, 3 levels of nesting is not terrible. And I'll probably never need more nesting than that. I hope.
But it needed no nesting... (?)
> But then, later, I realize that I need to swap in another implementation of A from a third-party, which I don't really know or trust. And as I start to examine this pattern, I realize, really, all I care about in this case is to be "notified" (think: event listening) of when A fully finishes (whether that is immediately or later).
>
> In fact, I read up on that third-party's documentation for A, and I find that they actually have conditions inside the implementation where sometimes the callback will be executed immediately, and sometimes it'll happen later. This starts to make me nervous, because I'm not sure if I can really trust that they'll properly call my callback at the right time. What if they call it twice? Or too early? What if that uncertainty creates other race conditions in my calling code.
var FLAG= 1;
A(1,2,b);
function b () { FLAG && B(3,4,c), FLAG= 0 }
function c () { C(5,6) }
> So, at this point, I'm getting really frustrated. I don't know exactly what pattern I can use to cleanly and efficiently ensure that A(..) will run (however it wants to run) and then once it's complete, and only then, and only once, will I be notified of it being complete so I can run B.
Adding a flag gets you frustrated ?
> (...)
>
> Before you go and casually dismiss the minority opinion that the language could/should be part of the solution, I'd love to see perfect answers to all of the above complications (and the dozen others I didn't present).
Honestly, I think there's nothing complicated above. Perhaps somewhere in the other twelve, you mean ?
--
Jorge.
Ah, thanks for the explanation - that seems safe and useful. It seems
coffeescript should be adding some syntax macros like this.
Since you're raising this point... we've had some experience with this
tension of executing stuff in parallel or sequentially in
StratifiedJS.
When you write a async-to-sequential framework, there is a great
temptation to parallelize everything that's parallelizable. E.g. in an
initial version of StratifiedJS we made function calls evaluate their
parameters in parallel. I.e. in foo(a(),b(),c()), 'a()', 'b()', and
'c()' were automatically evaluated in parallel.
In practice we found this to be very dangerous - in non-trivial code
it can quickly introduce non-obvious race conditions. Our conclusion
was that it is better never to parallelize anything by default; the
programmer should have to do it explicitly ( in StratifiedJS by using
waitfor/and/or or a higher-level abstraction built on these, such as
waitforAll - http://onilabs.com/modules#cutil/waitforAll ).
We've got a working version of StratifiedJS for nodejs btw (see
https://github.com/afri/sjs-nodejs ). Some things in node are very
easy to sequentialize, e.g. this is a blocking version of
child_process.exec:
function exec(command, options) {
waitfor(var err, stdout, stderr) {
var child = require('child_process').exec(command, options, resume);
}
retract {
child.kill();
}
if (err) throw err;
return { stdout: stdout, stderr: stderr };
}
With this you can e.g. write a http server that serves shell commands
in a blocking style:
http.createServer(function(req, res) {
res.writeHead(200, {'Content-Type':'text/plain'});
res.end(exec("uname -a").stdout);
}).listen(8124, "127.0.0.1");
Other things in node are harder to convert to sequential, in
particular streams. The fundamental problem here is that node's
streams *push* data, whereas for blocking logic you always want to
*pull*. I'd be interested if anyone has any good idea of how to wrap
streams with a blocking API while still keeping compatibility with
their normal async usage in node.
Cheers,
Alex
You can:
1. resume the stream,
2. wait for 'data' event,
3. pause the stream,
4. wait until the user calls a synchronous read() again,
5. go to 1.
Short version of this post, the gjs promises/generators stuff and C#
await both have a promise/task object, and I think that might help
some. Don't know.
Long version is to spell out how it works,
On Tue, Jan 18, 2011 at 5:19 PM, jashkenas <jash...@gmail.com> wrote:
> for item in list
> defer func(item)
>
> In synchronous code, and in serial async code, "func" would be able to
> rely on the fact that the previous item had already been processed. In
> parallel async code, it wouldn't. Which is the default? Do you have
> different keywords to specify which one you mean?
In the case of
for each (item in list)
yield func(item)
Then func() would be returning a promise, which is a handle to either
a result value or an exception, where exactly one of the result or
exception will exist when a future async callback gets invoked.
This is not done in parallel. At each yield, the generator function
containing the yield is suspended until the callback is invoked by the
event loop. So it's done sequentially.
If you wanted to do this in parallel, one simple approach is something
like this:
// fire off the event loop handlers and gather promises of their completion
var promises = []
for each (item in list)
promises.push(func(item))
// now wait for ALL the promises
yield promiseFromPromiseList(promises)
Here, promiseFromPromiseList creates a new promise which invokes its
callback when all the promises in the list have invoked theirs.
(Sample implementation: add a callback to each promise in the list
which increments a count, save in the closure the expected count and
the new promise, complete the new promise when the count is as
expected.)
This shows the value of having a promise abstraction rather than raw
callbacks. You could not implement callbackFromCallbackList afaik.
(Well I guess with a preprocessor you can do about anything)
The line:
yield promiseFromPromiseList(promises)
would also throw an exception if any of the promises in the list have
an error set on them. "yield" asks the calling function to unpack the
promise into either a throw or a value. In this case the value is
discarded but you'd still get a throw.
Instead of the promiseFromPromiseList you could also just do this:
for each (p in promises)
yield p
but it would suspend/resume the function more times, if that makes a
difference, probably it doesn't. Also with promiseFromPromiseList you
could decide to do something like collect and log all exceptions
before completing the composite promise.
Say you actually needed the values to go with the individual items in
the list. You could get cute and implement a
reducePromises/foldPromises type thing, but leaving that aside, you
can do something like:
var promises = []
for each (item in list)
promises.push(func(item))
var results = []
for each (p in promises)
results.push(yield p)
Here, first you create all the promises (adding the corresponding task
to the event loop), and then the first "yield" returns to the event
loop. At that point the event loop could actually work on all of the
tasks. For example maybe all the http requests are written out. When
the http reply comes back, then the promises' callbacks are invoked.
The function here would resume at the "yield p" when the first promise
in the list got a reply back (or threw an exception). Then it would
proceed through the rest of the promises. It waits on them in order,
but that doesn't matter since they're all running in parallel.
It's a critical convention that when an async call returns a promise,
it's already added the needed hooks to the event loop (the promise has
to be "fired off" already, so work starts to happen as soon as the
event loop spins).
> Another issue lies in the fact that async code "pollutes" everything
> it touches, all the way up the call stack to the top. Absent static
> analysis, there's no way to know if the function you're about to call
> is going to "defer" -- if it does, you have to pass a callback, or
> defer yourself. This means that any function that's going to call an
> async function has to know it in advance, and defer the call
> accordingly. At this point, it's already hardly better than just
> passing the callback yourself.
The model we are using is that async calls return a promise object. If
the async call is implemented with a generator, then it looks like:
function waitTwoSeconds() {
return Promise.fromGenerator(function() {
// add a timeout, returning a promise which is completed when
timeout fires
yield addTimeout(2000);
// do stuff 2000ms later
// can have infinite "yield" or also manual promise usage in here
// can just throw here and it makes the promise "throw"
});
}
Now if the caller itself is a generator it can do:
yield waitTwoSeconds()
if it isn't, it can do the usual callback model:
waitTwoSeconds().get(function(result, error) {} )
"result" is just undefined for a timeout of course.
You only have to pollute upward if you also would have to in the
callback case, i.e. you _can_ fire-and-forget a callback/promise as a
side effect, if you don't need its result in order to return from your
function.
If you do need the async result, yeah you have to pollute upward. That
seems inherent and even desirable. A function either returns a value
or a promise, as part of its type signature.
Because this is all on top of generators, the fromGenerator() wrapper
thing is needed whenever you want to use the yield syntax, which is
pesky.
Another awkwardness with using generators is that just "return 10" at
the end of the generator doesn't work, instead there has to be a
special function to provide the value of the promise, which is
implemented by throwing a special exception. So that sucks.
Both of these would be fixable in CoffeeScript I'd think. I guess it's
all fixable in JS too if the ECMAScript guys got on it ;-)
> Another issue lies in argument ordering. Node assumes that "err" is
> always the first argument to the callback, and should be treated
> specially -- but should this assumption be built-in to the language?
> Surely not every async API will follow this pattern. Same goes for
> callback-as-the-final-argument-to-the-function. Some APIs take a
> callback and an errback, how do you deal with those?
Promises let you easily adapt different callback conventions. For example:
function myThing() {
var p = new Promise();
myThingThatTakesAWeirdCallback(function(foo, bar, result, baz,
error) { if (error) p.putError(error); else p.putReturn(result); })
return p
}
Built-in to http://git.gnome.org/browse/gjs/tree/modules/promise.js
there's support for a "function(result,error)" flavor, but it's easy
to write the little adapters back and forth to whatever you have.
Another advantage of promises is that they are always returned, so you
don't have this issue of which argument is the callback (first? last?
if last what if you want a variable arg list?)
Another example, convert standard setTimeout to promises:
function addTimeout(ms) {
var p = new Promise();
setTimeout(function() { p.putReturn(); }, ms);
return p;
}
> At the end of the day, you end up with "defer" sugar that makes a lot
> of assumptions, choosing either serial or parallel,
I think as long as promises "fire off" when created, the sugar can
just be serial, because people can create a bunch of promises before
suspending their function to yield any of their values. Fire off all
your tasks before you block on any.
C#'s name "await" for the keyword makes a lot more sense than "yield"
probably, yield is more based on using generators for iteration. Also
C# doesn't require the kludges like fromGenerator() and
putReturnFromGenerator(). CoffeeScript could presumably be more
cleaned up, like C#.
> and locking you in
> to a fixed callback structure, for better or for worse.
Since a promise gives you a way to deal with callbacks (and callback
lists) "generically" you can use them as the lingua franca and adapt
to various callback flavors, which is nice.
Havoc
Just out of curiosity, do you think an operator like @ "looks" synchronous
or asynchronous? In other words, does:
A(1,2,3) @ B(4,5,6);
"look" more synchronous (or rather, less asynchronous) than:
A(1,2,3,function(){ B(4,5,6); });
If so, can you please explain how?
I have several places in my current code where I pass around function
references and the code is entirely synchronous. Should I be worried then
that my actully-synchronous code in fact "looks" asynchronous and so is thus
harder to understand? That's a pretty slippery slope to stand on.
I'm actually kinda confused by the whole concept of something looking
synchronous or asynchronous. If you're asserting that merely passing around
callbacks (function references) inherently/semantically looks asynchronous,
I take issue with that assertion. The only reason that it looks that way (to
you, maybe not to someone else) is not anything semantic about the usage,
but just because *almost* all uses do in fact turn out to be asynchronous,
so a lot (but not all) of us have been conditioned that way, and it becomes
obvious (but not semantic).
In my mind, there's a big difference between making a decision to pattern
something some way because "that's how everyone does it" versus "that's how
it semantically makes sense".
In fact, I would argue (albeit a little tenuously) that I think the @
operator as shown above "looks" more asynchronous than nested callbacks
because there is actually a semantic reason I chose to pattern it that way,
and why I used the @ operator specifically:
A() @ B() is to be interpreted as "execute A, and if A suspends himself,
then pause the rest of the statement. If A does not pause himself, or once A
completes (or errors), then immediately 'continue' execution of the
statement *AT* B."
So, please explain how the A(... ,B) pattern is more semantically
asynchronous (than the alternatives)?
--Kyle
You're right, this works quite well, although there is quite a lot of
adding/removing of listeners (because for each read we need to listen
to 'data', 'end', and 'error' - see
https://github.com/afri/sjs-nodejs/blob/master/lib/stream.sjs ).
Also, from the perspective of sync APIs, it would be desirable to have
all streams paused by default on creation. But I can see how this
clashes with keeping the async APIs simple.
Cheers,
Alex
I think the only reason any programmer knows what any operator does is
because there's clear documentation and plenty of example use-cases. And
this would especially be true if the language introduced a new operator,
there'd be lots and lots written about the new operator and how it works,
what it's for and not for, etc.
I don't think that just because you look at an operator and don't know
immediately what it does, that this means the operator isn't useful or
semantic. Consider ~. That operator is quite strange looking. Most people
never even type that character on their keyboards. But developers (when they
learn it) understand exactly what it's doing, that it's a unary operator
that inverts the bits of the operand. You have to admit, that's not
particularly semantic at first glance. But many developers have learned that
operator, and put it to good use all the time.
*I* happen to think that synchronous and asynchronous code *shouldn't* be so
starkly different. I guess that's where I differ from a lot of people on the
list. But I write both synchronous and asynchronous code mixed in my
programs all the time, and I long for the ability to write the same style of
code in both contexts. In fact, I have a number of prominent use-cases in my
code where the same function call is either synchronous or asynchronous
depending on run-time conditions. I even have functions that are both
synchronous and asynchronous at the same time -- that is, the same function
will return an immediate value AND it will fire off asynchronous behavior
later, too.
The intended definition I have for the @ operator, as I eluded to in my
previous message, is that it can be either synchronous OR asynchronous,
depending on what the function call chooses to do. *I* think the function
itself should get to decide if it's sync or async, and the calling code
should just respond to that decision appropriately. I've never liked the
reverse idea, that the calling code is the one who is in control of if a
function is immediate or not.
Contrary to what you suggest about this making developers lazy, I think
usage of that operator will *force* the developer to write code that is
"async safe" because the function in question may or may not defer itself.
setTimeout(function(){ ... }, 5000);
vs.
setTimeout(1000) @ function() { ... };
I don't think that type of syntax would make developers any lazier about
knowing how to safely write async code. And JavaScript could easily be
changed to where a function like setTimeout() could be called in either way,
depending on what the developer prefers.
I absolutely love the idea of, once I understand sync/async issues (which I
do), being able to write code that blurs the lines (while still functioning
the way I want it to) between sync and async. IMHO, if a language like
JavaScript supports async functionality, that async behavior should be a
first class citizen and just as obvious and face-forward as his sync
brother. Right now, with clunky nested callback syntax, async "feels" more
like the awkward step-child of JavaScript. And that's what I want to change.
I'm not suggesting that JavaScript would change to where there'd be no
callbacks for accessing async behavior. I'm simply suggesting that @
operator could be *added* to the equation to let those of us (maybe only
me!?) that want to take advantage of it, do so. Perhaps callbacks would stay
the default/preferred method, but perhaps given long enough exposure, @
might end up being a preferred pattern.
---------
When a developer designs an API (even just a function signature), they've
allowed the calling code to rely on a stable and consistent usage of that
API, regardless of the underlying implementation details changing. This is a
fundamental tenant of software development. I'm basically just extending
that to say that sync vs. async should be able to be "hidden" behind the
scenes as an implementation detail, and the calling code "API" should look
consistent.
In fact, in several of my projects, I currently use a "Promise"
implementation to abstract away the differences between async and sync. I
have code that I run both in the browser and on the server, and in the
browser it's asynchronous, but in the server it's synchronous. I like the
fact that by using promises, my code operates just fine regardless of sync
or async context. The "Promise" hides that detail from my main code base,
and I think that's a great thing. It certainly makes my job of maintaining
my code better.
Bottom line: many/most of you seem to want sync and async code to "look"
very different, but I want the opposite. I want to be able to write
consistent code that responds appropriately to either context as
appropriate.
Perhaps that weird desire is just what makes me polar opposite to the
mainstream node.js crowd. Sorry for clogging up this thread with my own
opinions on the topic. I guess I'll just go do my own thing.
--Kyle