[ANN] fibers 0.2.0: Faster, smaller, more stable, just as awesome

378 views
Skip to first unread message

Marcel Laverdet

unread,
Feb 22, 2011, 1:47:45 PM2/22/11
to nodejs
Hey this is an update from last month on the state of fibers in nodejs. I've made a lot of improvements to fibers, fixed a lot of bugs, and wanted to send out an update. I'll try to address a lot of questions and misconceptions people have about fibers here.


= = "What are fibers?" = =

Fibers, similar to coroutines and "green threads", are a way to write to asynchronous code in a synchronous way. Essentially they are threads, but without the scheduler. That means the operating system isn't cutting off your threads and starting them again whenever it feels like. The job of swapping to another fiber is up to the client. And at no point will more than 1 fiber be running; you don't get to run multiple fibers at once.


= = "How does that help me write asynchronous code?" = =

Consider a simple program which copies one file to another. Ignoring for a moment that stream.pipe() exists, you're going to have to write at least two callbacks within each other. You also have to check for errors manually which is no fun. That's where fibers come in.

Check out this gist:

Note the difference between copyFileWithoutFiber() and copyFileWithFiber(). Without fibers you must check for errors in your callback to ensure you're not doing anything stupid. Then you have to nest another callback which also checks for errors. If your workflow gets even marginally complicated your code will quickly turn into a deeply-nested mess.

But with fibers you can just use a simple try/catch block to catch errors. No callbacks are needed, you just yield execution back to the original code. When the file is read, your fiber will pick back up where it left off. Notice the code looks synchronous, but to the client copyFileWithoutFiber() and copyFileWithFiber() are /indistinguishable/. Both functions will return immediately and call the callback when their job is done. But with fibers you can isolate your callback code succinctly and then just worry about your workflow.


= = "How does it work?" = =

When you create a new fiber what happens is that it creates an entirely new execution stack. The first frame on this stack is a function that you pass in. After your fiber has been created you can freely switch into and out of that stack by using run() and yield() respectively. When you switch back into a stack everything is exactly how you left it. You resume in the middle of your function, local variables are in tact, closures are fine, and so on. You can pass data into the fiber with run(), and return data back to the caller with yield() (you could also do this with globals). You can even throw exceptions /into/ the fiber with throwInto().

Keep in mind that this is the exact same thing that happens with a thread. Except with a thread you don't control when your thread starts and stops so you have to deal with peculiarities like locking and race conditions. With a fiber you explicitly switch into an in out of fibers, so none of that is a problem.


= = "Sounds expensive..." = =

It's actually not that expensive. Context switches (switching between stacks/threads/fibers) aren't really that expensive; your computer is generally switching between threads 1000's of times per second.

In terms of memory, each fiber (while running) will consume around 64kb of memory. That memory basically all goes to the stack. Fibers are reused in a pool to avoid constantly creating and deleting fibers; creating lots of short-lived fibers is totally acceptable. The pool is currently set to a maximum of 120 fibers. For those of you keeping track at home that's about 8mb of memory, which incidentally is the cost of a /single/ pthread (by default).

If you compare fibers to callbacks in a micro-benchmarking situation you're likely to say "oh no this is too slow!" but that mantra is foolish. Yes fibers will be considerably slower, but that's merely a consequence of the v8 C++/JS membrane. Basically any time you switch between C++ code and JS code you're going to take a relatively large performance hit when compared to a pure Javascript function. This is why all of the v8 runtime is written directly in Javascript. But what you must keep in mind is that cost when compared to something like, reading from database, is very insubstantial.

To put this in perspective, these two functions are about the same in terms of cpu time spent. In fact F2() is slightly slower.

    function F1() {
      var fiber = Fiber(function() {
        yield();
      });
      fiber.run();
      fiber.run();
    }

    function F2() {
      var buf = new Buffer(5);
      buf.write('hello');
      var buf2 = new Buffer(5);
      buf2.write('world');
    }

In both cases there are 6 switches between JS and C++ (object destructors count as a JS/C++ switch), and that makes up most of the cost. You wouldn't feel bad about calling native node functions, so don't feel bad about using fibers.

However, I would recommend against using fibers for long polling. Having very large numbers of fibers active (many thousands) may begin to impact garbage collection. For long polling it's not hard to write your long poll in a callback and then start a fiber when you need to do some work.


= = "But there are 100's of libraries which do this already." = =

I'm certain there's nothing out there that will result in code as elegant as you can achieve with fibers. The closest thing you can get are transformation-based approaches like streamline.js and narrativejs, but fibers are just easier to work with.

You will need a good library on top of fibers to fully realize this potential, however. I did the hard part, now you figure out how to take care of that :-P. I recommend checking out https://github.com/lm1/node-fiberize or even https://github.com/kriszyp/node-promise .


= = "How can I try it out?" = =

npm install fibers

Be sure to read the documentation at https://github.com/laverdet/node-fibers .

I'm happy to field questions about fibers, just ask. I've been spending a lot of time over the past month or so thinking about this so I've probably got an answer for you.

Thanks for reading!
~ Marcel

Jorge

unread,
Feb 22, 2011, 4:39:01 PM2/22/11
to nod...@googlegroups.com
On 22/02/2011, at 19:47, Marcel Laverdet wrote:
> (...) If your workflow gets even marginally complicated your code will quickly turn into a deeply-nested mess.

Yes Marcel, but in your example there was no need to nest anything... :-)

function copyFileWithoutFiber (from, to, ondone) {

fs.readFile(from, 'utf8', readCB);

function readCB (err, data) {
if (err) {
ondone(err);
return;
}
fs.writeFile(to, data, 'utf8', writeCB);
}

function writeCB () {
if (err) {
ondone(err);
} else {
ondone();
}
}

}

I'm hearing it perhaps too often, phrases along the lines of "quickly turns into a deeply-nested mess".

It is becoming a myth, but it's a false myth. Simply: it's not true, imho.

> (... great explanation ...)

I appreciate it. Great work, well done. I for one can't wait to experiment/play with fibers :-)

Thank you!
--
Jorge.

Jorge

unread,
Feb 22, 2011, 5:35:34 PM2/22/11
to nod...@googlegroups.com
On 22/02/2011, at 23:22, Preston Guillory wrote:
On Tue, Feb 22, 2011 at 3:39 PM, Jorge <jo...@jorgechamorro.com> wrote:
I'm hearing it perhaps too often, phrases along the lines of "quickly turns into a deeply-nested mess".

It is becoming a myth, but it's a false myth. Simply: it's not true, imho.
function f (callback) {

    a(a_CB);
    
    function a_CB (err) {
        if (err) return callback(err);
        b(b_CB);
    }
    
    function b_CB (err) {
        if (err) return callback(err);
        c(c_CB);
    }
    
    function c_CB (err) {
        if (err) return callback(err);
        d(callback);
    }
    
}

Where's the deeply-nested-mess ?
-- 
Jorge.

Bradley Meck

unread,
Feb 22, 2011, 5:39:53 PM2/22/11
to nod...@googlegroups.com
Even without deep nesting and the conventions to avoid it, It is interesting how you can combine errors, as well as provide a new experience to coding style. As explained in the about, this is not too far off from what happens, although I am interested in some things such as benchmarks, and possible need for semaphores. Either way good job. And to those attacking this, I would recommend some time to see what is made from it as a design perspective rather than performance for now.

Jorge

unread,
Feb 22, 2011, 5:43:11 PM2/22/11
to nod...@googlegroups.com
On 22/02/2011, at 23:39, Bradley Meck wrote:

> Even without deep nesting and the conventions to avoid it, It is interesting how you can combine errors, as well as provide a new experience to coding style. As explained in the about, this is not too far off from what happens, although I am interested in some things such as benchmarks, and possible need for semaphores. Either way good job. And to those attacking this, I would recommend some time to see what is made from it as a design perspective rather than performance for now.

Exactly. Agreed. +1
--
Jorge.

Preston Guillory

unread,
Feb 22, 2011, 8:02:15 PM2/22/11
to nod...@googlegroups.com
I hope I didn't come across as attacking it.  On the contrary, it's a cool project and addresses a real need.


--
You received this message because you are subscribed to the Google Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com.
To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.


Martin Nelson

unread,
Feb 22, 2011, 11:29:12 PM2/22/11
to nod...@googlegroups.com, Jorge
If chaining calls is that common, why not create a helper function like: 

function chain() {
if(arguments.length === 0) { return; } 
if(arguments.length === 1) { 
arguments[0](); 
} else if(arguments.length === 2) {
arguments[0](arguments[1]);
} else {
var outerArgs = arguments;
arguments[0]( function(err, data) {
if (err) return outerArgs[outerArgs.length - 1](err)
var newArgs = Array.prototype.slice.call(outerArgs, 1, outerArgs.length);
chain.apply(null, newArgs);
});
};
}

then you could do:

chain(a, b, c, d, callback);

Preston Guillory

unread,
Feb 22, 2011, 5:22:29 PM2/22/11
to nod...@googlegroups.com
On Tue, Feb 22, 2011 at 3:39 PM, Jorge <jo...@jorgechamorro.com> wrote:
I'm hearing it perhaps too often, phrases along the lines of "quickly turns into a deeply-nested mess".

It is becoming a myth, but it's a false myth. Simply: it's not true, imho.

Marcel Laverdet

unread,
Feb 23, 2011, 12:33:57 AM2/23/11
to nod...@googlegroups.com
> It is becoming a myth, but it's a false myth. Simply: it's not true, imho.

True there are techniques to mitigate the nesting, like you mentioned, but Preston's 4 line example turned into 12 when unwrapped. There's other techniques to improve brevity, but there's generally a loss of readability here, often with lots of one-line anonymous functions.


> although I am interested in some things such as benchmarks, and possible need for semaphores.

Performance here is not an issue; if your application is slow it won't be because you used fibers. There IS increased overhead from pure callbacks, but it really is insubstantial. Most of the cost is incurred by passing through the JS/C++ membrane, and I imagine we'll see that get better and better as v8 evolves. And if you compare that cost to the profile of a typical application it will be a small fraction of a percent of the application's cpu time. As for sempaphores there's no need because there's no scheduler which will preempt your Javascript; context switches only occur when you request them.


> chain(a, b, c, d, callback);

This is reasonable in the simple case but as your workflow increases in complexity the boilerplate does as well. Instead of separating tasks with a semicolon you begin separating them with "}, function() {". Proper error handling becomes even more tricky. It's just a very unintuitive way to write software. With fibers you can write code naturally with full exception support.

Isaac Schlueter

unread,
Feb 23, 2011, 2:07:24 AM2/23/11
to nod...@googlegroups.com
On Tue, Feb 22, 2011 at 20:29, Martin Nelson <marty....@yahoo.com> wrote:
> If chaining calls is that common, why not create a helper function like:

https://github.com/isaacs/npm/blob/master/lib/utils/chain.js

--i

Floby

unread,
Feb 23, 2011, 4:22:25 AM2/23/11
to nodejs
definitely going to try it.

Martin Nelson

unread,
Feb 23, 2011, 10:20:26 AM2/23/11
to nod...@googlegroups.com, Isaac Schlueter
Thanks for sharing Isaac. I'm newer to javascript and node, this was good exercise to do and I (finally) have a workable TDD based environment set up to do it with. 

Liam

unread,
Feb 23, 2011, 1:07:52 PM2/23/11
to nodejs
On Feb 22, 1:39 pm, Jorge <jo...@jorgechamorro.com> wrote:
>
> I'm hearing it perhaps too often, phrases along the lines of "quickly turns into a deeply-nested mess".
>
> It is becoming a myth, but it's a false myth. Simply: it's not true, imho.

My experience with constructing SQL statements dynamically for SQLite
(along with fs & net ops) does indeed yield some very deeply nested
code.

Using Node as a middleware layer may not entail much nesting. Building
a complex app with it probably will.

Marco Rogers

unread,
Feb 23, 2011, 1:44:58 PM2/23/11
to nod...@googlegroups.com
In your gist example, it's not really clear how throwInto behaves.  What I assume is the following.

- I start a fiber inside copyFileWithFiber, lets call it F
- I yield at some later point. Let's say on line 10.
- Outside of F I encounter an error and throwInto F.
- F resumes but immediately throws the error * at the point of the last yield *. so at line 10.
- In your example that thrown error is immediately caught by the try/catch in F.  Then I can do whatever I like with it.

Is that correct?  Am I missing anything?

:Marco

Bruno Jouhier

unread,
Feb 24, 2011, 2:56:06 AM2/24/11
to nodejs
Agree with Liam. People who are building technical layers seem happy
with callbacks and actually the mixture of callback and event-driven
programming seems to opens the door to creative designs. But people
who try to to build business applications with complex logic sitting
on top of async database or web service layers are having a hard time
with callbacks: more code to write, more risks of errors, code is
harder to read, etc.

I've been using streamline.js for more than a month and I have
converted more than 10,000 lines of code to the async style. It makes
a huge difference in terms of readability, ease of maintenance, etc.
And there is no loss in performance.

BTW, I'd like to move to fibers but I have modules that are shared
between server and browser. Any idea on when fibers will become
available browser side (we need Chrome, FF, Safari and IE)?

Bruno

Bruno Jouhier

unread,
Feb 24, 2011, 3:39:12 AM2/24/11
to nodejs
Typo: read "converted code to sync style"

Marcel Laverdet

unread,
Feb 24, 2011, 9:46:14 AM2/24/11
to nod...@googlegroups.com
> Is that correct?  Am I missing anything?

Yes you've got it! Basically when you call yield() the current stack is frozen and control goes back to wherever run() was called from. Then when you call run() it switches back to the fiber (which is currently paused in the middle of yield() somewhere), and yield() returns whatever you passed into run(). But if you call throwInto(), then yield() will throw instead of return.


> Using Node as a middleware layer may not entail much nesting. Building
> a complex app with it probably will.

+1 to this sentiment as well. There are many applications where I would never even feel the need to use fibers. However when I'm building websites which require complex data dispatching, fibers are invaluable.


> BTW, I'd like to move to fibers but I have modules that are shared
> between server and browser. Any idea on when fibers will become
> available browser side (we need Chrome, FF, Safari and IE)?

Well considering there's no specification for fibers even being discussed right now, I wouldn't hold your breath. For shared async code between server and client, fibers are simply not an option. There's also continuations in Rhino which would be in the running for standardizing some form of "yield". But worth pointing out is that fibers can be implemented very very easily. Probably 40% of the code in node-fibers is there because I didn't want to require any patches to node or v8. The rest of it is super straightforward, whereas continuations (seem like they) would be much harder for browsers to implement. If it proves to be a popular feature on server platforms perhaps it could make it down the road to standardization, but I think that's a long ways away. Also it's worth pointing out that v8cgi has a fibers library as well, but theirs is based on pthreads and has a different interface.

I'm wondering, though, what kind of dispatch logic would be shared between server and client. Seems like best practice would be to put an RPC in between your client and server to reduce round trips to the server. Now that I think about it, there's never been a time I wish I had fibers on the client side. It's just pretty rare to need build out complex dispatch workflows on the client. On the other hand after about 7 minutes of playing with nodejs I wanted to kill myself.

Bruno Jouhier

unread,
Feb 24, 2011, 11:10:04 AM2/24/11
to nodejs
> I'm wondering, though, what kind of dispatch logic would be shared between
> server and client. Seems like best practice would be to put an RPC in
> between your client and server to reduce round trips to the server. Now that
> I think about it, there's never been a time I wish I had fibers on the
> client side. It's just pretty rare to need build out complex dispatch
> workflows on the client. On the other hand after about 7 minutes of playing
> with nodejs I wanted to kill myself.

Scenario is offline apps where we sync data to offline storage and we
want to reuse server logic on the client.
Reply all
Reply to author
Forward
0 new messages