> On 6 Dec 2014, at 03:03, Harry Simons <
simon...@gmail.com> wrote:
>
> Hello,
>
> I'm new to Node. It looks very wonderful so far, I must say.
>
> Is there any official documentation on Node Internals, addressing questions such as the following:
>
> 1. How is the priority of the main thread that initiates the asynchronous I/O, creates servers etc handled vis-a-vis the event-handler thread that calls the callbacks? Do I have to know about this aspect of things as a Node programmer?
That's the same thread: the main thread runs in a loop, spawning IO (which ends up in a background thread pool if it's to the filesystem, because Unix has broken async IO APIs for non-network IO), and looping until there's an event to trigger a callback, which it does.
The background thread pool priority isn't terribly interesting because it's IO-bound, so there's never CPU scheduling going on for it in any useful sense. You can model the whole thing as single threaded execution with parallel IO. You really don't have to know about threads unless you're diving under the hood.
> 2. To use multi-cores, I can have multiple Node processes, sure. But how do I serialize/synchronize these processes when doing a set of (non-atomic) operations in an atomic manner over, say, a filesystem or a database?
Locking! Filesystem locks provided by your OS, or a distributed lock service. A proper distributed lock service lets you scale across machines, not just CPUs, of course.
And then if you can design things to be stateless or use data types like CRDTs, you may be able to avoid locking for some tasks.
> 3. How big is the Node's worker thread-pool? Is its size configurable, or does it automatically scale up/down with load? When this thread pool is full, say, due to OS limitations or due to configuration parameters (provided that is even possible), does Node block or throw an exception or crash when yet another worker thread needs to be engaged?
Nope: It's fixed-sized, not configurable. If you overflow it, operations are queued. There's a pending IO queue, and a queue of events to process on the way back. It's really pretty invisible, just a shim to do asynchronous IO with synchronous filesystem IO primitives. Thanks, Unix!
> 4. Since I/O happens asynchronously in worker threads, it is possible for a single Node process to quickly/efficiently accept 1000s of incoming requests compared to something like Apache. But surely the outgoing responses for each of those requests will take their own time, won't it? For example, if an isolated request takes a minimum of 3 seconds to get serviced (with no other load on the system), then if concurrently hit with 5000 of such requests, won't Node take A LOT of time to service them all? If this 3-second I/O task happens to be involve exclusive access to certain resources, then it would take 5000 x 3 sec = 15000 sec, or over 4 hours of wait to see the response for the last request coming out of the app. In such scenarios, would it be correct to say that a single-process Node configuration can handle 1000s of requests per second, especially compared to Apache (akin to the nginx-Apache benchmark), when all that Node may be doing is putting the requests on hold till they get serviced? I'm asking this because as I'm reading up on Node I'm often hearing how Node can address the C10K problem without any co-mention of any special application setup.
That depends: Is that 3 second IO able to be concurrent? Your question involves "exclusive access to certain services", so yes, that sounds like you'd have to serialize. With that constraint, nothing could be quick about it. It's an artificial restriction you don't run into very often. Node would accept all the requests, but you'd respond to them serially.
Node's remarkably efficient. Without that restriction, compare it to nginx, which is also single-threaded, not Apache, which is complicated. If you're doing little on the CPU for each, it can accept a great many connections very quickly; it will queue the IO, and it'll shovel data to the connections as fast as possible.
> 5. What about the context switching overhead of the workers in the thread pool? If C10K requests hit a Node-based application, won't the workers in the thread pool end up context-switching just as much as Apache's regular thread pool...? because all that would have happened in Node's main thread would be a quick request-parsing and -routing, with the remainder (or, the bulk) of the processing still happening in some thread somewhere? That is, does it really matter (as far as minimization of thread context-switching is concerend) whether a request/response is handled from start to finish in a single thread (in the manner of Apache), or happens up in a Node-managed worker thread with only minimal work (the request parsing and routing) subtracted from it?
Ignore the thread pool: network IO is non-blocking. It all runs in a single thread, start to finish. Maybe node will move a few expensive things into threads and handle them as events eventually -- like parsing HTTP headers -- but those are small tweaks at C10K scale. You really can model node as single threaded and be quite accurate for all but the most esoteric use cases or details.
C10K is an interesting problem for some servers, because O(n) expensive event listeners start dominating the time. The unix select() call fails here, and you need something like epoll. libuv under the hood uses epoll or equivalents, not select. C10K is not generally about having 10,000 requests come in at once, but 10,000 concurrently connected clients.
The next barrier is C100K. At that scale the CPU time to handle requests starts to dominate the time, OS socket limits start hampering you, the ephemeral ports to assign to connections start running out so you need multiple IP addresses, and the data structures at every level start becoming really important. People have pushed node that far with success.
Give it a try -- run a tool like apachebench (ab) or wrk against an HTTP server using node, and you can start seeing about how it responds.
Aria