How to detect and diagnose event-loop blocking?

1,967 views
Skip to first unread message

Pete Kruckenberg

unread,
Mar 2, 2012, 8:57:38 AM3/2/12
to nodejs
I'm looking for suggestions on detecting when the NodeJS event loop is blocked (due to CPU-intensive code), and determining what is causing the block.

I'm running a rather large NodeJS app. Over time, as the app and dataset have grown, things that used to process quickly have become more CPU-intensive and are blocking the event loop longer than "a few milliseconds". Tracking down the problems, and even determining how big the problem is, have proven to be more complicated than expected.

Any recommendations for the following would be much appreciated:

1. Monitor the 'health' of the NodeJS event loop, to know 'how big the problem is' – any extended blocking of the event loop is a Bad Thing, so seems a Good Practice to detect when blocking happens.

2. When the event loop is blocked, detect what is causing the block – I need a 'top' or 'ps' for the NodeJS event loop –  profilers only help with this indirectly (meaning not much at all), because they don't differentiate between blocking and non-blocking activities.

Thanks for your suggestions.

Pete.

Tim Caswell

unread,
Mar 2, 2012, 9:36:06 AM3/2/12
to nod...@googlegroups.com
It will need some adjusting, but the event source hooks I wrote for [trycatch] allow you to insert code at the root of an event stack.  You can create a timestamp before the first function is called, and another when it returns.  if the difference is larger that some preset, you can flag that stack as misbehaving (throw an exception and you'll see what kind of event source it if).   Then maybe you can use the token feature of [trycatch] to inject some data into the stack and guess which stack is causing the issue.  Sorry it's not a pre-packaged library to do what you want, but it's a start.

[trycatch]: https://github.com/CrabDude/trycatch/blob/master/lib/hook.js


Pete.

--
Job Board: http://jobs.nodejs.org/
Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com
To unsubscribe from this group, send email to
nodejs+un...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Paddy Byers

unread,
Mar 2, 2012, 10:06:04 AM3/2/12
to nod...@googlegroups.com
Hi,

Any recommendations for the following would be much appreciated:

1. Monitor the 'health' of the NodeJS event loop, to know 'how big the problem is' – any extended blocking of the event loop is a Bad Thing, so seems a Good Practice to detect when blocking happens.

As a very simple health-check, do the following:

function timeTick() {
  var startTime = (new Date().getTime());
  function onTick() {
    var interval = (new Date().getTime()) - startTime;
    if(interval > 5)
      console.log('timeTick(): WARNING: interval = ' + interval);
  }
  process.nextTick(onTick);
}
setInterval(timeTick, 1000); 

This simply sets up a periodic timer that checks the interval between one tick and the next. Set the threshold (5 above) to be a value that represents the kind of latency that you consider problematic for your app. It doesn't tell you what's causing the problem but you'll quickly get an idea of what circumstances are triggering a problem.

Paddy

Mark Hahn

unread,
Mar 2, 2012, 3:17:59 PM3/2/12
to nod...@googlegroups.com
But having a lot of different events, like many users pulling pages at once, would give a bad result in that test.  Not that is doesn't sound useful in general.

Paddy Byers

unread,
Mar 2, 2012, 6:56:17 PM3/2/12
to nod...@googlegroups.com
Hi,

But having a lot of different events, like many users pulling pages at once, would give a bad result in that test.

I'm not claiming it's sophisticated :) and you're right, in that the indicated latency will be the aggregate of a number of queued events, not just a single event. But nonetheless it does give an indication of the time between, say, an FD becoming readable and a handler being called with the data, which is the measure of latency that matters to your app.

Thanks - paddy

Tomasz Janczuk

unread,
Mar 3, 2012, 2:05:42 AM3/3/12
to nod...@googlegroups.com
Perhaps something like https://github.com/mape/node-profile would help. 

Bharani K

unread,
Aug 25, 2016, 12:11:13 AM8/25/16
to nodejs
The better way to look into event loop latency now is,


Otherwise, APM tools like AppDynamics offers to monitor event loop latenccy.

Bharani
Reply all
Reply to author
Forward
0 new messages