accept() -1 EAGAIN (Resource temporarily unavailable)

1,500 views
Skip to first unread message

gjohnson

unread,
Jan 9, 2012, 11:12:13 AM1/9/12
to nod...@googlegroups.com
I have been debugging an odd bug in our app which causes many requests to slow down during certain times of the day. During an strace the only thing that stands out is this:

accept(10, 0x7fff2748f410, [2830782089748545664]) = -1 EAGAIN (Resource temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {11840363, 618943181}) = 0
epoll_wait(3, {{EPOLLIN, {u32=10, u64=4294967306}}}, 64, 1114) = 1
clock_gettime(CLOCK_MONOTONIC, {11840363, 975286181}) = 0
accept(10, 0x7fff2748f410, [2830782089748545664]) = -1 EAGAIN (Resource temporarily unavailable)
clock_gettime(CLOCK_MONOTONIC, {11840363, 975588181}) = 0
epoll_wait(3, {{EPOLLIN, {u32=10, u64=4294967306}}}, 64, 757) = 1
clock_gettime(CLOCK_MONOTONIC, {11840363, 975866181}) = 0
accept(10, 0x7fff2748f410, [2830782089748545664]) = -1 EAGAIN (Resource temporarily unavailable)

I am not really keen on the internals of all this yet, so what exactly is the EAGAIN trying to say?

Ben Noordhuis

unread,
Jan 9, 2012, 11:28:49 AM1/9/12
to nod...@googlegroups.com

That all pending TCP connections have been accepted. Node calls
accept() or accept4() until it returns EAGAIN.

gjohnson

unread,
Jan 9, 2012, 1:10:37 PM1/9/12
to nod...@googlegroups.com
Ahhh, okay. Thanks! Any ideas then on debugging what appears to be node just halting at given times? For example the app is very similar to something like:

connect(

  function(req, res, next) {
     req._receievedAt = Date.now();
     next();
  },
  function(req, res, next) {
     fs.stat(function(err, stat){
       // .... random stuff
       next();
     });
  },
  function(req, res, next) {
    // main application stuff
  }

).listen(...);

And during certain times it can range from 200ms to 12 seconds just get from the first middleware to the last; the only thing that does any IO between between that first and last middleware is that fs.stat, that is what led me to doing an strace to see what exactly was going on with the stat call. It is super strange and incredibly hard to reproduce. 

gjohnson

unread,
Jan 10, 2012, 6:09:04 PM1/10/12
to nod...@googlegroups.com
This was solved! Ended up being a rouge cron job killing the server of all IO resources and therefore any request coming into the node app would be jacked up because of the fs.stat in the middle.
Reply all
Reply to author
Forward
0 new messages