1) Look at the logs I already keep to see if there are any patterns.
Add in more logging and hope it helps next time this problem happens
(and doesn't make the problem worse).
2) Try to run apache bench against my server to see if that can trigger it.
3) Try changing out various modules that I know have caused problems
for others in the past (like gzip)
4) Just reason about the code
I'm not too worried about trying things that slow down my server some
- it's not running anywhere near capacity currently. Are there better
ways to debug something like this?
Also, as a side note - not sure if there are any monit experts on this
list, but I'm trying to get monit to just restart node when this
happens. I've used monit successfully for a lot of other processes on
my server, but it doesn't seem to be able to detect the CPU spike and
restart node (I thought this might be because the system as a whole is
going down due to the CPU utilization, but that isn't the case - other
services on my machine are working fine since I'm able to still access
the site and the django server). My monit entry for node is
http://gist.github.com/586242 (I tried using both ">" and "is greater
than" - I have another service on my machine (Xvfb) that leaks quite a
bit of memory, and monit handles restarting that service fine).
Also, I'm not using SSL which is the only other reference I could find
to an issue like this (I'm fairly sure the broken code is not in node
core, but can't be certain).
Thanks in advance for any suggestions!
-- Saikat
--
You received this message because you are subscribed to the Google Groups "nodejs" group.
To post to this group, send email to nod...@googlegroups.com.
To unsubscribe from this group, send email to nodejs+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/nodejs?hl=en.
Force a coredump and get a stacktrace (with, like, kill -SIGTRAP pid).
That'll probably tell you what's spinning.
I also managed to get monit to restart the process in high CPU cases
by checking for connection failures to the port instead of monitoring
the CPU usage. For some reason, I never did get that working.
Thanks for the suggestions!