Hi all, welcome to my first post!
A quick scan revealed some perturbations of what I'm experiencing, but
no messages that I found that were exactly the same.
Running god 0.7.12 on Centos 4.7 to monitor an app we just deployed
that is having some memory issues (will fix later, but need app up
now). My god configuration can be found here:
http://pastie.org/355756
I'm forcing my app to memory hog so that god will restart it, and I'm
seeing the following:
I [2009-01-08 11:11:58] INFO: ridingresource-mongrel-12002 [trigger]
memory out of bounds [39728kb, 85004kb, *127152kb, *127152kb,
*127152kb] (MemoryUsage)
I [2009-01-08 11:11:58] INFO: ridingresource-mongrel-12002 move 'up'
to 'restart'
I [2009-01-08 11:11:58] INFO: ridingresource-mongrel-12002 restart:
mongrel_rails restart -P /home/riding/railsapps/equine/log/mongrel.pid
I [2009-01-08 11:12:10] INFO: ridingresource-mongrel-12002 moved 'up'
to 'up'
I [2009-01-08 11:12:11] INFO: ridingresource-mongrel-12002 [trigger]
process is not running (ProcessRunning)
I [2009-01-08 11:12:11] INFO: ridingresource-mongrel-12002 move 'up'
to 'start'
I [2009-01-08 11:12:11] INFO: ridingresource-mongrel-12002
before_start: no pid file to delete (CleanPidFile)
I [2009-01-08 11:12:11] INFO: ridingresource-mongrel-12002 start:
mongrel_rails start -c /home/riding/railsapps/equine -p 12002 -
P /home/riding/railsapps/equine/log/mongrel.pid -e production -d
I [2009-01-08 11:12:23] INFO: ridingresource-mongrel-12002 moved 'up'
to 'up'
I [2009-01-08 11:12:24] INFO: ridingresource-mongrel-12002 [trigger]
process is not running (ProcessRunning)
I [2009-01-08 11:12:24] INFO: ridingresource-mongrel-12002 move 'up'
to 'start'
I [2009-01-08 11:12:24] INFO: ridingresource-mongrel-12002
before_start: no pid file to delete (CleanPidFile)
I [2009-01-08 11:12:24] INFO: ridingresource-mongrel-12002 start:
mongrel_rails start -c /home/riding/railsapps/equine -p 12002 -
P /home/riding/railsapps/equine/log/mongrel.pid -e production -d
You can see that once god sees the memory usage is out of bounds, it
does a restart. This restart works fine -- watching the mongrel
process in htop shows that it has been restarted and memory usage
falls.
For some reason, on the next test for processrunning, god thinks that
the process isn't running, even though it is, and even though the pid
is there. It then continually starts the mongrel until it decides
that it is flapping.
I don't see anything egregiously wrong with my configuration file, so
I'm not sure what's going on.
Any suggestions?