|
David Chute it looks like you are using a 15sec timeout, shorter than the default of 60sec. You will want to make sure that the timeout is larger than the biggest catalog you expect to process, as well as the amount of time it takes the agent to send the report, and have the master process the report, potentially by multiple report processors. If your timeout is too short, then the reboot can leave behind the pid file.
That said, the agent should be more robust in the face of failures. The code in lib/puppet/util/pidlock.rb handles locking and reclaiming of the pidfile. Note there are a few pidfile bugs filed in this area already, so you might want to check those out first.
|