Kill agent after timeout

112 views
Skip to first unread message

Alex

unread,
Jan 12, 2014, 11:33:10 AM1/12/14
to puppet...@googlegroups.com
Hi there,

On some machines, I see the following:

* There are two puppet processes. The agent and a child process "Puppet: applying configuration"
* A lck file with the PID of the child process.
* The lock file is two days old
* There are no log entries in the agent's log file.

It seems the agent hangs for an unknown reason.

Is there an option in the agent to kill the child process after a timeout?

If not, what would be an appropriate way to kill an agent that hangs automatically? Should I use a cron job instead and implement the timeout myself?

Thanks,
Alex

Toni Schmidbauer

unread,
Jan 13, 2014, 2:06:17 AM1/13/14
to puppet...@googlegroups.com
At Sun, 12 Jan 2014 08:33:10 -0800 (PST),
Alex wrote:
> * There are two puppet processes. The agent and a child process
> "Puppet: applying configuration"
> * A lck file with the PID of the child process.
> * The lock file is two days old
> * There are no log entries in the agent's log file.

we've got exactly the same issue here. any chance you are running the
agent with ruby 1.8?

my assumption is that if the master stops answering requests in a
certain way (in our case we max-ed out httpd processes), there's a bug
in ruby 1.8 where sockets do not get closed.

https://projects.puppetlabs.com/issues/2089 describes the issue.

currently we do not have a solution for this (other than upgrading
ruby). as a workaround you could add a cronjob that stops the agent,
kills remaining puppet agent processes and starts the agent again.

maybe puppetlabs should support rhel software collections because
there's a newer version of ruby included (1.9.3 if i recall
correctly). but you have to change the start/stop script for that.

regards toni
--
Don't forget, there is no security | toni at stderr dot at
-- Wulfgar | Toni Schmidbauer

Alex

unread,
Jan 13, 2014, 8:55:41 AM1/13/14
to puppet...@googlegroups.com
I run Ruby 1.8 indeed.

I ended up scheduling a script to check whether the agent hangs and restarts it.

Reply all
Reply to author
Forward
0 new messages