getting CFEngine to clean up a cf-agent process pile-up

39 views
Skip to first unread message

Aleksey Tsalolikhin

unread,
Jun 30, 2015, 4:10:55 PM6/30/15
to help-c...@googlegroups.com
One of the failure modes of CFEngine is a cf-agent process pile-up bogging down the system that gets worse the longer the situation is extant.

Here is a policy to detect and kill cf-agent processes more than 10 minutes old.

Feedback welcome.

Best,
-at



bundle agent detect_cf_agent_pileup

{

processes:

"cf-agent"

comment => "Detect and kill cf-agent processes more than 10 minutes old.

CFEngine won't exit until all external commands complete,

unless commands exec_timeout is set and succeeds. This

promise ensures any cf-agent processes more than 10 minutes

old are terminated and killed, so we never end up with hundreds

or thousands of cf-agent processes on a system",

process_count => pileup_check,

process_select => proc_finder,

classes => if_repaired("agents_purged"),

signals => { "term", "kill"};

reports:

process_pile_up::

"cf-agent process pile-up detected!!";

agents_purged::

"cf-agents purged";

}

body process_select proc_finder

{

command => "^.*cf-agent.*"; # (Anchored) regular expression matching the command/cmd field of a process

process_owner => { "root", }; # List of regexes matching the user of a process

stime_range => irange(ago(0,0,0,0,10,0), now); # select processes started within 10 minutes

process_result => "(!stime)&command&process_owner"; # reverse stime to get only processes started over 10 min ago

}

body process_count pileup_check

{

match_range => "0,2"; # Integer range for acceptable number of matches for this process

out_of_range_define => { "process_pile_up" }; # List of classes to define if the matches are out of range

}




--
Need CFEngine training?  Email trai...@verticalsysadmin.com
Reply all
Reply to author
Forward
0 new messages