CJK
unread,Sep 2, 2008, 11:40:07 AM9/2/08Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to 福州linux用户组
转自:Linux.com
Taming your daemons with PSMon
By Ben Martin on September 02, 2008 (9:00:00 AM)
The PSMon utility lets you specify which processes should be running,
how much of resources such as CPU or RAM each is allowed to use when
it runs, and how many instances are able to be run. PSMon will then
ensure that these processes are running and kill off a process if it
starts to use too many resources, and possibly restart a process if it
has crashed.
PSMon is not in the repositories for Fedora 9, Ubuntu Hardy, or
openSUSE 11. You can install PSMon using CPAN as described in the
PSMon manual. There is also an install script stored in the utility's
support subdirectory that will take care of installation tasks for
you.
PSMon needs a few Perl modules to function. The support/install.sh
script will install those Perl modules for you, or you can get them
from your distribution's package repository first. The advantage of
installing from the package repository is that you can keep the
modules up-to-date through your normal Linux distribution updates. The
commands shown below first install these extra Perl modules, then run
the install script for the PSMon program.
# yum install perl-CPAN perl-YAML # yum install perl-Config-General
perl-Proc-ProcessTable perl-Unix-Syslog # tar xjf psmon-1.29.tar.bz2 #
cd psmon* # ./support/install.sh Checking for Config::General ...
found Checking for Proc::ProcessTable ... found Checking for
Unix::Syslog ... found Checking for Getopt::Long ... found Installing
psmon ... done Installing psmon-config ... done Installing etc/
psmon.conf ... done Generating HTML documentation support/
psmon.html ... done Installing manual psmon.1 ... done
The configuration file generated by the script has key value pairs
either at the top level of the file or nested inside Process
groupings. The syntax is designed to be similar to that of the Apache
configuration file. There is a special Process * group that lets you
apply settings for all processes. However, this might not work as you
expect -- it could end up killing many processes that you did not
intend to get rid of, so you should avoid using the Process * group.
p>Near the top of the default /etc/psmon.conf file you will see
Disabled True, making PSMon not do anything until you have changed
this directive in the configuration file.
PSMon supports a small collection of directives that are designed to
be used at the top level, outside of any Process group. These let you
set the frequency (in seconds, default 60) with which PSMon will scan
the process table. Changing this to 5 seconds will cause respawns and
badly performing processes to be killed more quickly, but PSMon will
consume more CPU time on the machine. The AdminEmail directive
(default root@localhost) lets you set the email address that PSMon
notifies when processes are spawned or killed, or a failure occurs
while it performs those operations.
There are also two directives, NeverKillPID and NeverKillProcessName,
that can be used to protect processes from ever being killed. These
two directives take a space-delimited list of Process IDs (PID) and
process names and default to 1 and a list of kernel threads that you
really don't want to kill by mistake.
The example below shows a Process group, which is started and finished
with XML-like tags. After the Process declaration you put the name of
the process that you are describing. You cannot include path
information in the process name, and should omit any command-line
options that the command might have taken. Being able to specify the
full path (or a regular expression to match against) of the process
you wish to use PSMon with would be a welcome enhancement. For the SSH
daemon, simply using sshd is not likely to generate any false hits
with other running processes. In this example the sshd process group
ensures that the SSH daemon is up and running, should it exit or crash
for any reason.
<Process sshd> SpawnCmd /sbin/service sshd start </Process>
Other directives that you can use in a Process group include
Instances, to control the maximum number of process that can be
running, and KillCmd, which lets you specify a custom way to close the
process if it is misbehaving. If KillCmd is not specified, a SIGKILL
will be sent to close the process. You might like to consider using a
KillCmd to send a SIGTERM to the process, wait a few seconds, and then
send a stronger SIGKILL if the process is still around. Another good
option for the KillCmd is to use the /etc/init.d scripts to stop a
service.
You can set resource limits for a process using PctCPU, PctMEM, and
TTL directives to set a percentage limit on the CPU and RAM usage and
how long the process can live in total. The PIDFile directive is used
to tell PSMon a file path which contains the process ID of the daemon
which you don't want PSMon to kill. The PIDFile directive is only
useful if you are using the PctCPU, PctMEM, or TTL directives too. As
an example of why you might like to use the PIDFile directive,
consider a daemon that spawns many children to perform network
communications. You might like to make sure that the children do not
consume more than 70% of the system's RAM. Using the PIDFile you can
tell PSMon not to kill the main control process, but only the child
worker processes if they start to consume too much memory.
The TTL directive is handy to ensure that processes that are meant to
complete within a known amount of time have done so. For example, you
can limit the updatedb command or the use of unison or find to a one-
hour duration to stop them from running unchecked from a user's cron
job:
<Process find> ttl 86400 instances 30 </Process>
You can control how verbose PSMon is using the NoEmail, NoEmailOnKill,
and NoEmailOnSpawn directives. These all default to False, but setting
them to True will result in no emails at all, none on process killing,
or none on process spawning, respectively.
You can also set the LogLevel and AdminEmail directives on a per-
process section basis, so you can send email to an SMS gateway when a
very important process such as Apache has crashed. Changing the
LogLevel also affects how failed respawn attempts are reported. PSMon
reports a failure to stop or start a process using the LogLevel plus
one, so setting the Apache group to have a high LogLevel will also
cause PSMon to report respawn errors to syslog with a high priority.
Sending the USR1 signal to PSMon when it is running as a daemon will
make it rescan the running processes immediately. You can start PSMon
as a daemon using the --daemon command-line option.
Final words
I am not to sold on the idea of killing processes if they are using
too much of a system's resources, since a process may legitimately be
using 95% of the CPU for a few minutes and you wouldn't want it to be
killed. Enforcing a maximum run time, if you select a time well beyond
what most legitimate uses of the command would require, can help to
protect the system from badly behaving cron jobs when you are not
around to notice them. Being able to respawn processes automatically
if they have exited is certainly useful -- although sshd and Apache do
not tend to crash much, you can bet the one time they do is when you
board a airplane for an nine-hour flight. Its multiple capabilities
make PSMon a worthy utility for your system administration toolkit.
Ben Martin has been working on filesystems for more than 10 years. He
completed his Ph.D. and now offers consulting services focused on
libferris, filesystems, and search solutions.