Thanks in advance.
--geeb
-------------------------------------------------------------
Mark A. Gebert Email: ge...@merit.edu
Senior Research Programmer Voice:+1 734 936 2655
Merit Network, Inc Fax: +1 734 647 3185
4251 Plymouth Rd, Suite C, Ann Arbor, MI 48105-2785
-------------------------------------------------------------
I'd never thought I'd say this... But can I go work now?
% cat std.disclaimers
--geeb
--
- Rick
-------
Rick Carter, System Administrator, Physics Dept., University of Michigan
Rick....@umich.edu Voice: (734) 764-3348 FAX: (734) 763-9694
For Physics computer support, please use Physi...@umich.edu
"As mommy used to say, 'be nice to the man with the death-beam.'" - Kriegman
--geeb
#!/bin/sh
#
# daemoncheck - control process for running various system and daemon checks
#
# Written By: Mark A Gebert
# Date: July, 1999
#
# Thank to Mark Giuffrida (ma...@umich.edu) for the basic script)
#
umask 077
BINDIR="/usr/private/dc"
LOGFILE="/var/adm/daemonrestart.log"
HOST=`hostname | cut -d. -f1`
DOMAIN=merit.edu
MONITOREDEMAIL="n...@noc.ns.itd.umich.edu"
echo $$ > /etc/daemoncheck.pid
/usr/private/dc/bootcheck
cd ${BINDIR}
while true
do
#
# Setup Daemoncheck
#
DCSLEEPTIME=900
. /etc/hostconfig
export STARTX SENDMAIL
sleep ${DCSLEEPTIME}
mv ${LOGFILE} ${LOGFILE}.old
echo "daemonmonitor starting loop at `date`" > ${LOGFILE} 2>&1
startsize=`/bin/ls -l ${LOGFILE} | awk '{print $5}'`
#
# Perform system checks
#
SYSCHECKS="*.sys"
for SYSSCRIPT in $SYSCHECKS
do
if [ -s ${BINDIR}/${SYSSCRIPT} ]; then
( ${BINDIR}/${SYSSCRIPT} >> ${LOGFILE} 2>&1 )
fi
done
#
# Preform daemon checks
#
DAEMONCHECKS="*.daemon"
for DAEMONSCRIPT in $DAEMONCHECKS
do
if [ -s ${BINDIR}/${DAEMONSCRIPT} ]; then
( ${BINDIR}/${DAEMONSCRIPT} >> ${LOGFILE} 2>&1 )
fi
done
endsize=`/bin/ls -l ${LOGFILE} | awk '{print $5}'`
#
# Do we need to notify people?
#
if [ "$startsize" != "$endsize" ]; then
to=root+$HOST@$DOMAIN
if [ "${MONITORED:=-NO-}" = "-YES-" ]; then
to="$to, $MONITOREDEMAIL"
fi
/usr/lib/sendmail -t << EOF
To: $to
Subject: Daemoncheck alerts on `/bin/hostname`
`cat ${LOGFILE}`
EOF
fi
echo "daemonmonitor ending loop at `date`" >> ${LOGFILE} 2>&1
done
At 15:21 -0500 17 November 2000, Jason Presnell <presnell> wrote:
> On Fri, 17 Nov 2000, Mark A Gebert wrote:
>
> > The question is does anyone have a clue why this happens?
> >
>
> It would help if we could all see the script in question that is not
> working (remove anything that is private if you wish).
>
> -j
--
It's an iritating problem, that seems to happen no matter
how the process is started. I'm assuming that it's the
dc (DeamonCheck) process that you're talking about...
On Fri, Nov 17, 2000 at 04:55:22PM -0500, Mark A Gebert wrote:
> No kerberos in this one.... Zip zero ziltch.
>
> --geeb
>
> At 15:36 -0500 17 November 2000, Rick Carter <Rick.Carter> wrote:
>
> > Flying blind here, but whenever I hear "long-running" and "doesn't work"
> > in the same sentence, I start wondering if there's a kerberos ticket
> > expiration involved somewhere.
> >
> > - Rick
> >
> >
> > On Fri, 17 Nov 2000, Mark A Gebert wrote:
> >
> > > The question is does anyone have a clue why this happens?
> > >
> > > --geeb
> > >
> > > At 14:45 -0500 17 November 2000, Mark A Gebert <geeb> wrote:
> > >
> > > > We have a long running shell script that is started out of the inittab and does
> > > > some monitoring on some of our systems if it detects a problem it sends email
> > > > out. After about a week the program detects the problems but does not send out
> > > > Email. We've tried serveral things but no dice. This is under Solaris 2.6.
> > > >
> > > > Thanks in advance.
> > > >
> > > > --geeb
> > > >
> > > > -------------------------------------------------------------
> > > > Mark A. Gebert Email: ge...@merit.edu
> > > > Senior Research Programmer Voice:+1 734 936 2655
> > > > Merit Network, Inc Fax: +1 734 647 3185
> > > > 4251 Plymouth Rd, Suite C, Ann Arbor, MI 48105-2785
> > > > -------------------------------------------------------------
> > > > I'd never thought I'd say this... But can I go work now?
> > > >
> > > > % cat std.disclaimers
> > >
> > > --
> > >
> >
> > -------
> > Rick Carter, System Administrator, Physics Dept., University of Michigan
> > Rick....@umich.edu Voice: (734) 764-3348 FAX: (734) 763-9694
> > For Physics computer support, please use Physi...@umich.edu
> > "As mommy used to say, 'be nice to the man with the death-beam.'" - Kriegman
> >
>
> --
--
--jlockard - "Welcome to the Psychic Admin hotline.
Don't call us, we'll call you." - KSon
Personally, I run something with a 15 minute delay out of cron
instead of a infinite loop with a sleep....
How do you know it keeps monitoring? Is it because
/var/adm/daemonrestart.log.old is being updated like it should every
15 minutes + processing time?
Try logging *.debug in syslogd.conf to somewhere, and check for any
entries for sendmail when it should be sending out the emails. Maybe
it will give a clue if sendmail is running but refusing to send the e-
mail...
Are any portions that the script runs, or logs, or checks, etc... on
NFS mounts, or other remote file systems?
---------------------------------------------------------------------------
John Lauro email: jla...@flint.umich.edu
University of Michigan - Flint jla...@umich.edu
Information Technology Services
303 E. Kearsley St. phone: (810) 762-3123
Flint, MI 48502 fax: (810) 766-6805
Reason for not running it out of cron, is that if cron dies, then
you also lose the script. This way, if the script dies, inittab
will restart it.
> How do you know it keeps monitoring? Is it because
> /var/adm/daemonrestart.log.old is being updated like it should every
> 15 minutes + processing time?
You can actually see that the logs (update info) is being written,
just not mailed out).
> Try logging *.debug in syslogd.conf to somewhere, and check for any
> entries for sendmail when it should be sending out the emails. Maybe
> it will give a clue if sendmail is running but refusing to send the e-
> mail...
>
> Are any portions that the script runs, or logs, or checks, etc... on
> NFS mounts, or other remote file systems?
Nope, in this case, everything is completely local on ufs or ffs
partitions.
--
--jlockard - "Why do you have to be so small?" - Lane Myer
-------------------------------------------------------------------
John M. Lockard | U of Michigan - School of Information
Sys Admin III | 400 West Hall - 550 E. University. Ave.
jloc...@umich.edu | Ann Arbor, MI 48109-1092
www.umich.edu/~jlockard | 734-615-8776 | 734-764-2475 FAX
-------------------------------------------------------------------
add some error checking to the script and see the exit status from the
sendmail process. That might give a clue as to what is going wrong.
You might be hitting a limit problem although it's not clear what limit
you would be hitting.
there might be a bug in solaris sh causing this...try running it under
bash or ksh or zsh or something instead. Also double check that you
are up to date on any /bin/sh patches.
along the same lines, reimplement the thing in perl.
Once it stops sending email, hit it with a truss -p and see what happens
when it attempts to fork sendmail.
kludge it so that it keeps a counter and exits after a day or so.
Init will restart it and you'll be happy.
On Fri, 17 Nov 2000, Mark A Gebert wrote:
> We have a long running shell script that is started out of the inittab and does
> some monitoring on some of our systems if it detects a problem it sends email
> out. After about a week the program detects the problems but does not send ou
> Email. We've tried serveral things but no dice. This is under Solaris 2.6.
>
> Thanks in advance.
>
> --geeb
>
> -------------------------------------------------------------
> Mark A. Gebert Email: ge...@merit.edu
> Senior Research Programmer Voice:+1 734 936 2655
> Merit Network, Inc Fax: +1 734 647 3185
> 4251 Plymouth Rd, Suite C, Ann Arbor, MI 48105-2785
> -------------------------------------------------------------
> I'd never thought I'd say this... But can I go work now?
>
> % cat std.disclaimers
>
they say i shot a man named gray
and took his wife to italy dan pritts
she inherited a million bucks 734/996-0169
and when she died it came to me da...@umich.edu
i can't help it if i'm lucky...
One could use (shudder) process accounting to look at whether
sendmail is getting started.
Just playing the tyro here. :).
neil
--
Neil Tweedy
Mathematics Computer Group
LS&A Information Technology
twe...@umich.edu
----------------------------------------------------------------------