I could put the code in each program, but why have redundant code at
the beginning of each script to do the same thing. Hence the driver.
It operates as in this example:
# driver myprog
The driver has a reference file that decides how "myprog" should run,
how much time must elapse before "myprog" can be run again, whether to
create a log, where to save the log, how many copies it should keep,
and so on.
But I am having a little problem with my "tee" command. I want the
STDOUT and STDERR to be both displayed on the screen as well as
written to my output log. Under normal circumstances, the following
logic works fine:
# START OF CODE (simplified for readability) #
######################################
if ${CMDLINE}
then
echo "Execution complete of program: ${CMDLINE}"
else
echo "ERROR: Problems detected with ${CMDLINE}"
fi | tee -a ${CMDLOG}
######################################
# END OF CODE #
As I said, this works fine for most programs (or shell scripts) I run.
Where I am having a problem is that I have a few scripts that are
being called end up hanging after displaying the ERROR line. I checked
this with "set -vx" and the last line executed is the tee. If I check
with "ps", I only have these processes:
PID TTY TIME CMD
1216682 pts/2 0:00 tee -a /testing/log/problem.program.log.1
1605836 pts/2 0:00 /testing/bin/dr problem.program
1622268 pts/2 0:00 ps
1765552 pts/2 0:00 -ksh
From the looks of things, it is almost like the tee command has opened
the pipe but never closes. It's like the difference between issuing:
# This ends right away
echo "\c" | tee -a sample.out
# This hangs
tee -a sample.out
This only happens in one or two programs that I use with the driver
tool, but it is enough to cause all sorts of havoc as my users sit
there waiting for their program to end and return to the prompt.
And yet, if I run the problem program without the driver it works
fine.
The only thing I can think of is that perhaps the scripts being run
are doing their own redirection inside it that may be interfering with
the tee.
I also am wondering if the "sync" or "wait" commands may be
influencing this...I notice some scripts the driver is calling contain
those commands in an effort to make sure that the previous processes
were completed, and that the logs have not been buffered.
I tried recoding various ways, and it seems to always hang. I guess my
questions on this are:
1) Has anyone seen this type of behavior before?
2) Is there a way to get around it?
I'm going to mess around with my Korn shell commands, but if anyone
has had the issue before perhaps they can save me from re-inventing
the wheel.
Thanks
Steve N.
I've had problems with tee myself. So I don't use it anymore
I put output in a string and have a command line option to echo the
string and put to a file; to point out the bleeding obvious the tee
hangs because the if/fi loop isn't (looping). I see there's a -i
option with tee; but I'd personally flag tee. I believe you're right
about signals etc with things like sync and wait and also exec. Not
sure how deep Korn goes .
I guess what you are seeing is pretty much expected... Not only on
ksh, but on any shell.
when command A is piped to command B - A's stdout and B's stdin are
dup/dup2-ed two ends of the same pipe.
B repeatedly reads from the stdin, and closes it only when it receives
some indication of an end of input, or on end of file.
In case you start tee in the following manner:
tee -a filename
then no pipe is dup/dup2-ed to the stdin. Hence it will expect an
input from the keyboard.
If you want this command to terminate, you should give some indication
of end of input - usually CTRL-D.
Refer the book on programming with unix system calls by Richard
Stevens.
Thanks and regards,
Rajbir Bhattacharjee
P.S. - The views expressed here are my own, and do not necessarily
reflect that of my employer.
On Mar 3, 3:48 am, "steven_nospam at Yahoo! Canada"
Perhaps I made my example a bit too simplistic. What I meant by that
line was that my program is hanging "as if" it was waiting for input
from STDIN instead of what it was passed from the previous command
line. I definitely know that the tee command is expecting input from
some source and usually terminates with some sort of EOF marker.
Here is the actual code I was using originally when the problem first
raised its ugly head:
### START OF MAIN LOOP IN CODE ###
{
echo "Starting execution of program:\t${CMDLINE}"
echo "Start date/time stamp........:\t$(date)"
${CMDLINE}
ERRVAL=$?
wait;sync
if test ${ERRVAL} -ne 0
then
echo "\t#########################################################"
echo "\t# Driver detected a non-zero exit status. The program #"
echo "\t# that was run may not have completed properly. Verify #"
echo "\t# the log for any error messages or potential problems. #"
echo "\t#########################################################"
fi
echo "Execution complete of program:\t${CMDLINE}"
echo "Completion date/time stamp...:\t$(date)"
} 2>&1 | tee -a ${Log}
### END OF CODE ###
From the command line, we usualy run commands like the following:
1) This would execute the ls command and create a data file containing
the listing
# driver ls
2) This would create a mksysb backup and keep track of the last time
it was created for auditing purposes.
# driver mksysb -i -e /dev/rmt0
3) This would run a COBOL program that performs a series of month-end
processes
# dr runcbl closemonth
The only one I have been having a problem with is the Korn Shell
script that restores my monthly backup into a historical environment
(driver eomrestore). If I run the restore on its own, it works fine.
If I run the driver with anything else, it works fine. If I run the
two together, it hangs.
I am going to do some coding changes to the restore script. The only
area I'm wondering about in there is a nohup branched process that the
"eomrestore" uses, may may be causing the problems when the "driver"
script calls it. I will update this post if anything develops from
that.
Thx,
Steve N.
>I am going to do some coding changes to the restore script. The only
>area I'm wondering about in there is a nohup branched process that the
>"eomrestore" uses, may may be causing the problems when the "driver"
>script calls it.
Running some program in background may be your problem. As long, as any program has the pipe open, the reading program will not terminate. Try these examples:
# Will print a, sleep, print b:
{ echo aaaa ; sleep 10 ; echo bbbb; } | cat
# Will print a, b immediately, then "hang" ..
{ echo aaaa ; sleep 10 & echo bbbb; } | cat
# Will print a, b immediately and return to prompt
{ echo aaaa ; sleep 10 >/dev/null & echo bbbb; } | cat
Hth
Cheers
Heinrich
--
Heinrich Mislik
Zentraler Informatikdienst der Universitaet Wien
A-1010 Wien, Universitaetsstrasse 7
Tel.: (+43 1) 4277-14056, Fax: (+43 1) 4277-9140
Hi Heinrich,
That's exactly what I thought too. The restore script we use performs
an Informix "onload" command, which (once started) usually shows no
activity until it completes. We had users who were under the
impression that the process was "hung" and so they press all sorts of
buttons to try to get it to do something. To avoid that, one of our
developers added a simple loop that prints periods (.) on the screen
to make it look like there is activity.
# This is used to start the dots
while :
do
sleep 10
echo ".\c"
done &
P_I_D=$!
# This is used to stop the dots
[ -n "$P_I_D" ] && kill $P_I_D >/dev/null 2>&1
This is the only place in the script that spawns a process using the
ampersand (&).
However, I commented out these lines and the hanging still occurs.
Does anyone know if it's possible to use the "script" command instead
of the "tee" command? I tried but it seems that you can only do this
from a command line, but I may be using the command wrong.
SteveN
do {} and () behave differently?
I'd still dump tee and it's ilk and put the output to a variable, and
write a function that echo's to a file and STDOUT; however, I'm also
interested to know the solution as I have a similar headache with NIM
scripts that do a kill -HUP on the remote client hanging
My driver utility runs it's startup routines, but before running the
$CMDLINE program and logging the output (using tee -a), I pass the
name of the log to a separate utility that I nohup. That utility just
monitors for three things:
1) Check fuser and see if there is still an active pid in the logfile.
2) If a pid exists, check if we reached the message "Driver detected a
non-zero exit status." and kill the hanging tee command.
3) If a pid exists, check if we reached the message "Execution
complete" and kill the tee command if it is still hanging.
It's not pretty, but it works.
Steve N
do your checks include checking the file that tee is writing to ?
> > My driver utility runs it's startup routines, but before running the
> > $CMDLINE program and logging the output (using tee -a), I pass the
> > name of the log to a separate utility that I nohup. That utility just
> > monitors for three things:
>
> > 1) Check fuser and see if there is still an active pid in the logfile.
> > 2) If a pid exists, check if we reached the message "Driver detected a
> > non-zero exit status." and kill the hanging tee command.
> > 3) If a pid exists, check if we reached the message "Execution
> > complete" and kill the tee command if it is still hanging.
> > Steve N
>
> do your checks include checking the file that tee is writing to ?
Yes. The named log file which I pass to the kill program is the one I
write to with the "tee" command.
Steve
that felt like a bit of a "euraka" moment for me :)
was also wondering if there are any "exec" commands buried in there ?
This is the kill script I created. no exec statements in this one. It
uses the name of the output log ($1) to decide if something has
finished writing to that file. Because my output log will always
contain either "Execution complete" or "non-zero exit" as a string at
the end of the log, I grep for that in the log to determine if it may
have become hung. You could instead insert a special unique code to
indicate normal or abnormal completion.
#!/bin/ksh
#
if test ! "$1" = ""
then
while :
do
sleep 30
XPID=$(fuser $1 2>/dev/null |awk '{print $1}')
if test "${XPID}" = ""
then
exit
else
if test "$(cat $1 | grep "Driver detected a non-zero exit
status.")" = ""
then
if test "$(cat $1 | grep "Execution complete of program:")" =
""
then
sleep 30
else
kill -HUP ${XPID}
exit
fi
else
kill -HUP ${XPID}
exit
fi
fi
done
fi
And here is how it is called from the driver program - I put the kill
tool before the tee command that "might" hang on me. If it does not
hang, then the fuser will show no PIDs and my killtool does nothing
and exits. If it does hang, then I should see a PID in my
$LOGFILENAME, and the $CMDLINE will have finished and should be
showing either message I'm looking for. The only scenario I don't have
covered is if the $CMDLINE hangs without echo'ing one of the two
messages I'm looking for. But I have yet to come across that case.
if test -x /usr/local/bin/killtool
then
nohup /usr/local/bin/killtool ${LOGFILENAME} 1>/dev/null 2>&1 &
fi
${CMDLINE} | tee -a ${LOGFILENAME}