I have an application which, at some point, executes a zsh script. This
script in turn calls another script written in Tcl/Expect, and that one
does an rlogin to a remote host, does some interaction, and then
returns.
Note also that the login-shell on the remote host is tcsh. The operating
system on both machines is Linux.
All this works fine when called manually, but when running it from a
cron job, it mysteriously dies within the rlogin thing.
So the first thing I did was to put a "set echo" in the .cshrc on
the remote system to see at which point the application dies.
When investigating the output which was sent to me from the cron job
via email, I found that on several runs, it died on different points
while executing .cshrc (in this case, there is a *lot* to do in .cshrc,
as there are MANY other scripts sourced). Moreover, it does not die
with an error message; instead, it is that someone "just has cut the
plug" during execution. I had one case where even a command
setenv VAR something
was listed only "half" in the cron job, as
setenv VAR som
In another case, the system was able to perform the whole login
process, Expect was even able to supply the first command as part
of the interaction, but the next command, which happened to be
a cd $HOME, was only listed, but I already didn't see the shell
prompt, so the connection must have been cut after this.
There were also no other output from my script after that.
So what could be the reason for this strange behaviour? Could
there be some kind of timeout be involved, which kills the
connection? Alternatively, could it be that cron collects output
only up to a certain amount, and if the application produces
more, it is killed?
The last one would explain why sometimes the commands seem to
be "cut in the middle"; due to the complexity of the rlogin
stuff, it is well possible that on successive invocations,
always a different amount of output is produced. OTOH, the man
pages for cron don't mention such a restriction.
Any ideas on this?
Ronald
--
Ronald Fischer <ron...@eml.cc>
Posted via http://www.newsoffice.de/
I'm sure the problem is solvable. It *does* sound like a mess,
though ... Have you read <URL: http://wiki.tcl.tk/cron > and
<URL: http://wiki.tcl.tk/match_max >?
No clue. I would definitely try 'strace -f -o /tmp/mybug.tra myjob' in
the crontab entry.
This way you'll know what exactly is the killer (a signal or a silent
exit after an error).
-Alex
Yes, but nothing which seems to be applicable in that particular case.
Actually, I was able to rewrite my application so that it does not need
interaction anymore, so it can be "remote controlled" via ssh instead
of
Tcl/Expect.
Ronald
On Jan 24, 10:41 am, "Ronny" <ro.naldfi.sc...@gmail.com> wrote:
>
> Actually, I was able to rewrite my application so that it does not need
> interaction anymore, so it can be "remote controlled" via ssh instead
> of Tcl/Expect.
Glad you found a workaround. However it would be could if you could
report back about the strace output. Understanding the problem does no
harm.
-Alex