Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Please help ssh spawn loop problem, expect crashing

590 views
Skip to first unread message

John NoSpam

unread,
Jun 4, 2003, 1:57:44 PM6/4/03
to
Hello
I am having a few problems with the following expect script. In addition to
my post i included a prevoius post by Brain
that handles ssh errors in a loop very well, and I like his methodology of
sending the user an
exact ssh error and then continuing his host list. The bad thing is this is
the only expec ssh su example i can find
and it is missing some important code to understand how it really works.
This is my first expect script
so please be kind.

1. First question is does anyone have an example of how to capture and pass
spawn ssh
errors/status to the user like Connection Refused, Successful, and
Connection refused
ssh not running. A good example would be greatly appreciated. Otherwise my
script blows up
when expects autotimeout of like 10 seconds tries to continue on and cant
send to the process that does not exist.

2. How can I send these errors to a log along with the hostname it had a
problem on? Also I would like to keep
the logs need and avoid logging a password if that is possible.

3. How can I let the script continue on after an error/status with the other
hosts, so I dont have to keep
restarting the script? This is important. It would be very cool if I could
make the system
interact upon an error to try to intervene manually then pass control back
to the script. It would be
good to be able to do a different control sequence to skip the host if you
could not fix it manually and
continue the script.

4. I want to keep somewhat secure so I am closing the root session, and I
need to close the
connection when done cause I run out of PTYS otherwise. However, if I do
close -i and
wait -i the script hangs waiting for the process to die, but if I put
wait -nowait I am still
running out of PTYS it seems. What am I missing?

Many Thanks


#!/usr/local/bin/expect -f
#$hosts is a working list that passes a list of hosts to the loop, this code
is working and was omitted.
#$pass is the password, The user is asked for the root password. This code
is working and was omitted.
foreach host $hosts
{
Spawn ssh $host #starts user ssh session to host
set ssh_id $spawn_id
"\$" {\r} #check for connection
send "su\r" #su to root
expect "Password:" #look for password prompt
send "$pass\r" #send root password
expect {
"/#" {\r} #expect root prompt
timeout 10 {interact + return} #allow to interact if problem occurs, fix
problem, and press + return to continue
}
send "root command(s)" #example
expect "/#" #expect root prompt to see command(s) are done
send "exit" #exit root for security
expect "/$"
close -i $ssh_id wait -i $ssh_id
}

Example expect script I dont fully understand posted by Brian due
to missing code. This does exactly what im trying to accomplish
if it works and anyone knows how to fill in the missing pieces.

A basic overview of what this script is doing is in order.
First it spawns an ssh session to host/hosts, su - to root, runs a
simple command to stop/kill a process then restart it.
The command run to restart the process works from the command line on
the servers themselves.
Copy of restart command that is called by the expect script.

#!/bin/sh
cd /opt/prfmgr/prfmgr/bin
./hepm-shutdown.csh
sleep 3
ps -ef |grep jmq |awk '{print $2}' |xargs kill -9
sleep 1
ps -ef |grep DSPIRIT |awk '{print $2}' |xargs kill -9
sleep 1
ps -ef |grep PingApmSer |awk '{print $2}' |xargs kill -9
sleep 1
./hepm-startup.csh
sleep 3
cd /opt/SpiritWave4_31/bin
nohup jmq &
sleep 2
cd /opt/prfmgr/scripts
./KeepAlive.sh
sleep 2

When i run the expect script it logs into the machine/machines in
question and will su with success. The restart command has to be run
as the specific user, so in order to call the restart script, i call a
simple script that executes
su - prfmgr -c "/opt/prfmgr/scripts/apmrestart.sh"
this also works from the command line, but when the expect script
calls this script it shuts down the processes, kills them if they
hang, restarts everything sucessfully, (in the script is a ps -ef
|egrep "all|the|process|names" that returns and shows sucessful
restart. But it seems that when the script exits, it kills the
processes it just started up. Is there anyway to keep this from
happening? I've attatched the important parts of the script (sans
passwords of course). If anyone can see why this would happen please
let me know.

proc restart_apm {} {

global ssh_id DEBUG

# Return code constants
set SUCCESS 0
set EFAIL 1
set ETIME 255 ;# Interaction timedout

set timeout 30
set return_value -1

# Login responses
set root_prompt "# "
set password_prompt "Password:"
set sorry "Sorry, try again."
set no_access "is not in the sudoers file.*"
set syntax_err "sudoers file: syntax error*"

# Commands
set restart "/root/apmstart.sh"
set ps "/root/prfmgr-ps.sh"

send -i $ssh_id "$restart\n"
expect -i $ssh_id -re "$root_prompt"
send -i $ssh_id "$ps\r"
expect {
-i $ssh_id "dstb.utils.PingApmSer" {
set return_value $SUCCESS
}

} ;# end expect
expect -i $ssh_id -re "$root_prompt"
return $return_value
} ;# end restart_apm

########
# #
# MAIN #
# #
########

foreach host $hosts {
set timeout 30
if { $DEBUG == 1} {
exp_internal 1
}
# make sure host is pingable
if { $DEBUG == 0} {
log_user 0
}

# Start SSH process
spawn /opt/local/bin/ssh -l biscadm $host; set ssh_id $spawn_id
if { $DEBUG == 1} {
exp_internal 1
}
if { $DEBUG == 0} {
log_user 0
}

# Execute login procedure
set return_code [ssh_login 0]
switch -- $return_code \
0 {
# If this is the final host status, we failed further down
# but did not catch it.
set status($host) "ssh: Successfully logged in"
} 1 {
set status($host) \
"FAILED: ssh: Known password(s) did not work"
close -i $ssh_id; wait -i $ssh_id
continue
} 2 {
set status($host) \
"FAILED: ssh: Authentication failed: account does not exist"
close -i $ssh_id; wait -i $ssh_id
continue
} 3 {
set status($host) \
"FAILED: ssh: Connection Refused: SSH not running or not
allowed"
close -i $ssh_id; wait -i $ssh_id
continue
} 4 {
set status($host) \
"FAILED: ssh: SecurID prompt: requires interactive login"
close -i $ssh_id; wait -i $ssh_id
continue
} 255 {
# This is here just in case. This status should not show
# up. We should be collecting something more specfic
# than a timeout.
set status($host) "FAILED: ssh: timeout (unexpected response)"
close -i $ssh_id; wait -i $ssh_id
continue
} default {
# This is BAD. The script should NEVER end up here.
set status($host) "ERROR: ssh: check script (unknown return
code)"
close -i $ssh_id; wait -i $ssh_id
continue
}

if { $DEBUG == 0} {
log_user 1
}

# Should be at a prompt at this point so send su
set return_code [su_to_root 0]
switch -- $return_code \
0 {
# If this is the final host status, we failed further down
# but did not catch it.
set status($host) "su: OK"
} 1 {
set status($host) \
"FAILED: su: Known password(s) did not work"
close -i $ssh_id; wait -i $ssh_id
continue
} 2 {
set status($host) "FAILED: su: No root account or bad passwd
file"
close -i $ssh_id; wait -i $ssh_id
continue
} 255 {
# This is here just in case. This status should not show
# up. We should be collecting something more specfic
# than a timeout.
set status($host) "FAILED: su: timeout (unexpected response)"
close -i $ssh_id; wait -i $ssh_id
continue
} default {
# This is BAD. The script should NEVER end up here.
set status($host) "ERROR: su: check script (unknown return
code)"
close -i $ssh_id; wait -i $ssh_id
continue
}

# Should be at a root prompt at this point so send restart
# Restart apm processes
set return_code [restart_apm]
switch -- $return_code \
0 {
set status($host) "SUCCESS"
} 1 {
set status($host) "FAILED: restart_prfmgr: prfmgr failed to
restart"
close -i $ssh_id; wait -i $ssh_id
continue
} 2 {
set status($host) "FAILED: restart_prfmgr:
/opt/prfmgr/scripts/apmrestart.sh: command not found"
close -i $ssh_id; wait -i $ssh_id
continue
} 255 {
# This is here just in case. This status should not show
# up. We should be collecting something more specfic
# than a timeout.
set status($host) "FAILED: restart_prfmgr: timeout (unexpected
response)"
close -i $ssh_id; wait -i $ssh_id
continue
} default {
# This is BAD. The script should NEVER end up here.
set status($host) "ERROR: restart_prfmgr: check script
(unknown return code)"
close -i $ssh_id; wait -i $ssh_id
continue
}

if { $DEBUG == 0} {
log_user 0
}

#expect -i $ssh_id "$ "

send -i $ssh_id "exit\r"
expect -i $ssh_id -re $prompt
close -i $ssh_id; wait -i $ssh_id
} ;# end foreach host $hosts

puts "\n\n\nHost: Status:"
puts "-------------------------------------------------------------"
#Print the status of all hosts
foreach host [array names status] {
if { $host != "" } {
set line [format "%-30s%-s" $host $status($host)]
puts "$line\n"
}
}


Cameron Laird

unread,
Jun 4, 2003, 2:25:12 PM6/4/03
to
In article <sqqDa.25578$Xl.5...@twister.rdc-kc.rr.com>,

John NoSpam <NoS...@kc.rr.com> wrote:
>Hello
>I am having a few problems with the following expect script. In addition to
>my post i included a prevoius post by Brain
> that handles ssh errors in a loop very well, and I like his methodology of
>sending the user an
>exact ssh error and then continuing his host list. The bad thing is this is
>the only expec ssh su example i can find
>and it is missing some important code to understand how it really works.
>This is my first expect script
>so please be kind.
.
[several good questions]
.
.
It might help if you clarify what you're after at
a high level. Thorough study of the Expect book
and introductory ssh manuals is sufficient to solve
all these problems. You seem to want something
more than that; perhaps you don't have the time/
opportunity/inclination/... to learn so much about
Expect. If *that* is the case, I urge you to con-
sider paying for consulting. For what I am certain
will be a modest charge, you can have complete
solutions, and on a definite schedule.
comp.lang.tcl is better at providing suggestions to
developers, rather than solutions to users, and it
could easily happen that no one makes the time to
work out all you appear to require.
--

Cameron Laird <Cam...@Lairds.com>
Business: http://www.Phaseit.net
Personal: http://phaseit.net/claird/home.html

John NoSpam

unread,
Jun 4, 2003, 2:37:17 PM6/4/03
to
No the book has no good exampeles on this that are very detailed to someone
is learning expect. All of their host loops assume there will be no
problems connecting. I have only been working expect for a week, but I am
stuck. I need some advice not a consultant, or I would be posting to
Monster, not here. This script is something for my own use, as a sysadmin.
If you could offer some help in what I am doing wrong in the code, it would
be much appreciated.
Thanks for your comments.
"Cameron Laird" <cla...@lairds.com> wrote in message
news:vdsec8d...@corp.supernews.com...

Cameron Laird

unread,
Jun 4, 2003, 4:18:09 PM6/4/03
to
In article <sqqDa.25578$Xl.5...@twister.rdc-kc.rr.com>,
John NoSpam <NoS...@kc.rr.com> wrote:
>Hello
>I am having a few problems with the following expect script. In addition to
>my post i included a prevoius post by Brain
> that handles ssh errors in a loop very well, and I like his methodology of
>sending the user an
>exact ssh error and then continuing his host list. The bad thing is this is
>the only expec ssh su example i can find
>and it is missing some important code to understand how it really works.
>This is my first expect script
>so please be kind.
>
>1. First question is does anyone have an example of how to capture and pass
>spawn ssh
>errors/status to the user like Connection Refused, Successful, and
>Connection refused
>ssh not running. A good example would be greatly appreciated. Otherwise my
>script blows up
>when expects autotimeout of like 10 seconds tries to continue on and cant
>send to the process that does not exist.
.
.
.
You write nearby that "No the book has no good exampeles on

this that are very detailed to someone is learning expect."
I'm at a loss as to how to reply. For me, the book is consum-
mate; I don't know how anything I have to say could add to it.
Are you looking for something like

log_user 0

proc connect {ssh_host user password expected_prompt} {
spawn ssh -l $user $ssh_host
while 1 {
expect {
{Are you sure you want to continue connecting (yes/no)?} {
send yes\r
}
password: {
send $password\r
break
}
{Connection refused} {
puts "ssh not in service for $ssh_host."
return
}
timeout {
puts "Networking problem--are you sure $ssh_host exists?"
return
}
}
}
expect {
$expected_prompt {
puts "Ready for action with spawn_id $spawn_id."
return $spawn_id
}
{Permission denied, please try again.} {
puts "Bad password"
exp_close
}
timeout {
expect *
puts "Expected prompt '$expected_prompt' not detected in output stream '$expect_out(buffer)'."
exp_close
}
}
}


connect myhost.com claird mypassword {$ }
connect myhost.com claird xxx {$ }
connect myhost.com claird mypassword junk
connect nowhere claird mypassword {$ }
?

Cameron Laird

unread,
Jun 4, 2003, 4:20:43 PM6/4/03
to
In article <x%qDa.25746$Xl.5...@twister.rdc-kc.rr.com>,

John NoSpam <NoS...@kc.rr.com> wrote:
>No the book has no good exampeles on this that are very detailed to someone
>is learning expect. All of their host loops assume there will be no
>problems connecting. I have only been working expect for a week, but I am
>stuck. I need some advice not a consultant, or I would be posting to
>Monster, not here. This script is something for my own use, as a sysadmin.
>If you could offer some help in what I am doing wrong in the code, it would
>be much appreciated.
.
.
.
? Does Monster actually serve such a purpose? That
surprises me; my impression was that it is *not* a
good place to look for modestly-priced expertise for
episodic help. That's good to know.

Cameron Laird

unread,
Jun 4, 2003, 4:27:33 PM6/4/03
to
In article <x%qDa.25746$Xl.5...@twister.rdc-kc.rr.com>,
John NoSpam <NoS...@kc.rr.com> wrote:
.
.

.
>If you could offer some help in what I am doing wrong in the code, it would
>be much appreciated.
.
.
.
I see several blemishes in the code. My first
advice, though, is to reduce your current focus
to *one* problem at a time. I find what you've
presented confusing. Please make a smaller ex-
ample of the symptom you want to solve.

John NoSpam

unread,
Jun 7, 2003, 11:28:20 PM6/7/03
to
Hello
What are the claird staments at the end? Also I am dealing witha
foreachloop for a hostlist which is ugly when you try to tye something like
this in. How would an su to root fit in? Especially if the password failed,
and where in this example would the root password go? This is on the right
track, but could you post a complete example please?

"Cameron Laird" <cla...@lairds.com> wrote in message
news:vdsl01q...@corp.supernews.com...
0 new messages