I have somewhat a strange problem that I could not figure out, maybe someone
here can lelp.
I have a script that uses scp to distribute files to many servers, I use '-o
BatchMode=yes -o ConnectTimeout=2 options so the scp will not get stuck if
something is wrong with the remote host.
If the remote host is down (non-pingable) or sshd is down the timeout option
works and the scp continues to the next host in 2 seconds, but if the host
is up (pingable) and sshd is listening on port 22 and host is either in
trouble e.g. irresponsive then ssh connection hangs and wont every timeout.
In that case, I will have to logon to the server initiating the scp and kill
that process so the script will continue to run.
Is there any option that I can use to just disconnect the scp connection in
a given time, regardless of the scp is actually transferring file or not so
the connection wont hang for ever? The actual transfer time is about 5
seconds. I have just upgraded ssh to 5.2p1 but no help. The remote servers
platforms are AIX, Linux and Solaris and I see this behavior on all at some
point.
Thank you,
_______________________________________________
openssh-unix-dev mailing list
openssh-...@mindrot.org
https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev
I always recommend:
cd dir; tar czf - dir | ssh user@host "tar xzvf -"
or
cd dir; ssh user@host "tar czvf -" | tar xzf -
> Hi all,
>
> I have somewhat a strange problem that I could not figure out, maybe someone
> here can lelp.
>
> I have a script that uses scp to distribute files to many servers, I use '-o
> BatchMode=yes -o ConnectTimeout=2 options so the scp will not get stuck if
> something is wrong with the remote host.
For some time, ConnectTimeout has applied to both the TCP connection and
the first exchange of the protocol (banner exchange). This should allow
a client to detect a fully stuck server. Are you running a recent version
on the client or just the server?
> If the remote host is down (non-pingable) or sshd is down the timeout option
> works and the scp continues to the next host in 2 seconds, but if the host
> is up (pingable) and sshd is listening on port 22 and host is either in
> trouble e.g. irresponsive then ssh connection hangs and wont every timeout.
>
> In that case, I will have to logon to the server initiating the scp and kill
> that process so the script will continue to run.
>
> Is there any option that I can use to just disconnect the scp connection in
> a given time, regardless of the scp is actually transferring file or not so
> the connection wont hang for ever? The actual transfer time is about 5
> seconds. I have just upgraded ssh to 5.2p1 but no help. The remote servers
> platforms are AIX, Linux and Solaris and I see this behavior on all at some
> point.
You could probably shell script a timeout to unconditionally kill the scp
process after a certain amount of time.
-d
On Thu, Dec 24, 2009 at 1:31 PM, Vahid Moghaddasi <
vahid.mo...@gmail.com> wrote:
> Hi all,
>
> I have somewhat a strange problem that I could not figure out, maybe
> someone
> here can lelp.
>
> I have a script that uses scp to distribute files to many servers, I use
> '-o
> BatchMode=yes -o ConnectTimeout=2 options so the scp will not get stuck if
> something is wrong with the remote host.
>
> If the remote host is down (non-pingable) or sshd is down the timeout
> option
> works and the scp continues to the next host in 2 seconds, but if the host
> is up (pingable) and sshd is listening on port 22 and host is either in
> trouble e.g. irresponsive then ssh connection hangs and wont every timeout.
>
> In that case, I will have to logon to the server initiating the scp and
> kill
> that process so the script will continue to run.
>
> Is there any option that I can use to just disconnect the scp connection in
> a given time, regardless of the scp is actually transferring file or not so
> the connection wont hang for ever? The actual transfer time is about 5
> seconds. I have just upgraded ssh to 5.2p1 but no help. The remote servers
> platforms are AIX, Linux and Solaris and I see this behavior on all at some
> point.
>
> Thank you,
> Try using ServerAliveInterval=3 and a reasonable ServerAliveCountMax as
> well. The full TCP handshake could be happening but then sshd might not be
> doing anything with the established tcp connection. ConnectTimeout will only
> bail if there is a problem setting up the connection. Once the connection is
> established ConnectTimeout's job is done.
>
I tried ssh -o ServerAliveCountMax=5 -o ServerAliveInterval=1 -o
ConnectTimeout=3 user@server "date" but the connection still hung. I tried
with OpenSSH_4.3p2 and OpenSSH_5.2p1 with same result. I also added
ServerAliveCountMax 5
ServerAliveInterval 1 hoping that may have a different affect.
There might be a bug as I see some people reported:
http://marc.info/?l=openssh-unix-dev&m=121081969105167&w=2
The SSH on the server side are different version on different platform but I
don't think that would matter anyway.
How would I count the connection seconds in my script and kill the hung
process.
Thanks,
> On Fri, Dec 25, 2009 at 4:07 PM, Bryan Whitehead <dri...@megahappy.net>
> wrote:
> Try using ServerAliveInterval=3 and a reasonable
> ServerAliveCountMax as well. The full TCP handshake could be
> happening but then sshd might not be doing anything with the
> established tcp connection. ConnectTimeout will only bail if
> there is a problem setting up the connection. Once the
> connection is established ConnectTimeout's job is done.
>
> I tried ssh -o ServerAliveCountMax=5 -o ServerAliveInterval=1 -o
> ConnectTimeout=3 user@server "date" but the connection still hung. I tried
> with OpenSSH_4.3p2 and OpenSSH_5.2p1 with same result. I also added
> ServerAliveCountMax 5
> ServerAliveInterval 1 hoping that may have a different affect.
Please post a client debug trace from a failing connection. It isn't possible
to determine what is going wrong without it: "ssh -vvv host"
> There might be a bug as I see some people reported:
> http://marc.info/?l=openssh-unix-dev&m=121081969105167&w=2
> The SSH on the server side are different version on different platform but I
> don't think that would matter anyway.
> How would I count the connection seconds in my script and kill the hung
> process.
I think that bug was fixed. Certainly, ServerAliveCount works for me.
-d
>
> Please post a client debug trace from a failing connection. It isn't
> possible
> to determine what is going wrong without it: "ssh -vvv host"
>
>
>
Here it is:
# ssh -vvv -o ServerAliveCountMax=5 -o ServerAliveInterval=1 -o
ConnectTimeout=3 user@server1 date
OpenSSH_4.3p2, OpenSSL 0.9.8c 05 Sep 2006
debug1: Reading configuration data /usr/local/etc/ssh_config
debug2: ssh_connect: needpriv 0
debug1: Connecting to server1 [192.168.1.24] port 22.
debug2: fd 4 setting O_NONBLOCK
debug1: fd 4 clearing O_NONBLOCK
debug1: Connection established.
debug1: permanently_set_uid: 0/1
debug1: identity file /.ssh/identity type -1
debug1: identity file /.ssh/id_rsa type -1
debug3: Not a RSA1 key file /.ssh/id_dsa.
debug2: key_type_from_name: unknown key type '-----BEGIN'
debug3: key_read: missing keytype
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug3: key_read: missing whitespace
debug2: key_type_from_name: unknown key type '-----END'
debug3: key_read: missing keytype
debug1: identity file /.ssh/id_dsa type -1
debug1: Remote protocol version 2.0, remote software version Sun_SSH_1.1
debug1: no match: Sun_SSH_1.1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_4.3
debug2: fd 4 setting O_NONBLOCK
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug2: kex_parse_kexinit:
diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss
debug2: kex_parse_kexinit:
aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,
rijnda...@lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr
debug2: kex_parse_kexinit:
aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,arcfour128,arcfour256,arcfour,aes192-cbc,aes256-cbc,
rijnda...@lysator.liu.se,aes128-ctr,aes192-ctr,aes256-ctr
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,
hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,hmac-ripemd160,
hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zl...@openssh.com,zlib
debug2: kex_parse_kexinit: none,zl...@openssh.com,zlib
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: kex_parse_kexinit:
diffie-hellman-group-exchange-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-cbc,blowfish-cbc,3des-cbc
debug2: kex_parse_kexinit: aes128-cbc,blowfish-cbc,3des-cbc
debug2: kex_parse_kexinit: hmac-sha1,hmac-md5
debug2: kex_parse_kexinit: hmac-sha1,hmac-md5
debug2: kex_parse_kexinit: none,zlib
debug2: kex_parse_kexinit: none,zlib
debug2: kex_parse_kexinit: en-CA,en-US,es,es-MX,fr,fr-CA,i-default
debug2: kex_parse_kexinit: en-CA,en-US,es,es-MX,fr,fr-CA,i-default
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: mac_init: found hmac-md5
debug1: kex: server->client aes128-cbc hmac-md5 none
debug2: mac_init: found hmac-md5
debug1: kex: client->server aes128-cbc hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug2: dh_gen_key: priv key bits set: 125/256
debug2: bits set: 492/1024
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug3: check_host_in_hostfile: filename /.ssh/known_hosts
debug3: check_host_in_hostfile: match line 282
debug3: check_host_in_hostfile: filename /.ssh/known_hosts
debug3: check_host_in_hostfile: match line 282
debug1: Host 'xsuadm38' is known and matches the RSA host key.
debug1: Found key in /.ssh/known_hosts:282
debug2: bits set: 507/1024
debug1: ssh_rsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /.ssh/identity (0)
debug2: key: /.ssh/id_rsa (0)
debug2: key: /.ssh/id_dsa (0)
debug1: Authentications that can continue:
gssapi-keyex,gssapi-with-mic,publickey,password,keyboard-interactive
debug3: start over, passed a different list
gssapi-keyex,gssapi-with-mic,publickey,password,keyboard-interactive
debug3: preferred publickey,keyboard-interactive,password
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive,password
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Trying private key: /.ssh/identity
debug3: no such identity: /.ssh/identity
debug1: Trying private key: /.ssh/id_rsa
debug3: no such identity: /.ssh/id_rsa
debug1: Trying private key: /.ssh/id_dsa
debug1: read PEM private key done: type DSA
debug3: sign_and_send_pubkey
debug2: we sent a publickey packet, wait for reply
debug1: Authentication succeeded (publickey).
debug1: channel 0: new [client-session]
debug3: ssh_session2_open: channel_new: 0
debug2: channel 0: send open
debug1: Entering interactive session.
debug2: callback start
debug2: client_session2_setup: id 0
debug1: Sending command: date
debug2: channel 0: request exec confirm 0
debug2: callback done
debug2: channel 0: open confirm rwindow 0 rmax 32768
debug2: channel 0: rcvd adjust 131072
^Cdebug1: channel 0: free: client-session, nchannels 1
debug3: channel 0: status: The following connections are open:
#0 client-session (t4 r0 i0/0 o0/0 fd 5/6 cfd -1)
debug3: channel 0: close_fds r 5 w 6 e 7 c -1
Killed by signal 2.
Thanks again.
On Fri, 25 Dec 2009, Vahid Moghaddasi wrote:
> On Fri, Dec 25, 2009 at 6:09 PM, Damien Miller <d...@mindrot.org> wrote:
>
> Please post a client debug trace from a failing connection. It isn't
> possible
> to determine what is going wrong without it: "ssh -vvv host"
>
>
> Here it is:
> # ssh -vvv -o ServerAliveCountMax=5 -o ServerAliveInterval=1 -o
> ConnectTimeout=3 user@server1 date
> OpenSSH_4.3p2, OpenSSL 0.9.8c 05 Sep 2006
Could you try a recent OpenSSH at the client side? OpenSSH 4.3 is quite old.
-d
>
>
>
> Could you try a recent OpenSSH at the client side? OpenSSH 4.3 is quite
> old.
>
> -d
>
Damien,
It's always like that, when you want it to fail, it wont fail now... I
upgraded the ssh on the machine to OpenSSH_5.2p1 anyway and added all the
options hopefully that will do it.
Thank you all very much, you have been very helpful.