Hi, Everyone.
I'm trying to run an Ansible playbook on two servers, and occasionally either one or both of them fail a task with the error "/usr/bin/python: can't open file 'command': [Errno 13] Permission denied". I started looking into why this was happening (and why it was only happening some of the time). I traced the error to the temporary directory where Ansible copies its command file sometimes being empty instead of the normal random directory created in /tmp (runner/__init__.py:Runner._make_tmp_path()). Looking into it further, it looks like sometimes the mkdir && echo command used to create the temporary directory is assumed to be successful, but the output, which should be the created directory, is empty (runner/connection_plugins/ssh.py:Connection._communicate()).
The mkdir && echo commands looks identical to me:
<server1> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=30 server1 /bin/sh -c 'mkdir -p /tmp/ansible-tmp-1415576481.52-184054006705534 && chmod a+rx /tmp/ansible-tmp-1415576481.52-184054006705534 && echo /tmp/ansible-tmp-1415576481.52-18405400670
<server2> EXEC ssh -C -tt -vvv -o ControlMaster=auto -o ControlPersist=60s -o ControlPath="/home/user/.ansible/cp/ansible-ssh-%h-%p-%r" -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=30 server2 /bin/sh -c 'mkdir -p /tmp/ansible-tmp-1415576481.52-82470479460561 && chmod a+rx /tmp/ansible-tmp-1415576481.52-82470479460561 && echo /tmp/ansible-tmp-1415576481.52-82470479460561'
But when I print out the stderr output from the commands, they are very different:
Server 1 (failed):
OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013debug1: Reading configuration data /etc/ssh/ssh_configdebug1: Applying options for *debug1: auto-mux: Trying existing masterdebug1: Control socket "/home/user/.ansible/cp/ansible-ssh-server1-22-user" does not existdebug2: ssh_connect: needpriv 0debug1: Connecting to server1 [[server1 IP]] port 22.debug2: fd 3 setting O_NONBLOCKdebug1: fd 3 clearing O_NONBLOCKdebug1: Connection established.debug3: timeout: 30000 ms remain after connectdebug1: identity file /home/user/.ssh/identity type -1debug1: identity file /home/user/.ssh/identity-cert type -1debug3: Not a RSA1 key file /home/user/.ssh/id_rsa.debug2: key_type_from_name: unknown key type '-----BEGIN'debug3: key_read: missing keytypedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug3: key_read: missing whitespacedebug2: key_type_from_name: unknown key type '-----END'debug3: key_read: missing keytypedebug1: identity file /home/user/.ssh/id_rsa type 1debug1: identity file /home/user/.ssh/id_rsa-cert type -1debug1: identity file /home/user/.ssh/id_dsa type -1debug1: identity file /home/user/.ssh/id_dsa-cert type -1debug1: identity file /home/user/.ssh/id_ecdsa type -1debug1: identity file /home/user/.ssh/id_ecdsa-cert type -1debug1: Remote protocol version 2.0, remote software version OpenSSH_5.3debug1: match: OpenSSH_5.3 pat OpenSSH*debug1: Enabling compatibility mode for protocol 2.0debug1: Local version string SSH-2.0-OpenSSH_5.3debug2: fd 3 setting O_NONBLOCKdebug1: SSH2_MSG_KEXINIT sentdebug3: Wrote 960 bytes for a total of 981debug1: SSH2_MSG_KEXINIT receiveddebug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1debug2: kex_parse_kexinit: ssh-rsa-...@openssh.com,ssh-dss-...@openssh.com,ssh-rsa-...@openssh.com,ssh-dss-...@openssh.com,ssh-rsa,ssh-dssdebug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijnda...@lysator.liu.sedebug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijnda...@lysator.liu.sedebug2: kex_parse_kexinit: hmac-md5,hmac-sha1,uma...@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,uma...@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96debug2: kex_parse_kexinit: zl...@openssh.com,zlib,nonedebug2: kex_parse_kexinit: zl...@openssh.com,zlib,nonedebug2: kex_parse_kexinit:debug2: kex_parse_kexinit:debug2: kex_parse_kexinit: first_kex_follows 0debug2: kex_parse_kexinit: reserved 0debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1debug2: kex_parse_kexinit: ssh-rsa,ssh-dssdebug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijnda...@lysator.liu.sedebug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijnda...@lysator.liu.sedebug2: kex_parse_kexinit: hmac-md5,hmac-sha1,uma...@openssh.com,hmac-ripemd160,hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,uma...@openssh.com,hmac-ripemd160,hmac-ri...@openssh.com,hmac-sha1-96,hmac-md5-96debug2: kex_parse_kexinit: none,zl...@openssh.comdebug2: kex_parse_kexinit: none,zl...@openssh.comdebug2: kex_parse_kexinit:debug2: kex_parse_kexinit:debug2: kex_parse_kexinit: first_kex_follows 0debug2: kex_parse_kexinit: reserved 0debug2: mac_setup: found hmac-md5debug1: kex: server->client aes128-ctr hmac-md5 zl...@openssh.comdebug2: mac_setup: found hmac-md5debug1: kex: client->server aes128-ctr hmac-md5 zl...@openssh.comdebug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sentdebug1: expecting SSH2_MSG_KEX_DH_GEX_GROUPdebug3: Wrote 24 bytes for a total of 1005debug2: dh_gen_key: priv key bits set: 146/256debug2: bits set: 521/1024debug1: SSH2_MSG_KEX_DH_GEX_INIT sentdebug1: expecting SSH2_MSG_KEX_DH_GEX_REPLYdebug3: Wrote 144 bytes for a total of 1149debug3: check_host_in_hostfile: host server1 filename /home/user/.ssh/known_hostsdebug3: check_host_in_hostfile: host server1 filename /home/user/.ssh/known_hostsdebug3: check_host_in_hostfile: match line 9debug3: check_host_in_hostfile: host [server1 IP] filename /home/user/.ssh/known_hostsdebug3: check_host_in_hostfile: host [server1 IP] filename /home/user/.ssh/known_hostsdebug3: check_host_in_hostfile: match line 10debug1: Host 'server1' is known and matches the RSA host key.debug1: Found key in /home/user/.ssh/known_hosts:9debug2: bits set: 501/1024debug1: ssh_rsa_verify: signature correctdebug2: kex_derive_keysdebug2: set_newkeys: mode 1debug1: SSH2_MSG_NEWKEYS sentdebug1: expecting SSH2_MSG_NEWKEYSdebug3: Wrote 16 bytes for a total of 1165debug2: set_newkeys: mode 0debug1: SSH2_MSG_NEWKEYS receiveddebug1: SSH2_MSG_SERVICE_REQUEST sentdebug3: Wrote 48 bytes for a total of 1213debug2: service_accept: ssh-userauthdebug1: SSH2_MSG_SERVICE_ACCEPT receiveddebug2: key: /home/user/.ssh/identity ((nil))debug2: key: /home/user/.ssh/id_rsa (0x7f4dc338f2d0)debug2: key: /home/user/.ssh/id_dsa ((nil))debug2: key: /home/user/.ssh/id_ecdsa ((nil))debug3: Wrote 64 bytes for a total of 1277debug1: Authentications that can continue: publickey,gssapi-keyex,gssapi-with-mic,passworddebug3: start over, passed a different list publickey,gssapi-keyex,gssapi-with-mic,passworddebug3: preferred gssapi-with-mic,gssapi-keyex,hostbased,publickeydebug3: authmethod_lookup gssapi-with-micdebug3: remaining preferred: gssapi-keyex,hostbased,publickeydebug3: authmethod_is_enabled gssapi-with-micdebug1: Next authentication method: gssapi-with-micdebug3: Trying to reverse map address [server1 IP].debug1: Unspecified GSS failure. Minor code may provide more informationCredentials cache file '/tmp/krb5cc_12444' not founddebug1: Unspecified GSS failure. Minor code may provide more informationCredentials cache file '/tmp/krb5cc_12444' not founddebug1: Unspecified GSS failure. Minor code may provide more informationdebug1: Unspecified GSS failure. Minor code may provide more informationCredentials cache file '/tmp/krb5cc_12444' not founddebug2: we did not send a packet, disable methoddebug3: authmethod_lookup gssapi-keyexdebug3: remaining preferred: hostbased,publickeydebug3: authmethod_is_enabled gssapi-keyexdebug1: Next authentication method: gssapi-keyexdebug1: No valid Key exchange contextdebug2: we did not send a packet, disable methoddebug3: authmethod_lookup publickeydebug3: remaining preferred: ,publickeydebug3: authmethod_is_enabled publickeydebug1: Next authentication method: publickeydebug1: Trying private key: /home/user/.ssh/identitydebug3: no such identity: /home/user/.ssh/identitydebug1: Offering public key: /home/user/.ssh/id_rsadebug3: send_pubkey_testdebug2: we sent a publickey packet, wait for replydebug3: Wrote 368 bytes for a total of 1645debug1: Server accepts key: pkalg ssh-rsa blen 277debug2: input_userauth_pk_ok: SHA1 fp d3:31:a6:e5:17:09:6c:cb:aa:f1:02:85:5f:fc:e9:94:56:e5:3e:d1debug3: sign_and_send_pubkey: RSA d3:31:a6:e5:17:09:6c:cb:aa:f1:02:85:5f:fc:e9:94:56:e5:3e:d1debug1: read PEM private key done: type RSAdebug3: Wrote 640 bytes for a total of 2285debug1: Enabling compression at level 6.debug1: Authentication succeeded (publickey).debug1: setting up multiplex master socketdebug3: muxserver_listen: temporary control path /home/user/.ansible/cp/ansible-ssh-server1-22-user.FZ7QTgXgs6oGvNasdebug2: fd 4 setting O_NONBLOCKdebug3: fd 4 is O_NONBLOCKdebug3: fd 4 is O_NONBLOCKdebug1: channel 0: new [/home/user/.ansible/cp/ansible-ssh-server1-22-user.FZ7QTgXgs6oGvNas]debug3: muxserver_listen: mux listener channel 0 fd 4debug1: Entering interactive session.debug2: set_control_persist_exit_time: schedule exit in 60 seconds Server 2 (ok):
OpenSSH_5.3p1, OpenSSL 1.0.1e-fips 11 Feb 2013
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: Applying options for *
debug1: auto-mux: Trying existing master
debug2: fd 3 setting O_NONBLOCK
debug2: mux_client_hello_exchange: master version 4
debug3: mux_client_request_forwards: requesting forwardings: 0 local, 0 remote
debug3: mux_client_request_session: entering
debug3: mux_client_request_alive: entering
debug3: mux_client_request_alive: done pid = 18226
debug3: mux_client_request_session: session request sent
debug1: mux_client_request_session: master session id: 2
debug1: mux_client_request_session: master session id: 2
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0
Shared connection to server2 closed.
I'm not familiar with SSH, so I don't understand much what's going on in the failing case. It looks to me like it's trying out different authentication methods, and eventually is successful. (For authentication, I have keys set up in authorized_keys in both servers.)
Version info:
control:
CentOS release 6.6 (Final)
Python 2.7.3
openssh.x86_64 5.3p1-104.el6
server1:
CentOS release 6.2 (Final)
Python 2.6.6
openssh.x86_64 5.3p1-70.el6
server2:
CentOS release 6.2 (Final)
Python 2.6.6
openssh.x86_64 5.3p1-70.el6
I hope someone could shed some light on this, or at least be able to propose a workaround, so I could get it to work.
Paavo