Ansible Failing But SSH Connections are Successful

1,005 views
Skip to first unread message

Michael Ellis

unread,
Oct 20, 2017, 3:37:26 PM10/20/17
to Ansible Project
Hello Group,

New to Ansible and the Project, but like what I see so far!

I am running Ansible 2.4.0 on RHEL7.4.  I have SSH keys setup on several hosts and can connect to the remote hosts using the SSH Key Pair from my Ansible Control node and not be prompted for passwords, all as the root user in my POC.

The issue I am seeing is that even though I can use the SSH keys to connect, Ansible is failing in anything but a "raw" connection.  I am guessing there is some oddity in my system or root shell profile (bash), but I have tested working machines in my Lab to non-working machine in the Dev environment.

The issue seems to be stringing commands together over the SSH connection.  I can see this issue when even running commands over SSH.  But I know for certain that the actual login process using the SSH keys does work.  Below is an example of the debug output when trying to run an ad-hoc command, I tired to include what I thought was the most relevant info, but if more is needed, please let me know!

 22433 1508524844.83617: _low_level_execute_command(): starting
 22433 1508524844.83629: _low_level_execute_command(): executing: /bin/sh -c 'echo ~ && sleep 0'
 22433 1508524845.34744: stderr chunk (state=2):
>>>
********************************************************************************
SSH Banner Displayed Here....
********************************************************************************

<<<

 22433 1508524845.72531: stderr chunk (state=3):
>>>~: -c: line 0: unexpected EOF while looking for matching `''
~: -c: line 1: syntax error: unexpected end of file
<<<

 22433 1508524845.72595: stdout chunk (state=3):
>>><<<

 22433 1508524845.72611: stderr chunk (state=3):
>>><<<

 22433 1508524845.72639: _low_level_execute_command() done: rc=1, stdout=, stderr=
********************************************************************************
SSH Banner Displayed Here....
********************************************************************************

~: -c: line 0: unexpected EOF while looking for matching `''
~: -c: line 1: syntax error: unexpected end of file

 22433 1508524845.72665: _low_level_execute_command(): starting
 22433 1508524845.72678: _low_level_execute_command(): executing: /bin/sh -c '( umask 77 && mkdir -p "` echo ~/.ansible/tmp/ansible-tmp-1508524844.84-267267221542940 `" && echo ansible-tmp-1508524844.84-267267221542940="` echo ~/.ansible/tmp/ansible-tmp-1508524844.84-267267221542940 `" ) && sleep 0'
 22433 1508524845.97754: stderr chunk (state=2):
>>>umask: -c: line 0: unexpected EOF while looking for matching `''
umask: -c: line 1: syntax error: unexpected end of file
<<<

 22433 1508524845.97820: stdout chunk (state=3):
>>><<<

 22433 1508524845.97839: stderr chunk (state=3):
>>><<<

 22433 1508524845.97873: _low_level_execute_command() done: rc=1, stdout=, stderr=umask: -c: line 0: unexpected EOF while looking for matching `''
umask: -c: line 1: syntax error: unexpected end of file

 22433 1508524845.97910: _execute() done
 22433 1508524845.97921: dumping result to json
 22433 1508524845.97933: done dumping result, returning
 22433 1508524845.97955: done running TaskExecutor() for lqil0219icma01.cardinalhealth.net/TASK: ping [005056a4-c5bf-11a9-98aa-000000000053]
 22433 1508524845.97981: sending task result for task 005056a4-c5bf-11a9-98aa-000000000053
 22433 1508524845.98058: done sending task result for task 005056a4-c5bf-11a9-98aa-000000000053
 22433 1508524845.98122: WORKER PROCESS EXITING
    "changed": false,
    "msg": "Authentication or permission failure. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote temp path in ansible.cfg to a path rooted in \"/tmp\". Failed command was: ( umask 77 && mkdir -p \"` echo ~/.ansible/tmp/ansible-tmp-1508524844.84-267267221542940 `\" && echo ansible-tmp-1508524844.84-267267221542940=\"` echo ~/.ansible/tmp/ansible-tmp-1508524844.84-267267221542940 `\" ), exited with result 1",
    "unreachable": true
}
 22421 1508524845.98229: no more pending results, returning what we have
 22421 1508524845.98237: results queue empty

To this appears to be making the SSH connection as the Banner is displayed, but when Ansible tries to string a couple commands together, the second, and subsequent, command fails, which is causing the play to fail.

I have tried comparing /etc/profile and any .profile/.bashrc files between working and non-working nodes, but I must be missing something as I cannot get the non-working nodes, to, well, work.

Below is a success using the 'raw' module for the same host:
# ansible broken-host -m raw -a 'uptime'
broken-host | SUCCESS | rc=0 >>
Shared connection to broken-host closed.

An SSH session, we do have root logins set to "forced-commands-only" so not sure if this is an issue, I did test setting this to "yes" and restarting SSH but still my Ansible commands fail:
# ssh -q broken-host "grep Root /etc/ssh/sshd_config"
PermitRootLogin forced-commands-only



I am just not sure where to go from here, any help would be greatly appreciated!

Thanks,

-Mike

Michael Ellis

unread,
Oct 24, 2017, 4:52:59 PM10/24/17
to Ansible Project
I think I have this narrowed down to two issues revolving around SSH connections to the remote machine, but testing is still on-going for one of the potential issues.

By default in the environment we have PermitRootLogin set to forced-commands-only on all our machines.  This parameter in the SSHD config seems to be the main crux of the problem.  It appears PermitRootLogin needs set to yes.

The other issue I am seeing, and still testing, is the SSH Key I am attempting to use is currently an existing key from our admin Jump servers, but this key has the "no-pty" option set, which appears that it could be causing additional issues.  I have not 100% proven the "no-pty" setting is an issue, as mentioned, testing continues.

More updates shortly.

-me

Michael Ellis

unread,
Oct 26, 2017, 1:49:52 PM10/26/17
to Ansible Project
Think I have this working as expected at this time.  There is something very odd in our environment that is breaking SSH in wonderful ways when it comes to Ansible, I suspect the current CFEngine setup is not going to give up so easily.  Even after stopping CFE there are some kinds of custom, home-grown processes still using CFE which will seemingly randomly break my ability to SSH in as from Ansible.

To get around this issue to allow my testing to proceed, I have setup an Ansible Service Account locally on my test machines, added the proper entries to sudoers and added the needed SSH keys to allow all this work.  Everything is working as expected at this time so now on to further testing!

-me

On Friday, October 20, 2017 at 3:37:26 PM UTC-4, Michael Ellis wrote:
Reply all
Reply to author
Forward
0 new messages