playbooks hang forever when client is swapping or has (NFS) mount problems

416 views
Skip to first unread message

Frank Thommen

unread,
Aug 19, 2016, 7:19:44 AM8/19/16
to Ansible Project
Dear all,

doing my first steps with ansible I noticed, that on some clients
executing playbooks completely hangs. The common problem on these hosts
is, that they are either swapping (even very small amounts of swap used)
or they have problems with hanging/not responding NFS filesystems. In
all cases, these two problems appeared together. Therefore I cannot
say, which is the problematic issue.

However all local filesystems are perfectly ok and responding and
working. The system over-all is also working fine.

Running even the simplest playbooks on these hosts hangs completely,
even though the playbooks don't access the problematic filesystems.
Running the same commands as ad-hoc commands, works fine:

$ ansible buggyhost -m shell -a '/bin/ls' --key=id_rsa
buggyhost | SUCCESS | rc=0 >>
[... `/bin/ls` output here..]

$

... but ...

$ ansible-playbook ls.yml --extra-vars "target=buggyhost"
--private-key=id_rsa

PLAY [buggyhost]
***************************************************************

TASK [setup]
*******************************************************************
[...and here it hangs...]
^C
$

using -vvv doesn't really help:


$ ansible-playbook -vvv ls.yml --extra-vars "target=buggyhost"
--private-key=id_rsa
Using /root/ansible-config/ansible.cfg as config file

PLAYBOOK: ls.yml
***************************************************************
1 plays in ls.yml

PLAY [buggyhost]
***************************************************************

TASK [setup]
*******************************************************************
Using module file
/root/ansible/ansible/lib/ansible/modules/core/system/setup.py
<buggyhost> ESTABLISH SSH CONNECTION FOR USER: None
<buggyhost> SSH: EXEC ssh -q -C -o ControlMaster=auto -o
ControlPersist=60s -o StrictHostKeyChecking=no -o
'IdentityFile="id_rsa"' -o KbdInteractiveAuthentication=no -o
PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=10 -o
ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r buggyhost '/bin/sh -c
'"'"'( umask 77 && mkdir -p "` echo
$HOME/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534 `" && echo
ansible-tmp-1471604358.1-265208088531534="` echo
$HOME/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534 `" ) &&
sleep 0'"'"''
<buggyhost> PUT /tmp/tmp7cAXoN TO
/root/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534/setup.py
<buggyhost> SSH: EXEC sftp -b - -C -o ControlMaster=auto -o
ControlPersist=60s -o StrictHostKeyChecking=no -o
'IdentityFile="id_rsa"' -o KbdInteractiveAuthentication=no -o
PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=10 -o
ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r '[buggyhost]'
<buggyhost> ESTABLISH SSH CONNECTION FOR USER: None
<buggyhost> SSH: EXEC ssh -q -C -o ControlMaster=auto -o
ControlPersist=60s -o StrictHostKeyChecking=no -o
'IdentityFile="id_rsa"' -o KbdInteractiveAuthentication=no -o
PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=10 -o
ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r buggyhost '/bin/sh -c
'"'"'chmod -R u+x
/root/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534/ && sleep
0'"'"''
<buggyhost> ESTABLISH SSH CONNECTION FOR USER: None
<buggyhost> SSH: EXEC ssh -q -C -o ControlMaster=auto -o
ControlPersist=60s -o StrictHostKeyChecking=no -o
'IdentityFile="id_rsa"' -o KbdInteractiveAuthentication=no -o
PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey
-o PasswordAuthentication=no -o ConnectTimeout=10 -o
ControlPath=/root/.ansible/cp/ansible-ssh-%h-%p-%r -tt buggyhost
'/bin/sh -c '"'"'/usr/bin/python
/root/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534/setup.py; rm
-rf "/root/.ansible/tmp/ansible-tmp-1471604358.1-265208088531534/" >
/dev/null 2>&1 && sleep 0'"'"''
^C
$

The playbook being

$ cat ls.yml
---
- hosts: '{{ target }}'
tasks:
- name: Run ls
shell: /bin/ls
$


I am using asible 2.2.0 (devel 3afe50dfe2). Server and clients are
running openSuSE 13.1.

Any idea, why playbooks hang in this situation?

Cheers
frank


Jean-Yves LENHOF

unread,
Aug 19, 2016, 7:55:12 AM8/19/16
to ansible...@googlegroups.com
The first task ansible is doing is gathering facts.... In facts there
are mounted filesystems, so the NFS one too

Regards,

Frank Thommen

unread,
Aug 19, 2016, 8:20:36 AM8/19/16
to ansible...@googlegroups.com
Thanks a lot. Using 'gather_facts: no' in the playbook solved this issue:

---
- hosts: '{{ target }}'
gather_facts: no
tasks:
- name: Run ls
shell: /bin/ls


However I found this feature to be quite hidden in the documentation.
IMHO gather_facts should be off by default and only on on request.

Cheers
frank


Jean-Yves LENHOF

unread,
Aug 19, 2016, 8:34:08 AM8/19/16
to ansible...@googlegroups.com



Le 19/08/2016 à 14:20, Frank Thommen a écrit :
On 08/19/2016 01:54 PM, Jean-Yves LENHOF wrote:
<snip>


The first task ansible is doing is gathering facts.... In facts there
are mounted filesystems, so the NFS one too

Thanks a lot.  Using 'gather_facts: no' in the playbook solved this issue:

---
- hosts: '{{ target }}'
  gather_facts: no
  tasks:
  - name:    Run ls
    shell:  /bin/ls


However I found this feature to be quite hidden in the documentation. IMHO gather_facts should be off by default and only on on request.

Hi,

No it is ok that's on by default and that should stay like this. A lot of my playbooks (and from other people too) depends on facts (ansible_distribution, ansible_distribution_version, ansible_lsb.major_release, to name the most currents one....)


If you need, you can filter facts :
http://stackoverflow.com/questions/34485286/ansible-gathering-facts-with-filter-inside-a-playbook


But from my point of view, is NFS is not responding, it's your server that is broken... Perhaps automouting (and so dismounting) NFS is an option for you

Regards,

JYL

Frank Thommen

unread,
Aug 19, 2016, 10:07:55 AM8/19/16
to ansible...@googlegroups.com
On 08/19/2016 02:33 PM, Jean-Yves LENHOF wrote:
>
> Le 19/08/2016 à 14:20, Frank Thommen a écrit :
>> On 08/19/2016 01:54 PM, Jean-Yves LENHOF wrote:
> <snip>
>>>
>>> The first task ansible is doing is gathering facts.... In facts there
>>> are mounted filesystems, so the NFS one too
>>
>> [...]
>>
>> However I found this feature to be quite hidden in the documentation.
>> IMHO gather_facts should be off by default and only on on request.
>
> Hi,
>
> No it is ok that's on by default and that should stay like this. A lot
> of my playbooks (and from other people too) depends on facts
> (|ansible_distribution,
> ||ansible_distribution_version, ||ansible_lsb.major_release, to name the
> most currents one....)|
> ||||

I see it the same as with services, open ports, access permissions ecc.
ecc.: Minimum by default, more on request. But of course, once the
maximum has been established as default, a change can break established
and working mechanisms. Now it's probably too late to change this
initial design decision.


> |If you need, you can filter facts :
> http://stackoverflow.com/questions/34485286/ansible-gathering-facts-with-filter-inside-a-playbook
> But from my point of view, is NFS is not responding, it's your server
> that is broken... Perhaps automouting (and so dismounting) NFS is an
> option for you Regards, JYL

yes, there is a technical problem, but that's not the issue. The issue
is, that this shouldn't break my scripts. When - very simplified - I
run a script which does an `ls` in my homedirectory I don't want it to
break (rather: it /must/ not break), just because some other, completely
unrelated filesystem or service is not working. But that's what was
happening in our case until we disabled gather_facts.

Anyway: Our current problem is solved - thanks to your hint - and I will
set "gathering = explicit" in the configuration file, which should also
have the desired effect.


Cheers
frank





Reply all
Reply to author
Forward
0 new messages