Gathering facts just hung to infinity for my AWS hosts

192 views
Skip to first unread message

Steven Truong

unread,
Oct 7, 2014, 2:55:12 PM10/7/14
to ansible...@googlegroups.com
Hi,

I have a mixed or Amazon Linux and CentOS 6.5 on AWS but I just recently have had this problem.  Running the "setup" scripts directly worked but Ansible gathering facts just hung.

# pwd
/home/adsymp/.ansible

 # su root -c /bin/sh -c 'echo SUDO-SUCCESS-dhyqcckwxzuthyzxqniggbjvfxvuewha; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /home/adsymp/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294/setup'
SUDO-SUCCESS-dhyqcckwxzuthyzxqniggbjvfxvuewha
/home/adsymp/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294/setup:2922: DeprecationWarning: object.__new__() takes no parameters
  return super(cls, subclass).__new__(subclass, *arguments, **keyword)
{"verbose_override": true, "changed": false, "ansible_facts": {"ansible_product_serial": "", "ansible_form_factor": "", "ansible_product_version": "", "ansible_swaptotal_mb": 0, "ansible_user_id": "root", "module_setup": true, "ansible_userspace_bits": "64", "ansible_distribution_version":
.....


Ansible verbose ....

EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/home/steven/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/home/steven/.ssh/key-user"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=user', '-o', 'ConnectTimeout=10', 'ip-10-123....', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294 && echo $HOME/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294'"]
<ip-10-123-71-225> PUT /tmp/tmpLdghlG TO /home/user/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294/setup
<ip-10-123-71-225> EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/home/steven/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/home/steven/.ssh/key-user"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=user', '-o', 'ConnectTimeout=10', 'ip-10-123.....', u'/bin/sh -c \'su root -c "/bin/sh -c \'"\'"\'echo SUDO-SUCCESS-dhyqcckwxzuthyzxqniggbjvfxvuewha; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /home/adsymp/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294/setup; rm -rf /home/adsymp/.ansible/tmp/ansible-tmp-1412705402.97-194667564226294/ >/dev/null 2>&1\'"\'"\'"\'']


Amazon Linux
#rpm -q openssh
openssh-5.3p1-15.12.amzn1.x86_64
# python --version
Python 2.6.5

CentOS:
$ rpm -q openssh
openssh-6.4p1.el6-1.x86_64
$ python --version
Python 2.6.6

I am using ansible 1.7.1.

Please help as this is really strange.  I ran the setup with "python -m trace --trace" and nothing seems to stand out for causing the issue.

Thanks,
Steven.


James Cammarata

unread,
Oct 7, 2014, 9:44:37 PM10/7/14
to ansible...@googlegroups.com
Hi Steven,

Can you hop on one of the systems while the fact gathering is going on and see what may be hung that way? My bet is it could be something mount-related.

--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/2c43519f-e7a4-4695-adaf-fb42eac43e02%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Steven Truong

unread,
Oct 8, 2014, 1:23:57 PM10/8/14
to ansible...@googlegroups.com
Hi James,

Here are the mount points for my two typical systems affected by this issue.  

bash-3.2$ df
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/xvda1            10321208   4753692   5462660  47% /
tmpfs                  3825972         0   3825972   0% /dev/shm
/dev/xvdc            433455904   3674680 407762920   1% /media/ephemeral0


Filesystem     1K-blocks     Used Available Use% Mounted on
/dev/xvde1      51606140 21435556  27549656  44% /
tmpfs            7685444        0   7685444   0% /dev/shm
/dev/xvdg       82558640  3882592  74482308   5% /media/ephemeral0
/dev/xvdh       82558640   188292  78176608   1% /media/ephemeral1


When I logged in the systems and use strace to attach to the pid of the python process for ansible, I saw that it just hung there with

Process 406 attached - interrupt to quit
read(0, 

But I just simply ran the /usr/bin/python /home/adsymp/.ansible/tmp/ansible-tmp-1412788904.43-83621415975693/setup then I got the complete expected facts.

Thank you,
Steven.

Steven Truong

unread,
Oct 10, 2014, 3:44:57 PM10/10/14
to ansible...@googlegroups.com
Hi all,

I worked around and tried not to use any facts so I disabled fact gathering in my playbook.  I now ran into a strange problem and even for a simplest task such as running a shell command the whole thing would just hung...


< TASK: bash | cp /usr/local/bin/help to /usr/local/bin/help.old >
 --------------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||


<mysever> ESTABLISH CONNECTION FOR USER: user
<mysever> REMOTE_MODULE command cp /bin/bash /bin/bash.old #USE_SHELL
<mysever> EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/home/steven/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/home/steven/.ssh/key-user"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=user', '-o', 'ConnectTimeout=10', 'mysever', "/bin/sh -c 'mkdir -p $HOME/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592 && echo $HOME/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592'"]
<mysever> PUT /tmp/tmpVCdj27 TO /home/user/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592/command
<mysever> EXEC ['ssh', '-C', '-tt', '-vvv', '-o', 'ControlMaster=auto', '-o', 'ControlPersist=60s', '-o', 'ControlPath=/home/steven/.ansible/cp/ansible-ssh-%h-%p-%r', '-o', 'StrictHostKeyChecking=no', '-o', 'IdentityFile="/home/steven/.ssh/key-user"', '-o', 'KbdInteractiveAuthentication=no', '-o', 'PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey', '-o', 'PasswordAuthentication=no', '-o', 'User=user', '-o', 'ConnectTimeout=10', 'mysever', u'/bin/sh -c \'su root -c "/bin/sh -c \'"\'"\'echo SUDO-SUCCESS-jqiqepwwyueomscsvvidoflkvqfxpkff; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /home/user/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592/command; rm -rf /home/user/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592/ >/dev/null 2>&1\'"\'"\'"\'']

On the aws server:

$ date
Fri Oct 10 19:27:02 UTC 2014
$ ps axuw|grep python
root      7691  0.0  0.0 143044  1056 pts/1    Ss+  16:58   0:00 su root -c /bin/sh -c 'echo SUDO-SUCCESS-jqiqepwwyueomscsvvidoflkvqfxpkff; LANG=en_US.UTF-8 LC_CTYPE=en_US.UTF-8 /usr/bin/python /home/adsymp/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592/command; rm -rf /home/user/.ansible/tmp/ansible-tmp-1412960308.87-49429622058592/ >/dev/null 2>&1'
root     17783  0.0  0.1 331228 13320 ?        SN    2012   0:00 /usr/bin/python -tt /usr/sbin/yum-updatesd

As you can see, I let it to run from 16:58 UTC until 19:27 UTC and it is still there.

I would like to know how to fix this as this is a show stopper for me. 

The weird thing is that I could use "-m setup" or "-m shell -a uptime" but not through the playbook's tasks.

By they way, in our Ansible community, do we have a tool to capture the environments (ansible facts might do) to provide information for troubleshooting.

Steven.

Steven Truong

unread,
Oct 10, 2014, 6:48:20 PM10/10/14
to ansible...@googlegroups.com
Well, you do not have to guess.  I feel very stupid and spent times looking into this. 

Apparently I used the wrong options for ansible-playbook....

ansible-playbook base.yml --tags useful -i hosts.awse --limit awse-c --private-key ~/.ssh/key-user -u user -s

This works.

Sorry to all..
Steven.

On Tuesday, October 7, 2014 11:55:12 AM UTC-7, Steven Truong wrote:
Reply all
Reply to author
Forward
0 new messages