Ansible sometime could not get the output from the runner._low_level_exec_command,this would cause the task to fail.

50 views
Skip to first unread message

qiu jiawei

unread,
Apr 3, 2014, 3:28:24 AM4/3/14
to ansible...@googlegroups.com
Ansible version 1.5.3

Out playbook look like below:
debug.ym:

- hosts:
  - cnode463
  tasks:
  - include: roles/conf/tasks/hadoop.yml


hadoop.yml
- name: copy hadoop conf
  sudo: yes
  template: src={{ TEMPLATE_DIR }}/hadoop/{{item}}.j2 dest=/etc/hadoop/conf/{{item}}
  with_items:
    - core-site.xml
    - hdfs-site.xml
    - hdfs-site.private.xml
    - log4j.properties
    - hadoop-env.sh


when running the playbook, sometime we get failed.

TASK: [copy hbase conf] ******************************************************* 
ok: [cnode463] => (item=hbase-site.xml)
ok: [cnode463] => (item=log4j.properties)
failed: [cnode463] => (item=hbase-env.sh) => {"failed": true, "item": "hbase-env.sh", "parsed": false}

FATAL: all hosts have already failed -- aborting

PLAY RECAP ******************************************************************** 
cnode463                   : ok=1    changed=0    unreachable=0    failed=1   
I debug the ansible code and add below code to print the result of running runner._low_level_exec_command
 
        print "****"
        print "cmd "+str(cmd)
        print "out "+str(out)
        print "err "+str(err)
        print "____"

And last I found that _low_level_exec_command may not get the output of the cmd correctly.

the debug log is below:
****
cmd mkdir -p $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534 && chmod a+rx $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534 && echo $HOME/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534
out /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534

err
____
****
cmd rc=0; [ -r "/etc/hadoop/conf/yarn-site.private.xml" ] || rc=2; [ -f "/etc/hadoop/conf/yarn-site.private.xml" ] || rc=1; [ -d "/etc/hadoop/conf/yarn-site.private.xml" ] && echo 3 && exit 0; (/usr/bin/md5sum /etc/hadoop/conf/yarn-site.private.xml ) || (/sbin/md5sum -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/digest -a md5 /etc/hadoop/conf/yarn-site.private.xml ) || (/sbin/md5 -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/md5 -n /etc/hadoop/conf/yarn-site.private.xml ) || (/bin/md5 -q /etc/hadoop/conf/yarn-site.private.xml ) || (/usr/bin/csum -h MD5 /etc/hadoop/conf/yarn-site.private.xml ) || (/bin/csum -h MD5 /etc/hadoop/conf/yarn-site.private.xml ) || (echo "${rc}  /etc/hadoop/conf/yarn-site.private.xml")
out
err
____
****
cmd /usr/bin/python /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534/copy; rm -rf /home/hadoop/.ansible/tmp/ansible-tmp-1396508595.41-255928955172534/ >/dev/null 2>&1
out
err
____
failed: [cnode463] => (item=yarn-site.private.xml) => {"failed": true, "item": "yarn-site.private.xml", "parsed": false}
This problem appear more and more frequently.Is it possible to fix it ?

Michael DeHaan

unread,
Apr 3, 2014, 6:12:38 PM4/3/14
to ansible...@googlegroups.com
Haven't seen this.

If you can set up a minimal example that can reproduce this one and file a ticket we can help take a look.




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
To post to this group, send email to ansible...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/ansible-project/ebfbbec6-d60c-4885-9cf9-912dc2edefd7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

qiu jiawei

unread,
Apr 8, 2014, 3:44:12 AM4/8/14
to ansible...@googlegroups.com
I thought it is hard to set up the example. Because this problem only appear in one of our product environment.We never find this problem in our test environment.
We use ansible to monitor the machine's port ,so ansible-playbook may run multiple at the same time .
Is it any params that we should notice when running multiple ansible-playbook at the same time???

在 2014年4月4日星期五UTC+8上午6时12分38秒,Michael DeHaan写道:

Michael DeHaan

unread,
Apr 9, 2014, 9:51:38 AM4/9/14
to ansible...@googlegroups.com
Can't say for sure, but if you can get it to occur -vvvv output *MIGHT* be interesting.

You might also have a .bash_profile type script or MOTD outputting something that looks like JSON and has confused the parser - MOTDs normally don't show up the way we invoke SSH but they did with dropbear (which you very very likely aren't using).

If it's reproducible consistently on the one production machine it should be possible to debug things (though perhaps would require some modification of Ansible).




Reply all
Reply to author
Forward
0 new messages