Hi everyone,
I am stumped on an issue trying to run a command (/sbin/service jetty start) via any combination of ansible command line, playbook with either the shell or command modules and ansible never exiting. If I ask ansible to start my jetty instance via any of these methods the command never exits. I am using ansible 0.4. For overall context this work is intended as part of a web application deployment process. If anyone has any input on this I'd be much obliged.
I'll walk through a command module and shell module example as they are easier to explain than the playbook.
here is the example command line:
ansible test2 -D --module-name=shell -a "/sbin/service jetty start"
or
ansible test2 -D -a "/sbin/service jetty start"
Executing either of these commands will actually do the "work" of starting jetty on the remote server, but ansible never exits from the tasks above. Executing this program on the target host directly happens right away with an exit code of 0 (ie: it works as expected when done manually).
the content of the test2 group is a single server, hosts content below:
----
[test2]
swup-ua-lt02v.swup
----
The commands above never exit, this command:
[root@swup-mgt-util01v playbooks]# ansible test2 -a "/sbin/service jetty start"
has been running for over an hour, however the "work" to start the jetty instance does happen, so I know the command is being executed, it just never finishes. Ansible *will* exit if the init script exits with a non-zero exit code, say for example if the service is already running:
[root@swup-mgt-util01v playbooks]# ansible test2 -D --module-name=shell -a "service jetty start"
swup-ua-lt02v.swup | FAILED | rc=1 >>
Starting Jetty: Already Running!
[root@swup-mgt-util01v playbooks]#
other commands seem to run fine:
[root@swup-mgt-util01v playbooks]# ansible test2 -D -a "/bin/ls -l /usr/share/ansible"
swup-ua-lt02v.swup | success | rc=0 >>
total 152
-rwxr-xr-x 1 root root 5299 Jun 1 16:40 apt
-rwxr-xr-x 1 root root 2761 Jun 1 16:40 async_status
-rwxr-xr-x 1 root root 6141 Jun 1 16:40 async_wrapper
-rwxr-xr-x 1 root root 2687 Jun 1 16:40 command
-rwxr-xr-x 1 root root 1880 Jun 1 16:40 copy
-rwxr-xr-x 1 root root 879 Jun 1 16:40 facter
-rwxr-xr-x 1 root root 948 Jun 1 16:40 failtest
-rwxr-xr-x 1 root root 857 Jun 1 16:40 fetch
-rwxr-xr-x 1 root root 10658 Jun 1 16:40 file
-rwxr-xr-x 1 root root 6053 Jun 1 16:40 git
-rwxr-xr-x 1 root root 4715 Jun 1 16:40 group
-rwxr-xr-x 1 root root 784 Jun 1 16:40 ohai
-rwxr-xr-x 1 root root 972 Jun 1 16:40 ping
-rwxr-xr-x 1 root root 876 Jun 1 16:40 raw
-rwxr-xr-x 1 root root 7199 Jun 1 16:40 service
-rwxr-xr-x 1 root root 14545 Jun 1 16:40 setup
-rw-r--r-- 1 root root 230 Jun 1 16:40 shell
-rwxr-xr-x 1 root root 1944 Jun 1 16:40 slurp
-rwxr-xr-x 1 root root 939 Jun 1 16:40 template
-rwxr-xr-x 1 root root 10640 Jun 1 16:40 user
-rwxr-xr-x 1 root root 11481 Jun 1 16:40 virt
-rwxr-xr-x 1 root root 10451 Jun 1 16:40 yum
and the /sbin/service jetty start command that is being executed runs fine when manually executed on the target host and exits with a code of 0 right away.
As far as I can tell there is something specific about the service command exiting with a 0 that is causing ansible to never exit. I get the same behavior when calling the /etc/init.d/jetty script directly too.
The strace output for ansible in its "wait state" is not overly useful, though I honestly can't read this output very well:
wait4(27480, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27485, 0x7fffe2227414, WNOHANG, NULL) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f75fd34a9d0) = 27490
wait4(27480, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27485, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27490, 0x7fffe2227414, WNOHANG, NULL) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f75fd34a9d0) = 27491
wait4(27480, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27485, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27491, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27490, 0x7fffe2227414, WNOHANG, NULL) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f75fd34a9d0) = 27492
wait4(27480, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27485, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27491, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27492, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27490, 0x7fffe2227414, WNOHANG, NULL) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f75fd34a9d0) = 27494
wait4(27485, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27492, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27494, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27480, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27490, 0x7fffe2227414, WNOHANG, NULL) = 0
wait4(27491, 0x7fffe2227414, WNOHANG, NULL) = 0
clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD, child_tidptr=0x7f75fd34a9d0) = 27497
wait4(27490, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 27490
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(27491, 0x7fffe22272b4, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(27491, 0x7fffe22272b4, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(27491, 0x7fffe22272b4, 0, NULL) = ? ERESTARTSYS (To be restarted)
--- SIGCHLD (Child exited) @ 0 (0) ---
wait4(27491,
If I can provide any other information I'd be happy to.
thanks for everyone's time,
Jonathan