Background task that runs 'forever'

4,678 views
Skip to first unread message

Will Thames

unread,
Nov 1, 2013, 12:04:08 AM11/1/13
to ansible...@googlegroups.com
Is there a way to start a script that runs forever - without passing a phenomenally large number to async?

The problem with async is that after the timeout, it calls os.killpg on the process group with SIGKILL, so no matter how much I nohup, detach, subshell the child task, even if it has a PPID of 1, it's still in the same process group.

I don't want to have async running forever either, as then there are three long running ansible processes just hanging around making the process table look untidy.

I have tried to make the script run the task using nohup and then calling the command module without async, but then for some reason it hangs in run_command

A fully working example (ok, it only sleeps for 10000 seconds but that will do) is at:
https://gist.github.com/willthames/7260782

It's worth noting that even after the KeyboardInterrupt, the sleep process is running.

Is there a better way of doing this? Are there arguments against sending a different signal to os.killpg (SIGHUP seems like an obvious one that I could protect against with nohup or trap)

Thanks,
Will

Will Thames

unread,
Nov 1, 2013, 1:41:42 AM11/1/13
to ansible...@googlegroups.com
I've tested that changing the signal to SIGHUP works in the way I'd expect for the async module, and would be willing to submit a pull request if there's any likelihood it would be accepted. I have no idea why the popen.communicate blocks on the script that calls the nohup background task when not using async.

Will

Alex Rodenberg

unread,
Nov 1, 2013, 8:53:21 AM11/1/13
to ansible...@googlegroups.com
if you put poll to 0 on an async call it should carry on.. 

http://www.ansibleworks.com/docs/playbooks_async.html

Or do you need to do stuff after it completes? 

Will Thames

unread,
Nov 1, 2013, 9:07:48 AM11/1/13
to ansible...@googlegroups.com
I agree that that is what the docs say, and what would be desirable to happen.

What I'm saying though is that the task gets killed by a os.killpg call when the timeout expires. I'm happy with the killpg cleaning up the async_wrapper module and associated ansible-playbook related processes, but it kills the processes that the async command creates too. Because it sends SIGKILL, I can't trap it.

I'd be happy to see a working example of any approach where a process continues after the playbook ends and any timeouts expire - my gist is I think a very simple example (and adding async: 5 and poll: 0 just fails in a different way) that hopefully someone has a simple fix for. 

Will

Michael DeHaan

unread,
Nov 1, 2013, 10:46:46 PM11/1/13
to ansible...@googlegroups.com
It's fine to insert an insanely high value to async.

Heat death of the universe is fine if you want to.

Nothing to worry about.




--
You received this message because you are subscribed to the Google Groups "Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ansible-proje...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.



--
Michael DeHaan <mic...@ansibleworks.com>
CTO, AnsibleWorks, Inc.
http://www.ansibleworks.com/

Will Thames

unread,
Nov 1, 2013, 10:49:11 PM11/1/13
to ansible...@googlegroups.com
But then the three ansible processes hang around in the background making the process list untidy, not to mention making unnecessary checks every five seconds as to whether the process has timed out yet. 

Any reason why the killpg sends SIGKILL rather than SIGHUP?

Will

You received this message because you are subscribed to a topic in the Google Groups "Ansible Project" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/ansible-project/bMuOs5lLg_8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to ansible-proje...@googlegroups.com.

Michael DeHaan

unread,
Nov 1, 2013, 10:56:25 PM11/1/13
to ansible...@googlegroups.com
Those processes will die when the operation dies.

You can also change the poll to any interval you want.

If you don't wish to poll, async with 0 poll and fire & forget.

not sure what you mean about the killpg question


Will Thames

unread,
Nov 1, 2013, 11:01:35 PM11/1/13
to ansible...@googlegroups.com
This is for an operation that is supposed to live forever.

The 5 second poll is hardcoded and happens even when poll is set to 0

Also hardcoded is the kill signal of the process group which is set to SIGKILL. If it were set to SIGHUP the behaviour would pretty much be identical except that tasks could ignore the SIGHUP signal using nohup or trap (but the rest of the process group would die which is as desired).

Will 

Michael DeHaan

unread,
Nov 1, 2013, 11:05:22 PM11/1/13
to ansible...@googlegroups.com
I think it's reasonable to just do a straight exec in the fire and forget case but we'll have to see about implications -- there's no need for the status watcher in that case, but things elsewhere in Runner might need to change.

I don't believe SIGHUP is the proper fix, the kill is there to kill the beast when it expires.

Please file a ticket and reference this thread.



Will Thames

unread,
Nov 1, 2013, 11:09:56 PM11/1/13
to ansible...@googlegroups.com
I've tested execution with SIGHUP, and that is sufficient to kill the other processes when it expires - the only processes that would survive would be ones that trap it or run under nohup

I'll raise the issue though. 

Michael DeHaan

unread,
Nov 2, 2013, 9:00:52 AM11/2/13
to ansible...@googlegroups.com
Others may wish to see discussion here:

Reply all
Reply to author
Forward
0 new messages