Interesting questions, Neha. As soon as you deviate from the default
behavior of running tasks in lock-step across all the play hosts,
the plane of intersection between "behaviors I want" and "behaviors
I get" grows incredibly fast.
When you say "preserve the concurrent behavior," which behavior
exactly are you referring to? If it's the default
linear strategy that completes a
task across all hosts before proceeding to the next, then I think
the answer is "no."
Let me ask: does a person watching the
ansible-playbook output constitute patch
failure notification? If that's the case, what does that person then
do? Can
that be automated as well?
Once you start an
async
task for a given host, are there other tasks being run for that host
before you start the
async_status
task? If not, you aren't gaining anything by running those patch
tasks
async — at
least, as I understand it; someone correct me if I'm wrong. In fact,
it may be hurting you, as each
async task is tying up a worker, as is the
async_status task.
Consider taking another approach: in your patch playbook, make the
play that does the actual patching run with "
strategy: free". That will allow each host
to run through that play's tasks as fast as it can - other
constraints being considered - without regard to which tasks other
hosts are still executing. Furthermore, put the patch task in a
block, and in that
block include a
rescue section. Tasks
(plural; you can have lots of them) in the
rescue section are only run on the hosts
that failed a task in the main body of the
block. In that
rescue section, invoke some sort of
mechanism to notify the admin that host blah-dee-blah failed its
patching. It'll also show up in the
ansible-playbook log, but so will a lot of
other stuff, screens scroll, humans blink, etc. An alternative
dedicated mechanism (which could be simply appending a line to a
file listing failed hosts, maybe with timestamps) isn't as likely to
hide unicorns in the forest.
I'm reasonably convinced this will actually do what you're trying to
achieve with
async
and
async_status. But
then Ansible manages to throw surprises at me whenever I try
something clever, so test, test, test.
Let us know how you get on. Good luck!
--
Todd