workaround for serial: 1 failures stopping the entire playbook?

42 views
Skip to first unread message

Andrew Caldwell

unread,
Apr 2, 2019, 11:41:23 AM4/2/19
to Ansible Project
Hello,

This has probably been addressed 1000 times before, but I can't seem to find an answer (if this is even possible) on how, when running a play within a playbook on serial: 1, to have a node fail a task that would be fatal for the node, but not for the remaining nodes that have not run yet, and Ansible skip the rest of the play for just that one node, moving on to the next node in the batch.

I have a scenario where I want to perform OS patching on a large-ish group of servers in a hadoop cluster with no downtime to the cluster itself. So I am using serial: 1 when performing the patching tasks for each node - put it in maintenance mode, take it out of the cluster, patch, reboot, re-join the cluster, and do some basic health checks.

However if any one of these tasks fails in serial: 1 mode, Ansible considers the entire play failed and will not run against any remaining nodes. Since this is a large cluster (50 nodes), a failure on a single node isn't a showstopper and shouldn't stop the rest of the nodes from performing their OS patching. 

I'd like to know if there is a way around Ansible stopping an entire play for all nodes if a single node fails when running in serial: 1. From what I've read on the google there doesn't seem to be a way to do this short of setting serial: 2(+), but I thought I'd ask.


Brian Coca

unread,
Apr 3, 2019, 3:49:57 PM4/3/19
to Ansible Project
there are several ways, the simplest might be putting the whole thing
in a 'block' with a 'rescue' that always succeeds so it will go to the
next host.



--
----------
Brian Coca

Andrew Caldwell

unread,
Apr 10, 2019, 1:07:51 PM4/10/19
to Ansible Project
Brian,

Thanks for the reply on this. I will definitely test this out in my plays.

Andrew

Rob Wagner

unread,
Oct 22, 2019, 10:37:28 AM10/22/19
to Ansible Project
Hey Andrew - were you able to get anywhere with this?  I tried adding a block/rescue without any luck.  Searching all morning for a way to make ansible move onto the next host in a serial strategy even if one task on one host fails.  I'm thinking it's not possible.

Rob

Kai Stian Olstad

unread,
Oct 22, 2019, 11:32:57 AM10/22/19
to ansible...@googlegroups.com
On 22.10.2019 16:37, Rob Wagner wrote:
> Searching all morning for a way to make
> ansible move onto the next host in a serial strategy even if one task on
> one host fails. I'm thinking it's not possible.

It is possible.


--
Kai Stian Olstad
Reply all
Reply to author
Forward
0 new messages