multimaster salt (minion restart issues when one master is stopped)

201 views
Skip to first unread message

dean.s...@eduserv.org.uk

unread,
Sep 29, 2016, 9:21:37 AM9/29/16
to Salt-users
Hi,

I'm running 2015.5.11 on both masters and minions.

If I try and start the salt-minion on a minion and one of the salt-masters is stopped, salt-minion fails to start correctly. It is unresponsive from the other master, until I restart the stopped salt-master.

Are there some settings that I need to set or do I need to upgrade?

Best Regards

Charles Baker

unread,
Sep 29, 2016, 5:31:13 PM9/29/16
to salt-...@googlegroups.com
We've arrived at the following for our minions. Hope this helps.

[ec2-user@i-61e65ef1 minion.d]$ pwd

/etc/salt/minion.d

[ec2-user@i-61e65ef1 minion.d]$ sudo cat masters.conf

master:

- 10.145.0.184

- 10.145.1.190

master_type: failover

master_shuffle: True

master_alive_interval: 30

[ec2-user@i-61e65ef1 minion.d]$ cat recon.conf

# These settings are valid for SaltStack >= 2014.7.0

# Number of consecutive SaltReqTimeoutError that are acceptable when trying to

# authenticate.

auth_tries: 10


# If authentication fails due to SaltReqTimeoutError, continue without stopping the

# minion.

auth_safemode: True


# Ping Master to ensure connection is alive (minutes).

ping_interval: 30


# If the minion hits an error that is recoverable, restart the minion.

restart_on_error: True


# Let's have all minions reconnect within a 60 second timeframe on a disconnect.

# Each minion will have a randomized reconnect value between 'recon_default'

# and 'recon_default + recon_max', which in this example means between 1000ms

# and 60000ms (or between 1 and 60 seconds). The generated random-value will be

# doubled after each attempt to reconnect.

recon_default: 1000

recon_max: 59000

recon_randomize: True

[ec2-user@i-61e65ef1 minion.d]$


--
You received this message because you are subscribed to the Google Groups "Salt-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to salt-users+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Charles H. Baker
864.990.1297
Knowing is not enough; we must apply. Willing is not enough; we must do. Bruce Lee

Megan Wilhite

unread,
Sep 29, 2016, 5:50:33 PM9/29/16
to Salt-users
I know there was a lot of improvements with multi-master which I believe were included in 2015.8.11 and up. I'm pretty certain if you upgraded your salt versions you would not run into these issues.

dean.s...@eduserv.org.uk

unread,
Sep 30, 2016, 4:01:25 AM9/30/16
to Salt-users
Thanks,

when I try this I get

[salt.minion                              ][CRITICAL][18472] 'master_type' set to 'failover' but 'retry_dns' is not 0. Setting 'retry_dns' to 0 to failover to the next master on DNS errors.

In the logs

Should I worry about that?
To unsubscribe from this group and stop receiving emails from it, send an email to salt-users+...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

dean.s...@eduserv.org.uk

unread,
Sep 30, 2016, 5:07:11 AM9/30/16
to Salt-users
Ok looking at the behaviour:

With one master's salt-master stopped, the minion starts and will connect to the other master. 
Unfortunately then I'm only able to manage a minion from the master it originally connects to. Even after starting salt-master the other one gets a timeout.

If I remove the failover option I can use both masters, but the initial salt dialog is to one only and if that one is not available the other gets a timeout.

I use a unison script to replicate between the two masters.

dean.s...@eduserv.org.uk

unread,
Sep 30, 2016, 6:11:02 AM9/30/16
to Salt-users
The salt repo for 2015.8+ has a dependency for python-tornado that is greater than the version supplied by centos 7.
I use yum priorities to help prevent repos from overwriting base packages. I've yet to determine if there are any unforeseen consequences of using that package from the salt repo. 

Error: Package: salt-2015.8.12-2.el7.noarch (salt-2015.8)
           Requires: python-tornado >= 4.2.1
           Available: python-tornado-2.2.1-8.el7.noarch (base)
               python-tornado = 2.2.1-8.el7
Reply all
Reply to author
Forward
0 new messages