ping_interval vs. master_alive_interval

36 views
Skip to first unread message

Maik Ender

unread,
Mar 29, 2023, 11:47:17 AM3/29/23
to Salt-users

TL;DR:

Where is the difference between the ping_interval and master_alive_interval settings, as they seem to do the same?


Long version: 

I am running a saltstack fleet with around 600 minions (most of them are Raspberry Pis). Since adding the last 100 minions, I have often experienced connection timeouts when applying a state.apply or similar. 


```

salt.exceptions.SaltReqTimeoutError: Salt request timed out. The master is not responding. You may need to run your command with `--async`, the CLI tool will print the job id (jid) and exit immediately without listening for responses. You can then use `salt-run jobs.lookup_jid` to look up the results of the job in the job cache later.

```

The server should be fine (8 CPUs, 16GB RAM), as I doubled the specs some days ago without any improvement.


Within the minion config, I set the following:


```

ping_interval: 1

mine_function:

   network.ip_adds:
    - type: private

master_alive_interval: 30

```


Where is the difference between the ping_interval and master_alive_interval settings, as they seem to do the same?


Also, there are a lot of minion_ping events on the event bus.


Thanks

Maik

Phipps, Thomas

unread,
Mar 29, 2023, 12:53:11 PM3/29/23
to salt-...@googlegroups.com
ping_interval sends a ping event to the master from the minion. as a way of keeping the system alive. master_alive_interval runs status.master on the minion to check if the master is connected and if it isn't to reconnect to the failover master.

a ping_interval is why you are seeing those minion_ping events. and with 600 servers you are flooding your master out as it tries dealing with 600 events every second. not including the events from anything else.

--
You received this message because you are subscribed to the Google Groups "Salt-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to salt-users+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/salt-users/0b1b21ca-46c7-4f43-b78a-23cc62d5e307n%40googlegroups.com.

Maik Ender

unread,
Mar 29, 2023, 5:44:19 PM3/29/23
to Salt-users
Many thanks. I think, according to the documentation, the unit for the ping_interval is minutes, while for the master_alive_interval it is in seconds. Thus I assumed 10 per second is a little.

However, is it safe to just use master_alive_interval and disable the ping_interval? My minions are all IoT devices, some of which have a precarious connection. So the minion would reconnect after it reconnects to the network?

Phipps, Thomas

unread,
Mar 29, 2023, 8:01:39 PM3/29/23
to salt-...@googlegroups.com
You're right ping_interval is min not seconds. so many seconds settings get them mixed up sometimes. I would actually stop using master_alive_interval. as ping_interval will restart the minion if it can't reach the master. and master_alive_interval is better for tcp and multimaster configs.

You might want to look at increasing the worker_treads slightly. to say 10. do not set worker_threads above 1 and a half times the number of cpus.

Reply all
Reply to author
Forward
0 new messages