Salt master loosing minion connectivity after sometime.

701 views
Skip to first unread message

Manikanta G

unread,
Nov 16, 2016, 7:27:39 AM11/16/16
to Salt-users
Hi,

I m able to 'test.ping' immediately after a minion service is started. But after sometime the 'test.ping' is not working, and returning "Minion did not return. [No response]". If I restart the minion service again, ping will work for sometime. The same behavior is observed all the time and is not some intermittent issue.


tm@ubuntu-16:~$ sudo salt '*' test.ping
7704550179275276290:
    True
tm@ubuntu-16:~$ sudo salt '*' test.ping
7704550179275276290:
    Minion did not return. [No response]
# restarted the minion service now
tm@ubuntu-16:~$ sudo salt '*' test.ping
7704550179275276290:
    True
tm@ubuntu-16:~$ date
Wed Nov 16 10:21:36 IST 2016
# tested after 5 min, and now it is not working
tm@ubuntu-16:~$ sudo salt '*' test.ping
7704550179275276290:
    Minion did not return. [No response]
tm@ubuntu-16:~$ date
Wed Nov 16 10:26:03 IST 2016


As seen in the above runs, minion ping worked first, then after 5min not worked. I've seen this same behavior with 2016.3 and 2016.3.4 too.

I've tried all the ways mentioned in the troubleshooting documentation. tried running minion in foreground mode with debug log (salt-minion -l debug), but when I run the 'test.ping' on master, no log is printing in the minion. But salt-call (salt-call -l debug state.apply) is working fine. Even after this salt-call, still the ping is failing.

Can someone please let me know if this is some config issue or a bug? I m struct with this from last couple of weeks.



Master versions-report: (DigitalOcean droplet)

tm@ubuntu-16:~$ salt-master --versions-report
Salt Version:
           Salt: 2016.3.4

Dependency Versions:
           cffi: Not Installed
       cherrypy: 3.5.0
       dateutil: 2.4.2
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: Not Installed
         Jinja2: 2.8
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.3
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.12 (default, Jul  1 2016, 15:12:24)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.2.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: Ubuntu 16.04 xenial
        machine: i686
        release: 4.4.0-38-generic
         system: Linux
        version: Ubuntu 16.04 xenial


Minion versions-report: (on Oracle VirtualBox 5.1.4, on Windows 8.1 64bit guest OS)

manikanta@bo-server-1:~$ salt-minion --versions-report
Salt Version:
           Salt: 2016.3.4

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: 2.4.2
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.8
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: 1.0.3
   msgpack-pure: Not Installed
 msgpack-python: 0.4.6
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
         pygit2: Not Installed
         Python: 2.7.12 (default, Jul  1 2016, 15:12:24)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.2.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: Ubuntu 16.04 xenial
        machine: x86_64
        release: 4.4.0-47-generic
         system: Linux
        version: Ubuntu 16.04 xenial


Master config:

publish_port: 30000

ret_port: 30001

external_auth:
  pam:
    tm:
      - .*
      - '@wheel'
      - '@runner'

file_roots:
  base:
    - /srv/data
    - /srv/salt
    - /srv/formulas/sun-java-formula
    - /srv/formulas/tomcat-formula
    - /srv/formulas/mysql-formula

hash_type: sha256

pillar_roots:
  base:
    - /srv/pillar

rest_cherrypy:
  host: 127.0.0.1
  port: 8000
  ssl_crt: /data/localhost-cert.pem
  ssl_key: /data/localhost-key.pem
  disable_ssl: true


Minion config:

id: 7704550179275276290
master_tries: -1
auth_tries: 20
hash_type: sha256
master_port: 30001

Note: here master is pointing to correct domain and it is resolving fine.



Thanks,
Manikanta

Manikanta G

unread,
Nov 18, 2016, 9:01:11 PM11/18/16
to Salt-users
Despite my detailed description about the issue I m facing, I didn't get any solution/tip from anyone here for 2 days. For sometime, I thought to remove the salt dependency and work on other approach, but is already late in the game and I can risk more time. So I had to find some solution and I did look into similar issues in Salt Github, and found similar issue and solution here: https://github.com/saltstack/salt/issues/6231#issuecomment-200562537

Basically, I've added below to my minion conf:

tcp_keepalive: True
tcp_keepalive_idle: 60

Seems this fixed the issue and master is able to ping the minion even after long time. So, my complete minion config is below:

Minion config: /etc/salt/minion.d/my-app.conf

id: 7704550179275276290
master_tries: -1
auth_tries: 20
hash_type: sha256
master_port: 30001
recon_default: 1000
recon_max: 59000
tcp_keepalive: True
tcp_keepalive_idle: 60

I m still thinking salt community is active and I don't want to be incorrect here :)

Thanks,
Manikanta
Reply all
Reply to author
Forward
0 new messages