All minions can no longer keep a connection to master 3006.8

18 views
Skip to first unread message

Dave Macias

unread,
Jul 16, 2024, 10:02:01 AM (11 days ago) Jul 16
to Salt-users
Good day everyone,

As the subject says, I cannot get a single minion to keep a connection with the master. I am struggling, so I am posting here now. I use ipv6 to connect the minions to the master, but ipv4 also did not work. I can telnet to port 4505/4506 from the minion without issue. After I accept the minion key, the minion hangs for a little bit then loses connection to master and cannot understand why... (╯°□°)╯︵ ┻━┻

1x salt-master 3006.8 on docker with python3.11 (tried my docker instance on a separate host and same results)

Salt Version:
          Salt: 3006.8
 
Python Version:
        Python: 3.11.9 (main, Jul 10 2024, 19:10:15) [GCC 13.2.1 20240309]
 
Dependency Versions:
          cffi: 1.17.0rc1
      cherrypy: 18.10.0
      dateutil: 2.9.0.post0
     docker-py: Not Installed
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.4
       libgit2: Not Installed
  looseversion: 1.3.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.8
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 23.2
     pycparser: 2.22
      pycrypto: Not Installed
  pycryptodome: 3.20.0
        pygit2: Not Installed
  python-gnupg: Not Installed
        PyYAML: 6.0.1
         PyZMQ: 26.0.3
        relenv: Not Installed
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.5
 
Salt Extensions:
   salt-nornir: 0.21.0
 
System Versions:
          dist: alpine 3.20.1
        locale: utf-8
       machine: x86_64
       release: 5.14.0-427.18.1.el9_4.x86_64
        system: Linux
       version: Alpine Linux 3.20.1

all my minions are rocky linux 9 os and salt was installed via a pip venv. (also tried deleting the .venv and recreated but still same results)

Salt Version:
          Salt: 3006.8
 
Python Version:
        Python: 3.11.5 (main, Sep  7 2023, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)]
 
Dependency Versions:
          cffi: 1.16.0
      cherrypy: 18.10.0
      dateutil: 2.9.0.post0
     docker-py: 7.1.0
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 3.1.4
       libgit2: Not Installed
  looseversion: 1.3.0
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 1.0.8
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     packaging: 24.1
     pycparser: 2.22
      pycrypto: Not Installed
  pycryptodome: 3.20.0
        pygit2: Not Installed
  python-gnupg: 0.5.2
        PyYAML: 6.0.1
         PyZMQ: 26.0.3
        relenv: Not Installed
         smmap: Not Installed
       timelib: 0.3.0
       Tornado: 4.5.3
           ZMQ: 4.3.5
 
System Versions:
          dist: rocky 9.2 Blue Onyx
        locale: utf-8
       machine: x86_64
       release: 5.14.0-284.30.1.el9_2.x86_64
        system: Linux
       version: Rocky Linux 9.2 Blue Onyx



Hopefully, you folks can see something I don't :(

At a lost here, any input is much appreciated!

Best,
Dave


 

Dave Macias

unread,
Jul 16, 2024, 2:17:04 PM (11 days ago) Jul 16
to Salt-users
fixed

after we did a packet capture we noticed that the salt master was doing a DNS query for it's configured `master_id`.
The master id was just a dummy name and not the fqdn used by the minions.
We changed it to the fqdn and the minions connected just fine.
Unfortunately, that behavior was not reflected in the debug logs.
Happy its working again
┬─┬ノ( º _ ºノ)
Reply all
Reply to author
Forward
0 new messages