Minion not receiving commands

273 views
Skip to first unread message

ckm

unread,
Jun 1, 2012, 3:07:45 PM6/1/12
to salt-...@googlegroups.com
All,

I'm having a problem communicating with a minion.  The minion has correctly been authenticated, but no commands pass to it.  This particular minion is connected to a syndic master (e.g. master -> syndic -> master -> minion).    Neither the master_master or the syndic_master can pass commands to the minion, but I can see the master_master job being relayed.   Trying to test.ping the minion from either syndic or master_master does nothing.

Here is what I have in the syndic/master debug log:

Clear payload received with command publish
14:43:27,135 [salt.master    ][INFO    ] Published command status.cpuinfo with jid 20120601144326439237
14:43:27,135 [salt.master    ][DEBUG   ] Published command details {'tgt_type': 'glob', 'jid': '20120601144326439237', 'tgt': '*', 'ret': '', 'arg': [], 'fun': 'status.cpuinfo'}

That's the only indication of anything happening in response to salt '*' status.cpuinfo

I've run all four servers (master_master, syndic, master(syndic), minion) with --log-level=debug and the above line was the only indication of activity...  I also looked at the troubleshooting guide to get some guidance, but no joy there...

I'm quite sure it's not a firewall issue as netstat shows everything is connection properly:

tcp        0      0 172.16.65.228:4505      172.16.65.227:42957     ESTABLISHED 2526/python     
tcp        0      0 172.16.65.228:43981     172.16.65.226:4505      ESTABLISHED 2117/python     

Note: .227 is the minion, .226 is the master/master, .228 is the syndic/master (local machine in this case)


Help?

Thx. Chris.

Thomas S Hatch

unread,
Jun 1, 2012, 9:12:49 PM6/1/12
to salt-...@googlegroups.com
What do your master configs look like on the syndic master and the master?

ckm

unread,
Jun 5, 2012, 8:18:36 PM6/5/12
to salt-...@googlegroups.com
Sorry, spent the last couple of days debugging an iPhone app...

On the master/master, it looks like this (only the delta from master.template):
interface: 172.16.65.226
order_masters: true

On syndic, it looks like this:
interface: 172.16.65.228
syndic_master: 172.16.65.226

ckm

unread,
Jun 12, 2012, 5:08:20 PM6/12/12
to salt-...@googlegroups.com
Any guidance on this?  Where should I be looking for config errors?

Thx.

Chris.


On Friday, June 1, 2012 12:07:45 PM UTC-7, ckm wrote:

Thomas S Hatch

unread,
Jun 12, 2012, 5:13:58 PM6/12/12
to salt-...@googlegroups.com
Sorry I did not get back, seems I missed this one.

That config looks correct, are there any firewalls between you and the minion? You need to have ports 4505 and 4506 open on the master servers for the minion to connect

ckm

unread,
Jun 14, 2012, 1:51:05 PM6/14/12
to salt-...@googlegroups.com
There is a firewall, but ports 4505 & 4506 are open (see netstat above, iptables -L | grep [port] below).

I'll try disabling it see if it makes any difference, but they are open:

master/master
ACCEPT     tcp  --  172.16.65.0/24       anywhere            tcp dpt:4505 
ACCEPT     udp  --  172.16.65.0/24       anywhere            udp dpt:4505 
ACCEPT     tcp  --  172.16.65.0/24       anywhere            tcp dpt:4506 
ACCEPT     udp  --  172.16.65.0/24       anywhere            udp dpt:4506 

syndic
ACCEPT     tcp  --  172.16.65.0/24       anywhere            tcp dpt:4505 
ACCEPT     udp  --  172.16.65.0/24       anywhere            udp dpt:4505 
ACCEPT     tcp  --  172.16.65.0/24       anywhere            tcp dpt:4506 
ACCEPT     udp  --  172.16.65.0/24       anywhere            udp dpt:4506 

minion
[ no firewall enabled ]

Weird thing is I'm not getting any sort of communication errors in the logs, which is what I would expect...

Thx for the help on this, I really need to get this solved.

Chris.


On Tuesday, June 12, 2012 2:13:58 PM UTC-7, Thomas Hatch wrote:
Sorry I did not get back, seems I missed this one.

That config looks correct, are there any firewalls between you and the minion? You need to have ports 4505 and 4506 open on the master servers for the minion to connect

Thomas S Hatch

unread,
Jun 14, 2012, 2:01:32 PM6/14/12
to salt-...@googlegroups.com
You still don't have any minion communication right? Is a master running on the same machine the syndic is running on?

ckm

unread,
Jun 14, 2012, 2:02:17 PM6/14/12
to salt-...@googlegroups.com
I have a stupid key question.

On all these machines, there are two interfaces, one public, one private.  For obvious reasons, all the IPs that are listed here are the private one, which salt is supposed to use for communicating.   In all the config files, the master is configured to be the private ip (on syndic & minion), but I noticed the key name of the syndic machine on the master is actually the public DNS name....

Could this be the cause of issues?  Since key assignment is auto-magic, I suspect it might have picked up the public DNS name when requesting the key.   Or does the key/machine reference use IPs?  How would I check this & correct it if need be?   Can I force a key request tied to a particular IP?

It seems this could very well be the source of problems, but I would expect slightly better error reporting (host unreachable or something).

Sorry about asking this, seems like a basic question, but I'm bagging my head agains the wall trying to solve this.

Thx.

Chris.

Thomas S Hatch

unread,
Jun 14, 2012, 2:11:45 PM6/14/12
to salt-...@googlegroups.com
These are RSA keys, not SSL certs, they don't care about IPs, DNS etc, possession is the law.

Basically I bet that your interfaces are messed up, so that the minions are not making it to the master somehow, there is not error reporting of connections because zeromq just waits for the other side to become available and then binds to it.

what do your minion configs look like?

ckm

unread,
Jun 14, 2012, 5:38:41 PM6/14/12
to salt-...@googlegroups.com
I don't think the interfaces are messed up - if you look at the top post, you'll the netstat output on the syndic, which clearly shows that the minion is connected to Syndic and the syndic to the master.  I was just thinking that perhaps the certs were tied to IPs or somehow verified via DNS query....

Minion config:
master: 172.16.65.228
id: fe1

Everything else is the default

Minion netstat:
tcp        0      0 172.16.65.227:33415         172.16.65.228:4505          ESTABLISHED 29562/python2.6 

Thx.

Chris.

On Thursday, June 14, 2012 11:11:45 AM UTC-7, Thomas Hatch wrote:
These are RSA keys, not SSL certs, they don't care about IPs, DNS etc, possession is the law.

Basically I bet that your interfaces are messed up, so that the minions are not making it to the master somehow, there is not error reporting of connections because zeromq just waits for the other side to become available and then binds to it.

what do your minion configs look like?

ckm

unread,
Jun 14, 2012, 5:50:32 PM6/14/12
to salt-...@googlegroups.com
BTW, the reason I was looking at cert/name/interface problems is because I noticed that the key name of the syndic machine was the public DNS name, which references a different IP than the one salt is supposed to use....

However, all that is moot as I turned off the firewalls and nothing changed.   Plus there is an active connection between master & syndic:

master netstat
tcp        0      0 172.16.65.226:4505      172.16.65.228:60842     ESTABLISHED 28538/python 

I just deleted & re-accepted the keys from the syndic on the master, just in case that was an issue...

Even so, I'm still getting zero response from salt '*' test.ping

Chris.

ckm

unread,
Jun 14, 2012, 6:03:50 PM6/14/12
to salt-...@googlegroups.com
There are three machines

master (.226) -> syndic (.228) -> minion (.227)

All are separate Xen instances in the same datacenter running on a private LAN (172.16.65.x)

If I run salt '*' test.ping on syndic (.228) I get {'fe1': True}

If I run salt '*' test.ping on master (.226) I get nothing.

Syndic is running salt-syndic & salt-master and the master in /etc/salt has two lines that are different from stock:
interface: 172.16.65.228
syndic_master: 172.16.65.226

All three machines show (through netstat) that they are connected via port 4505 (see other posts), but only syndic can talk to the minion.

All are running salt 0.9.9.1 - master & syndic are Ubuntu 10.04.4 LTS, minion is running Centos 5.x - all are current.

There are no errors in the logs.

Thx.

Chris.

On Thursday, June 14, 2012 11:01:32 AM UTC-7, Thomas Hatch wrote:
You still don't have any minion communication right? Is a master running on the same machine the syndic is running on?

Thomas S Hatch

unread,
Jun 15, 2012, 12:02:53 PM6/15/12
to salt-...@googlegroups.com
It sounds like everything is set up correctly, and that is good that we can confirm that the master can communicate with the minion, so it is just the higher level master. On the higher level master can you change order_masters to True with a capital T? I am scraping the bottom of the barrel here...

ckm

unread,
Jun 19, 2012, 3:56:55 PM6/19/12
to salt-...@googlegroups.com
I actually gave up on this temporarily - I'm actually going to reinstall the whole OS and start from scratch, at least on Syndic & master

I suspect it's related to a DNS/hostname/IP mapping problem even though everything is reference with IPv4 addresses.   I tried doing some cleanup but it's causing other issues, so I'd rather start over.

Chris.


On Friday, June 15, 2012 9:02:53 AM UTC-7, Thomas Hatch wrote:
It sounds like everything is set up correctly, and that is good that we can confirm that the master can communicate with the minion, so it is just the higher level master. On the higher level master can you change order_masters to True with a capital T? I am scraping the bottom of the barrel here...

ckm

unread,
Jun 20, 2012, 6:24:26 PM6/20/12
to salt-...@googlegroups.com
So it looks like I was correct, there seems to be undocumented dependencies on DNS in Salt - the latest version (0.10?) fails to even start: 18:17:23,493 [salt.utils     ][ERROR   ] This master address: salt was previously resolvable but now fails to resolve! The previously resolved ip addr will continue to be used

Since all these machines are on private networks with no DNS entries (and all referenced by IPv4 address) why is there a reverse DNS lookup anywhere in salt?

One of the most frustrating things is code that does forward/reverse DNS lookups and then barfs when they fail even if it has no impact at all....

Chris.

On Friday, June 1, 2012 12:07:45 PM UTC-7, ckm wrote:

Thomas S Hatch

unread,
Jun 20, 2012, 11:43:54 PM6/20/12
to salt-...@googlegroups.com
There are NO dependencies on DNS in Salt. This message is posted when the master address is set to a dns hostname and can't resolved, to not use dns, don't mass a hostname as the master value, use an ip address.

Florian Ermisch

unread,
Jul 6, 2012, 7:35:59 AM7/6/12
to salt-...@googlegroups.com
On 12.06.2012 23:13, Thomas S Hatch wrote:
> [...] You need to have ports 4505 and 4506 open on the master servers for
> the minion to connect
> [...]

This little bit of information is missing from the docs at
http://salt.readthedocs.org/en/latest/topics/installation/fedora.html
Important on Fedora/RHEL(/CentOS/...)* which are pretty locked up
firewallwise by default.

Regards, FLorian

*) sudo system-config-firewall-tui
-> Customize
-> Forward
-> Add Port=4505, Protocol=tcp -> OK
-> Add Port=4506, Protocol=tcp -> OK
-> Close
-> OK
-> Yes

Thomas S Hatch

unread,
Jul 6, 2012, 11:24:58 AM7/6/12
to salt-...@googlegroups.com
Good call, I will get this added. we should also put a section into the troubleshooting guide about libvirt turning on iptables without warning
Reply all
Reply to author
Forward
0 new messages