[rabbitmq-discuss] newbie problem--'Error: unable to connect to node 'rabbit@rabbitmq-small02-dev': nodedown'

3,267 views
Skip to first unread message

John Stoner

unread,
Mar 9, 2012, 3:24:19 PM3/9/12
to rabbitmq...@lists.rabbitmq.com
OK, I realize this is probably a very basic thing, but I've been
googling and screwing around with this for hours and getting nowhere.

I'm trying to set up my first rabbitmq cluster. I have it installed
fine on two ec2 hosts, and I'm trying to get one to cluster with the
other. I've tried the instructions at

http://www.rabbitmq.com/clustering.html

and I keep getting

Error: unable to connect to node 'rabbit@rabbitmq-small02-dev':
nodedown
diagnostics:
- nodes and their ports on rabbitmq-small02-dev: [{rabbit,57641},

{rabbitmqctl12928,46801}]
- current node: 'rabbitmqctl12928@rabbitmq-small02-dev'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: 4IHR1JMCSCh55g/tisBydg==

So I found more detailed instructions on skaag.net, which seems to be
down, but I can get the google-cached instructions at

<http://webcache.googleusercontent.com/search?
q=cache:pl1EhJadjJoJ:www.skaag.net/2010/03/12/rabbitmq-for-beginners/
+howto+set+up+a+rabbitmq+cluster&hl=en&gl=us&strip=1>

So, I install (actually re-install) a fresh, up to date rabbitmq on
the second machine, according to the instructions for ec2.

Then I go through the instructions.

I stop the server:
[jstoner@rabbitmq-small02-dev]> service rabbitmq-server stop
Stopping rabbitmq-server: RabbitMQ is not running
rabbitmq-server.

I change the cookie to the same as the other one:
[jstoner@rabbitmq-small02-dev]> sudo su
root@rabbitmq-small02-dev:/home/jstoner/rclient# echo -n [cookie-text]
> /var/lib/rabbitmq/.erlang.cookie
root@rabbitmq-small02-dev:/home/jstoner/rclient# exit
exit

I try to restart:
[jstoner@rabbitmq-small02-dev]> sudo service rabbitmq-server start
Starting rabbitmq-server: FAILED - check /var/log/rabbitmq/
startup_{log, _err}
rabbitmq-server.

I look at the logs they mention:
[jstoner@rabbitmq-small02-dev]> more /var/log/rabbitmq/startup_log
Activating RabbitMQ plugins ...

********************************************************************************
********************************************************************************

0 plugins activated:

node with name "rabbit" already running on "rabbitmq-small02-dev"
diagnostics:
- nodes and their ports on rabbitmq-small02-dev: [{rabbit,57641},

{rabbitmqctl12869,39028},

{rabbitmqprelaunch12879,
37230}]
- current node: 'rabbitmqprelaunch12879@rabbitmq-small02-dev'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: 4IHR1JMCSCh55g/tisBydg==
[jstoner@rabbitmq-small02-dev]> more /var/log/rabbitmq/startup_err
[empty file]

That tells me nothing I can make sense of. So I see what happens if I
try the next step:
[jstoner@rabbitmq-small02-dev]> sudo rabbitmqctl stop_app
Stopping node 'rabbit@rabbitmq-small02-dev' ...
Error: unable to connect to node 'rabbit@rabbitmq-small02-dev':
nodedown
diagnostics:
- nodes and their ports on rabbitmq-small02-dev: [{rabbit,57641},

{rabbitmqctl12928,46801}]
- current node: 'rabbitmqctl12928@rabbitmq-small02-dev'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: [hashtext]

And I get the same kind of problem I got on the first set of
instructions. Any ideas would be much appreciated.
_______________________________________________
rabbitmq-discuss mailing list
rabbitmq...@lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

John Stoner

unread,
Mar 12, 2012, 3:21:36 PM3/12/12
to rabbitmq...@lists.rabbitmq.com
OK, I realize this is probably a very basic thing, but I've been
googling and screwing around with this for hours and getting nowhere.

I'm trying to set up my first rabbitmq cluster. I have it installed
fine on two ec2 hosts, and I'm trying to get one to cluster with the
other. I've tried the instructions at

http://www.rabbitmq.com/clustering.html

and I keep getting

Error: unable to connect to node 'rabbit@rabbitmq-small02-dev':
nodedown
diagnostics:
- nodes and their ports on rabbitmq-small02-dev: [{rabbit,57641},

{rabbitmqctl12928,46801}]
- current node: 'rabbitmqctl12928@rabbitmq-small02-dev'
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: 4IHR1JMCSCh55g/tisBydg==

So I found more detailed instructions on skaag.net, at

<http://www.skaag.net/2010/03/12/rabbitmq-for-beginners/>
Which is the same kind of problem I got on the first set of
instructions, even though the cookies match. 

Any ideas would be much appreciated.

--
blogs:
http://johnstoner.wordpress.com/
'In knowledge is power; in  wisdom, humility.'

Emile Joubert

unread,
Mar 14, 2012, 7:51:16 AM3/14/12
to John Stoner, rabbitmq...@lists.rabbitmq.com
Hi John,

On 12/03/12 19:21, John Stoner wrote:
> Error: unable to connect to node 'rabbit@rabbitmq-small02-dev':
> nodedown
> diagnostics:
> - nodes and their ports on rabbitmq-small02-dev: [{rabbit,57641},

Does the RabbitMQ logfile on rabbitmq-small02-dev contain an entry about
a connection attempt from a disallowed node at this time?

> I stop the server:
> [jstoner@rabbitmq-small02-dev]> service rabbitmq-server stop
> Stopping rabbitmq-server: RabbitMQ is not running
> rabbitmq-server.

Subsequent commands indicate that RabbitMQ was still running after this
command was issued. Check if you have any processes running starting
with "beam". Subsequent steps won't make any difference until you
succeed in restarting the broker. Kill the process if necessary.

> {rabbitmqctl12928,46801}]
> - current node: 'rabbitmqctl12928@rabbitmq-small02-dev'
> - current node home dir: /var/lib/rabbitmq
> - current node cookie hash: [hashtext]

What is [hashtext] ? Does it match the other node?


Also check the EC2-specific instructions at
http://www.rabbitmq.com/ec2.html
paying close attention to the section on hostnames.


-Emile

Emile Joubert

unread,
Mar 15, 2012, 5:58:58 AM3/15/12
to John Stoner, RabbitMQ Discuss
Hi John,

On 14/03/12 22:36, John Stoner wrote:

> Does the RabbitMQ logfile on rabbitmq-small02-dev contain an entry about
> a connection attempt from a disallowed node at this time?

If it doesn't then network access between the cluster nodes needs to be
confirmed or established. See information here:

http://www.rabbitmq.com/clustering.html#firewall

> One other question: is it expected behavior that `rabbitmq-server stop`
> doesn't kill the process?

That is not a recognised parameter. Did you mean `rabbitmqctl stop` ?

Emile Joubert

unread,
Mar 15, 2012, 6:59:00 PM3/15/12
to John Stoner, rabbitmq...@lists.rabbitmq.com

Hi John,

On 15/03/12 20:54, John Stoner wrote:
> OK, I don't think it's the firewall. 4369 is open to TCP (we tried
> adding UDP. Didn't help). Plus when I do a tcpdump, I see packets
> received on small01 on 'rabbitmqctl cluster rabbitmq-small01-dev'.

The EPMD port is necessary but not sufficient. From the details provided
in your previous email you would have needed to open port 57641 also.
This is a random port. Do you see packets on this port in your network
trace?

I would suggest temporarily eliminating the firewall entirely for
debugging purposes. See the firewall section on the clustering page for
configuration details:

https://www.rabbitmq.com/clustering.html

John Stoner

unread,
Mar 15, 2012, 8:02:28 PM3/15/12
to Emile Joubert, rabbitmq...@lists.rabbitmq.com
aack--I'm wrong, but not really. Looking again I see:

config file(s) : (none)

On Thu, Mar 15, 2012 at 7:00 PM, John Stoner <johns...@gmail.com> wrote:
OK, I'm checking about deactivating the firewall with the people working with me on this. 

Following your link, I see 

Once a distributed Erlang node address has been resolved via epmd, other nodes will attempt to communicate directly with that address using the Erlang distributed node protocol. The port range for this communication can be configured with two parameters for the Erlang kernel application:

  • inet_dist_listen_min
  • inet_dist_listen_max

Firewalls must permit traffic in this range to pass between clustered nodes (assuming all nodes use the same port range). The default port range is unrestricted.

The Erlang kernel_app manpage contains more details on the port range that distributed Erlang nodes listen on. See the configuration page for information on how to create and edit a configuration file.

following the configuration page link I see

Verify Configuration

The active configuration can be verified in the startup banner, e.g. the active configuration file:

config file(s) : /etc/rabbitmq/rabbitmq.config

which I'm not seeing. I have an empty /etc/rabbitmq, and when I stop and restart rabbitmq I don't see any mention of a config file. 


On Thu, Mar 15, 2012 at 5:59 PM, Emile Joubert <em...@rabbitmq.com> wrote:

Hi John,

On 15/03/12 20:54, John Stoner wrote:
> OK, I don't think it's the firewall. 4369 is open to TCP (we tried
> adding UDP. Didn't help). Plus when I do a tcpdump, I see packets
> received on small01 on 'rabbitmqctl cluster rabbitmq-small01-dev'.

The EPMD port is necessary but not sufficient. From the details provided
in your previous email you would have needed to open port 57641 also.
This is a random port. Do you see packets on this port in your network
trace?

I would suggest temporarily eliminating the firewall entirely for
debugging purposes. See the firewall section on the clustering page for
configuration details:

https://www.rabbitmq.com/clustering.html


-Emile




--
blogs:
http://johnstoner.wordpress.com/
'In knowledge is power; in  wisdom, humility.'

Emile Joubert

unread,
Mar 16, 2012, 6:24:48 AM3/16/12
to John Stoner, RabbitMQ Discuss
Hi John,

On 16/03/12 05:41, John Stoner wrote:
> OK, I got it to start with a good config file. what's an appropriate
> range for these port numbers?

You are free to use any unused port range between 1024 and 65536.

John Stoner

unread,
Mar 16, 2012, 1:25:02 PM3/16/12
to Emile Joubert, rabbitmq...@lists.rabbitmq.com
[Just a bit of context, some of this conversation happened off list--I'm trying to start a rabbitmq cluster on some firewalled EC2 instances, and Emile has been helping me identify all the details about ports and configuration and all that good stuff.]

We're looking to open fewer ports, not more. Is there a minimum we could do? Would one work, or would it break something else?

Also, we have these ports open to all TCP.  In the spirit of securing our systems, I guess we could open 4369 only to the IPs of the other machines in the cluster. Is that a good idea? Can you think of more firewall restrictions to add?

On Fri, Mar 16, 2012 at 5:24 AM, Emile Joubert <em...@rabbitmq.com> wrote:
Hi John,

On 16/03/12 05:41, John Stoner wrote:
> OK, I got it to start with a good config file. what's an appropriate
> range for these port numbers?

You are free to use any unused port range between 1024 and 65536.


-Emile



Carl Hörberg

unread,
Mar 16, 2012, 1:43:57 PM3/16/12
to John Stoner, rabbitmq...@lists.rabbitmq.com
you can allow traffic only between instances in the same security group by setting the "source" field to the id of the security group 

Emile Joubert

unread,
Mar 16, 2012, 1:56:57 PM3/16/12
to John Stoner, rabbitmq...@lists.rabbitmq.com

Hi John,

I assume that you established the reason for the clustering problems
encountered earlier was due to firewall configuration.

On 16/03/12 17:25, John Stoner wrote:
> We're looking to open fewer ports, not more./ /Is there a minimum we


> could do? Would one work, or would it break something else?

One port is possible (then inet_dist_listen_min = inet_dist_listen_max),
but a small number like 5 is more common. Avoid the ephemeral port range
when you make your selection.

> Also, we have these ports open to all TCP. In the spirit of securing
> our systems, I guess we could open 4369 only to the IPs of the other
> machines in the cluster. Is that a good idea? Can you think of more
> firewall restrictions to add?

As discussed previously and above, you need to open at least one port in
addition to the one used by the port mapper daemon. You are free to add
further firewall restrictions, as long as all clusternodes are
accessible from all other clusternodes on the relevant ports, as
discussed here:

http://www.rabbitmq.com/clustering.html#firewall

Michael Cumings

unread,
Mar 16, 2012, 2:37:44 PM3/16/12
to rabbitmq...@lists.rabbitmq.com
So what is the considerations on having 5 over 10 or 1?  Is there a reasonable criteria that can be used to determine what an appropriate number of ports that should be allocated?

John Stoner

unread,
Mar 16, 2012, 5:26:52 PM3/16/12
to Michael Cumings, rabbitmq...@lists.rabbitmq.com
ok...

still getting

March 16, 2012 @ 08:54:01PM: ~
[jstoner@rabbitmq-small02-dev]> sudo rabbitmqctl cluster rabbit@rabbitmq-small01-dev
Clustering node 'rabbit@rabbitmq-small02-dev' with ['rabbit@rabbitmq-small01-dev'] ...
Error: {no_running_cluster_nodes,['rabbit@rabbitmq-small01-dev'],
                                 ['rabbit@rabbitmq-small01-dev']}

I changed the rabbitmq.config to

[
                {kernel,
                        [{inet_dist_listen_min, [some number]},
                         {inet_dist_listen_max, [another number]}]
                }
].

on both servers, and restarted.

I think we have the right ports open now. 

The erlang cookie matches.

seeing traffic on 4369, but not the other ports.

What else could be wrong?

_______________________________________________
rabbitmq-discuss mailing list
rabbitmq...@lists.rabbitmq.com
https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss

Emile Joubert

unread,
Mar 16, 2012, 6:03:10 PM3/16/12
to John Stoner, rabbitmq...@lists.rabbitmq.com
Hi John,

On 16/03/12 21:26, John Stoner wrote:
> [jstoner@rabbitmq-small02-dev]> sudo rabbitmqctl cluster
> rabbit@rabbitmq-small01-dev
> Clustering node 'rabbit@rabbitmq-small02-dev' with
> ['rabbit@rabbitmq-small01-dev'] ...
> Error: {no_running_cluster_nodes,['rabbit@rabbitmq-small01-dev'],
> ['rabbit@rabbitmq-small01-dev']}

Does the RabbitMQ logfile on rabbitmq-small01-dev contain an entry about
a connection attempt from a disallowed node? If it doesn't then
internode network traffic is probably being blocked by a firewall.

> [
> {kernel,
> [{inet_dist_listen_min, [some number]},
> {inet_dist_listen_max, [another number]}]
> }
> ].

You can confirm whether that has taken effect by checking the port
numbers returned by "epmd -names".

> seeing traffic on 4369, but not the other ports.

If you don't see other internode traffic on the configured port(s) then
then the firewall config needs to be updated.

I suggest independently testing the firewall config with another
application such as netcat to get some confidence that it is correct.
Alternatively temporarily eliminate the firewall entirely for debugging
purposes if possible.

Emile Joubert

unread,
Mar 19, 2012, 6:20:36 AM3/19/12
to Michael Cumings, rabbitmq...@lists.rabbitmq.com
Hi Michael,

On 16/03/12 18:37, Michael Cumings wrote:
> So what is the considerations on having 5 over 10 or 1? Is there a
> reasonable criteria that can be used to determine what an appropriate
> number of ports that should be allocated?

Convenience only, I think. It is possible to configure one or as many
unused ports as you like.

John Stoner

unread,
Mar 21, 2012, 1:27:27 PM3/21/12
to Emile Joubert, rabbitmq...@lists.rabbitmq.com
OK, apparently there was a bit of flakiness in EC2's security. We forced a reapply of the rules by making a change and then undoing it, and now it clusters ok.
Reply all
Reply to author
Forward
0 new messages