RabbitMQ “node not running”

692 views
Skip to first unread message

yossi...@credifi.com

unread,
Mar 7, 2019, 3:29:04 AM3/7/19
to rabbitmq-users

I have set up a RabbitMQ cluster on a few workstations on the network. I'm viewing the cluster through the admin UI on my own workstation and seeing that one of the  nodes are no running.

I've checked the status of these nodes via rabbitmqctl and all is OK.
Can someone tell me why the admin UI is showing this?


Joe.


01.jfif
node2.jfif
node1.jfif

Luke Bakken

unread,
Mar 7, 2019, 1:25:14 PM3/7/19
to rabbitmq-users
Hello,

More than likely something is blocking port 25672 between your nodes. That's my first guess, at least. You should also check the RabbitMQ logs to see if any errors have been logged.

Thanks,
Luke

yossi...@credifi.com

unread,
Mar 10, 2019, 5:13:00 AM3/10/19
to rabbitmq-users
I cheeked. the port 25672 is open  between the node.
What's strange is that in the cli (rabbitmqctl cluster_status) everything is up, and only in the UI I see there is a node down )-:

This is what I see on node 1 log:


=ERROR REPORT==== 7-Mar-2019::12:05:51 ===
Mnesia('rabbit@RABBITMQ-01'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbit@RABBITMQ-02'}

=INFO REPORT==== 7-Mar-2019::12:05:51 ===
Keep rabbit@RABBITMQ-02 listeners: the node is already back

=INFO REPORT==== 7-Mar-2019::12:05:51 ===
node 'rabbit@RABBITMQ-02' down: net_tick_timeout

=INFO REPORT==== 7-Mar-2019::12:05:51 ===
node 'rabbit@RABBITMQ-02' up

=INFO REPORT==== 7-Mar-2019::12:05:51 ===
global: Name conflict terminating {rabbit_mgmt_db,<31222.24137.5>}

=INFO REPORT==== 10-Mar-2019::01:19:29 ===
rabbit on node 'rabbit@RABBITMQ-02' down

=INFO REPORT==== 10-Mar-2019::01:19:29 ===
Statistics database started.

=INFO REPORT==== 10-Mar-2019::01:19:29 ===
node 'rabbit@RABBITMQ-02' down: connection_closed

=INFO REPORT==== 10-Mar-2019::01:21:11 ===
node 'rabbit@RABBITMQ-02' up

=ERROR REPORT==== 10-Mar-2019::01:21:11 ===
Mnesia('rabbit@RABBITMQ-01'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbit@RABBITMQ-02'}

=INFO REPORT==== 10-Mar-2019::01:21:11 ===
global: Name conflict terminating {rabbit_mgmt_db,<31222.24325.5>}

=INFO REPORT==== 10-Mar-2019::01:21:13 ===
Statistics database started.

=ERROR REPORT==== 10-Mar-2019::01:49:53 ===
** Node 'rabbit@RABBITMQ-02' not responding **
** Removing (timedout) connection **

=INFO REPORT==== 10-Mar-2019::01:49:53 ===
rabbit on node 'rabbit@RABBITMQ-02' down

=INFO REPORT==== 10-Mar-2019::01:49:53 ===
node 'rabbit@RABBITMQ-02' down: net_tick_timeout

=INFO REPORT==== 10-Mar-2019::01:52:14 ===
node 'rabbit@RABBITMQ-02' up

=ERROR REPORT==== 10-Mar-2019::01:52:14 ===
Mnesia('rabbit@RABBITMQ-01'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbit@RABBITMQ-02'}

=INFO REPORT==== 10-Mar-2019::01:52:14 ===
global: Name conflict terminating {rabbit_mgmt_db,<31222.17570.63>}

=INFO REPORT==== 10-Mar-2019::01:52:15 ===
Statistics database started.

=INFO REPORT==== 10-Mar-2019::03:25:01 ===



This is what I see on node 2 log:

=INFO REPORT==== 7-Mar-2019::09:40:04 ===
closing AMQP connection <0.14960.3> ([::1]:59226 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::09:40:04 ===
accepting AMQP connection <0.14982.3> ([::1]:59230 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::09:40:04 ===
closing AMQP connection <0.14982.3> ([::1]:59230 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::10:04:33 ===
accepting AMQP connection <0.27706.3> ([::1]:59234 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::10:04:33 ===
closing AMQP connection <0.27706.3> ([::1]:59234 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::10:04:34 ===
accepting AMQP connection <0.27727.3> ([::1]:59238 -> [::1]:5672)

=INFO REPORT==== 7-Mar-2019::10:04:34 ===
closing AMQP connection <0.27727.3> ([::1]:59238 -> [::1]:5672)

=ERROR REPORT==== 7-Mar-2019::11:49:38 ===
** Node 'rabbit@RABBITMQ-01' not responding **
** Removing (timedout) connection **

=INFO REPORT==== 7-Mar-2019::11:49:38 ===
rabbit on node 'rabbit@RABBITMQ-01' down

=INFO REPORT==== 7-Mar-2019::11:49:45 ===
Statistics database started.

=INFO REPORT==== 7-Mar-2019::11:49:45 ===
node 'rabbit@RABBITMQ-01' down: net_tick_timeout

=INFO REPORT==== 7-Mar-2019::11:50:14 ===
node 'rabbit@RABBITMQ-01' up

=ERROR REPORT==== 7-Mar-2019::11:50:14 ===
Mnesia('rabbit@RABBITMQ-02'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbi                                                                                                                       t@RABBITMQ-01'}

=ERROR REPORT==== 7-Mar-2019::12:05:38 ===
** Node 'rabbit@RABBITMQ-01' not responding **
** Removing (timedout) connection **

=INFO REPORT==== 7-Mar-2019::12:05:38 ===
rabbit on node 'rabbit@RABBITMQ-01' down

=INFO REPORT==== 7-Mar-2019::12:05:38 ===
Statistics database started.

=INFO REPORT==== 7-Mar-2019::12:05:38 ===
node 'rabbit@RABBITMQ-01' down: net_tick_timeout

=INFO REPORT==== 7-Mar-2019::12:05:54 ===
node 'rabbit@RABBITMQ-01' up

=ERROR REPORT==== 7-Mar-2019::12:05:54 ===
Mnesia('rabbit@RABBITMQ-02'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbi                                                                                                                       t@RABBITMQ-01'}

=INFO REPORT==== 7-Mar-2019::12:05:54 ===
Statistics database started.

=ERROR REPORT==== 10-Mar-2019::01:19:32 ===
** Node 'rabbit@RABBITMQ-01' not responding **
** Removing (timedout) connection **

=INFO REPORT==== 10-Mar-2019::01:19:32 ===
rabbit on node 'rabbit@RABBITMQ-01' down

=INFO REPORT==== 10-Mar-2019::01:19:32 ===
node 'rabbit@RABBITMQ-01' down: net_tick_timeout

=INFO REPORT==== 10-Mar-2019::01:21:14 ===
node 'rabbit@RABBITMQ-01' up

=ERROR REPORT==== 10-Mar-2019::01:21:14 ===
Mnesia('rabbit@RABBITMQ-02'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbi                                                                                                                       t@RABBITMQ-01'}

=ERROR REPORT==== 10-Mar-2019::01:49:47 ===
** Node 'rabbit@RABBITMQ-01' not responding **
** Removing (timedout) connection **

=INFO REPORT==== 10-Mar-2019::01:49:47 ===
rabbit on node 'rabbit@RABBITMQ-01' down

=INFO REPORT==== 10-Mar-2019::01:49:47 ===
Statistics database started.

=INFO REPORT==== 10-Mar-2019::01:49:47 ===
node 'rabbit@RABBITMQ-01' down: net_tick_timeout

=INFO REPORT==== 10-Mar-2019::01:52:17 ===
node 'rabbit@RABBITMQ-01' up

=ERROR REPORT==== 10-Mar-2019::01:52:17 ===
Mnesia('rabbit@RABBITMQ-02'): ** ERROR ** mnesia_event got {inconsistent_database, running_partitioned_network, 'rabbi                                                                                                                       t@RABBITMQ-01'}

BR,
Yossi.

Luke Bakken

unread,
Mar 11, 2019, 10:48:05 AM3/11/19
to rabbitmq-users
Hi Yossi,

Please note the net_tick_timeout messages. This means that the network between your nodes is unreliable, or your nodes are so overloaded that the distributed Erlang connection can't send heartbeats, so the nodes think they are disconnected.

If your nodes aren't overloaded, I suggest checking the network itself.

Thanks,
Luke

Michael Klishin

unread,
Mar 14, 2019, 7:38:57 PM3/14/19
to rabbitmq-users

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages