Unable to join one node in rabbitmq cluster

61 views
Skip to first unread message

prasanth reddy

unread,
Sep 24, 2025, 7:38:38 AM (13 days ago) Sep 24
to rabbitmq-users
Hi,
I have 3 node cluster. node2 has disk issues. I removed mnesia directory from home directory. After that we tried to to join the node2 to cluster. Below are home directory permissions and configurations on  nodes.
 #ls -ld /u1/rabbitmq_stage
drwxr-xr-x 3 rabbitmq rabbitmq 51 Sep 21
07:39 /u1/rabbitmq_stage
 #ls -ld /var/lib/rabbitmq
 lrwxrwxrwx 1 root root 21 Sep 21
07:04 /var/lib/rabbitmq -> /u1/rabbitmq_stage
#ls -l /var/lib/rabbitmq/.erlang.cookie
r------- 1 rabbitmq rabbitmq 28 Sep 21
07:39 /var/lib/rabbitmq/.erlang.cookie
#rabbitmqctl eval 'node().'
'rab...@rabbitmq-cluster-02.example.com'
 #hostname -f
rabbitmq-cluster-02.example.com
#echo $HOME
/u1/rabbitmq_stage

Config file:
#cat /etc/rabbitmq/rabbitmq.conf
cluster_name = RABBIT-STAGE
cluster_formation.peer_discovery_backend = classic_config cluster_formation.classic_config.nodes.1 = rab...@rabbitmq-cluster-01.example.com
cluster_formation.classic_config.nodes.2 = rab...@rabbitmq-cluster-02.example.com cluster_formation.classic_config.nodes.3 = rab...@rabbitmq-cluster-03.example.com
cluster_partition_handling = autoheal
 log.file = rabbitmq.log
 log.dir = /var/log/rabbitmq
 log.file.level = info
 log.connection.level = info
log.connection.file = rabbitmq_connecction.log
 log.channel.level = info
log.channel.file = rabbitmq_channel.log
log.queue.level = info
 log.queue.file = rabbitmq_queue.log

#cat /etc/rabbitmq/rabbitmq-env.conf
HOME=/u1/rabbitmq_stage. RABBITMQ_USE_LONGNAME=True NODENAME=rab...@rabbitmq-cluster-02.example.com

 erlang cookie and erlang cookie hash is same on all 3 nodes. able to get pong to node 1 and node 3 from erlang shell from node2. but from cli getting pang. form node 1 and node3 able to get pong to node2 server.
when i tried to join node2 in the cluster i am seeing below.
# rabbitmqctl join_cluster rab...@rabbitmq-cluster-01.example.com
Clustering node rab...@rabbitmq-cluster-02.example.com with rab...@rabbitmq-cluster-01.example.com
 Error: unable to perform an operation on node 'rab...@rabbitmq-cluster-01.example.com'.
 Please see diagnostics information and suggestions below.
Most common reasons for this are:
* Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
* CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running
 In addition to the diagnostics info below:
* See the CLI, clustering and networking guides on
https://rabbitmq.com/documentation.html to learn more
* Consult server logs on node rab...@rabbitmq-cluster-01.example.com
* If target node is configured to use long node names, don't forget to use --longnames with CLI tools
DIAGNOSTICS
===========
 attempted to contact: ['rab...@rabbitmq-cluster-01.example.com'] rab...@rabbitmq-cluster-01.example.com:
* connected to epmd (port 4369) on rabbitmq-cluster-01.example.com
* node rab...@rabbitmq-cluster-01.example.com up, 'rabbit' application running
Current node details: 
* node name: 'rabbitmqcli-...@rabbitmq-cluster-02.example.com'
* effective user's home directory: /u1/rabbitmq_stage
* Erlang cookie hash: zQ9nqcnsr2w+afMYJetB1g==
epmd -names
epmd: up and running on port 4369 with data:
name rabbit at port 25672
could you please let me know how to sort it out the issue. 3 nodes are running on centos7 os.

prasanth reddy

unread,
Sep 25, 2025, 5:43:38 AM (12 days ago) Sep 25
to rabbitm...@googlegroups.com
Hi,

Could any one share the inputs to solve this issue.


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rabbitmq-users/556c67df-8927-4614-a7ed-f35639cdfce7n%40googlegroups.com.

Michal Kuratczyk

unread,
Sep 29, 2025, 3:08:17 AM (8 days ago) Sep 29
to rabbitm...@googlegroups.com
You didn't provide even the most basic info: version and errors.
However, most likely you need to forget_cluster_node before trying to join it again: https://www.rabbitmq.com/docs/clustering#removal-of-unresponsive-and-unrecoverable-nodes




--
Michal
RabbitMQ Team

prasanth reddy

unread,
Sep 29, 2025, 5:39:18 AM (8 days ago) Sep 29
to rabbitmq-users
Hi  Michal Kuratczyk,

Version is  RabbitMQ 3.8.9 on Erlang 23.1.5 on 3 servers.

We are observing the below exception when tried to join the 02 node to cluster
rabbitmqctl  join_cluster rab...@rabbitmq-cluster-01.example.com
Clustering node rab...@rabbitmq-cluster-02.example.com with rab...@rabbitmq-cluster-01.example.com
Error: unable to perform an operation on node 'rab...@rabbitmq-cluster-01.example.com'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rab...@rabbitmq-cluster-01.example.com
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: ['rab...@rabbitmq-cluster-01.example.com']

rab...@rabbitmq-cluster-01.example.com:
  * connected to epmd (port 4369) on rabbitmq-cluster-01.example.com
  * node rab...@rabbitmq-cluster-01.example.com up, 'rabbit' application running

Current node details:
 * node name: 'rabbitmqcli-...@rabbitmq-cluster-02.example.com'
 * effective user's home directory: /u1/rabbitmq_stage
 * Erlang cookie hash: zQ9nqcnsr2w+afMYJetB1g==


I have stopped the app on rab...@rabbitmq-cluster-02.example.com on with rabbitmqctl stop_app and run the forget_cluster_node. But it's still saying to stop the app.
rabbitmqctl stop_app
Stopping rabbit application on node  rab...@rabbitmq-cluster-02.example.com   

 rabbitmqctl forget_cluster_node -n rab...@rabbitmq-cluster-01.example.com rab...@rabbitmq-cluster-02.example.com
Removing node  rab...@rabbitmq-cluster-02.example.com   from the cluster
Error:
RabbitMQ on node  rab...@rabbitmq-cluster-02.example.com   must be stopped with 'rabbitmqctl -n  rab...@rabbitmq-cluster-02.example.com   stop_app' before it can be removed


Michal Kuratczyk

unread,
Sep 29, 2025, 5:42:40 AM (8 days ago) Sep 29
to rabbitm...@googlegroups.com
RabbitMQ 3.8.9 has been out of support for many years, so I'm not going to investigate this. The docs are pretty clear what you need to do.



--
Michal
RabbitMQ Team
Reply all
Reply to author
Forward
0 new messages