Cluster Join Error

1,676 views
Skip to first unread message

Jeff S

unread,
Apr 22, 2021, 11:56:39 PM4/22/21
to rabbitmq-users

Hi. I'm using CentOS 8 and installed everything up via yum. I have 3 nodes. All 3 nodes were running separately. For some reason, all 3 nodes were missing /var/lib/rabbitmq/.erlang.cookie during installation.  The only cookie that was ever created on all 3 nodes existed on $HOME. So I copied that cookie to /var/lib/rabbitmq as well as $HOME on both slave nodes. I then stopped and reset and then tried to join the master but it's giving me:

Error: unable to perform an operation on node 'rabbit@node03'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@node03
 * If target node is configured to use long node names, don't forget to use --longnames with CLI tools

DIAGNOSTICS
===========

attempted to contact: [rabbit@node03]

rabbit@node03:
  * connected to epmd (port 4369) on node03
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic
  * TCP connection succeeded but Erlang distribution failed
  * suggestion: check if the Erlang cookie identical for all server nodes and CLI tools
  * suggestion: check if all server nodes and CLI tools use consistent hostnames when addressing each other
  * suggestion: check if inter-node connections may be configured to use TLS. If so, all nodes and CLI tools must do that
   * suggestion: see the CLI, clustering and networking guides on https://rabbitmq.com/documentation.html to learn more

Looking at the log at /var/log/... on node 03 I see many messages like
Connection attempt from node 'rabbitmqcli-277-rabbit@node03' rejected. Invalid challenge reply. **

It seems like the cookie mismatch but I verified all 3 nodes have the same value in $HOME as well as the cookie that I created in /var/lib/rabbitmq. I noticed that on the 2 slaves, node 02 and node03, I can't even start the app anymore with the same message as above.

What can I check?

Jeff S

unread,
Apr 23, 2021, 12:05:42 AM4/23/21
to rabbitmq-users
Also, I tried passing the cookie value in the cli command when starting (or joining) the slave nodes  (they are in stopped mode and both reset because I wanted the 2 nodes to join the master) but the messages are the same.

Jeff S

unread,
Apr 24, 2021, 1:13:17 PM4/24/21
to rabbitmq-users
A bit of an update. I removed erlang and rabbitmq, left the .erlang.cookie in $HOME and reinstalled. The node started up and I was able to join_cluster to node01. My /etc/hosts file contains ip and hostname of all 3 nodes. After checking the cluster status and cluster joining showing no error messages, I now have 2 clusters instead of 1. node01 shows 1 node running and and node02 shows node02 running.

Please help

Jeff S

unread,
Apr 24, 2021, 1:25:33 PM4/24/21
to rabbitmq-users
Guys I know this will be dismissed as user error but after stopping the app, resetting and joining the cluster, it shows up now. I tried 3 times, and it worked on the 3rd time. My only remaining question is how come on CentOS 8, it's using $HOME/.erlang.cookie and not /var/lib/.erlang.cookie? After installing erlang and rabbitmq, I could never see the latter cookie, only in the $HOME one.

M K

unread,
Apr 26, 2021, 5:58:25 PM4/26/21
to rabbitmq-users
Hi Jeff,

RabbitMQ nodes use /var/lib/rabbitmq/.erlang.cookie for default cookie location. I have just verified this on CentOS 8 using a centos:8 container and [1]:

ls -lha /var/lib/rabbitmq/
total 20K
drwxr-xr-x 3 rabbitmq rabbitmq 4.0K Apr 26 20:57 .
drwxr-xr-x 1 root     root     4.0K Apr 26 20:57 ..
-r-------- 1 rabbitmq rabbitmq   20 Apr 26 00:00 .erlang.cookie

The .spec file of the package [2][3] confirms that the default service user and working directory haven't changed in a long time. They are
identical on all RPM-based distributions.

RabbitMQ CLI tools use their effective user's $HOME directory, so it matters how the tools are run.

Reply all
Reply to author
Forward
0 new messages