MQTT clients not able to connect (rabbitmq upgrade leader selection)

389 views
Skip to first unread message

Jimmy Carter

unread,
Apr 5, 2021, 9:23:06 AM4/5/21
to rabbitmq-users
While upgrading cluster rabbit mq version 3.8.x to say latest 3.8.14, I get below errors

[error] <0.10565.0> MQTT cannot accept connection  -> x.x.x.x:1883 due to an internal error or unavailable component

I have 3 node cluster master and 2 slaves.
Until all nodes are upgraded and are up , I keep gettting above errors and it seems there is issue with leader selection.
I am doing rolling upgrades.

How to automatically configure rabbitmq to do leader selection.

Very urgent
Thanks

Luke Bakken

unread,
Apr 5, 2021, 10:52:15 AM4/5/21
to rabbitmq-users
Hello,

First of all, you directly contacted me at my personal email address. This is extremely impolite and should never be done to members of this or any internet forum.

Second, if this is truly a "very urgent" issue I suggest paying for support - https://www.rabbitmq.com/#support

It appears that this issue is resolved when all nodes are up, correct? Your MQTT applications should be able to handle this and back off of reconnection until the MQTT port is available.

Luke

Jimmy Carter

unread,
Apr 5, 2021, 11:43:02 AM4/5/21
to rabbitm...@googlegroups.com
Apologies for that.
It appears that this issue is resolved when all nodes are up, correct
Yes but it takes time may be 5 minutes , and all the new connections are failing .
I assume we want new connections to failover in case of node upgrade.
What is the use of cluster and other rabbit mq  if we need to handle such things from the mqtt client side?

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/bb4a0725-6337-4937-8b6c-c3e6f95dcbe7n%40googlegroups.com.

M K

unread,
Apr 6, 2021, 5:50:42 AM4/6/21
to rabbitmq-users
There are no "master" and "follower" nodes in RabbitMQ. MQTT plugin in a cluster forms a Raft cluster for tracking client ID uniqueness.
So a specific node is a leader in this MQTT cluster but otherwise, all nodes are equal peers [1].

Leader election in this client ID tracking cluster is automatic. There is no need or way to trigger it.

With MQTT and its underlying Raft library defaults, it should not take 5 minutes for MQTT client ID tracking to elect anew leader.
It cannot happen in an instant and will require a majority of nodes with the MQTT plugin enabled online, however. This is an inherent
limitation of Raft: if no majority of nodes are online, no operations will be accepted.

So the question comes down to how do you provision and upgrade your nodes. You haven't mentioned any specifics beyond "a rolling upgrade".
So all we can do is guess as to what is going on. We do not guess in this community: guessing is an incredibly expensive (time consuming) way of troubleshooting distributed systems,
for everyone involved.

Jimmy Carter

unread,
Apr 6, 2021, 7:16:10 AM4/6/21
to rabbitmq-users
Here are the detailed info

We have 3 node rabbit mq cluster with mqtt 3.1.1 and amqp
one is the master and 2 are slave .All have rabbit mq 3.8.5-management version .
It is hosted on aws.
So when I upgrade the master from 3.8.5 to 3.8.14, existing mqtt connections are broken and new connections do fail.
Atttached are the master node logs.
And I see loads of such errors.
I also tried making all queues as type classic instead of quorum.
Please let me know if more info needed.


Moreover I ran these comamnds on all 3 nodes but it did not make any difference.
rabbitmqctl eval "ra:trigger_election('mqtt_node')."
rabbitmqctl eval "ra:overview()."


I see on all 3 nodes , that the state  is "pre_vote" .
Please let me know with 3 node cluster , does consensus raft algo works when one is being upgraded.

Also the error logs keeps continuing and I am stuck with no solution.
error_logs.txt

Jimmy Carter

unread,
Apr 6, 2021, 7:35:01 AM4/6/21
to rabbitmq-users

Also please let me know what are proper steps to upgrade rabbitmq cluster with 3 nodes having verson 3.8.5 to latest on production , without disrupting any running connections so that everything is smooth and seamless.

Jimmy Carter

unread,
Apr 6, 2021, 11:26:05 AM4/6/21
to rabbitmq-users
Attached are complete logs when upgrading master node from 3.8.5 to 3.8.14 wile rest nodes were in 3.8.5 version rabbitmq

On Tuesday, April 6, 2021 at 3:20:42 PM UTC+5:30 michael....@gmail.com wrote:
master-final.txt
slave1-final.txt
slave2-final.txt

Luke Bakken

unread,
Apr 6, 2021, 12:30:46 PM4/6/21
to rabbitmq-users
Hello,

Any time you restart a network service's operating system process (RabbitMQ, HTTP server, database server, etc) all TCP connections to that service will be closed. There is no way around it. Your MQTT applications must be written to handle abrupt TCP disconnections.

Luke

Jimmy Carter

unread,
Apr 6, 2021, 12:43:21 PM4/6/21
to rabbitmq-users
I agree .But after restart things should be back to normal.
But in my case it is never.
So if I upgrade master node only , mqtt connections thereafter become unstable .
Also while upgrading the new mqtt connections shall be handled by running slaves(since only master is being restarted).Correct?

I just wanted steps to smoothly  upgrade rabbitmq version  cluster on production env without losing connections etc.
Reply all
Reply to author
Forward
0 new messages