Query on high availability rabbitmq as a pod in AWS EKS

72 views
Skip to first unread message

Sagar Ingale

unread,
Apr 9, 2025, 7:07:08 AM4/9/25
to rabbitm...@googlegroups.com
Hi Team,

I have below configuration.

AWS Eks with below configuration:
Eks-Node1 with PV (EBS) in zone1.
Eks-Node2 with PV (EBS) in zone2.

Rabbimq is configured as below:
Rabbitmq-node1 as a replicaset Pod1 on Eks-Node1.
Rabbitmq-node2 as a replicaset Pod2 on Eks-Node2.

(Note: I dont want to use EFS, due to low performance, and it will give same issue anyway if one rabbitmq node down other will also not serve properly)

Issue:
1. When I bring Eks-node1 down, Rabbitmq-node1 gets down and I am not able to see any messages which was published on Rabbitmq-node1.
I did checked using rabbitmqctl.
Another, issue i saw in rabbitmq pod log is "errorContext: child_terminated".
2. I doubt, when i publish any message, it gets store only on one EBS depend on where the request receive eg pod1 . It should store message to both PV.

I have tried with classic quese as well as Quorum queue. Cluster partition as autoheal and pause minority. But nothing helped.

Please advise.

Thanks.

Michal Kuratczyk

unread,
Apr 9, 2025, 7:16:09 AM4/9/25
to rabbitm...@googlegroups.com
Your deployment is just wrong:

1. 2-node clusters are basically not supported - this is why you are observing this behaviour (there's no majority in 2-node cluster - when 1 node goes down, the other one doesn't have majority)
2. you should be using a StatefulSet

Using the operator is the recommended way to deploy RabbitMQ to Kubernetes:


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/rabbitmq-users/CAC6jUfGcVqtSdyNvSwzUqEw1UA-AuWiK5Rc4HNkingM2tj2dKw%40mail.gmail.com.


--
Michal
RabbitMQ Team

This electronic communication and the information and any files transmitted with it, or attached to it, are confidential and are intended solely for the use of the individual or entity to whom it is addressed and may contain information that is confidential, legally privileged, protected by privacy laws, or otherwise restricted from disclosure to anyone else. If you are not the intended recipient or the person responsible for delivering the e-mail to the intended recipient, you are hereby notified that any use, copying, distributing, dissemination, forwarding, printing, or copying of this e-mail is strictly prohibited. If you received this e-mail in error, please return the e-mail to the sender, delete it from your computer, and destroy any printed copy of it.

Sagar Ingale

unread,
Apr 9, 2025, 7:26:51 AM4/9/25
to rabbitm...@googlegroups.com
Yes, i am using statefulset only with rabbitmq operator. 

2-node rabbitmq cluster means 2 replica olny but replicas on different Aws eks nodes with different ebs PV volume attached.
So, published message should stored and serve from both replica as per me.

Thanks.


Michal Kuratczyk

unread,
Apr 9, 2025, 7:40:01 AM4/9/25
to rabbitm...@googlegroups.com
In the first email you said it was a ReplicaSet, so make sure it's not. If it's a StatefulSet - great.

The documentation clearly discourages using two-node clusters:
As we develop RabbitMQ, this requirement will only get more strict, as more and more core
components use Raft protocol and therefore require a majority of nodes to be available
(1 of 2 nodes is not sufficient).



Sagar Ingale

unread,
Apr 9, 2025, 10:27:06 AM4/9/25
to rabbitm...@googlegroups.com
But statefullset means, it will save messages in Ebs volume ie PV.

The problem is here only wrt aws zones.
Aws does not replicate messages in all volume of different zone.

Also even if i consider 3 node, it should give same issue..


I think, rabbitmq should replicate message in all nodes/replica and store in all PV. 


Michal Kuratczyk

unread,
Apr 9, 2025, 10:33:48 AM4/9/25
to rabbitm...@googlegroups.com
RabbitMQ does store messages on the persistent volume and if you use quorum queues - they are stored on all nodes
and therefore all persistent volumes.

You didn't share enough about what exactly you are doing but from what you did share, it sounds like the partition handling
strategy kicks in when you lose one node and it stops the second node (because it doesn't have a majority).

RabbitMQ is unaware of the zones, so you might as well experiment in a single-zone setup (even on your laptop if you want)
and the behaviour will be the same. Use 3 nodes and a quorum queue.

Sagar Ingale

unread,
Apr 9, 2025, 10:49:51 AM4/9/25
to rabbitm...@googlegroups.com
Statement which you said (1st paragraph) were not working, may be due to wrong config.

Do  i need to purchase something to get better solution wrt Aws eks config? 
If not, can we have call? As if we need screen sharing, i need to first check with client..

As per me queues are storing on memory as per rabbitmq limit. Not on ALL PV.

Michal Kuratczyk

unread,
Apr 9, 2025, 11:25:39 AM4/9/25
to rabbitm...@googlegroups.com
You don't need to purchase anything - you just need a basic understanding of RabbitMQ.
You want highly available and replicated queues - use quorum queues. Quorum queues
always write messages to disk - they never stored them in memory.



Reply all
Reply to author
Forward
0 new messages