Uneven quorum_queue_procs memory utilization across nodes

Maciej Kopczyński

unread,

Mar 5, 2024, 5:14:35 AM3/5/24

to rabbitmq-users

Hello,

I am running a 3-node RabbitMQ cluster (RabbitMQ 3.11.13, Erlang 25.3.1). I use both classic and quorum queues. I have few quorum queues processing "large" amounts of messages (this is relative of course, for now it is only ~100 mgs./s) and few thousands quorum queues with a much lower traffic, I would say few dozen messages per day.

My problem is an uneven memory consumption across my cluster nodes. After running memory_breakdown, it seems that it is caused by higher utilization by "quorum_queue_procs" on this node. I have assumed that this is related to an uneven distribution of leader queues, which according to my observations, consume significantly more memory. But after rebalancing, nothing has changed. I would appreciate any advice: is it an expected behavior? What could be the reason? I am worried because this way I will reach memory limit on one node much faster than on the others, I would like to distribute memory consumption more evenly.

For what it's worth, I can see that there are more connections made to the node with higher memory consumption (load is balanced by k8s, since I have deployed RabbitMQ using the operator). I do not know how is this related to quorum queues though. Below are the memory reports for each node:

quorum_queue_procs: 3.4395 gb (65.49%)
allocated_unused: 1.2051 gb (22.95%)
mgmt_db: 0.1279 gb (2.44%)
other_ets: 0.1152 gb (2.19%)
binary: 0.0792 gb (1.51%)
quorum_ets: 0.0626 gb (1.19%)
other_system: 0.0577 gb (1.1%)
other_proc: 0.0485 gb (0.92%)
plugins: 0.0465 gb (0.89%)
code: 0.0362 gb (0.69%)
mnesia: 0.0212 gb (0.4%)
metrics: 0.005 gb (0.1%)
queue_procs: 0.0037 gb (0.07%)
atom: 0.0019 gb (0.04%)
connection_other: 0.0006 gb (0.01%)
connection_channels: 0.0003 gb (0.01%)
msg_index: 0.0003 gb (0.01%)
connection_readers: 0.0001 gb (0.0%)
quorum_queue_dlx_procs: 0.0 gb (0.0%)
connection_writers: 0.0 gb (0.0%)
stream_queue_procs: 0.0 gb (0.0%)
stream_queue_replica_reader_procs: 0.0 gb (0.0%)
queue_slave_procs: 0.0 gb (0.0%)
stream_queue_coordinator_procs: 0.0 gb (0.0%)
reserved_unallocated: 0.0 gb (0.0%)

allocated_unused: 0.7772 gb (35.41%)
quorum_queue_procs: 0.7706 gb (35.11%)
other_ets: 0.1153 gb (5.25%)
mgmt_db: 0.1118 gb (5.09%)
other_proc: 0.0871 gb (3.97%)
reserved_unallocated: 0.0828 gb (3.77%)
quorum_ets: 0.0781 gb (3.56%)
other_system: 0.0579 gb (2.64%)
code: 0.0361 gb (1.65%)
plugins: 0.031 gb (1.41%)
mnesia: 0.0212 gb (0.97%)
binary: 0.0163 gb (0.74%)
metrics: 0.0049 gb (0.22%)
atom: 0.0019 gb (0.09%)
queue_procs: 0.0018 gb (0.08%)
connection_other: 0.0003 gb (0.01%)
connection_readers: 0.0001 gb (0.0%)
quorum_queue_dlx_procs: 0.0 gb (0.0%)
msg_index: 0.0 gb (0.0%)
connection_channels: 0.0 gb (0.0%)
connection_writers: 0.0 gb (0.0%)
stream_queue_procs: 0.0 gb (0.0%)
stream_queue_replica_reader_procs: 0.0 gb (0.0%)
queue_slave_procs: 0.0 gb (0.0%)
stream_queue_coordinator_procs: 0.0 gb (0.0%)

quorum_queue_procs: 0.7714 gb (36.17%)
allocated_unused: 0.6856 gb (32.15%)
other_ets: 0.1147 gb (5.38%)
mgmt_db: 0.11 gb (5.16%)
plugins: 0.0985 gb (4.62%)
quorum_ets: 0.091 gb (4.27%)
other_proc: 0.0712 gb (3.34%)
other_system: 0.0577 gb (2.71%)
reserved_unallocated: 0.0525 gb (2.46%)
code: 0.0362 gb (1.7%)
mnesia: 0.0212 gb (1.0%)
binary: 0.0139 gb (0.65%)
metrics: 0.005 gb (0.24%)
atom: 0.0019 gb (0.09%)
queue_procs: 0.0018 gb (0.09%)
connection_other: 0.0001 gb (0.0%)
msg_index: 0.0 gb (0.0%)
quorum_queue_dlx_procs: 0.0 gb (0.0%)
stream_queue_procs: 0.0 gb (0.0%)
stream_queue_replica_reader_procs: 0.0 gb (0.0%)
connection_readers: 0.0 gb (0.0%)
connection_writers: 0.0 gb (0.0%)
connection_channels: 0.0 gb (0.0%)
queue_slave_procs: 0.0 gb (0.0%)
stream_queue_coordinator_procs: 0.0 gb (0.0%)

Thanks in advance!

Maciej Kopczyński

unread,

Mar 7, 2024, 8:14:29 AM3/7/24

to rabbitmq-users

Just a follow-up. >24h after rebalancing queues I had to restart the cluster. Now memory distribution is perfectly even across the nodes. I am confused as to why did that help, but leaving this message in case some stumbles upon a similar problem.

oren handel

unread,

Apr 10, 2024, 4:33:27 AM4/10/24

to rabbitmq-users

I had a similar issue, with a 3-node RabbitMQ cluster - 2 nodes had much higher memory & storage usage.
I ran `rabbitmq-queues rebalance quorum --queue-pattern "<busy_quorum_prefix>.*"`

in my case it seems that quorum leaders of my busier quorum queues were created unevenly between my nodes. After rebalancing them, memory and storage usage between the nodes was balanced again.

Reply all

Reply to author

Forward