Memory alarm effect during cluster node restart

Vilius Šumskas

unread,

Jun 6, 2024, 3:47:42 AMJun 6

to rabbitm...@googlegroups.com

Hi,

I‘m wondering what could be effects on a mirrored queues if during restart of the cluster nodes (one by one) another node is experiencing continues memory alarm? Specifically, can it produce a queue state where leader is missing, queues not syncing or stuck at 100% of sync after the restart, etc.

--

Best Regards,

Vilius Šumskas

Rivile

IT manager

+370 614 75713

jo...@cloudamqp.com

unread,

Jun 12, 2024, 11:51:34 AMJun 12

to rabbitmq-users

Hi,

Yes it can produce a state where syncing gets stuck. This happens often for CMQ.

You can somewhat limit this with setting a smaller batch size or better mirroring_sync_max_throughput (https://github.com/rabbitmq/rabbitmq-server/pull/3925), but the fundamental flaws of CMQ are still there.

Note that CMQ are deprecated and are removed in 4.0.

/Johan

Vilius Šumskas

unread,

Jun 12, 2024, 2:31:25 PMJun 12

to rabbitm...@googlegroups.com

Thank you for your response, Johan.

So, in essence, it is probably a good idea to ensure that nodes always have at least 33-40% of free memory (given 3-node balanced cluster), in case any node needs a reboot?

--

Vilius

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/e5a24df8-5547-4c45-9403-7e661ea2b08dn%40googlegroups.com.

jo...@cloudamqp.com

unread,

Jun 12, 2024, 4:59:38 PMJun 12

to rabbitmq-users

That's a good rule of thumb: but more info is needed. Are there a lot of connections/channels? Are those reconnecting? In the worst case you'd have to size for the event that all connections are on one node, and keeping connections/channels require some amount of memory.

A better strategy would be quorum queues for a few number of queues (say hundreds, low thousands and quorum queues are not blocking on sync and only sends the delta, not a complete re-sync) and classic non-mirrored queues (which can in 2024 easily go into millions of messages, and tens of thousands of queues without breaking a sweat)

/Johan

Vilius Šumskas

unread,

Jun 12, 2024, 5:17:21 PMJun 12

to rabbitm...@googlegroups.com

We have 1k connections (auto-reconnecting on failover) and 10k channels which map to around 12k queues. Almost all queues are very short.

Memory breakdown: https://pasteboard.co/AAORtjwdVHcH.png .

Interesting. Are you implying that RabbitMQ would not handle such amount of quorum queues easily? Or is this only actual for _mirrored_ quorum queues?

To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/03703496-28a6-421f-862e-26d3511b08e8n%40googlegroups.com.

jo...@cloudamqp.com

unread,

Jun 13, 2024, 2:17:50 PMJun 13

to rabbitmq-users

Q: Are you implying that RabbitMQ would not handle such amount of quorum queues easily?

A: Yes, I would not recommend using more than a few thousand "mirrored" QQs in the same cluster.

I haven't done much testing with "quorum queue size-1" in a cluster as the unique features not present in cqv2 (poison message handling, at-least-once deadlettering being the biggest) are usually just needed for a few queues, not all.

/Johan

Reply all

Reply to author

Forward