Memory alarm effect during cluster node restart

56 views
Skip to first unread message

Vilius Šumskas

unread,
Jun 6, 2024, 3:47:42 AMJun 6
to rabbitm...@googlegroups.com

Hi,

 

I‘m wondering what could be effects on a mirrored queues if during restart of the cluster nodes (one by one) another node is experiencing continues memory alarm? Specifically, can it produce a queue state where leader is missing, queues not syncing or stuck at 100% of sync after the restart, etc.

 

--

   Best Regards,

 

    Vilius Šumskas

    Rivile

    IT manager

    +370 614 75713

 

jo...@cloudamqp.com

unread,
Jun 12, 2024, 11:51:34 AMJun 12
to rabbitmq-users
Hi,
Yes it can produce a state where syncing gets stuck. This happens often for CMQ.
You can somewhat limit this with setting a smaller batch size or better mirroring_sync_max_throughput (https://github.com/rabbitmq/rabbitmq-server/pull/3925), but the fundamental flaws of CMQ are still there.

Note that CMQ are deprecated and are removed in 4.0.

/Johan

Vilius Šumskas

unread,
Jun 12, 2024, 2:31:25 PM (14 days ago) Jun 12
to rabbitm...@googlegroups.com

Thank you for your response, Johan.

 

So, in essence, it is probably a good idea to ensure that nodes always have at least 33-40% of free memory (given 3-node balanced cluster), in case any node needs a reboot?

 

--

    Vilius

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/e5a24df8-5547-4c45-9403-7e661ea2b08dn%40googlegroups.com.

jo...@cloudamqp.com

unread,
Jun 12, 2024, 4:59:38 PM (14 days ago) Jun 12
to rabbitmq-users
That's a good rule of thumb: but more info is needed. Are there a lot of connections/channels? Are those reconnecting? In the worst case you'd have to size for the event that all connections are on one node, and keeping connections/channels require some amount of memory.

A better strategy would be quorum queues for a few number of queues (say hundreds, low thousands and quorum queues are not blocking on sync and only sends the delta, not a complete re-sync) and classic non-mirrored queues (which can in 2024 easily go into millions of messages, and tens of thousands of queues without breaking a sweat)

/Johan

Vilius Šumskas

unread,
Jun 12, 2024, 5:17:21 PM (14 days ago) Jun 12
to rabbitm...@googlegroups.com

We have 1k connections (auto-reconnecting on failover) and 10k channels which map to around 12k queues. Almost all queues are very short.

Memory breakdown: https://pasteboard.co/AAORtjwdVHcH.png .

 

Interesting. Are you implying that RabbitMQ would not handle such amount of quorum queues easily? Or is this only actual for _mirrored_ quorum queues?

jo...@cloudamqp.com

unread,
Jun 13, 2024, 2:17:50 PM (13 days ago) Jun 13
to rabbitmq-users
Q: Are you implying that RabbitMQ would not handle such amount of quorum queues easily?
A: Yes, I would not recommend using more than a few thousand "mirrored" QQs in the same cluster.

I haven't done much testing with "quorum queue size-1" in a cluster as the unique features not present in cqv2 (poison message handling, at-least-once deadlettering being the biggest) are usually just needed for a few queues, not all.

/Johan
Reply all
Reply to author
Forward
0 new messages