Classic queues high memory usage on mirror nodes

622 views
Skip to first unread message

Daniel Fenert

unread,
May 24, 2022, 7:29:02 AM5/24/22
to rabbitmq-users
Hi, 

I've 3-node rabbitmq cluster with ~300 mirrored classic queues (all durable, most in memory, few lazy ones).

Second time in a week span I'm experiencing very high memory usage on mirror nodes.
It looks like it happens on publishing peak to one exchange that delivers msgs to 10 queues.
That leads to high memory alarm (and downtime). After alarm sets, memory is never freed on mirror instances so I had to restart these slave instances to get things back to working order (screenshots of UI after last restart)
Restart of 2 nodes is a reason that one of the rabbitmq nodes is a master for all the queues.

Screenshot from UI on current memory stats and history.

Do you have any ideas of such large difference in memory usage between master and slave nodes?

I've recently made upgrade from 3.7.x to 3.9.14, before that it was pretty stable, but I'm not sure if new version is the cause of the problem.
master.png
masterhistory.png
slave2.png
overview.png
slave1history.png
slave2history.png
slave1.png

Luke Bakken

unread,
May 24, 2022, 7:59:46 AM5/24/22
to rabbitmq-users
Hello,

Could you please provide more details about this statement?

" It looks like it happens on publishing peak to one exchange that delivers msgs to 10 queues."

What exactly do you mean by "publishing peak"? Does the message rate increase? Size of the messages? Can the consumers keep up? Without that information we can't try to reproduce this issue.

Have you attempted to rebalance the queue masters after a restart? (https://www.rabbitmq.com/rabbitmq-queues.8.html#rebalance)

Thanks,
Luke

Daniel Fenert

unread,
May 24, 2022, 9:57:46 AM5/24/22
to rabbitmq-users
Hello,
Message rate increased, msg sizes were normal.
Consumers had some lag - 600K messages in queues globally at peak, these messages were delivered to 4 queues.
Memory usage still increased on slaves even as messages were consumed (it's visible in attached screenshot).
I'll try to rebalance queues.

What happened after my initial post:
1. I've added some ram (30% more),
2. The team that produced this high rate of messages reported that they stopped getting publish confirms (at least, the rate that they could publish was very low). Though looking at RabbitMQ UI everything looked fine, no alarms (just more and more memory being used by slave nodes).
3. Memory on nodes which were slave after restart was still increasing, until another high memory alert despite 30% more memory available
4. I did another restart of slaves at 14:14 - seen on attached screenshot
5. Now everything seems normal:
    - memory is not increasing to unusual levels,
    - the team mentioned in "2." was able to produce all messages they needed - ~150K resulting in ~500K being delivered to various queues.
    - everything looks stable and working (as for last few days).

Previous problem that looked similar was 8 days ago, between dates everything worked stable and without problems.

Daniel Fenert

unread,
May 24, 2022, 9:59:25 AM5/24/22
to rabbitmq-users
Forgoten attachment
Screenshot Rabbitmq Dashboard - Grafana.png

Luke Bakken

unread,
Sep 7, 2022, 1:47:06 PM9/7/22
to rabbitmq-users
Hello,

I outline some workarounds here that should address your issue:


Note that I tested them on the latest series of RabbitMQ, 3.10.X

Daniel Fenert

unread,
Sep 9, 2022, 2:57:59 AM9/9/22
to rabbitmq-users
Thanks for info,

switching queue mode to `lazy` also solves the problem for me.
Reply all
Reply to author
Forward
0 new messages