Classic queues high memory usage on mirror nodes

Daniel Fenert

unread,

May 24, 2022, 7:29:02 AM5/24/22

to rabbitmq-users

Hi,

I've 3-node rabbitmq cluster with ~300 mirrored classic queues (all durable, most in memory, few lazy ones).

Second time in a week span I'm experiencing very high memory usage on mirror nodes.

It looks like it happens on publishing peak to one exchange that delivers msgs to 10 queues.

That leads to high memory alarm (and downtime). After alarm sets, memory is never freed on mirror instances so I had to restart these slave instances to get things back to working order (screenshots of UI after last restart)

Restart of 2 nodes is a reason that one of the rabbitmq nodes is a master for all the queues.

Screenshot from UI on current memory stats and history.

Do you have any ideas of such large difference in memory usage between master and slave nodes?

I've recently made upgrade from 3.7.x to 3.9.14, before that it was pretty stable, but I'm not sure if new version is the cause of the problem.

master.png

masterhistory.png

slave2.png

overview.png

slave1history.png

slave2history.png

slave1.png

Luke Bakken

unread,

May 24, 2022, 7:59:46 AM5/24/22

to rabbitmq-users

Hello,

Could you please provide more details about this statement?

" It looks like it happens on publishing peak to one exchange that delivers msgs to 10 queues."

What exactly do you mean by "publishing peak"? Does the message rate increase? Size of the messages? Can the consumers keep up? Without that information we can't try to reproduce this issue.

Have you attempted to rebalance the queue masters after a restart? (https://www.rabbitmq.com/rabbitmq-queues.8.html#rebalance)

Thanks,

Luke

Daniel Fenert

unread,

May 24, 2022, 9:57:46 AM5/24/22

to rabbitmq-users

Hello,

Message rate increased, msg sizes were normal.

Consumers had some lag - 600K messages in queues globally at peak, these messages were delivered to 4 queues.

Memory usage still increased on slaves even as messages were consumed (it's visible in attached screenshot).
I'll try to rebalance queues.

What happened after my initial post:
1. I've added some ram (30% more),

2. The team that produced this high rate of messages reported that they stopped getting publish confirms (at least, the rate that they could publish was very low). Though looking at RabbitMQ UI everything looked fine, no alarms (just more and more memory being used by slave nodes).

3. Memory on nodes which were slave after restart was still increasing, until another high memory alert despite 30% more memory available

4. I did another restart of slaves at 14:14 - seen on attached screenshot

5. Now everything seems normal:

- memory is not increasing to unusual levels,

- the team mentioned in "2." was able to produce all messages they needed - ~150K resulting in ~500K being delivered to various queues.

- everything looks stable and working (as for last few days).

Previous problem that looked similar was 8 days ago, between dates everything worked stable and without problems.

Daniel Fenert

unread,

May 24, 2022, 9:59:25 AM5/24/22

to rabbitmq-users

Forgoten attachment

Screenshot Rabbitmq Dashboard - Grafana.png

Luke Bakken

unread,

Sep 7, 2022, 1:47:06 PM9/7/22

to rabbitmq-users

Hello,

I outline some workarounds here that should address your issue:

https://github.com/rabbitmq/rabbitmq-server/issues/5312#issuecomment-1193207112

Note that I tested them on the latest series of RabbitMQ, 3.10.X

Daniel Fenert

unread,

Sep 9, 2022, 2:57:59 AM9/9/22

to rabbitmq-users

Thanks for info,

switching queue mode to `lazy` also solves the problem for me.

Reply all

Reply to author

Forward