consume large memory until OOM, lots of lager_error_logger_h dropped

176 views
Skip to first unread message

Victor Lee

unread,
Apr 22, 2022, 9:45:00 AM4/22/22
to rabbitmq-users
Hi, everyone.
Openstack cluster have three control nodes: node-30, node-31, node-32 and other compute nodes, three rabbitmq docker container run on control nodes separately. Occasionally, one of three rabbitmq will consume a lot of memory until system OOM or restart the rabbitmq container manually.
fs2_rbmq.png
As the picture shows, node-30 was restart until trigger OOM at midnight(2022-04-08 03:16).
all rabbitmq nodes (lager, error_logger_hwm) setting is {ok,50}.
 Here is some snip log:
/var/log/kolla/rabbitmq/rab...@node-30.log
2022-04-08 02:49:45.007 [warning] <0.32.0> lager_error_logger_h dropped 9 messages in the last second that exceeded the limit of 1000 messages/sec
2022-04-08 03:17:44.003 [warning] <0.32.0> lager_error_logger_h dropped 154 messages in the last second that exceeded the limit of 1000 messages/sec

/var/log/kolla/rabbitmq/log/crash.log(node-30)
2022-04-17 13:23:40 =SUPERVISOR REPORT====
2022-04-17 11:20:37.826
     Offender:   [{nb_children,1},{name,channel_sup},{mfargs,{rabbit_channel_sup,start_link,[]}},{restart_type,temporary},{shutdown,infinity},{child_type,supervisor}]
2022-04-07 23:52:29.201
     Offender:   [{nb_children,1},{name,channel_sup},{mfargs,{rabbit_channel_sup,start_link,[]}},{restart_type,temporary},{shutdown,infinity},{child_type,supervisor}]
     Reason:     shutdown
     Context:    shutdown_error
     Supervisor: {<0.7366.866>,rabbit_channel_sup_sup}
2022-04-07 23:52:22 =SUPERVISOR REPORT====

/var/log/kolla/rabbitmq/rab...@node-31.log
2022-04-08 02:47:17.000 [warning] <0.32.0> lager_error_logger_h dropped 17 messages in the last second that exceeded the limit of 1000 messages/sec

/var/log/kolla/rabbitmq/rab...@node-32.log
2022-04-08 03:17:33.000 [warning] <0.32.0> lager_error_logger_h dropped 2925 messages in the last second that exceeded the limit of 1000 messages/sec
2022-04-08 03:17:38.000 [warning] <0.32.0> lager_error_logger_h dropped 206 messages in the last second that exceeded the limit of 1000 messages/sec

I had add the full logs by attachment.

Question:
- 1, what's the root cause of this problem, and how could I fix it?
- 2, could I fix it by update error_logger_hwm to a large value such as 4000? what will be the harm? bcz in lager README.md says about error_logger_hwm: It is probably best to keep this number small.

Here is some relevant stackoverflow question but nobody could help me:
node-32_rabbitmq.log
node-30_rabbitmq.log
node-31_rabbitmq.log

Victor Lee

unread,
Apr 24, 2022, 3:12:08 AM4/24/22
to rabbitmq-users
FYI. question 2 is solved.

Michal Kuratczyk

unread,
Apr 25, 2022, 3:00:19 AM4/25/22
to rabbitm...@googlegroups.com
Hi,

This is RabbitMQ 3.7.10, which is over 3 years old and has been unsupported for a long time (ot also runs on unsupported Erlang). We don't even use lagger these days.
You have a high connection churn - clients keep connecting and disconnecting. This may be the cause or a contributing factor.
The connections seem to disconnect due to missing heartbeats. Seems like your client apps or the network are unhealthy.

Best,

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/0e7ae6ae-6fc6-4c86-8ef1-772884f5e2b9n%40googlegroups.com.


--
Michał
RabbitMQ team

Victor Lee

unread,
Apr 26, 2022, 2:13:07 AM4/26/22
to rabbitmq-users
Thank you, Michał.
>Seems like your client apps or the network are unhealthy.
Disagree with this, we solved the problem after that only restart the high memory consume rabbitmq process.
Could it shows that no other apps or network issue?

Michal Kuratczyk

unread,
Apr 26, 2022, 3:03:32 AM4/26/22
to rabbitm...@googlegroups.com
Hi,

There are many potential reasons for excessive memory usage. First step is to learn what this memory is used for:

Establishing what the memory is used for does not immediately guarantee that we will know why it happened but it's definitely a step towards it.
Keep in mind that if it turns out to be a bug in RabbitMQ or Erlang, it will not get fixed in the versions you are using as they are no longer supported.
You may be chasing an issue solved a long time ago.

Best,



--
Michał
RabbitMQ team
Reply all
Reply to author
Forward
0 new messages