rabbitmq-server frequently stops with erlang crash dump

270 views
Skip to first unread message

rabbitmq_user

unread,
Apr 25, 2019, 3:57:57 PM4/25/19
to rabbitmq-users
Hello! 

I'm running a 3-node cluster with mirrored queues, with RabbitMQ version 3.7.12 with Erlang/OTP 21.3.5. 

Message rates are low-volume (total of at most 70 messages/second queued). Memory use from the management UI shows ~500 MB (well below the high watermark, which is ~20GB). 

RabbitMQ-server crashes with an erlang crash dump anywhere between 1 hour - 1.5 days of starting up the process with the following message (every time):

Slogan: eheap_alloc: Cannot allocate 17517010032464 bytes of memory (of type "message").


The process always ends up crashing within 1.5 days.

What can I try to improve stability on my cluster?

Daniil Fedotov

unread,
Apr 26, 2019, 1:07:45 PM4/26/19
to rabbitmq-users
Hello,

This doesn't look right. That's more than 16 terabytes. I don't think anything should allocate that much memory ever.
Do you have the memory metrics collected for the time before this error happens?

rabbitmq_user

unread,
Apr 26, 2019, 2:22:34 PM4/26/19
to rabbitmq-users
rabbitmqctl status on a currently running node in the cluster looks like: 

{memory,


     [{connection_readers,13198088},


      {connection_writers,3651644},


      {connection_channels,16082856},


      {connection_other,29032668},


      {queue_procs,26453992},


      {queue_slave_procs,8161408},


      {plugins,36712076},


      {other_proc,6075512},


      {metrics,3861996},


      {mgmt_db,49793632},


      {mnesia,2232200},


      {other_ets,4389376},


      {binary,157039712},


      {msg_index,31024},


      {code,23752080},


      {atom,1213657},


      {other_system,17060751},


      {allocated_unused,172055408},


      {reserved_unallocated,0},


      {strategy,rss},


      {total,[{erlang,398742672},{rss,556539904},{allocated,570798080}]}]},


This is approximately the "normal" status of every node until the crash suddenly happens. I'm attaching a graph of erlang_vm_memory_bytes_total for the node that crashed below: (rabbitmq-server startup until the crash)

Screen Shot 2019-04-26 at 2.17.15 PM.png

rabbitmq_user

unread,
Apr 26, 2019, 3:51:42 PM4/26/19
to rabbitmq-users
(Update: The green line in the graph is System memory and the red line is Processes)

Michael Klishin

unread,
May 13, 2019, 3:41:20 PM5/13/19
to rabbitmq-users
172 MB of about 250 is "allocated but not yet used by the runtime". See [1].
A few dozens more are connections, see [2].


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages