rabbitmqctl hanging

176 views
Skip to first unread message

Kapil Goyal

unread,
Mar 23, 2016, 10:45:15 PM3/23/16
to rabbitm...@googlegroups.com

Hi Folks,

 

We run a cron job every five minutes that runs the command 'rabbitmqctl report' and logs the output to a file. On a particular setup, we observed that after having run successfully for many hours, this command started hanging. Eventually, the system ran out of memory and I found about 350 instances of the command (beam.smp) still running at that point. So, 350 times, the command didn't return. In the log file, the output will always be truncated like this:

 

<rab...@localhost.2.8047.1>     192.168.1.146:26943 -> 192.168.1.240:5671       5671    26943   192.168.1.240   192.168.1.146   true                            PLAIN   tlsv1.2 rsa     aes_256_cbc     sha     {0,9,1} cvn-hv-ded3cfa3-d639-11e5-9add-9dc306575d87     nsx     0       131072  100     [{"product","rabbitmq-c"},{"information","See http://hg.rabbitmq.com/rabbitmq-c/"}]     1458021579876   470366  1053    5629    65      0       running 1

 

Channels:

 

So, apparently, the command hangs while trying to print channels. I also found 700 instances of inet_gethost processes running. This seems to indicate that each time command runs, it spawns  2 inet_gethost processes, both of which do not return.

 

Has anybody seen this before? Any suggestions on what may be going wrong?

 

Also, after few hours, some or all of the commands start priting again. We notice these 3 messages also printed, which might be related to this progress:

 

inet_gethost[13423]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 13424).

inet_gethost[31255]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 31256)

inet_gethost[31112]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 24698).

 

Thanks

Kapil

Michael Klishin

unread,
Mar 23, 2016, 11:18:15 PM3/23/16
to rabbitm...@googlegroups.com
rabbitmqctl report lists every single queue, binding, connection, so it can take a ling time to generate output.

Depending on what exactly enforces the timeout it can be that or a hostname resolution problem.

inet_gethost certainly hints at the latter. Make sure your node hostname resolves either via DNS or the local hosts file.
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages