Hi Folks,
We run a cron job every five minutes that runs the command 'rabbitmqctl report' and logs the output to a file. On a particular setup, we observed that after having run successfully for many hours, this command started hanging. Eventually, the system ran out of memory and I found about 350 instances of the command (beam.smp) still running at that point. So, 350 times, the command didn't return. In the log file, the output will always be truncated like this:
<rab...@localhost.2.8047.1> 192.168.1.146:26943 -> 192.168.1.240:5671 5671 26943 192.168.1.240 192.168.1.146 true PLAIN tlsv1.2 rsa aes_256_cbc sha {0,9,1} cvn-hv-ded3cfa3-d639-11e5-9add-9dc306575d87 nsx 0 131072 100 [{"product","rabbitmq-c"},{"information","See http://hg.rabbitmq.com/rabbitmq-c/"}] 1458021579876 470366 1053 5629 65 0 running 1
Channels:
So, apparently, the command hangs while trying to print channels. I also found 700 instances of inet_gethost processes running. This seems to indicate that each time command runs, it spawns 2 inet_gethost processes, both of which do not return.
Has anybody seen this before? Any suggestions on what may be going wrong?
Also, after few hours, some or all of the commands start priting again. We notice these 3 messages also printed, which might be related to this progress:
inet_gethost[13423]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 13424).
inet_gethost[31255]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 31256)
inet_gethost[31112]: WARNING:Timeout waiting for child process to die, ignoring child (pid = 24698).
Thanks
Kapil
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.