msg_store_persistent: rebuilding indices from scratch". So, I thought I'd upgrade to 3.5.6 (via homebrew). The upgrade went without any problems. However, rabbitmq is still "stuck". I rebooted the Mac and restarted rabbitmq, but the problem just won't go away.
Please help. The crash and log files are on google drive:
Many thanks,
Vitaly
The erl_crash.dump and rab...@localhost.log are attached with my prior post. The sasl log is on google drive at https://drive.google.com/file/d/0B826DlZRam76OTBrZ3V3NmNKcW8/view?usp=sharing
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hi Alvaro, Michael, et al,
here are the results:
1. This problem is happening with the VPN OFF. I am aware of issues when running with VPN, as experienced personally and discussed on numerous threads here. RabbtMQ would slow down severely with VPN ON, but then would recover once the VPN was turned OFF. Every now and then, it wouldn't recover after turning VPN OFF, and I would have to kill it, and then restart rabbit to get back to normal. This time is different: RabbitMQ seems "stuck" all the time regardless of VPN ON/OFF and rebooting the Mac OS X 10.10.5 doesn't help at all.
2. This all has been running fine for a long time with 3.5.1, except the pesky erl_crash.dump discussed in the thread https://groups.google.com/forum/#!searchin/rabbitmq-users/erl_crash.dump%7Csort:date/rabbitmq-users/3oYTb9feQr8/iSBiRGy8tZ4J. The problem in the current thread started suddenly this night, after successfully running tests and code for much of the preceding day. Now, every time I execute `rabbitmq-server -detached`, erl_crash.dump shows up in the current directory. Per referenced thread, the regular erl_crash dumps were going to be addressed. I installed 3.5.6 in hopes that it would help recover from the stuck state (before posting this thread), but it didn't help. However, I can't tell whether the erl_crash dump in the other thread is related to the one in this thread.
3.
$ /bin/df -kP /usr/local/var/lib/rabbitmq/mnesia/
Filesystem 1024-blocks Used Available Capacity Mounted on
/dev/disk1 487385236 154144276 332984960 32% /
$ hostname
vkruglikovs-MacBook-Pro.local
This shows that the pure DNS resolver doesn't see it:
$ host `hostname`
Host vkruglikovs-MacBook-Pro.local not found: 3(NXDOMAIN)
But the "official" system resolver on Mac OS X sees it just fine (aided by the hosts file, I guess):
$ dscacheutil -q host -a name `hostname`
name: localhost
alias: vkruglikovs-macbook-pro.local
ipv6_address: ::1
name: localhost
alias: vkruglikovs-macbook-pro.local
ip_address: 127.0.0.1
And ping works, too:
$ ping `hostname`
PING localhost (127.0.0.1): 56 data bytes
64 bytes from 127.0.0.1: icmp_seq=0 ttl=64 time=0.065 ms
And in Python:
In [7]: socket.gethostname()
Out[7]: 'vkruglikovs-MacBook-Pro.local'
In [8]: socket.gethostbyname('vkruglikovs-MacBook-Pro.local')
Out[8]: '127.0.0.1'
--
vkruglikov@vkruglikovs-MacBook-Pro-2:~$ ps aux | grep -i rabbit
root 19212 1.9 0.6 2672648 97024 s004 S 6:00PM 0:34.58 /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../../erts-6.4/bin/beam.smp -W w -A 64 -P 1048576 -K true -B i -- -root /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../.. -progname erl -- -home /Users/vkruglikov -- -pa /usr/local/Cellar/rabbitmq/3.5.6/ebin -noshell -noinput -s rabbit boot -sname rabbit@localhost -boot /usr/local/Cellar/rabbitmq/3.5.6/releases/3.5.6/start_sasl -kernel inet_default_connect_options [{nodelay,true}] -rabbit tcp_listeners [{"127.0.0.1",5672}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit error_logger {file,"/usr/local/var/log/rabbitmq/rab...@localhost.log"} -rabbit sasl_error_logger {file,"/usr/local/var/log/rabbitmq/rab...@localhost-sasl.log"} -rabbit enabled_plugins_file "/usr/local/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/local/Cellar/rabbitmq/3.5.6/plugins" -rabbit plugins_expand_dir "/usr/local/var/lib/rabbitmq/mnesia/rabbit@localhost-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/usr/local/var/lib/rabbitmq/mnesia/rabbit@localhost" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672
root 19209 0.0 0.0 2467612 388 ?? S 6:00PM 0:00.04 /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../../erts-6.4/bin/epmd -daemon
dtrace output for /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../../erts-6.4/bin/beam.smp log: https://gist.github.com/vitaly-krugl/739ab8b4e146c31741e1
dtrace output for /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../../erts-6.4/bin/epmd -daemon: https://gist.github.com/vitaly-krugl/fc63a0d4d336f7ebc7d7
vkruglikov@vkruglikovs-MacBook-Pro-2:~$ ps aux | grep -i rabbit
vkruglikov 20213 0.0 0.0 2432772 680 s003 S+ 6:59PM 0:00.00 grep -i rabbit
vkruglikov 20165 0.0 0.0 2459420 372 ?? S 6:57PM 0:00.00 /usr/local/Cellar/rabbitmq/3.5.6/erts-6.4/bin/../../erts-6.4/bin/epmd -daemon
vkruglikov@vkruglikovs-MacBook-Pro-2:~$
--
There was very little in mnesia at the time that rabbitmq was unresponsive:
vkruglikov@vkruglikovs-MacBook-Pro-2:/usr/local/var/lib/rabbitmq (master)$ du -sh UNRESPONSIVE-MNESIA-OLD/
4.1M UNRESPONSIVE-MNESIA-OLD/
vkruglikov@vkruglikovs-MacBook-Pro-2:/usr/local/var/lib/rabbitmq (master)$ ls UNRESPONSIVE-MNESIA-OLD/
rabbit@localhost rabbit@localhost-plugins-expand rab...@localhost.pid
vkruglikov@vkruglikovs-MacBook-Pro-2:/usr/local/var/lib/rabbitmq (master)$
$ df -h /
Filesystem Size Used Avail Capacity iused ifree %iused Mounted on
/dev/disk1 465Gi 153Gi 312Gi 33% 40125538 81720769 33% /
The archived mnesia directory is here: https://github.com/vitaly-krugl/publicly-shared-files/blob/master/unresponsive.mnesia.tar
--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
=WARNING REPORT==== 9-Nov-2015::22:18:24 ===
msg_store_persistent: rebuilding indices from scratch
=WARNING REPORT==== 9-Nov-2015::22:18:24 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
However, if I gracefully stop RabbitMQ (`rabbitmqctl stop`), then set `ulimit -n` back to the default (256) and restart RabbitMq (`rabbitmq-server -detached`), it *hangs* again.
My expectation was that once Mnesia indices are rebuilt, RabbitMQ would be able to work fine with the default `ulimit -n` of 256 after that.
The logs with `ulimit -n`=1024 and subsequent graceful restart with `ulimit -n`=256 are below
=INFO REPORT==== 9-Nov-2015::22:17:58 ===
Halting Erlang VM
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Starting RabbitMQ 3.5.6 on Erlang 17.5
Copyright (C) 2007-2015 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
node : rabbit@localhost
home dir : /Users/vkruglikov
config file(s) : /usr/local/etc/rabbitmq/rabbitmq.config (not found)
cookie hash : weyu9iv/l8eIOTGaIPFcYw==
log : /usr/local/var/log/rabbitmq/rab...@localhost.log
sasl log : /usr/local/var/log/rabbitmq/rab...@localhost-sasl.log
database dir : /usr/local/var/lib/rabbitmq/mnesia/rabbit@localhost
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Memory limit set to 6006MB of 15015MB total.
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Disk free limit set to 50MB
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Limiting to approx 924 file handles (829 sockets)
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
FHC read buffering: ON
FHC write buffering: ON
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Priority queues enabled, real BQ is rabbit_variable_queue
=INFO REPORT==== 9-Nov-2015::22:18:23 ===
Management plugin: using rates mode 'basic'
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index
=WARNING REPORT==== 9-Nov-2015::22:18:24 ===
msg_store_persistent: rebuilding indices from scratch
=WARNING REPORT==== 9-Nov-2015::22:18:24 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
started TCP Listener on 127.0.0.1:5672
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
rabbit_stomp: default user 'guest' enabled
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
started STOMP TCP Listener on [::]:61613
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
started MQTT TCP Listener on [::]:1883
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
Management plugin started. Port: 15672
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
Statistics database started.
=INFO REPORT==== 9-Nov-2015::22:18:24 ===
Server startup complete; 10 plugins started.
=INFO REPORT==== 9-Nov-2015::22:32:13 ===
Halting Erlang VM
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Starting RabbitMQ 3.5.6 on Erlang 17.5
Copyright (C) 2007-2015 Pivotal Software, Inc.
Licensed under the MPL. See http://www.rabbitmq.com/
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
node : rabbit@localhost
home dir : /Users/vkruglikov
config file(s) : /usr/local/etc/rabbitmq/rabbitmq.config (not found)
cookie hash : weyu9iv/l8eIOTGaIPFcYw==
log : /usr/local/var/log/rabbitmq/rab...@localhost.log
sasl log : /usr/local/var/log/rabbitmq/rab...@localhost-sasl.log
database dir : /usr/local/var/lib/rabbitmq/mnesia/rabbit@localhost
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Memory limit set to 6023MB of 15059MB total.
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Disk free limit set to 50MB
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Limiting to approx 156 file handles (138 sockets)
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
FHC read buffering: ON
FHC write buffering: ON
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Priority queues enabled, real BQ is rabbit_variable_queue
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
Management plugin: using rates mode 'basic'
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index
=INFO REPORT==== 9-Nov-2015::22:32:16 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index
=WARNING REPORT==== 9-Nov-2015::22:32:16 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=WARNING REPORT==== 9-Nov-2015::22:32:16 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=WARNING REPORT==== 9-Nov-2015::22:32:16 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
=WARNING REPORT==== 9-Nov-2015::22:32:16 ===
Mnesia(rabbit@localhost): ** WARNING ** Mnesia is overloaded: {dump_log,write_threshold}
--
Alvaro, thank you for the clarification. I recall reading somewhere in the docs that RabbitMQ maintains a cache of open file descriptors, but is able to close them as needed in order to make resources available for other operations. Did I get it wrong? If this many queue (I had almost 600) requires so many file descriptors (far above 256 that my system allowed originally), then how was RabbitMQ even able to create so many queue in the first place without running out of file descriptors?
Alvaro, thank you for the clarification. I recall reading somewhere in the docs that RabbitMQ maintains a cache of open file descriptors, but is able to close them as needed in order to make resources available for other operations. Did I get it wrong? If this many queue (I had almost 600) requires so many file descriptors (far above 256 that my system allowed originally), then how was RabbitMQ even able to create so many queue in the first place without running out of file descriptors?