Broker unreachable for many seconds after RabbitMQ service has started

319 views
Skip to first unread message

Steve

unread,
Aug 9, 2017, 12:56:01 PM8/9/17
to rabbitmq-users
I have created a .Net Windows service that 'depends on' RabbitMQ such that it is only started on Windows reboot once the RabbitMQ Windows service is underway. I have observed that once the RabbitMQ service is underway it still takes several seconds before a broker can be reached for creating a connection. With one hardware configuration that I intend to support I have observed a 20 second delay between my service confirming that the RabbitMQ service is running and my service being able to create its first connection to RabbitMQ via the RabbitMQ.Cleint.ConnectionFactory.CreateConnection method (effectively polling until CreateConnection returns a connection instead of throwing a BrokerUnreachableException).

My questions are: Is this a known problem? Is this expected behaviour? What are my options for reducing this delay?

Some more detail:

Software:
Operating System: Windows 7
RabbitMQ Server version: 3.6.5
RabbitMQ.Client version: 4.1.1

Dev PC:
Processor: Intel Core i5 6500
RAM: 20 GB

I have created and attached the source code for a simple command line application that demonstrates a 6 second delay on my development machine between the RabbitMQ service starting and the broker becoming available, even outside of system reboot. The application stops the RabbitMQ service, starts polling for a connection, then restarts the service. It assumes that there is a user called 'Username' with a password of 'Password' on localhost, with access to the '/' virtual host. The service takes less than 1 second to start, however the polling continues for a further few polls of 2 seconds each before successfully getting a connection. On less performant hardware this can be 20 seconds instead of 6, but on this hardware even 6 seconds seems excessive.

Example output:

09/08/2017 17:50:06 Running...
09/08/2017 17:50:09 Start RabbitMQ service
09/08/2017 17:50:09 Start poll
09/08/2017 17:50:09 Create factory
09/08/2017 17:50:09 Factory created
09/08/2017 17:50:09 RabbitMQ service started
09/08/2017 17:50:11 Broker Unreachable
09/08/2017 17:50:13 Broker Unreachable
09/08/2017 17:50:15 Connected
09/08/2017 17:50:15 ...Finished
09/08/2017 17:50:15 Press [enter] to continue

Logs:
The SASL log is empty.
The standard log contains:

=INFO REPORT==== 9-Aug-2017::17:50:07 ===
Stopping RabbitMQ


=INFO REPORT==== 9-Aug-2017::17:50:07 ===
stopped TCP
Listener on 0.0.0.0:5672


=INFO REPORT==== 9-Aug-2017::17:50:07 ===
stopped TCP
Listener on [::]:5672


=INFO REPORT==== 9-Aug-2017::17:50:07 ===
Stopped RabbitMQ application


=INFO REPORT==== 9-Aug-2017::17:50:07 ===
Halting Erlang VM


=INFO REPORT==== 9-Aug-2017::17:50:12 ===
Starting RabbitMQ 3.6.5 on Erlang 18.3
Copyright (C) 2007-2016 Pivotal Software, Inc.
Licensed under the MPL.  See http://www.rabbitmq.com/


=INFO REPORT==== 9-Aug-2017::17:50:12 ===
node          
: rabbit@[Computer Name]
home dir      
: C:\Windows
config file
(s) : c:/Users/[My Username]/AppData/Roaming/RabbitMQ/rabbitmq.config
cookie hash    
: Zf3sjQl4MGnEPZAxKzC5rg==
log            
: C:/Users/[My Username]/AppData/Roaming/RabbitMQ/log/RABBIT~1.LOG
sasl log      
: C:/Users/[My Username]/AppData/Roaming/RabbitMQ/log/RABBIT~2.LOG
database dir  
: c:/Users/[My Username]/AppData/Roaming/RabbitMQ/db/RABBIT~1


=INFO REPORT==== 9-Aug-2017::17:50:14 ===
Memory limit set to 8144MB of 20360MB total.


=INFO REPORT==== 9-Aug-2017::17:50:14 ===
Disk free limit set to 50MB


=INFO REPORT==== 9-Aug-2017::17:50:14 ===
Limiting to approx 8092 file handles (7280 sockets)


=INFO REPORT==== 9-Aug-2017::17:50:14 ===
FHC read buffering
:  OFF
FHC write buffering
: ON


=INFO REPORT==== 9-Aug-2017::17:50:14 ===
Priority queues enabled, real BQ is rabbit_variable_queue


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Management plugin: using rates mode 'basic'


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
msg_store_transient
: using rabbit_msg_store_ets_index to provide index


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
msg_store_persistent
: using rabbit_msg_store_ets_index to provide index


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
started TCP
Listener on [::]:5672


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
started TCP
Listener on 0.0.0.0:5672


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Management plugin started. Port: 15672


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics event collector started.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics channel stats collector started.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics queue stats collector started.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics database started.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_stats_deliver_get with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_stats_queue_msg_counts with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_stats_queue_msg_rates with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_stats_process_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_queue_exchange_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_vhost_stats_deliver_get with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_vhost_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_vhost_stats_queue_msg_rates with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_vhost_stats_queue_msg_counts with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_vhost_stats_coarse_conn_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_queue_stats_deliver_get with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_queue_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_queue_stats_queue_msg_counts with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_stats_deliver_get with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_stats_queue_msg_counts with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_stats_process_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_exchange_stats_deliver_get with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_channel_exchange_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_exchange_stats_fine_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_node_stats_coarse_node_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_node_node_stats_coarse_node_node_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_connection_stats_coarse_conn_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table aggr_connection_stats_process_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table channel_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Statistics garbage collector started for table connection_stats with interval 5000.


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
Server startup complete; 6 plugins started.
 
* rabbitmq_management
 
* rabbitmq_web_dispatch
 
* webmachine
 
* mochiweb
 
* rabbitmq_management_agent
 
* amqp_client


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
accepting AMQP connection
<0.393.0> (127.0.0.1:54758 -> 127.0.0.1:5672)


=INFO REPORT==== 9-Aug-2017::17:50:15 ===
closing AMQP connection
<0.393.0> (127.0.0.1:54758 -> 127.0.0.1:5672)


Any help or guidance would be much appreciated,

Steve
Program.cs

dfed...@pivotal.io

unread,
Aug 9, 2017, 1:12:35 PM8/9/17
to rabbitmq-users
Hello Steve,

Windows service does not wait for RabbitMQ to initialise, it sets status to 'Running' as soon as the erlang VM starts.
Before RabbitMQ can accept connections, the VM should load the bytecode, read the database state, unpack and load the plugins, which can take some time.
Most of the operations tools (e.g. on unix platforms) use `rabbitmqctl wait` command, which will block until RabbitMQ initialise all the internal state.

Hope it helps.

Steve

unread,
Aug 10, 2017, 5:55:41 AM8/10/17
to rabbitmq-users
Hello,

Thank you for your reply. I can understand that the RabbitMQ Windows service marks itself as running before RabbitMQ is fully available for use, this is normal for services. Thank you for the information about 'rabbitmqctl wait', I will use that or an equivalent rather than polling in future. Also, I have now disabled all plugins that I do not need. Disabling the plugins has substantially reduced the time between the initial RabbitMQ log appearing and connections being available, from 11 seconds to 1 second on my worst machine. So thank you.

However I am still concerned, on my worst machine, about the 12 second delay between the RabbitMQ service being in the Running state, and the first log entry from RabbitMQ appearing. Is there anything I can do (e.g. profiling) find out what is actually causing the 12 second delay, or any actions I can take that might reduce it? I would prefer to have RabbitMQ accepting connections before any user-facing applications begin, i.e. before our bespoke Windows Shell starts initialisation (so that our Windows Shell can create connections to RabbitMQ during its initialisation).

Order of timings:
  1. 09:30:36: Evidence that the RabbitMQ service is running.
  2. 09:30:42: Windows Shell initialisation begins.
  3. 09:30:46: First RabbitMQ log entry ('Starting RabbitMQ 3.6.5 on Erlang 18.3')
  4. 09:30:50: RabbitMQ log entry stating that it is accepting connections.
  5. 09:30:51: Evidence that RabbitMQ is accepting connections.
  6. 09:30:51: Windows Shell initialisation complete.
  7. 09:30:56: Windows Shell available for user interaction.
Steve

Chandrika

unread,
Nov 26, 2018, 6:44:40 PM11/26/18
to rabbitmq-users
Steve,

Is there anyway to do the same status check, i.e. wait til RabbitMQ is done with all its initialization steps and is ready for accepting connections,  when I run RabbitMQ in detached mode?

Thanks
Chandrika

Michael Klishin

unread,
Dec 3, 2018, 12:43:30 PM12/3/18
to rabbitm...@googlegroups.com
There are `rabbitmqctl wait` which takes a PID path argument (in 3.6.x and 3.7.x)
plus `rabbitmqctl await_online_nodes` (in recent 3.7.x series).

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages