RabbitMq shutting down / force closing connections

9,412 views
Skip to first unread message

Mark Craft

unread,
Nov 10, 2014, 7:28:25 AM11/10/14
to rabbitm...@googlegroups.com

Hello, I was wondering someone can give any insight in to an issue I'm seeing on a rabbitmq installation.

I'm currently running v2.8.2 on a windows box. I have .Net clients hosted on the same machine as rabbit. There isn't a high volume of messages - no more than 30,000 over 20 queues during 24 hours atthe very most, more likely to be closer to 10,000 than 30,000. 
All queues only have a single consumer, no exchanges/routing set up. Messages are persisted and messages are manually acknowledged. All messages are a small piece of XML, never over 1kb.

In terms of resources rabbit is currently showing :

30/829 Socket descriptors
443/1048576 erlang prcoesses
66.8Mb memory used
15.5Gb disk space free (12Gb low watermark).

So I think there are adequate resource for rabbit.

RabbitMq appears to be shutting down of its own accord, e.g. I see this in the client application log:

The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=320, text="CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", classId=0, methodId=0, cause=
   at RabbitMQ.Client.Impl.ConnectionBase.Close(ShutdownEventArgs reason, Boolean abort, Int32 timeout)

And then BrokerUnreachable exceptions until RabbitMQ starts again.

In rabbit.log there doesn't seem to be anything to denote why (extract below), the two log entries preceeding and following this extract are accepting/closing connection info messages. The extract posted below has happened 81 times since the evening of 8th November, always the same set of logs, I do not see any other messages before/after that indicate an error, manual shutdown etc.

I have concerns over one of the applications that consumes messages from the queue, this queue gets about 2/3rds of all messages but is using BasicGet to retrieve a message at a time. The other applications use the BasicConsumer to dequeue multiple messages at a time.
Given the number of messages on this single queue I don't think it should use BasicGet and that dequeue would be more efficient ? Could BasicGet be causes instability with (say) 5,000 messages a day ?

It didn't cause the issue, but because rabbit had been restarting and it couldn't connect this application stopped, allowing a backlog of some 53,000 messages to build up. When I restarted the application it only managed to retrieve 4000 messages before RabbitMq restarted, it then managed 28,000 before restarting again and finally cleared the backlog. Rabbit has now been up for 3 hours and appears to be perfectly okay.

Is there anything I can look at to determine the problem further ?

Thanks

Mark


Rabbit.log extract 

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
Stopping Rabbit

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
    application: rabbitmq_management
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
    application: rabbitmq_management_agent
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
stopped TCP Listener on 0.0.0.0:5672

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
stopped TCP Listener on [::]:5672

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
    application: rabbit
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
    application: os_mon
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
    application: mnesia
    exited: stopped
    type: permanent

=INFO REPORT==== 10-Nov-2014::10:30:27 ===
Halting Erlang VM

=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Limiting to approx 924 file handles (829 sockets)

=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Memory limit set to 4911MB of 12277MB total.

=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Disk free limit set to 12277MB

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management plugin upgraded statistics to fine.

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
started TCP Listener on [::]:5672

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
started TCP Listener on 0.0.0.0:5672

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management agent started.

=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management plugin started. Port: 55672, path: /

in sasl.log I see


=SUPERVISOR REPORT==== 10-Nov-2014::10:30:27 ===
     Supervisor: {<0.402.0>,rabbit_channel_sup_sup}
     Context:    shutdown_error
     Reason:     shutdown
     Offender:   [{pid,<0.404.0>},
                  {name,channel_sup},
                  {mfa,{rabbit_channel_sup,start_link,[]}},
                  {restart_type,temporary},
                  {shutdown,infinity},
                  {child_type,supervisor}]

Michael Klishin

unread,
Nov 10, 2014, 7:40:31 AM11/10/14
to rabbitm...@googlegroups.com, Mark Craft
On 10 November 2014 at 15:28:27, Mark Craft (mark....@gmail.com) wrote:
> The AMQP operation was interrupted: AMQP close-reason, initiated
> by Peer, code=320, text="CONNECTION_FORCED - broker forced
> connection closure with reason 'shutdown'

This means RabbitMQ was shut down. 
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,
Nov 10, 2014, 7:43:28 AM11/10/14
to rabbitm...@googlegroups.com, Mark Craft
 On 10 November 2014 at 15:40:29, Michael Klishin (mkli...@pivotal.io) wrote:
> This means RabbitMQ was shut down.

…or the entire VM decided to shut down, or there was another serious unrecoverable
issue.

There should be other log messages that may give a clue, in both regular and SASL logs.

Michael Klishin

unread,
Nov 10, 2014, 7:44:23 AM11/10/14
to rabbitm...@googlegroups.com, Mark Craft
 On 10 November 2014 at 15:28:27, Mark Craft (mark....@gmail.com) wrote:
> =INFO REPORT==== 10-Nov-2014::10:30:27 ===
> Stopping Rabbit

Sorry, haven't noticed this message. Yes, this means RabbitMQ server wa shut down
by a human or another program.
Message has been deleted

Alvaro Videla

unread,
Nov 11, 2014, 5:32:07 PM11/11/14
to Mark Craft, rabbitm...@googlegroups.com
If the Erlang VM would be dying unexpectedly, then RabbitMQ wouldn't be doing a clean shutdown as reported in the logs.

-Alvaro

On Mon, Nov 10, 2014 at 10:16 AM, Mark Craft <mark....@gmail.com> wrote:
Thanks Michael,

I did check the windows event log to look for service stops and starts, but the usual messages you see for any service (e.g. "The RabbitMQ Service enterted the stopped state" or "The RabbitMQ service entered the started state" messages are not present, leading me to believe that this was not a controlled shutdown (manual or automated).

If something is agressively killing ErlSrv there would also be logs in the event viewer for it (I've just tried it). 

Is it possible erl.exe is dynig ? can I make erlang write a log file that might help ?

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages