Hello, I was wondering someone can give any insight in to an issue I'm seeing on a rabbitmq installation.
I'm currently running v2.8.2 on a windows box. I have .Net clients hosted on the same machine as rabbit. There isn't a high volume of messages - no more than 30,000 over 20 queues during 24 hours atthe very most, more likely to be closer to 10,000 than 30,000.
All queues only have a single consumer, no exchanges/routing set up. Messages are persisted and messages are manually acknowledged. All messages are a small piece of XML, never over 1kb.
In terms of resources rabbit is currently showing :
30/829 Socket descriptors
443/1048576 erlang prcoesses
66.8Mb memory used
15.5Gb disk space free (12Gb low watermark).
So I think there are adequate resource for rabbit.
RabbitMq appears to be shutting down of its own accord, e.g. I see this in the client application log:
The AMQP operation was interrupted: AMQP close-reason, initiated by Peer, code=320, text="CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'", classId=0, methodId=0, cause=
at RabbitMQ.Client.Impl.ConnectionBase.Close(ShutdownEventArgs reason, Boolean abort, Int32 timeout)
And then BrokerUnreachable exceptions until RabbitMQ starts again.
In rabbit.log there doesn't seem to be anything to denote why (extract below), the two log entries preceeding and following this extract are accepting/closing connection info messages. The extract posted below has happened 81 times since the evening of 8th November, always the same set of logs, I do not see any other messages before/after that indicate an error, manual shutdown etc.
I have concerns over one of the applications that consumes messages from the queue, this queue gets about 2/3rds of all messages but is using BasicGet to retrieve a message at a time. The other applications use the BasicConsumer to dequeue multiple messages at a time.
Given the number of messages on this single queue I don't think it should use BasicGet and that dequeue would be more efficient ? Could BasicGet be causes instability with (say) 5,000 messages a day ?
It didn't cause the issue, but because rabbit had been restarting and it couldn't connect this application stopped, allowing a backlog of some 53,000 messages to build up. When I restarted the application it only managed to retrieve 4000 messages before RabbitMq restarted, it then managed 28,000 before restarting again and finally cleared the backlog. Rabbit has now been up for 3 hours and appears to be perfectly okay.
Is there anything I can look at to determine the problem further ?
Thanks
Mark
Rabbit.log extract
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
Stopping Rabbit
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
application: rabbitmq_management
exited: stopped
type: permanent
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
application: rabbitmq_management_agent
exited: stopped
type: permanent
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
stopped TCP Listener on [::]:5672
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
application: rabbit
exited: stopped
type: permanent
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
application: os_mon
exited: stopped
type: permanent
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
application: mnesia
exited: stopped
type: permanent
=INFO REPORT==== 10-Nov-2014::10:30:27 ===
Halting Erlang VM
=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Limiting to approx 924 file handles (829 sockets)
=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Memory limit set to 4911MB of 12277MB total.
=INFO REPORT==== 10-Nov-2014::10:31:30 ===
Disk free limit set to 12277MB
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management plugin upgraded statistics to fine.
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
msg_store_transient: using rabbit_msg_store_ets_index to provide index
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
msg_store_persistent: using rabbit_msg_store_ets_index to provide index
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
started TCP Listener on [::]:5672
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management agent started.
=INFO REPORT==== 10-Nov-2014::10:31:31 ===
Management plugin started. Port: 55672, path: /
in sasl.log I see
=SUPERVISOR REPORT==== 10-Nov-2014::10:30:27 ===
Supervisor: {<0.402.0>,rabbit_channel_sup_sup}
Context: shutdown_error
Reason: shutdown
Offender: [{pid,<0.404.0>},
{name,channel_sup},
{mfa,{rabbit_channel_sup,start_link,[]}},
{restart_type,temporary},
{shutdown,infinity},
{child_type,supervisor}]