Closed connection rate much higher than opened

179 views
Skip to first unread message

Mantas Audickas

unread,
Feb 12, 2022, 4:44:30 AM2/12/22
to rabbitmq-users
Hello,

we are seeing quite strange metric in our RabbitMQ cluster (3 nodes, RabbitMQ 3.9.9 Erlang 24.1.5), it reports closed connection rate much more higher than opened connection rate while in total connection count stays quite stable (near 2000 connections):

churning.png
Any ideas?
with best regards
Mantas

Luke Bakken

unread,
Feb 12, 2022, 1:34:42 PM2/12/22
to rabbitmq-users
Hello,

Are you using protocols other than AMQP, like MQTT?

That is a pretty strange statistic. I'll bring it up with the team.

Thanks,
Luke

Mantas Audickas

unread,
Feb 12, 2022, 2:39:27 PM2/12/22
to rabbitm...@googlegroups.com
We use only .NET client (with custom made library on top), so I guess ampq 0.9.

Strange thing that its constant.. created connections are relatively low (still too high, but we are working on it) while closed is higher few hundred times..


With best regards
Mantas

--
You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/hFGC0NwXBjk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/87450169-b081-4d2a-a5c7-f3ad5590c298n%40googlegroups.com.

Luke Bakken

unread,
Feb 14, 2022, 1:33:44 PM2/14/22
to rabbitmq-users
Hello,

Do your applications choose a node at random in the cluster to which to connect or is there a load balancer in between?

I'm going to try and reproduce this today.

Thanks
Luke

jo...@cloudamqp.com

unread,
Feb 14, 2022, 1:39:14 PM2/14/22
to rabbitmq-users
Hi,

Check the logs and see if you can find why it is closing all those connections there.

Further checks:
- Do you have a shovels? Are try repeatedly failing to connect?
- Enable the event exchange [1], bind "connection.closed" for a second or two, to a queue with a max-length, and then unbind it again (to avoid choking the system). Analyze what the "connection.closed" stem from.

/Johan

Mantas Audickas

unread,
Feb 14, 2022, 1:47:25 PM2/14/22
to rabbitmq-users
> Do your applications choose a node at random in the cluster to which to connect or is there a load balancer in between?
We have a mixed mode: some clients are connecting directly to nodes and some still using load balancer. We were in the middle of transition from load balancer to direct connection as we had some other issues related to load balancer.

> Do you have a shovels?
If you mean shovel plugin - then no, we do not have this plugin installed.
However RabbitMQ is accessed using RabbitMQ clients, Management UI and Queue explorer (over management port).

> Enable the event exchange...
This might take a while, I will try to see how this can be done.

with best regards
Mantas

Luke Bakken

unread,
Feb 14, 2022, 2:33:39 PM2/14/22
to rabbitmq-users
Hi everyone.

Excellent suggestions Johan. I hope that pinpoints the issue.

I didn't expect to be able to reproduce this and I can't. I have a 3-node cluster running. To simulate connection churn I have a Python script that starts hundreds of threads that connect, open some channels, maybe sleep, then disconnect. I also have PerfTest running with 128 producers/consumers to simulate constant connections.

I think the log files may show that connections are being denied due to credentials or some other error. I wonder if those still show up in the "closed" stats? It's easy enough for me to add that to my testing.

Luke

Luke Bakken

unread,
Feb 14, 2022, 3:06:35 PM2/14/22
to rabbitmq-users
Hi again -

On Monday, February 14, 2022 at 11:33:39 AM UTC-8 Luke Bakken wrote:
I think the log files may show that connections are being denied due to credentials or some other error. I wonder if those still show up in the "closed" stats? It's easy enough for me to add that to my testing.

Yep, that is the issue. I have attached the program I used to reproduce. Basically, by adding code that opens a socket, doesn't initiate an AMQP handshake, then closes it I can make the "closed" stat exceed the "created" stat. I suppose this could be considered expected. Now I know if we see stats like this what it could indicate.

So that's where I would check in your environment. What is connecting to port 5672 but not doing anything with the connection?

Luke
churn.py

Mantas Audickas

unread,
Feb 15, 2022, 5:24:49 AM2/15/22
to rabbitmq-users
That's actually explains everything. 
Our client is doing health checks to nodes by simply trying to open socket to port 5672 and closes it afterwards without doing anything more.
So I guess we are doing it wrong :) 

Thanks for explanation!

with best regards
Mantas

Mantas Audickas

unread,
Feb 25, 2022, 3:14:30 AM2/25/22
to rabbitmq-users
Btw, do you have a good suggestion, how to implement node health check properly?
I mean what should be done when connection is opened - in order to count it as "Created".. so that "Closed" metric would match :) 
What we did before was like this (C#):

private bool GetEndpointHealth()
{
    try
    {
        var socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
        socket.Connect(Endpoint.HostName, Endpoint.Port);
        socket.Close();
        return true;
    }
    catch
    {
        return false;
    }
}

But it does not validate if credentials are correct.. that was not our point - we just wanted to make sure that server is started and listening..
So I would not even consider it as "Opening connection".. :) 

with best regards
Mantas

Luke Bakken

unread,
Feb 25, 2022, 11:36:48 AM2/25/22
to rabbitmq-users
Reply all
Reply to author
Forward
0 new messages