/api/connections lists thousands of Management UI user connections, no state

235 views
Skip to first unread message

Kevin Freeman

unread,
Aug 3, 2018, 6:43:45 PM8/3/18
to rabbitmq-users
Config: rabbitmq 3.7.4, erlang 20.3, windows server 2012 R2

We see several thousand connections like below in the Web UI and in the response to /api/connections.  The named user is only used by the management UI.  Is this a known issue?

Cheers,
Kevin


{
"user_who_performed_action": "RabbitUI",
"node": "rabbit@HOST",
"connected_at": 1532351873269,
"type": "direct",
"client_properties": {
},
"vhost": "/",
"user": "RabbitUI",
"name": "<rab...@HOST.2.10000.7419>",
"protocol": "Direct 0-9-1",
"node": "rabbit@HOST"
},
{
"user_who_performed_action": "RabbitUI",
"node": "rabbit@HOST",
"connected_at": 1532351955921,
"type": "direct",
"client_properties": {
},
"vhost": "/",
"user": "RabbitUI",
"name": "<rab...@HOST.2.10009.7981>",
"protocol": "Direct 0-9-1",
"node": "rabbit@HOST"
},
{
"user_who_performed_action": "RabbitUI",
"node": "rabbit@HOST",
"connected_at": 1529930695724,
"type": "direct",
"client_properties": {
},
"vhost": "/",
"user": "RabbitUI",
"name": "<rab...@HOST.2.10015.6149>",
"protocol": "Direct 0-9-1",
"node": "rabbit@HOST"
},


Our application connections typically look like this:
{
"reductions_details": {
"rate": 174.8
},
"reductions": 41037120,
"recv_oct_details": {
"rate": 0.0
},
"recv_oct": 123953,
"send_oct_details": {
"rate": 0.0
},
"send_oct": 79874,
"user_who_performed_action": "SvcUser",
"node": "rabbit@NODE",
"connected_at": 1533108449477,
"client_properties": {
"capabilities": {
"authentication_failure_close": true,
"basic.nack": true,
"connection.blocked": true,
"consumer_cancel_notify": true,
"exchange_exchange_bindings": true,
"publisher_confirms": true
},
"connection_name": "undefined",
"copyright": "Copyright (c) 2007-2016 Pivotal Software, Inc.",
"information": "Licensed under the MPL.  See http://www.rabbitmq.com/",
"platform": ".NET",
"product": "RabbitMQ",
"version": "0.0.0.0"
},
"channel_max": 0,
"frame_max": 131072,
"timeout": 60,
"vhost": "/",
"user": "SvcUser",
"protocol": "AMQP 0-9-1",
"ssl_hash": null,
"ssl_cipher": null,
"ssl_key_exchange": null,
"ssl_protocol": null,
"auth_mechanism": "PLAIN",
"peer_cert_validity": null,
"peer_cert_issuer": null,
"peer_cert_subject": null,
"ssl": false,
"peer_host": "xx.xx.228.6",
"host": "xx.xx.229.103",
"peer_port": 9695,
"port": 5672,
"name": "xx.xx.228.6:9695 -> xx.xx.229.103:5672",
"node": "rabbit@NODE",
"type": "network",
"garbage_collection": {
"minor_gcs": 715,
"fullsweep_after": 65535,
"min_heap_size": 233,
"min_bin_vheap_size": 46422,
"max_heap_size": 0
},
"channels": 10,
"state": "running",
"send_pend": 0,
"send_cnt": 3882,
"recv_cnt": 15191
},

Kevin Freeman

unread,
Aug 3, 2018, 6:56:36 PM8/3/18
to rabbitmq-users
The management ui connections are not present in the output of:
.\rabbitmqctl.bat list_connections pid state user

Kevin

Michael Klishin

unread,
Aug 6, 2018, 2:29:23 AM8/6/18
to rabbitm...@googlegroups.com
Those direct connections most likely are used by the aliveness check endpoint.

Are you calling that endpoint way too often? Did the node run into alarms recently? That's
the only scenario I recall where they would accumulate.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Kevin Freeman

unread,
Aug 6, 2018, 12:49:09 PM8/6/18
to rabbitmq-users
The login is also used by a network appliance aliveness test, and the cluster did recently experience a memory alarm.  The network appliances (2) are each calling aliveness test every 15 seconds.

Kevin
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Aug 6, 2018, 12:53:39 PM8/6/18
to rabbitm...@googlegroups.com
IIRC those connections accumulated at the time of the alarm. You can close them via the HTTP API and a script,
e.g. using the username as filter.

To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Aug 6, 2018, 12:55:07 PM8/6/18
to rabbitm...@googlegroups.com
I should have mentioned that 15 seconds is 4 times as frequent as we usually recommend (most monitoring systems chart things with
a minute precision in practice) but it's not excessive and the interval specifically is not the cause here.

We'll see if we can reproduce after the 3.7.8 release.

Kevin Freeman

unread,
Aug 8, 2018, 7:53:28 PM8/8/18
to rabbitmq-users
I dug a bit more into our rabbitmq nodes and found that those under a load balancer (calling aliveness) list at least 10min worth (~80) of direct connections - they seem to mostly clear after 10 minutes.  This includes standby nodes under very light load.  But some of the nodes accumulate much more.  Nodes that have experienced a memory alarm have the most connections in this state, but all nodes under load have at least a few hundred direct connections more than can be explained by a 10-minute timeout.

I whipped up a script in attempt to delete the connections, but the DEL calls start to time out after 1 or 2 iterations.  Script below in case it helps anyone else.

$secpasswd = ConvertTo-SecureString 'pass' -AsPlainText -Force
$credRabbit = New-Object System.Management.Automation.PSCredential ('Fixer', $secpasswd)
$Params = @{
        BaseUri = "http://node:15672"
        Credential = $credRabbit
}

Import-Module RabbitMQTools -force

$conns = Get-RabbitMQConnection @Params
$conns = $conns|Where-Object {($_.User -eq 'Aliveness') -and ($_.State -ne 'Running')}
Write-Host "Stale Aliveness connections:" $conns.count

## Commented out in attempt to get more error detail
##$names = New-Object 'System.Collections.Generic.List[String]'
##foreach ($conn in $conns) {
## $names.Add($conn.name)
##}
##$names | Remove-RabbitMQConnection @Params

#Should use above block instead, below for testing
$conns = $conns | select-object -first 10
foreach ($conn in $conns) {
$url = $Params.BaseUri + "/api/connections/$([System.Web.HttpUtility]::UrlEncode($conn.name))"
$result = Invoke-RestMethod $url -Credential $Params.Credential -DisableKeepAlive -ErrorAction Continue -Method Delete -Verbose
Write-Host $result
Start-Sleep -m 10

Luke Bakken

unread,
Aug 9, 2018, 7:44:37 PM8/9/18
to rabbitmq-users
Hi Kevin,

Have you checked the output of netstat during this scenario to see if your load balancer is keeping HTTP connections to port 15672 open longer than they should be (via keepalives or something)?

We haven't been able to reproduce this in our local dev environments. If you're using a load balancer like haproxy could you share your configuration?

Thanks,
Luke

Michael Klishin

unread,
Aug 9, 2018, 7:57:51 PM8/9/18
to rabbitm...@googlegroups.com
To be more specific, here are the steps I tried (Luke can share his if they are meaningfully different):

 * Start a 3.7.7 node
 * Set VM memory watermark very low (100kB) so that it hits an alarm immediately
 * Observe the alarm in the logs
 * Hit the aliveness-test endpoint with vhost = / via curl 200 times
 * Use Erlang's Observer app to find ho leftover connections
 * Use RabbitMQ management UI to find no leftover connections

Note that the endpoing uses "direct" Erlang client connections and is local to the node so they are not easy to observe
with e.g. netstat, unlike "regular" client connections that use a TCP connection each.

However we have a hypothesis that would require HTTP request connections to stick around for a long time. Is this possible in your setup, Kevin?

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Aug 9, 2018, 7:59:47 PM8/9/18
to rabbitm...@googlegroups.com
I also observed non-zero message rates during my tests, which is an indication that the aliveness-test endpoint does what
it is supposed to: opens connections, channels and publishes messages. It doesn't do much but what it does do, it opens
a new direct connection to the local node per HTTP client call.

To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ
Reply all
Reply to author
Forward
0 new messages