vhost is down after creation

2,127 views
Skip to first unread message

voro...@gmail.com

unread,
Apr 16, 2018, 3:23:02 PM4/16/18
to rabbitmq-users
Hello,

I am looking at an behaviour when a client that tries to connect to a vhost as it is being created runs into a "vhost is down problem" and even more troubling is that the vhost stays down until we click the "restart" button on the vhost in the admin UI.

Here are some logs:

2018-04-13 23:46:16.709 [info] <0.21767.381> Adding vhost 'myvhost'

2018-04-13 23:46:17.915 [error] <0.21841.381> Error on AMQP connection <0.21841.381> (10.1.1.87:48750 -> 10.1.1.13:5672, user: 'events', state: opening):
access to vhost 'myvhost' refused for user 'myuser'

2018-04-13 23:46:18.736 [info] <0.21963.381> Setting permissions for 'myuser' in 'myvhost' to '.*', '.*', '.*'

2018-04-13 23:46:18.803 [error] <0.21984.381> Error on AMQP connection <0.21984.381> (10.1.1.87:48850 -> 10.1.1.13:5672, vhost: 'none', user: 'myuser', state: opening), channel 0:
 {handshake_error,opening,
                 {amqp_error,internal_error,
                             "access to vhost 'myvhost' refused for user 'myuser': vhost 'myvhost' is down",
                             'connection.open'}}


RabbitMQ: 3.7.4
Erlang: 20.3
Number of brokers: 3

Please help me define the steps to troubleshoot this situation. So far I've been restarting brokers or vhosts to get rid of this error.

-- Aleksey

Michael Klishin

unread,
Apr 17, 2018, 10:54:45 AM4/17/18
to rabbitm...@googlegroups.com
Something caused the vhost to become unavailable. The log snippet posted doesn't say what it was, see
earlier log entries for clues.

We have seen this in scenarios where a just started node had immediate client connections. I don't think we are aware of other
common cases.

Virtual hosts will try to restart if recovery is possible.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
MK

Staff Software Engineer, Pivotal/RabbitMQ

voro...@gmail.com

unread,
Apr 17, 2018, 5:22:36 PM4/17/18
to rabbitmq-users
I have a few more things investigated. I took a tcpdump on port 15672 and I am seeing client application to do the following:

Thread-1 :
    PUT /api/vhosts/myvhost
Then 1 seconds later but *before* I see a response to the first HTTP request
Thead-2 :
    PUT /api/vhosts/myvhost
    Response is 204 No Content (vhost already exists)
    PUT /api/permissions/myvhost/myuser
    Response is 201 Created
    PUT /api/policies/myvhost/ha-all
    Response is 201 Created
    the application goes on to open a connection on 5672 on that vhost and everything is stuck in an error state.

You were correct in that it seems that immediate client connections are important. When I am trying to reproduce the same problem with a simple curl calls, I do not see it happenning. The vhost does enter "partial" state, but it self heals.

-- Aleksey
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

voro...@gmail.com

unread,
Apr 17, 2018, 6:49:22 PM4/17/18
to rabbitmq-users
Actually, I found the cause. The client code had way too small of a timeout. I reproduced the issue with

curl -v -XPUT http://<rabbit host>:15672/api/vhosts/test4-005 -H'Content-Type: application/json' -u <user>:<password> -m 1

(it takes longer than 1 seconds for our 3 nodes in the Rabbit cluster to create a vhost between them)

It seems that Cowboy
HTTP server kills the Erlang process handling the request. Which leads to vhost not being initialized completely.

This issue is only happening with a cluster, and in our case, with a cluster that's been through some things - has a noticeable number of vhosts and queues already created. It has to take more time for Rabbit to create the vhost than the client waits for vhost to come online and time out on HTTP request.

Increasing HTTP timeout fixed the issue. However, I wonder if I should log this as a bug for RabbitMQ... What's your opinion?

-- Aleksey

Michael Klishin

unread,
Apr 17, 2018, 10:19:59 PM4/17/18
to rabbitm...@googlegroups.com
If you create a virtual host it takes some time (likely fractions of a second but can be a few seconds)
for it to be initialised. That is NOT a bug.

So this is exactly the scenario we've seen with some of our test suites, in fact,
using the HTTP API. Give the vhost some time before doing anything with it.

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-users+unsubscribe@googlegroups.com.
To post to this group, send email to rabbitmq-users@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages