Durable queues and no Mirroring on a cluster

275 views
Skip to first unread message

Steven Blanc

unread,
Jan 30, 2015, 11:55:28 AM1/30/15
to rabbitm...@googlegroups.com

Hello,

I have encountered an interesting situation with Durable Queues in a clustered test environment that has no mirroring.

 

My test environment:

* Durable queues

* No Mirroring of queues

* Messages are not persisted by default

* Two node cluster

 

When I begin, the two node cluster is up and functioning normally. 

I then start some consumers, with some connecting to one node and some to the second node.  This establishes queues for the

consumers on the node they are connected to.

I then shutdown one of the cluster nodes. 

My consumers that were connected to the shutdown node reconnect to the remaining node due to failover logic in our applications.  Part of the reconnection logic

built into the application is to passively try to create the queue it will consume, with the queue name being the same over time, meaning it will passively try to recreate the queue it has created on the originally connected node.

I now have all consumers connected to the remaining node and the consumers with queues mastered on the down node do not receive any messages.

 

I believe the consumers cannot recreate their Queues because they were originally created as durable and the queues still exist in the database.

So, while the second node is down, consumers trying to consume from the Queues that reside on the down node cannot receive messages until the

node comes back online. And published messages cannot be routed to the queue because it is down.

 

We've also been doing performance testing with your Java perfTool, and see a big hit in message throughput and latency with mirrored queues, so we are leaning toward no mirroring because

the environment we are testing for has a higher priority on performance that high availability of the individual messages.

 

So, my questions are:

* Is there a way to make the queues be available during the node down time without using mirroring and without changing the application to no longer use durable queues?

* If I do not use Durable Queues, will the queue be properly removed from the database when one node goes down (either by shutdown or by crashing) and therefore be allow for our reconnection logic to recreate the queue on the remaining node?

* Finally, do you have any references or suggestions to make mirrored queues be more performant?  We implemented mirroring with a policy "ha-mode: exactly, ha-params: 2".

 

Thank you,

Steve

 

 

Michael Klishin

unread,
Jan 30, 2015, 12:03:31 PM1/30/15
to rabbitm...@googlegroups.com, Steven Blanc
 On 30 January 2015 at 19:55:28, Steven Blanc (steven...@avid.com) wrote:
> So, my questions are:

> * Is there a way to make the queues be available during the node
> down time without using mirroring and without changing the application
> to no longer use durable queues?

No. Without mirroring queue contents are stored on a single node.

> * If I do not use Durable Queues, will the queue be properly removed
> from the database when one node goes down (either by shutdown
> or by crashing) and therefore be allow for our reconnection logic
> to recreate the queue on the remaining node?

Durability has nothing to do with what you're observing.

> * Finally, do you have any references or suggestions to make mirrored
> queues be more performant? We implemented mirroring with a policy
> "ha-mode: exactly, ha-params: 2".

Using more nodes and spreading connections between them should help.
Mirroring still means more work to be done for the node routing your messages.

You can go to pretty significant numbers by growing your cluster and distributing connections:
http://googlecloudplatform.blogspot.ru/2014/06/rabbitmq-on-google-compute-engine.html

Ideally your consumer should connect to the master node for the queue it uses. Search
the list for more information about this, it has been discussed many times before.
--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Simon MacMullen

unread,
Jan 30, 2015, 12:07:55 PM1/30/15
to Steven Blanc, rabbitm...@googlegroups.com
On 30/01/15 16:55, Steven Blanc wrote:
> I believe the consumers cannot recreate their Queues because they were
> originally created as durable and the queues still exist in the database.

Yes. Recent (3.4.x) versions of RabbitMQ try to make this as obvious as
possible - showing queues as visible but "down" in the management plugin
/ "rabbitmqctl list_queues" and trying to give a descriptive error
message over AMQP. Older versions will give a less clear error message
over AMQP and not show the down queue in ctl / mgmt.

> * Is there a way to make the queues be available during the node down
> time without using mirroring and without changing the application to no
> longer use durable queues?

Not really - we don't want to allow creation of another durable queue
with the same name as the down queue since it would need to be merged
somehow (and it's not clear how) if the other cluster node came back.

The only way to create another queue under the same name is to either
bring the node it was on back or issue "rabbitmqctl forget_cluster_node
..." (which will delete all unmirrored queues on the forgotten node).

> * If I do not use Durable Queues, will the queue be properly removed
> from the database when one node goes down (either by shutdown or by
> crashing) and therefore be allow for our reconnection logic to recreate
> the queue on the remaining node?

Yes, transient queues vanish as soon as the node goes down.

> * Finally, do you have any references or suggestions to make mirrored
> queues be more performant? We implemented mirroring with a policy
> "ha-mode: exactly, ha-params: 2".

Which version of RabbitMQ are you running? Older versions could have bad
performance with mirrored queues sometimes.

But mirrored queues will always need to do more work *per node* than
unmirrored ones.

Cheers, Simon

Michael Klishin

unread,
Jan 30, 2015, 12:09:43 PM1/30/15
to rabbitm...@googlegroups.com, Steven Blanc
On 30 January 2015 at 20:07:55, Simon MacMullen (si...@rabbitmq.com) wrote:
> Durability has nothing to do with what you're observing.

Oops. This is not correct, as Simon points out. 

Steven Blanc

unread,
Jan 30, 2015, 2:00:07 PM1/30/15
to Simon MacMullen, rabbitm...@googlegroups.com
| Which version of RabbitMQ are you running? Older versions could have bad
| performance with mirrored queues sometimes.

I am currently running RMQ Server 3.3.5 in production and that is what was tested.

| But mirrored queues will always need to do more work *per node* than
| unmirrored ones.

We see about 3-4 times the msg/sec throughput without mirroring. Is that "normal"?

Thank you.

Steve

Steven Blanc

unread,
Feb 4, 2015, 3:33:06 PM2/4/15
to Michael Klishin, rabbitm...@googlegroups.com
| You can go to pretty significant numbers by growing your cluster and distributing connections:
| http://googlecloudplatform.blogspot.ru/2014/06/rabbitmq-on-google-compute-engine.html

Did the very impressive test results obtained here come from using mirroring of the queues, or no mirroring? I couldn't find that aspect defined in any of the articles about this achievement.

Thanks,
SB

-----Original Message-----
From: Michael Klishin [mailto:mkli...@pivotal.io]
Sent: Friday, January 30, 2015 12:03 PM
To: rabbitm...@googlegroups.com; Steven Blanc
Subject: Re: [rabbitmq-users] Durable queues and no Mirroring on a cluster

Steven Blanc

unread,
Feb 4, 2015, 3:36:33 PM2/4/15
to Michael Klishin, rabbitm...@googlegroups.com
Sorry,, forgot my other question in the last message:

| Ideally your consumer should connect to the master node for the queue it uses. Search
| the list for more information about this, it has been discussed many times before.

Did the 1.3M msg/sec test on GCE simply make randomized connections through the load balancer for consumers, or were the connections selectively established to maximize "data locality", i.e. consuming from a connection to the node mastering the queue being consumed?

Thanks again,
Steve

-----Original Message-----
From: Steven Blanc
Sent: Wednesday, February 04, 2015 3:33 PM
To: 'Michael Klishin'; rabbitm...@googlegroups.com
Subject: RE: [rabbitmq-users] Durable queues and no Mirroring on a cluster

| You can go to pretty significant numbers by growing your cluster and distributing connections:
| http://googlecloudplatform.blogspot.ru/2014/06/rabbitmq-on-google-compute-engine.html

Did the very impressive test results obtained here come from using mirroring of the queues, or no mirroring? I couldn't find that aspect defined in any of the articles about this achievement.

Thanks,
SB

-----Original Message-----
From: Michael Klishin [mailto:mkli...@pivotal.io]
Sent: Friday, January 30, 2015 12:03 PM
To: rabbitm...@googlegroups.com; Steven Blanc
Subject: Re: [rabbitmq-users] Durable queues and no Mirroring on a cluster

Michael Klishin

unread,
Feb 4, 2015, 11:04:56 PM2/4/15
to Steven Blanc, rabbitm...@googlegroups.com
They document their methodology in the post.

MK
Reply all
Reply to author
Forward
0 new messages