Quorum queues and Leader election

763 views
Skip to first unread message

Arnaud Morin

unread,
Jul 18, 2023, 8:15:32 AM7/18/23
to rabbitm...@googlegroups.com
Hey again,

I can't find in documentation / tools a way to "force" an Leader
election for quorum queues?

I had multiple queues with "followers" only this morning on my cluster:

$ rabbitmq-queues quorum_status reply_533edf821dab42fb94b8425f4348c07e
Status of quorum queue reply_533edf821dab42fb94b8425f4348c07e on node rabbit@s3 ...
┌────────────────────────┬────────────┬───────────┬──────────────┬────────────────┬──────┬─────────────────┐
│ Node Name │ Raft State │ Log Index │ Commit Index │ Snapshot Index │ Term │ Machine Version │
├────────────────────────┼────────────┼───────────┼──────────────┼────────────────┼──────┼─────────────────┤
│ rabbit@s1 │ follower │ 0 │ 0 │ undefined │ 0 │ 3 │
├────────────────────────┼────────────┼───────────┼──────────────┼────────────────┼──────┼─────────────────┤
│ rabbit@s2 │ follower │ 0 │ 0 │ undefined │ 0 │ 3 │
├────────────────────────┼────────────┼───────────┼──────────────┼────────────────┼──────┼─────────────────┤
│ rabbit@s3 │ follower │ 0 │ 0 │ undefined │ 0 │ 3 │
└────────────────────────┴────────────┴───────────┴──────────────┴────────────────┴──────┴─────────────────┘

The only way to fix was to remove the queues, but maybe there is something else?
Note that I am running a 3 nodes cluster with rabbit 3.12.0

I dont know neither if the queue had a leader at some point, maybe it was created without any leader from the beginning.
If yes, how can I figure that out?

Cheers,
Arnaud.

Arnaud Morin

unread,
Jul 18, 2023, 8:54:55 AM7/18/23
to rabbitm...@googlegroups.com
I had another queue in the same situation few minutes ago, in logs I can
see:

** Reason for termination ==
** {{exception,partition_parallel_timeout},

What could cause this issue?

Michal Kuratczyk

unread,
Jul 18, 2023, 9:32:56 AM7/18/23
to rabbitm...@googlegroups.com
Please provide full logs, ideally at the debug level (you can enable it without a restart with `rabbitmqctl set_log_level debug`).

Best,

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/ZLaLkdDRZsM74o2I%40sync2.


--
Michał
RabbitMQ team

Arnaud Morin

unread,
Jul 18, 2023, 10:03:46 AM7/18/23
to rabbitm...@googlegroups.com
Hello,

Here it is (not in debug, I dont know yet how to reproduce this):

https://gist.github.com/arnaudmorin/3c0d5d50281db5b1bec573cc9da240c5

Note that only one of the three nodes had logs about the queue.
It seems to apply when my client is declaring the queue.
I suspect the system to timeout trying to elect/reach other nodes when
creating the queue.

Cheers,
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0uGz%3DKZGVq49b3XPgPaT6PXRrpzyZGo7REY-ta0gEnGOA%40mail.gmail.com.

Michal Kuratczyk

unread,
Jul 19, 2023, 2:57:13 AM7/19/23
to rabbitm...@googlegroups.com
Please provide full logs. The error itself basically says there was a timeout. The logs before it happened may shed some light on why that could be.

Best,



--
Michał
RabbitMQ team

Arnaud Morin

unread,
Jul 19, 2023, 9:09:11 AM7/19/23
to rabbitm...@googlegroups.com
Is there any way to tune the "START_CLUSTER_TIMEOUT" in
src/rabbit_quorum_queue.erl?

It seems to contain 5000 as default value, maybe this is too small on my
infra?

What could prevent the system to run:
try erpc_call(Leader, ra, start_cluster, ...)

in 5000ms?

Cheers,
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0sw3%2BuUb7sv1FKMPW5Xcvyeo6Nq%2BRUvBhnhzD0_Ovj3Gg%40mail.gmail.com.

Michal Kuratczyk

unread,
Jul 19, 2023, 12:25:24 PM7/19/23
to rabbitm...@googlegroups.com
It doesn't seem configurable currently. Is this a short lived queue?



--
Michał
RabbitMQ team

Arnaud

unread,
Jul 19, 2023, 12:44:43 PM7/19/23
to rabbitm...@googlegroups.com
Yes and no :)
The queue is supposed to live until the consumer disconnect but x-expire is set to 1 minute.

I also noticed in my consumer logs that it disconnected few minutes before.
Then the queue expire on rabbit side.
Then the consumer connect again and try to declare the queue.
Then the partition_parallel_timeout issue occur in logs.

Could it be some sort of race condition?

Michal Kuratczyk

unread,
Jul 20, 2023, 10:22:19 AM7/20/23
to rabbitm...@googlegroups.com
Yes. quite likely it is a race condition caused by the queue expiration and redeclaration.
There are very few users of queue TTL with quorum queues (quite likely you're the only one),
so that could be it. If you could try to put together a test app that simulates this kind of workload
that'd be a great way to contribute to fixing the issue.



--
Michał
RabbitMQ team

Karl Nilsson

unread,
Jul 20, 2023, 10:29:02 AM7/20/23
to rabbitm...@googlegroups.com
For this use case it would be _much_ better to use an exclusive classic queues. Temporary queues are not a use case quorum queues fit well for.

Cheers
Karl



--
Karl Nilsson

Arnaud Morin

unread,
Jul 21, 2023, 5:05:40 AM7/21/23
to rabbitm...@googlegroups.com
Ok,

We switched from classic non HA queues to quorum recently because we
don't want to lose the queue when a node is down/maintenance.

The queues are having a small expiration despite the fact that they can
be used for a very long time by clients.

The small expiration time is mostly useful for fanout to get deleted
before they fill up with a LOT of messages when the client is restarted
(because the client is using a random in name of queues, so each time
it's restarted, a new fanout is declared, leaving the former one
forever/until expiration).
(For the record, our client is OpenStack - mostly neutron and nova
components).

From the discussion we had, we see few points of improvment:
- start looking at stream to replace fanouts
- stop using random name in queues, so we can reuse the same queue after
restart (this would also save erlang atom exhaustion)
- increase expiry of queue to avoid the race condition

I'll try to build the test app requested by Michal, but I am not yet
sure how to reproduce the issue without the load.

Thanks for your answers.

Arnaud.
> >>>> https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0sw3%2BuUb7sv1FKMPW5Xcvyeo6Nq%2BRUvBhnhzD0_Ovj3Gg%40mail.gmail.com
> >>>> .
> >>>>
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "rabbitmq-users" group.
> >>>> To unsubscribe from this group and stop receiving emails from it, send
> >>>> an email to rabbitmq-user...@googlegroups.com.
> >>>> To view this discussion on the web, visit
> >>>> https://groups.google.com/d/msgid/rabbitmq-users/ZLfgaF85Eo8kncXc%40sync2
> >>>> .
> >>>>
> >>>
> >>>
> >>> --
> >> You received this message because you are subscribed to the Google Groups
> >> "rabbitmq-users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to rabbitmq-user...@googlegroups.com.
> >> To view this discussion on the web, visit
> >> https://groups.google.com/d/msgid/rabbitmq-users/494838C0-30AB-4A6B-944A-F62D5B4FFD7E%40gmail.com
> >> <https://groups.google.com/d/msgid/rabbitmq-users/494838C0-30AB-4A6B-944A-F62D5B4FFD7E%40gmail.com?utm_medium=email&utm_source=footer>
> >> .
> >>
> >
> >
> > --
> > Michał
> > RabbitMQ team
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "rabbitmq-users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to rabbitmq-user...@googlegroups.com.
> > To view this discussion on the web, visit
> > https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0vbqSdQ2abt%3Dv_zz%3DHJA9v9nmfAmnBAizk0-AR3H72B9w%40mail.gmail.com
> > <https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0vbqSdQ2abt%3Dv_zz%3DHJA9v9nmfAmnBAizk0-AR3H72B9w%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .
> >
>
>
> --
> *Karl Nilsson*
>
> --
> You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CAHC35TA3K2MKabUR-AS7Ujehi8vZ%3D%2BCvLO0O5NFdftXnrLFJmw%40mail.gmail.com.

Michal Kuratczyk

unread,
Jul 21, 2023, 5:19:15 AM7/21/23
to rabbitm...@googlegroups.com
Hi,

Either I don't understand something or these requirements conflict with one another.
If you want the queue to be deleted when the consumer is not present (eg. restarted), then that's what exclusive queues are for (no need for queue TTL).
Then you can have random names, the queue gets deleted when the consumer goes away.

An exclusive queue is always on the same node as the connection, so you don't need to worry about the node being down - if it does,
the connection will be terminated and the queue will disappear, which is what you want.

So as Karl said - exclusive queues should be a much better option.

Best,




--
Michał
RabbitMQ team

Arnaud Morin

unread,
Jul 23, 2023, 8:15:49 AM7/23/23
to rabbitm...@googlegroups.com
Hey,

What we want is to avoid losing any message when a node goes down.
Here is why:
If we go with exclusvie queue, as soon a node is down, we lose 1/3 of
the infrastructure queues (if balanced correctly).
While most of these queues are empty, some of them may not be.
For these, the messages in queues are going to be lost forever, creating
inconsistencies in the OpenStack region. Unfortunately, OpenStack is not
always able to recover from this automatically. We usually have to
either ask OpenStack to execute again the action, or fix consistency
manually.
Sorry for talking about OpenStack here, this is not the purpose, but
that may help understand our use case :)

Cheers,
Arnaud.
> > https://groups.google.com/d/msgid/rabbitmq-users/ZLpKWk5bZtF%2B/UZi%40sync2
> > .
> >
>
>
> --
> Michał
> RabbitMQ team
>
> --
> You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/CAA81d0vLmh_1MUeys_XbCVHZS3Fh%3Dyb%3DQRB0fJYED8oLL6M0AQ%40mail.gmail.com.

kjnilsson

unread,
Aug 15, 2023, 4:02:49 AM8/15/23
to rabbitmq-users
Hi Arnaud,

I think changing your approach to use a small set of quorum queues that do not expire instead of temporary one will be much more reliable. 

Cheers

Arnaud Morin

unread,
Aug 16, 2023, 4:13:22 AM8/16/23
to rabbitm...@googlegroups.com
Hey,

Yes, and, for the record, here is what we did to achieve something
stable:

- we switched all our fanout queues to streams (reducing a lot the
number of queues)
- we switched from random queue naming to consistent naming (so on
service restart, the queues are "re-used") --> as a result we tweaked
our policy to increase the time before deleting an unused queue (so
the system is not deleting/creating queues in a loop anymore)

With both changes, we reduced a lot the queue churn and everything works
fine!

It's time now to push these changes into openstack upstream code.

Thank you for your time and valuable answers.

Cheers,

Arnaud.
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/344704fb-de56-47b6-aec3-dee72bec1574n%40googlegroups.com.

kjnilsson

unread,
Aug 16, 2023, 5:39:44 AM8/16/23
to rabbitmq-users
Ah that's great to hear.

Out of curiosity, for the stream fan outs do you store offsets for your consumers and if not do how do you attach a new or restarted consumer?

Arnaud Morin

unread,
Aug 16, 2023, 5:51:00 AM8/16/23
to rabbitm...@googlegroups.com
on openstack side, the fanouts are mostly used to fill a cache on agent,
so the agent does not need to do a database call when requesting a
resource.

if the resource is not in cache (e.g. after service restart), then the
agent is asking the db.

So we dont need to store the offset.
Each time a service restart, it starts consuming messages from last one.

openstack magic :)
> To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/a88d64e6-c950-41c0-ab0e-eae243f17336n%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages