Quorum queue TTL support possible, important for others?

1,378 views
Skip to first unread message

Eric Sampson

unread,
Mar 1, 2019, 3:19:14 PM3/1/19
to rabbitmq-users
Hi, we're really looking forward to using quorum queues, until I noticed that queue TTL is not supported. We use queue TTL (along with DLX) heavily to achieve simple retry strategy for our services, probably like a lot of other RabbitMQ users, and are wondering if this feature is anticipated to be supported with quorum queues.

We also might have some heartburn about quorum keeping all messages in-mem (non-lazy) because we tend to be slow-consuming for a bunch of reasons, but this is probably less of an immediate concern than queue TTL support.

Thanks in advance,
Eric

aviv salem

unread,
Mar 1, 2019, 3:22:16 PM3/1/19
to rabbitm...@googlegroups.com
Just to make sure... You're talking about message ttl, and not queue expiry... Right? 

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Klishin

unread,
Mar 1, 2019, 3:54:51 PM3/1/19
to rabbitm...@googlegroups.com
Queue TTL is not implemented for quorum queues because we don't see any reason to use them for transient data.
Implementing queue TTL in theory should be fairly easy since the deletion part is already there.
Message TTL is more involved, we are yet to complete queue length limits for QQs.

Quorum queues take snapshots every so often or after an inactivity period. They don't keep the entire log in memory
and there's an adaptive part that has to do with the current operation rate (the higher the rate, the more is kept in memory).

Once we release 3.8.0-beta.3 you should give it a try on your workloads (e.g. by simulating them with PerfTest).

We are yet to test it but with Erlang 22 there's a chance that carrier memory fragmentation will be less of an issue,
so memory usage peaks should be smoother with any queue type. It will ship around the same time 3.8 will (June 2019).

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--
MK

Staff Software Engineer, Pivotal/RabbitMQ

Michael Klishin

unread,
Mar 1, 2019, 4:04:41 PM3/1/19
to rabbitm...@googlegroups.com
I'm not 100% correct. There's a backlog story that makes QQs react to alarms and reduce their memory consumption.
So it's one of remaining 3.8 items.

As for TTL support, I cannot make any promises as to whether/when message TTL for quorum queues will be included.

Karl Nilsson

unread,
Mar 1, 2019, 4:09:14 PM3/1/19
to rabbitm...@googlegroups.com
How do you use TTL to implement retries?
Karl Nilsson

Michael Klishin

unread,
Mar 1, 2019, 4:16:40 PM3/1/19
to rabbitm...@googlegroups.com
There is a pretty well-known pattern where a rejected message gets DLX'ed to a queue
with message TTL, which then DLX'es expired messages back to the original queue.

Karl Nilsson

unread,
Mar 1, 2019, 4:20:32 PM3/1/19
to rabbitm...@googlegroups.com
I see. Does the DLX queue really need to be replicated? How long are the TTLs?

Eric Sampson

unread,
Mar 3, 2019, 10:32:00 AM3/3/19
to rabbitmq-users
Ugh sorry all I wish could rewrite the original message :( I meant message-ttl set for a quorum queue, not TTL of the queue itself). I agree that QQ TTL is of no importance for us, but not being able to support message-ttl for a QQ would be really unfortunate for us, in order to support the pattern that was mentioned.

That's why I posted this message, to see how many other people would feel the same way and give some feepdack to the Rabbit team.

I can totally live without QQ per-message TTL support, but would hope that QQ message-ttl would be supported as a high backlog item.

Thanks again for all your hard work; I only bring this up because we're super excited to start using QQs. All the best.
Thanks again for all your hard woek

Karl Nilsson

unread,
Mar 4, 2019, 4:35:33 AM3/4/19
to rabbitm...@googlegroups.com
It is perfectly possible to implement message TTLs in quorum queues although it would most likely be more coarse-grained due to the overheads and processing patterns of the consensus system. Also as Michael has already mentioned it feels a bit strange to pay all the cost of consensus then to throw the message away. The only reason I can see would be to support the retry pattern as mentioned.

To support the delayed retry pattern does the queue accepting the dead-letter _need_ to be a quorum queue or would a "classic" durable queue with persistent messages provide enough safety for the short time each message spend on the retry queue? 
The other queue could be a quorum queue of course. If you tend go get a bit of a backlog and the messages are important then that would be a good choice.

Quorum queues aren't meant to be a whole-sale replacement for all other queues. They have costs and downsides and in my opinion should only be used when you really, really need that extra safety.


Cheers
Karl

--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.


--
Karl Nilsson

Pivotal/RabbitMQ

aviv salem

unread,
Mar 4, 2019, 4:56:26 AM3/4/19
to rabbitm...@googlegroups.com
The rational for message ttl doesn't have to collide with the security of QQ...

on our prod system, we need absolute guarantee of not losing messages. 
But, in case a message has been sitting on the queue for too long, we dead letter it to a different queue for backup And inspection, so it won't be in the way of other messages.

So we need the safety of QQ and still the features that dead letter (safely) messages from the queue 

To post to this group, send email to rabbitm...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Eric Sampson

unread,
Mar 4, 2019, 11:44:26 AM3/4/19
to rabbitmq-users
Thanks Karl and Michael, appreciate your input!

Yes, for us the use case would be the retry pattern. Having QQs support message-ttl would make our migration to them so much easier (and more realistic to do) than having to refactor all our services/approach, which we'd have to go back to the business product owners to get approval to pay for.

For our use of the retry pattern, we don't really care how exact the timeout is (within reason) if that would help with the implementation details in terms of coordination/consensus.

An alternative would be the delayed-exchange plugin, if the messages were persisted on more than one node using a more QQ-like approach. The one-node limitation is why we have not used this approach for our messaging architecture. But not sure if that would be more work than QQ message-ttl support or not...

Re your second point, we commonly leave messages on a given retry queue for hours at a time (up to a max of a day for some services), for a total 'retry residence time' of a few days at maximum.
And NServiceBus cascaded retry architecture* for RabbitMQ bus will leave a message on its longest retry queue for 1500 hours or over four years...

In our architecture, messages waiting to be retried are deemed just as important durability-wise as first-time messages. And we'd like to move away from mirrored queues as fast as we can due to the known issues.

Michael Klishin

unread,
Mar 4, 2019, 7:23:40 PM3/4/19
to rabbitmq-users
Even if features look orthogonal it doesn't mean they cannot be hard to reconcile in terms of implementation.
Raft is not a magic powder you put on features to make they safe. It's a protocol with plenty of limitations, both well and lesser known.

We also have to ship 3.8 at some point and most users would like that to happen in the next few months.

Companies that badly need feature X can contribute it or pay Pivotal (or Erlang Solutions, or someone else) to contribute it.
I'm afraid enthusiasm for a feature alone doesn't bring any closer to reality.

Michael Klishin

unread,
Mar 4, 2019, 7:25:16 PM3/4/19
to rabbitmq-users
There are plans to make the delayed-exchange-plugin distributed but it's not realistic to get that into 3.8. However,
as a community plugin it's not really tied to RabbitMQ release schedule and can come after 3.8.0 but before 3.9.0.

The best way to make it happen is to contribute engineer time or money.

Allen Sanborn

unread,
Apr 9, 2021, 7:04:58 PM4/9/21
to rabbitmq-users
Eighteen days ago, the Classic HA page was updated to state that "mirroring of classic queues will be removed in a future version of RabbitMQ " and the comment was "Mention that classic queue mirroring will eventually be removed". https://github.com/rabbitmq/rabbitmq-website/commit/5237191865289f86a1f91b5e83e981ecfcf0f9a9

I'd really like some more clarification on those two lines since that is an extremely large statement to make for us and a WHOLE lot of work that were planning on doing over the next few years. 

We are starting down the path of a large migration to RabbitMQ and we are also heavy users of NServiceBus. There are a number of NServiceBus features that rely on per-message TTL for their implementation e.g. Saga Timeouts, delayed messages, delayed retries and probably more . 

That creates quite the quandary for us since if we are heavy users of NServiceBus, we'd like to use RabbitMQ and we are making a lot of changes that will apparently not be supported at some unknown point-in-time in the future by RabbitMQ. We would also be "legacy on arrival" according to other members of the RabbitMQ team due to our current need for per-message TTL on messages. (https://groups.google.com/g/rabbitmq-users/c/WIWhM-bMWn0/m/9jZp-0kmAwAJ). That doesn't give me a great deal of confidence in our approach at the moment. 

Could you please clarify what the timescale is for that "future version" that would deprecate classic HA mirroring?

Do you have any more recent information on the likelihood of per-message TTL for Quorum Queues being implemented? It isn't clear to me whether that is essentially a non-starter or just not a feature that was going to ship in 3.8.* but would in possibly 3.9. 



M K

unread,
Apr 14, 2021, 8:13:02 AM4/14/21
to rabbitmq-users
We would appreciate if two year old threads were not bumped with unrelated questions. Starting a new thread
is quick and costs your nothing ;)

Classic queue mirroring should already be considered deprecated. It will be removed most likely in 4.0, so nothing changes in 3.9.

Quorum queues put data safety first and foremost. It doesn't make much sense to our team to choose to use this queue type and then
voluntarily throw data away because messages have TTL attached. It's a feature combination that does not make much sense.

Those who want to have TTL and do not need to use replication
(contrary to the popular belief, not everything has to be replicated), can use classic queues with per-message TTL today and will
be able to do so in the future.

NServiceBus and similar frameworks do not use per-message TTL because they want message TTL or any kind of TTL. They use it for delayed
delivery in the context of retries. We'd like to have a feature that does just that in quorum queues specifically. There is no ETA as right now
and for most of the year our focus is on shipping stream queues and a couple of commercial features.

Speaking of stream queues, frameworks such as NServiceBus could adopt them starting with 3.9. They allow for repeated streams and naturally have a TTL
(retention), so the only open ended question is dead lettering.

M K

unread,
Apr 14, 2021, 8:46:36 AM4/14/21
to rabbitmq-users
Perhaps I should also mention that this is a good example of how RabbitMQ development has been since 3.8 and will continue going forward: more focussed queue types and
feature combinations. Historically classic queues allow you to support a transient mirrored queue with length limits, per-message TTL, queue TTL, priorities enabled and federate that to a separate cluster.
You have to ask yourself, does it really make sense to combine some or most of those features? Or perhaps two queue types, one for transient data with TTL
and another replicated, durable, and federated off-site would make more sense?

So quorum queues won't support every classic queue feature, in 3.9 or any other version. Classic queues won't be mirrored eventually. There are at least two more queue
types in the works for workloads with sets of needs different from the above.

In the process, we would like to make it possible for NServiceBus to provide the features it needs but we cannot and will not try to support every feature
they currently use for every queue type. Fortunately I don't think they need arbitrary combinations either.

Finally, the earliest I'd expect 4.0 to ship is 2022 and 3.9.x should be supported for 18-24 months after it comes out later this year.
Which means, possibly until late 2023. By then there would be a bunch of new options for NServiceBus and similar tools.
No need to panic.

Allen Sanborn

unread,
Apr 14, 2021, 7:03:38 PM4/14/21
to rabbitmq-users
I apologize for bumping a stale thread. Since QQs don’t have per-message TTL, classic mirroring is being deprecated and it is the only option for HA per-message TTL, I assumed that the participants here would be interested. They were all inquiring about that feature and may be reliant on that in their production systems. This seemed like a zero-cost way notify the people that will be most affected by the statement in the PR. 

I'm trying to do what I can to ensure the success of what will be many years of work to migrate to RabbitMQ. If there is a task with a lot of unknowns, then it has a high degree of uncertainty and it is hard to model the risk. Low probability but high-risk scenarios should still be part of the model. So here I am chasing down the unknowns and trying to squish risk.

Reading that this feature is not an option allows us to know what to expect. The lack of interest in QQ per-message TTL is understandable but it was not clear if it would also not be acceptable. Thank you for clarifying that it will most definitely not be in a future release from the RabbitMQ team and that we probably will not get far advocating for or paying to develop that feature despite our enthusiasm. This constraint is a feature.

Thank you for stating the timeline for deprecation. It is good to know that classic mirroring will not be in v4. The end of 2023 is not that far away and knowing that we need to plan for a major bump to the broker and our libraries that interact with it is something that should be in our plan. It takes a while to update, test and deploy a few hundred applications. 

A larger number of simpler queues sounds wonderful. Classic with mirroring has an extremely high concept load and lots of sharp edges. I look forward to forgetting everything I have learned about them.

Thanks again for your response. You were able to validate my assumptions and help clarify some roadmaps. 

M K

unread,
Apr 14, 2021, 7:11:33 PM4/14/21
to rabbitmq-users
Again, I don't think anyone really needs per-message TTL in quorum queues, some projects just need a way to retry deliveries with a delay, which is not something
we oppose, it just will take time to develop this.

You won't have to upgrade client libraries for RabbitMQ 4.0. It's going to support the same protocols we support today. Frameworks such as NServiceBus will need an update
but I doubt the API they expose will meaningfully change.

Rainer Frey

unread,
Apr 15, 2021, 6:07:20 AM4/15/21
to rabbitmq-users
The one scenario where we use TTLs with mirrored queues is indeed delayed retries. A more direct support for exactly this use case would be very welcome. But in the meantime, I'm not sure that I don't need HA for these messages, so I sure hope this will be delivered before mirrored classic queues are going away. With the emphasis in the quorum queue docs on how these are purpose built for very specific use cases, I'm a little surprised that the seemingly more generally applicable HA alternative is already being deprecated.

Carlos Alberto Balseiro Mayi

unread,
Apr 19, 2021, 11:35:24 AM4/19/21
to rabbitmq-users
What about MQTT and JMS plugins? Are those being moved to quorum queues? Or they will remain with classic queues and losing the mirroring ability? 

M K

unread,
Apr 28, 2021, 7:58:07 AM4/28/21
to rabbitmq-users
Neither MQTT nor JMS are tied to any particular queue type. MQTT is not extensible, so it's not clear
how the type of the queue can be specified by a client. We can potentially use them for QoS 1 consumers.
Ideas are welcome in a separate thread.

As for JMS, it does support optional headers IIRC so it can be exposed to clients.

M K

unread,
Apr 28, 2021, 8:02:58 AM4/28/21
to rabbitmq-users
If you use queue replication then by definition you care more about data safety than anything else.
Quorum queues give you exactly that, and most features of classic queues that make sense.

Classic mirrored queues have unfixable design flaws. They cannot be fixed without a complete rewrite. Yes, really.
So the rewrite happened in a backwards-compatible, more focussed way: quorum queues. Then stream queues in 3.9.

More specialised queue types is the only realistic path forward. Packing more and more features into classic queues
is not an option. Whether a certain use case needs throughput or data safety or low per-queue footprint, this cannot be achieved
by packing more and more into classic queues.

Retries is the only missing feature. We get it. Beyond that, there are really no reasons to choose a classic mirrored queue over a quorum
queue or stream queue. Classic mirrored queues are fundamentally broken. A very substantial amount of complex code
and complex behaviour can be removed from RabbitMQ once and for all if classic mirrored queues are removed.

Carlos Alberto Balseiro Mayi

unread,
Apr 28, 2021, 8:29:21 AM4/28/21
to rabbitm...@googlegroups.com
Thanks for the explanation.

I understand what you mean and why it is the most logical way forward. Unfortunately for me,  my use case is probably a corner case scenario: site with two internal, low latency data centers, so I will probably have to add an off-site, much higher latency third node to keep the same service currently provided with mirror queues.


You received this message because you are subscribed to a topic in the Google Groups "rabbitmq-users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/rabbitmq-users/EawD3cYjfGY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/0c0513c8-4899-4ce1-b0fd-77b5d8335016n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages