Message TTL configuration for heartbeat queue type

221 views
Skip to first unread message

Vilius Šumskas

unread,
Jul 26, 2023, 3:33:39 AM7/26/23
to rabbitm...@googlegroups.com

Hi,

 

we have a Spring Boot application which uses one RabbitMQ queue as a heartbeat mechanism. Basically hundreds of publishers are sending messages every 5 minutes to that queue. A service on our side then consumes these messages and this way we know that publishers are dead or alive. We then can use the status of publishers in the application UI or during other backend tasks which require those publishers to be alive.

 

I‘m wondering if anyone can share their ideas on how such queue heartbeat should be configured? For now we are using classic durable queue. We are also thinking about enabling message TTL on this queue. I‘ve read through RabbitMQ documentation and it seems that recommended way to configure TTL is to use policies. However, given that there is only one such heartbeat queue, and given that RabbitMQ Java Client doesn’t support configuring policies, would it be better to use x-message-ttl during queue creation in this case? Can we later change x-message-ttl without recreating a queue then? And which solution is faster performance wise, x-arguments or policies?

 

Another question is regarding exact x-message-ttl value. Should we sync it with our business logic of “one heartbeat per 5 minutes” or set it to something different entirely? If sync with business logic, then should be set it to a little bit more than 5 minutes, or a little bit less? Or maybe we should just use x-message-ttl: 0 and emulate AMPQ immediate flag? Sorry for all these questions, but it’s a little bit over my head how RabbitMQ internally handles TTL.

 

By the way, our RabbitMQ is a cluster or 3 nodes, and a heartbeat queue is mirrored, if that makes a difference.

 

Any pointers are much appreciated!

 

--

   Best Regards,

 

    Vilius Šumskas

    Rivile

    IT manager

 

Michal Kuratczyk

unread,
Jul 31, 2023, 3:43:19 AM7/31/23
to rabbitm...@googlegroups.com
Queue arguments provided during declaration cannot be changed without recreating the queue, which is the main reason a policy is recommended.
There should be no performance difference between the two methods.

I don't really understand the value of RabbitMQ being involved in the process you described. It basically sounds like application monitoring and that's usually
solved by probing an application's endpoint and/or scraping its metrics. Spring Boot gives you a lot of that out of the box: https://docs.spring.io/spring-boot/docs/current/reference/html/actuator.html#actuator.endpoints.

To interpret the heartbeats, the consumer needs to know what publishers should be available (otherwise how do you interpret a lack of heartbeat?).
If you have an app that knows about all the expected publishers:
1. You can just query their endpoints (that sounds preferable to me)
2. TTL doesn't seem to matter much - your app needs to receive a message for a given publisher ID within 5 minutes since the previous message and that's it.

Best,


--
You received this message because you are subscribed to the Google Groups "rabbitmq-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rabbitmq-user...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/rabbitmq-users/AM8PR01MB7699A1ED730A064AF55059A09200A%40AM8PR01MB7699.eurprd01.prod.exchangelabs.com.

--
Michał
RabbitMQ team
RabbitMQ team

Vilius Šumskas

unread,
Jul 31, 2023, 4:42:20 AM7/31/23
to rabbitm...@googlegroups.com

Hi,

 

documentation here https://www.rabbitmq.com/queues.html#optional-arguments says that most of the x-arguments can be controlled dynamically, so I assumed that TTL can be changed too. Anyway, we will use policies if TTLs cannot be changed via parameters.

 

Regarding the functionality itself, we are using RabbitMQ queue as a heartbeat because all these publishers live on the client side. We don’t control or have access to their network, so we cannot just simply query actuator endpoints, or employ other similar techniques. There is also thousands of these publishers, some of them could be down most of the time. Querying all of publishers constantly would not be very effective, even if it was physically possible. Hence, we rely on what the clients (publishers) are reporting instead of querying them. We have an internal database to know what publishers should be available.

 

We did run this queue without any TTLs for the past few years. It worked great until we had an incident in our backend server-side application. Because of other issues in infrastructure it stopped processing RabbitMQ messages from the heartbeat queue.. This made heartbeat queue very big in a very short time – we gathered something like 160k messages in a few hours – which overloaded RabbitMQ memory. With TTL we want to avoid that. Are we on a wrong track here?

 

--

    Vilius

Michal Kuratczyk

unread,
Jul 31, 2023, 5:03:43 AM7/31/23
to rabbitm...@googlegroups.com
You have a few years more of experience using RabbitMQ for this purpose than I do, so I'm not in a position to tell you if you are on the wrong track. ;)

The exact value of TTL doesn't seem to change much. As an alternative, I think you could consider a subscription to the heartbeat queue, rather than publishing to it.
Then you can query for consumer details using the Management API

curl 'http://guest:guest@localhost:15672/api/queues/%2F/heartbeats/' | jq .consumer_details

Pros and cons:
+ no risk of accumulating messages as it happened in the past
+ you can easily change how often you check for present "publishers" (technically consumers, but they would come from the same client app)
- if your heartbeat contains some useful data, then that'd be lost

Best,

Vilius Šumskas

unread,
Jul 31, 2023, 5:27:08 AM7/31/23
to rabbitm...@googlegroups.com

Thank you for your input.

 

I think we had an idea at some point to do it via subscriptions, but later decided to implement heartbeats as we needed some telemetry about the client (OS, CPU, RAM, etc.)

 

Why do you say TTL value doesn’t matter much? I’m wondering how fast are RabbitMQ internals to deliver those messages? For example, if a failed heartbeat is >5 minutes, do I set message TTL to 5 minutes and 10 seconds to be safe? Maybe more?

Michal Kuratczyk

unread,
Jul 31, 2023, 5:54:25 AM7/31/23
to rabbitm...@googlegroups.com
First, because I assume 5 minutes is an arbitrary number, so I guess it doesn't matter much whether it's actually 5m or 5m10s.

Second, I expect most messages to be delivered almost immediately - nowhere near 5 minutes. So TTL would only kick in in those rare failure cases where messages are not consumed.
When that's the case, it likely doesn't matter much either - we have no idea when the consumer will be fixed.

If you just want to prevent the queue from growing too much in the failure case, max-length sounds like a good solution, but TTL can be used instead of, or together with it.

We keep hearing about use cases where users want to effectively store 1 message per ID of some kind (we refer to this concept as a "key-value queue").
We don't have a good answer for this - after all there are plenty of key-value stores to choose from for those who want that, but we might add this one day.
That would allow you to store exactly 0 or 1 message per publisher (perhaps with TTL), so the queue wouldn't grow if messages are not consumed and on top of that,
you can easily get the latest state if the consumer is restarted or something (similar to how last-value-cache exchange allows you to consume the last update from a given publisher,
even if the consumer wasn't present when that last update was sent).

Best,



--
Michał
RabbitMQ team

Vilius Šumskas

unread,
Jul 31, 2023, 6:06:49 AM7/31/23
to rabbitm...@googlegroups.com

That sounds reasonable. Thank you!

Reply all
Reply to author
Forward
0 new messages