Redelivery of acknowledged messages when rebooting NATS nodes in cluster

15 views
Skip to first unread message

Rossen Blagoev

unread,
Oct 24, 2024, 1:44:29 PM10/24/24
to nats

We are doing some torture tests to see how the nats cluster will behave when a node or two goes down, but we are experiencing some issues. 
We have 3 nodes, deployed as stateful sets on Azure kubernetes cluster, streams are with work queue retention, replicated on all nodes, no explicit limits are set. 
When we put a node down, the cluster starts to not acknowledging that messages are acknowledged. 

. What i mean with that - we have an application logic which explicitly acks message after 60 redeliveries ( 60 times we couldn't process the message and we nakced it, so it can be redelivered ). In normal condition, acknowledge message is not redelivered and we don't see increase in deliveries count ( obv ). But when a node is down, or nodes are synching changes we occasionally see more than 60 deliveries, which basically means that the server hasn't acknowledge that we've already acknowledge the given message.

 

We are using PHP and the only lib available for it, nats is updated to 2.10.22, but similar behaviour was observerd with 2.10.19
Reply all
Reply to author
Forward
0 new messages