Hi,
I've been reading through the documentation of NATS and NATS streaming, it seems like NATS streaming sits on top of NATS server (gnastd) which can be clustered.
However the documentation does not mention anything about queue resilience if one is using NATS streaming cluster with persistence to achieve HA. Are the queues durable to node failure ? are they mirrored or replicated ?.
What happens to messages on a certain node in case of network partitions or node failure ? Will another node pick the workload maybe in a master/slave manner ?.
If queues are persisted to disk but not replicated or NATS does not support other nodes picking up the workload of a failed node, what happens when a dead node gets back online again and joins the cluster ? does it pick up work from where it left ?.
If anyone could help clear out that part it would be great. And give an example of a highly available setup they've done or seen across WAN or multiple availability zones.
Thanks,
Siraj