Hi,
We have a cluster of two network brokers, and today we shutdown one node in order to test HA; the servers all connected to the surviving broker, but our clients began failing intermittently with the following error:
error 2020/12/02 19:16:24: natswrapper.rb:145:in block in start' Error in NATS connection: NATS::IO::SocketTimeoutError: NATS::IO::SocketTimeoutError error
2020/12/02 19:16:25: client.rb:39:in rescue in initialize' Timeout occured while trying to connect to middleware
The facts application failed to run, use -v for full error backtrace details: execution expired
warn 2020/12/02 19:16:25: natswrapper.rb:138:in `block in start' Disconnected from NATS: NATS::IO::SocketTimeoutError: NATS::IO::SocketTimeoutError
Approximately one of two client runs work OK.
Is this to be expected when a broker node fails? Is there any place we can inform the client that one of the brokers is down?
From the viewpoint of the servers all is fine. They all connected to the surviving broker.
Thank you