serf not retrying the connection to peer when "network is unreachable"

22 views
Skip to first unread message

shameer n

unread,
Apr 21, 2023, 9:27:55 AM4/21/23
to Serf
Hi All,
I have a 1-to-1 pair of serf based HA application. 
I start serf process as one of the first service after boot up.
It is a geo-redundant setup (my peer HA instance is on different network), So, I need a static route to reach the peer.
static route being added 1-2 seconds after serf being started
I could see from the serf log that it is trying to connect for two times. But fails with the following error,
    2023/04/21 06:46:40 [DEBUG] memberlist: Initiating push/pull sync with: 10.56.228.200:7946
    2023/04/21 06:46:40 [DEBUG] memberlist: Stream connection from=10.56.228.200:33028
    2023/04/21 06:46:40 [DEBUG] memberlist: Failed to join 10.56.227.200: dial tcp 10.56.227.200:7946: connect: network is unreachable

Since serf is not trying the connection again , member join not happening.

is there any serf settings for retrying in such senarios ?
below is my serf.conf file contents

===============================================
{
    "log_level": "trace",
    "leave_on_terminate": true,
    "tags": {
        "timeStamp": "1682074581.47",
        "haMode": "1to1",
        "role": "none",
        "appVersion": "V1",
        "IP": "10.56.227.200"
    },
    "bind": "10.56.227.200:7946",
    "retry_interval": "10s",
    "node_name": "59B22C25A50B6097DB0AE20EA18CC444",
    "reconnect_interval": "10s",
    "start_join": [
        "10.56.227.200",
        "10.56.228.200"
    ],
    "retry_join": [
        "10.56.227.200",
        "10.56.228.200"
    ],
    "interface": "ha",
    "event_handlers": [
        "member-join,member-leave,member-failed,member-update=python /opt/ribbon/ha-arbitration/serfCallback.py member",
        "user:roleChange,user:down,user:startingStandby,user:startingActive,user:active,user:standby,user:custom,user:switchover=python /opt/ribbon/ha-arbitration/serfCallback.py user"
    ],
    "rpc_addr": "127.0.0.1:7373"
}
===============================================

if anyone could shed some light on it would be a great help.
Thanks in advance
-Shameer
Reply all
Reply to author
Forward
0 new messages