Failure Detection Configuration

220 views
Skip to first unread message

Liang Yao

unread,
Feb 9, 2015, 10:15:37 AM2/9/15
to ser...@googlegroups.com
According to Serf documentation on Gossip,
-  failure detection is done by periodic random probing using a configurable interval
-  if the suspect member of the cluster does not dispute the suspicion within a configurable period of time, the node is finally considered dead, and this state is then gossiped to the cluster.

It's not clear to me how to configure the above interval and timeout value. Is there a way to configure them through serf cli? What are their default values?

Thanks,
Liang

Armon Dadgar

unread,
Feb 9, 2015, 1:26:41 PM2/9/15
to ser...@googlegroups.com, Liang Yao
Hey Liang,

Sorry for the confusion. There is the Serf library and Serf CLI confusingly enough.
For the Serf CLI, the configuration is indirect using the “-profile” flag, which can
change between local, LAN and WAN modes.

You can see the defaults of those here:

The default for LAN mode (standard profile) is a 1 second probe interval,
with a suspicion multiplier of 5 (log(Nodes) * 5 * ProbeInterval).

Hope that helps!

Best Regards,
Armon Dadgar

--
You received this message because you are subscribed to the Google Groups "Serf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to serfdom+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Liang Yao

unread,
Feb 9, 2015, 3:01:46 PM2/9/15
to ser...@googlegroups.com, liang...@gmail.com
Thanks for the quick responses, it's much appreciated.



On Monday, 9 February 2015 13:26:41 UTC-5, Armon Dadgar wrote:
Hey Liang,

Sorry for the confusion. There is the Serf library and Serf CLI confusingly enough.
For the Serf CLI, the configuration is indirect using the “-profile” flag, which can
change between local, LAN and WAN modes.

You can see the defaults of those here:

The default for LAN mode (standard profile) is a 1 second probe interval,
with a suspicion multiplier of 5 (log(Nodes) * 5 * ProbeInterval).

Hope that helps!

Best Regards,
Armon Dadgar

vishal yadav

unread,
May 17, 2016, 6:40:00 AM5/17/16
to Serf, liang...@gmail.com
Hey Armon,

Is there a way I can change default values of ProbeInterval and GossipInterval of serf from consul configuration without need to build a new consul executable.

Regards,
Vishal


On Monday, 9 February 2015 23:56:41 UTC+5:30, Armon Dadgar wrote:
Hey Liang,

Sorry for the confusion. There is the Serf library and Serf CLI confusingly enough.
For the Serf CLI, the configuration is indirect using the “-profile” flag, which can
change between local, LAN and WAN modes.

You can see the defaults of those here:

The default for LAN mode (standard profile) is a 1 second probe interval,
with a suspicion multiplier of 5 (log(Nodes) * 5 * ProbeInterval).

Hope that helps!

Best Regards,
Armon Dadgar

From: Liang Yao <lian...@gmail.com>
Reply: Liang Yao <lian...@gmail.com>>
Date: February 9, 2015 at 7:15:39 AM
To: ser...@googlegroups.com <ser...@googlegroups.com>>
Subject:  Failure Detection Configuration

Armon Dadgar

unread,
May 17, 2016, 1:07:52 PM5/17/16
to ser...@googlegroups.com, vishal yadav, liang...@gmail.com
Hey Vishal,

You can use the “profiles” to switch between LAN/WAN, but for any other values you need to compile a new binary at this point.

Best Regards,
Armon Dadgar
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/serf/issues
IRC: #serfdom on Freenode
---

You received this message because you are subscribed to the Google Groups "Serf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to serfdom+u...@googlegroups.com.

László Láng

unread,
Aug 5, 2016, 2:52:09 AM8/5/16
to Serf, vishal...@gmail.com, liang...@gmail.com
Hello,

May I ask in this topic? I have Serf 0.7.0 and I would like to get host failure notification in 1-2 seconds.
Is it possible with config tuning? Or just with recompiling with changed constant values?

Now even with local profile it takes 6 seconds to populate the member failed event ([INFO] agent: Received event: member-failed)

The selected host's serf was killed at 02:46:02, other host's serf wrote the below.
Notification went to the subscribed application process at 02:46:08.

    2016/08/05 02:46:05 [INFO] memberlist: Suspect host1 has failed, no acks received
    2016/08/05 02:46:07 [INFO] memberlist: Suspect host1 has failed, no acks received
    2016/08/05 02:46:07 [INFO] serf: EventMemberFailed: host1 20.99.1.0
    2016/08/05 02:46:08 [INFO] agent: Received event: member-failed
    2016/08/05 02:46:09 [INFO] serf: attempting reconnect to host1 20.99.1.0:7946
    2016/08/05 02:46:10 [INFO] serf: attempting reconnect to host1 20.99.1.0:7946
    2016/08/05 02:46:12 [INFO] serf: attempting reconnect to host1 20.99.1.0:7946
    2016/08/05 02:46:14 [INFO] serf: EventMemberReap: host1
    2016/08/05 02:46:15 [INFO] agent: Received event: member-reap


Thank you !
Br, Laci

Armon Dadgar

unread,
Aug 8, 2016, 7:14:24 PM8/8/16
to ser...@googlegroups.com, László Láng, vishal...@gmail.com, liang...@gmail.com
Hey,

You will need to recompile with modified constant values unfortunately.
The risk of going down to a 1-2 second interval is that you will be more susceptible
to false positives until you have a very reliable and low latency network.

Best Regards,
Armon Dadgar

Ranjib Dey

unread,
Aug 9, 2016, 12:10:50 AM8/9/16
to ser...@googlegroups.com, László Láng, vishal...@gmail.com, liang...@gmail.com
Armon,
we too have similar requirements (i.e. ability to customize those values) for our serf (and hence consul & nomad) clusters. Do you think you all will be interested in accepting a patch around this? i.e. let membership take an arbitrary config, currently it is hardcoded to only wan or lan based fixed values. That will at least allow us start experimenting with these values, 

regards
ranjib

To unsubscribe from this group and stop receiving emails from it, send an email to serfdom+unsubscribe@googlegroups.com.

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/serf/issues
IRC: #serfdom on Freenode
---
You received this message because you are subscribed to the Google Groups "Serf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to serfdom+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/serfdom/CAJaobSyM9EPwL_kCJxmnSXyQW1yYpyqu%3Dod7f_PDzHs_yC6Qcw%40mail.gmail.com.

Armon Dadgar

unread,
Aug 9, 2016, 5:39:39 PM8/9/16
to ser...@googlegroups.com, Ranjib Dey, vishal...@gmail.com, liang...@gmail.com, László Láng
Ranjib,

We want to expose a few of the knobs up through the configuration, but in general
do not want to allow arbitrary configuration. It requires a pretty deep understanding
of the system to modify those, and the support burden of arbitrary configurations falls
on us as OSS maintainers or via commercial support agreements when we open those up.

Best Regards,
Armon Dadgar
Message has been deleted

László Láng

unread,
May 2, 2017, 7:01:05 AM5/2/17
to Serf
Hello,

I would like to ask some advice. I have serf 0.8.1 cluster and I want to have config where I can detect node failure quite fast.
But what is more important, this must be tolerant to temporar failure.

Now I have only this in config.
{
  "profile"            : "lan",
  "reconnect_interval" : "1s",
  "reconnect_timeout"  : "10s",
  "tombstone_timeout"  : "1s",
  "leave_on_terminate" : true
}

I though that serf would try 10 times the node availability, at every 1 second, and only after 10 member-failure it decides member-reap.
In my application member-reap triggers some action related to the node, which is quite bad if node is up and running, but serf reports false member-reap.

Now I saw in my log that after the first member-failed I got the member-reap.

14:08:47 [INFO] agent: Received event: member-join"}
14:10:18 [INFO] memberlist: Marking katerina-428-oam-node-2 as failed, suspect timeout reached (3 peer confirmations)"}
14:10:18 [INFO] memberlist: Suspect katerina-428-oam-node-2 has failed, no acks received"}
14:10:19 [INFO] agent: Received event: member-failed"}
14:10:38 [INFO] agent: Received event: member-reap"}

Why there was not more member-failed?
What does "3 peer confirmations" means? Does it prevent the reconnects?

I do not know what cause the member failre, it could be overload, or network problem I guess.

How can I configure serf to avoid such false member-reap?

I have quite a lot such error, around the member failures.
14:10:18 [ERR] memberlist: Failed TCP fallback ping: read tcp 169.254.0.77:53654->169.254.0.32:7946: i/o timeout"}

How can I avoid it, what could be its root cause?

Thanks !
Br, Laci

Armon Dadgar

unread,
May 2, 2017, 12:28:57 PM5/2/17
to ser...@googlegroups.com, László Láng
Hey,

You have configured Serf with extremely aggressive settings that will cause it to not handle short lived failures well.
It should be sufficient to use the “LAN” profile and skip the reconnect and tombstone configurations.

Your configuration has Serf attempting to reconnect to failed nodes every second for up to 10 seconds, and then immediately reaping the failed nodes (1s later). This means a node that is offline for even 12 seconds will be permanently reaped from the cluster. Removing those settings will allow Serf to retry for up to 72 hours before reaping.

The root cause of the failure can be any number of things, from CPU starvation, packet loss, network outage, firewalls, etc.

Best Regards,
Armon Dadgar
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/serf/issues
IRC: #serfdom on Freenode
---
You received this message because you are subscribed to the Google Groups "Serf" group.
To unsubscribe from this group and stop receiving emails from it, send an email to serfdom+u...@googlegroups.com.

László Láng

unread,
May 2, 2017, 3:20:46 PM5/2/17
to Serf, flam...@gmail.com
Hello,

Actually this is what I would like to have. What you said: "Your configuration has Serf attempting to reconnect to failed nodes every second for up to 10 seconds, and then immediately reaping the failed nodes (1s later). This means a node that is offline for even 12 seconds will be permanently reaped from the cluster."

My problem is that, with the below configuration this is not happening. Please see the logs. Based on your explanation I would expect that there are 10 reconnect attempts, but I cannot see member-failed log. Or is there log at every reconnect attempt after the first member failed? Or those reconnects are without log? I do not have wireshark capture for example, only the Serf logs are available.


14:08:47 [INFO] agent: Received event: member-join"}
14:10:18 [INFO] memberlist: Marking katerina-428-oam-node-2 as failed, suspect timeout reached (3 peer confirmations)"}
14:10:18 [INFO] memberlist: Suspect katerina-428-oam-node-2 has failed, no acks received"}
14:10:19 [INFO] agent: Received event: member-failed"}
14:10:38 [INFO] agent: Received event: member-reap"}

I will investigate the logs I have, but this member-reap happens sometimes in my cloud env.
If 10 consecutive reconnect fail, with 1-3 second interval, for me it would be OK to reap the node.
But I would like to have that 10 reconnects attempts... but now I am not sure if I have those.

Thanks for help!
Br, Laci

Armon Dadgar

unread,
May 3, 2017, 9:23:08 PM5/3/17
to ser...@googlegroups.com, László Láng
Hey,

You may need to enable DEBUG level logging to see the right level of information.
When a node fails, Serf prevents every node in the cluster from attempting to reconnect, and instead nodes “flip a coin” to decide if they should try.

This way, if 1 node out of 100 fails, each node has a (1/100) chance of reconnecting so that the node doesn’t get 100 incoming connections per second.
Hope that helps!

Best Regards,
Armon Dadgar
Reply all
Reply to author
Forward
0 new messages