I have put the text below as a "potential issue" on 0.4.1, but I just want to ask you when would a node say such a thing?
I think a "suspect message" means a message in relation with the "suspection" protocol used by underlying serf/gossip/SWIM error detection mechanism?
Or it is a "suspect message" unrelated to the "suspection" protocol?
And what does refuting mean?
Seems that the other node keeps suspecting, so this node does not ack the other to tell him "hey i am alive"?
I have 3 nodes cluster: consul1, consul2 and consul3.
consul1 (172.18.32.130) is done.
Still, as expected both Read and Write went ok with quorum of consul2 and 3 up.
Now I stop consul2 and start it back.
As expected consul3 enters candidate state, but consul2 does not answer him.
On the other side, consul2 keeps "refuting a suspect message" from
consul3, which explains why it does not answer him, and consul3 keeps
timing out consul2.
Doing consul members on consul2 confirms both nodes "alive".
How can I debug this?
Log/consul3:
2015/03/02 12:07:50 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:51 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:52 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:52 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:52 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:52 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:53 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:53 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:53 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:54 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:55 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:55 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:55 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:56 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:57 [INFO] memberlist: Suspect consul2 has failed, no acks received
2015/03/02 12:07:57 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:57 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:57 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:58 [WARN] raft: Election timeout reached, restarting election
2015/03/02 12:07:58 [INFO] raft: Node at 172.18.33.110:8300 [Candidate] entering Candidate state
2015/03/02 12:07:58 [ERR] raft: Failed to make RequestVote RPC to
172.18.32.130:8300: dial tcp 172.18.32.130:8300: connection refused
2015/03/02 12:07:58 [ERR] agent: failed to sync remote state: No cluster leader
2015/03/02 12:07:59 [INFO] memberlist: Suspect consul2 has failed, no acks received
Log/consul2:
2015/03/02 12:07:32 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:34 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:37 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:39 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:41 [WARN] memberlist: Refuting a suspect message (from: consul2)
2015/03/02 12:07:44 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:44 [ERR] agent: failed to sync remote state: No cluster leader
2015/03/02 12:07:48 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:51 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:54 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:56 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:07:58 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:08:01 [WARN] memberlist: Refuting a suspect message (from: consul3)
2015/03/02 12:08:03 [WARN] memberlist: Refuting a suspect message (from: consul3)
--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Hey Nicolae,That sounds right. The issue is that the IP->MAC cache never gets invalidated,so if the IP is re-used and the MAC address is different (which I believe Docker randomlygenerates), then this issue crops up. Sounds like they have fixed the issue in newerversions of Docker however.Best Regards,Armon Dadgar
> email to consu...@googlegroups.com.
Failure detection is done by periodic random probing using a configurable interval. If the node fails to ack within a reasonable time (typically some multiple of RTT), then an indirect probe is attempted. An indirect probe asks a configurable number of random nodes to probe the same node, in case there are network issues causing our own node to fail the probe. If both our probe and the indirect probes fail within a reasonable time, then the node is marked "suspicious" and this knowledge is gossiped to the cluster. A suspicious node is still considered a member of cluster. If the suspect member of the cluster does not dispute the suspicion within a configurable period of time, the node is finally considered dead, and this state is then gossiped to the cluster.
Community chat: https://gitter.im/hashicorp-consul/Lobby
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/d5d25601-42e2-4bbd-adef-7832a8e48453%40googlegroups.com.