etcd cluster behind a LB and how to debug "Raft Internal Error"

766 views
Skip to first unread message

Xu Chen

unread,
Jan 28, 2014, 2:09:17 PM1/28/14
to coreo...@googlegroups.com
Two questions meshed into one here..

1) I have three etcd nodes in a cluster behind a LB. It seems to work fine most of the time. Recently I noticed that the requests to ETCD are usually redirected, while setting the redirect to the actual IP of the node. I wonder if there is a way to redirect to the virtual IP. Also, I wonder if there are potential issues for running etcd cluster like this.

2) I occasionally (probably 1 out of 15-20 PUTs) see status code 300, "Raft Internal Error". I wonder if there is a way to further troubleshoot this..

Thanks.

Rob Szumski

unread,
Jan 29, 2014, 2:05:45 AM1/29/14
to coreos-dev
I believe etcd redirects each write to the master. Your client will have to follow the redirects to write successfully and routing to the virtual IP won’t gain you more availability like it normally does. But the automatic elections will happen just as quickly as your load balancer’s health monitor will trigger.

The raft internal error is the error that is surfaced while writes are rejected due to a leader election. If you’re electing frequently, you should tune etcd so that an election only happens during a legitimate outage. Cloud environments need higher timeouts than private facilities. Check out this tuning guide:


 - Rob

Xu Chen

unread,
Jan 29, 2014, 9:01:02 AM1/29/14
to coreo...@googlegroups.com
It would be nice to hide the actual IPs of etcd nodes and only expose the virtual IP of the LB. It seems that to make this happen, the LB must be aware of the master node..

Thanks for the tuning guide..

Brandon Philips

unread,
Jan 29, 2014, 6:09:11 PM1/29/14
to coreos-dev
On Wed, Jan 29, 2014 at 6:01 AM, Xu Chen <xch...@gmail.com> wrote:
> It would be nice to hide the actual IPs of etcd nodes and only expose the virtual IP of the LB. It seems that to make this happen, the LB must be aware of the master node..

I don't quite understand your network layout. Would you mind drawing
something up and filing an on etcd issue explaining what sort of
feature you are looking for?

Thanks,

Brandon

Xu Chen

unread,
Feb 3, 2014, 3:13:28 PM2/3/14
to coreo...@googlegroups.com
Basically, you can think of three etcd nodes running on private range 192.168.0.0/24, which is not reachable to the outside.. A load balancer has a public IP open to client access, and a private IP on the same subnet as etcd nodes. 

In this case, the client can only reach the public IP on the load balancer. But since etcd isn't aware of the load-balancer, it is going to redirect the client to talk to the private IP of the master I guess. This case, the redirected URL won't be reachable to the client..
Reply all
Reply to author
Forward
0 new messages