Cannot add new node to ETCD cluster - unexpected status code 401

470 views
Skip to first unread message

a.sa...@gmail.com

unread,
Oct 5, 2016, 7:09:49 AM10/5/16
to etcd-dev
I have got ETCD cluster with 3 nodes. I want to add new node but I have got an error:

stderr: start to sync cluster using endpoints(http://127.0.0.1:2379,http://127.0.0.1:4001)
cURL
Command: curl -X GET http://127.0.0.1:2379/v2/members
cURL
Command: curl -X GET http://127.0.0.1:4001/v2/members
got endpoints
(https://xxx-29:2379,https://yyy-137:2379,https://zzz-254:2379,https://qqq-82:2379) after sync
Cluster-Endpoints: https://xxx-29:2379, https://yyy-137:2379, https://zzz-254:2379, https://qqq-82:2379
cURL
Command: curl -X POST https://xxx-29:2379/v2/members -d "{\"peerURLs\":[\"https://new-127:2380\"]}"
unexpected status code
401



When I check member list I can see:


start to sync cluster using endpoints(http://127.0.0.1:4001,http://127.0.0.1:2379)
cURL
Command: curl -X GET http://127.0.0.1:4001/v2/members
got endpoints
(https://xxx-29:2379,https://yyy-137:2379,https://qqq-82:2379,https://zzz-254:2379) after sync
Cluster-Endpoints: https://xxx-29:2379, https://yyy-137:2379, https://qqq-82:2379, https://zzz-254:2379
cURL
Command: curl -X GET https://xxx-29:2379/v2/members
cURL
Command: curl -X GET https://xxx-29:2379/v2/members/leader
id1
: name=i-95d25f28 peerURLs=https://xxx-29:2380 clientURLs=https://xxx-29:2379 isLeader=false
id2
: name=i-056f7b98 peerURLs=https://zzz-254:2380 clientURLs=https://zzz-254:2379 isLeader=true
id3
: name=i-85d90238 peerURLs=https://yyy-137:2380 clientURLs=https://yyy-137:2379 isLeader=false
id4
: name=i-f4e56849 peerURLs=https://qqq-82:2380 clientURLs=https://qqq-82:2379 isLeader=false


Cluster health says:


member id1 is healthy: got healthy result from https://xxx-29:2379
member id2
is healthy: got healthy result from https://zzz-254:2379
failed to check the health of member id3 on https
://yyy-137:2379: Get https://yyy-137:2379/health: dial tcp yyy.137:2379: i/o timeout
member id3
is unreachable: [https://yyy-137:2379] are all unreachable
member id4
is healthy: got healthy result from https://qqq-82:2379
cluster
is healthy


When I try to remove broken node from cluster:


start to sync cluster using endpoints(http://127.0.0.1:2379,http://127.0.0.1:4001)
cURL
Command: curl -X GET http://127.0.0.1:2379/v2/members
cURL
Command: curl -X GET http://127.0.0.1:4001/v2/members
got endpoints
(https://yyy-137:2379,https://xxx-29:2379,https://qqq-82:2379,https://zzz-254:2379) after sync
Cluster-Endpoints: https://yyy-137:2379, https://xxx-29:2379, https://qqq-82:2379, https://zzz-254:2379
cURL
Command: curl -X GET https://y-137:2379/v2/members
cURL
Command: curl -X GET https://xxx-29:2379/v2/members
cURL
Command: curl -X DELETE https://xxx-29:2379/v2/members/id3
Received an error trying to remove member id3: unexpected status code 401


Thank you in advance

anthony...@coreos.com

unread,
Oct 5, 2016, 12:29:48 PM10/5/16
to etcd-dev
What version of etcd?

Also, I suspect you probably don't want to remove 'xxx-29'; the broken node is 'yyy-137'.

a.sa...@gmail.com

unread,
Oct 6, 2016, 3:56:49 AM10/6/16
to etcd-dev
Each node:
etcdctl version 2.3.2

You are right, I want to remove 'yyy-137' and I think it is what I actually do. Take a look at the piece of log:

start to sync cluster using endpoints(http://127.0.0.1:2379,http://127.0.0.1:4001)
cURL
Command: curl -X GET http://127.0.0.1:2379/v2/members
cURL
Command: curl -X GET http://127.0.0.1:4001/v2/members
got endpoints
(https://yyy-137:2379,https://xxx-29:2379,https://qqq-82:2379,https://zzz-254:2379) after sync
Cluster-Endpoints: https://yyy-137:2379, https://xxx-29:2379, https://qqq-82:2379, https://zzz-254:2379
cURL
Command: curl -X GET https://y-137:2379/v2/members
cURL
Command: curl -X GET https://xxx-29:2379/v2/members
cURL
Command: curl -X DELETE https://xxx-29:2379/v2/members/id3
Received an error trying to remove member id3: unexpected status code 401

As well as I understand ETCD chooses one node to get information about the cluster. First element of the list is '-137' and GET is not possible because the box is dead. Next node is '-29' and it gives some information about topology. Then DELETE request goes to node '-29' with id of '-137' dead node. Am I wrong?

anthony...@coreos.com

unread,
Oct 6, 2016, 1:32:14 PM10/6/16
to etcd-dev
OK, sorry, you've got it right.

Could you share the etcd server logs around the time of issuing the `curl -X DELETE`? There may be some relevant error messages that would help with debugging. Thanks!

a.sa...@gmail.com

unread,
Oct 7, 2016, 9:49:35 AM10/7/16
to etcd-dev
Ok, I just wanted to be sure I understand the behaviour. I was looking for some logs but I did not find anything interesting. Maybe, I was checking wrong places, can you tell me where I can find some info? Which box I should check?

anthony...@coreos.com

unread,
Oct 7, 2016, 12:29:56 PM10/7/16
to etcd-dev
Is the cluster running with auth enabled? It looks like the only way to get a 401 on that path is if auth is enabled and the requester (e.g., an anonymous curl request) lacks the root role (https://github.com/coreos/etcd/blob/ce63f107382080fda29ed124249b16bfea224dc0/etcdserver/etcdhttp/client.go#L212 ). If there's an error in the logs it would be on the member that receives the curl request, but it seems the code doesn't report any warnings when rejecting anonymous requests so that's probably why the logs look fine.
Reply all
Reply to author
Forward
0 new messages