Moving a peer to a new IP address (Docker)

494 views
Skip to first unread message

Raman Gupta

unread,
Nov 6, 2014, 10:54:54 AM11/6/14
to consu...@googlegroups.com
If a peer is stopped and started on a new IP address without leaving the cluster, the peers list and the catalog gets out of sync. The peers list contains both the old and new IP address while the catalog contains only the new. The existing members continue to try and contact the old peer. It seems the only recovery at this point is to edit the raft/peers.json file.

The context of this is docker. When a docker container is restarted, docker gives the restarted container a new IP address.

Here is what I am doing to reproduce:

$ docker run -d --name node1 -h node1 progrium/consul -server -bootstrap-expect 3
$ docker run -d --name node2 --link node1:node1 -h node2 progrium/consul -server -join node1
$ docker run -d --name node3 --link node1:node1 -h node3 progrium/consul -server -join node1
$ docker run -d --name node4 --link node1:node1 -p 8400:8400 -p 8500:8500 -p 8600:53/udp -h node4 progrium/consul -join node1

$ http GET localhost:8500/v1/catalog/nodes
HTTP/1.1 200 OK
Content-Length: 165
Content-Type: application/json
Date: Thu, 06 Nov 2014 15:50:56 GMT
X-Consul-Index: 9
X-Consul-Knownleader: true
X-Consul-Lastcontact: 0

[
    {
        "Address": "172.17.0.36", 
        "Node": "node1"
    }, 
    {
        "Address": "172.17.0.37", 
        "Node": "node2"
    }, 
    {
        "Address": "172.17.0.38", 
        "Node": "node3"
    }, 
    {
        "Address": "172.17.0.39", 
        "Node": "node4"
    }
]

$ http GET localhost:8500/v1/status/peers
HTTP/1.1 200 OK
Content-Length: 58
Content-Type: application/json
Date: Thu, 06 Nov 2014 15:51:17 GMT

[
    "172.17.0.36:8300", 
    "172.17.0.37:8300", 
]

$ docker stop node2
$ docker start node2

Now the leader shows failed to connect errors continuously:

    2014/11/06 15:52:40 [WARN] raft: Failed to contact 172.17.0.37:8300 in 14.711937062s

$ http GET localhost:8500/v1/catalog/nodes
HTTP/1.1 200 OK
Content-Length: 165
Content-Type: application/json
Date: Thu, 06 Nov 2014 15:53:28 GMT
X-Consul-Index: 21
X-Consul-Knownleader: true
X-Consul-Lastcontact: 0

[
    {
        "Address": "172.17.0.36", 
        "Node": "node1"
    }, 
    {
        "Address": "172.17.0.40",    <---- IP address of node2 has been updated
        "Node": "node2"
    }, 
    {
        "Address": "172.17.0.38", 
        "Node": "node3"
    }, 
    {
        "Address": "172.17.0.39", 
        "Node": "node4"
    }
]

$ http GET localhost:8500/v1/status/peers
HTTP/1.1 200 OK
Content-Length: 77
Content-Type: application/json
Date: Thu, 06 Nov 2014 15:54:06 GMT

[
    "172.17.0.36:8300", 
    "172.17.0.40:8300", 
    "172.17.0.37:8300",       <------ peers still contains the old IP address
]

Can (or should) Consul see that a peer with the same name as a prior peer is joining from a different IP address, and remove the old peer with that name from the peers list?

Regards,
Raman

Andrew Watson

unread,
Nov 6, 2014, 12:23:22 PM11/6/14
to Raman Gupta, consu...@googlegroups.com
This could also be related to the arp cache issue in docker:
https://github.com/progrium/docker-consul#quickly-restarting-a-node-using-the-same-ip-issue



--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Raman Gupta

unread,
Nov 6, 2014, 1:13:57 PM11/6/14
to consu...@googlegroups.com, rocke...@gmail.com
On Thursday, November 6, 2014 12:23:22 PM UTC-5, Andrew Watson wrote:
This could also be related to the arp cache issue in docker:
https://github.com/progrium/docker-consul#quickly-restarting-a-node-using-the-same-ip-issue

I tried clearing my arp in between the stop and start, and still had the same issue... I think the arp cache stuff was fixed in docker upstream.

Regards,
Raman

Alvaro Miranda

unread,
Nov 6, 2014, 1:18:27 PM11/6/14
to Raman Gupta, consu...@googlegroups.com
my understanding is thats the normal behaviour 

they mark that host-ip as fail and will retry for.. i think 72 hours?


read last paragraph 

if the agent is shutdown it does send a leave

you can do consul leave from other session

Armon Dadgar

unread,
Nov 6, 2014, 1:56:18 PM11/6/14
to Raman Gupta, consu...@googlegroups.com
What version of Consul are you currently running? There was a known issue affecting
reaping of the peers list that was recently fixed in 0.4.1.

Best Regards,
Armon Dadgar

Raman Gupta

unread,
Nov 6, 2014, 2:01:00 PM11/6/14
to consu...@googlegroups.com, rocke...@gmail.com
On Thursday, November 6, 2014 1:18:27 PM UTC-5, Alvaro Miranda Aguilera wrote:
my understanding is thats the normal behaviour 

they mark that host-ip as fail and will retry for.. i think 72 hours?


read last paragraph 


Interesting... I just tried using "docker exec -it node2 consul leave" in order to shut down one of the nodes and that worked perfectly -- the node dropped out gracefully and the peers list was updated immediately. However, when I do "docker stop" it doesn't drop out gracefully, leaving it in the "failed" state. I note that this does not seem to be limited to "docker stop" -- it also enters the failed state if I use "exec" to enter the container and then "kill <consulpid>".

Shouldn't consul respond to SIGTERM the same way as SIGKILL? Or is this something specific to consul running inside docker?

Raman Gupta

unread,
Nov 6, 2014, 2:01:47 PM11/6/14
to consu...@googlegroups.com, rocke...@gmail.com
$ docker exec -it node1 consul version
Consul v0.4.1
Consul Protocol: 2 (Understands back to: 1)

Raman Gupta

unread,
Nov 6, 2014, 2:03:39 PM11/6/14
to consu...@googlegroups.com, rocke...@gmail.com
On Thursday, November 6, 2014 2:01:00 PM UTC-5, Raman Gupta wrote:
Shouldn't consul respond to SIGTERM the same way as SIGKILL? Or is this something specific to consul running inside docker?

I meant to say SIGTERM (kill <pid>) the same was as SIGINT (ctrl-c).

Armon Dadgar

unread,
Nov 6, 2014, 2:13:30 PM11/6/14
to Raman Gupta, consu...@googlegroups.com, rocke...@gmail.com
Oh okay, I think I know what is happening… If you do a SIGINT Consul triggers a graceful
leave, while SIGTERM is a forceful termination. (This is configurable behavior).

The issue is that the same node name is leaving/joining at a different IP address.
This is handled by the gossip layer which appropriate updates the IP, but the Raft layer
cannot handle an IP change.

I’ve opened a ticket here:

Best Regards,
Armon Dadgar

From: Raman Gupta <rocke...@gmail.com>
Reply: Raman Gupta <rocke...@gmail.com>>

Raman Gupta

unread,
Nov 6, 2014, 2:27:16 PM11/6/14
to consu...@googlegroups.com, rocke...@gmail.com
Excellent. Just out of curiosity, why is the default for "leave_on_terminate" false?

Armon Dadgar

unread,
Nov 6, 2014, 6:53:44 PM11/6/14
to Raman Gupta, consu...@googlegroups.com, rocke...@gmail.com
Legacy I guess. No specific reason.
Reply all
Reply to author
Forward
0 new messages