consul leave does not leave

303 views
Skip to first unread message

Barry Kaplan

unread,
Oct 14, 2016, 3:45:10 AM10/14/16
to Consul
I have just upgraded all servers and agents to 0.7.0. Everything seems to run fine, except the protocol has remained at 2.

Now I am trying to rebuild a the instances the servers are running on. Before I terminate the instances I am running

$ consul leave

When I do the server logs:

    2016/10/14 07:40:17 [ERR] dns: error starting tcp server: accept tcp [::]:8600: use of closed network connection
==> WARNING: Expect Mode enabled, expecting 3 servers
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
           
Version: 'v0.7.0'
         
Node name: 'ops-consul-server-1'
       
Datacenter: 'ops'
           
Server: true (bootstrap: false)
       
Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
     
Cluster Addr: 10.0.196.12 (LAN: 8301, WAN: 8302)
   
Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             
Atlas: <disabled>


==> Log data will now stream in as it occurs:


   
2016/10/14 07:40:17 [WARN] memberlist: Refuting a suspect message (from: ops-consul-server-1)
   
2016/10/14 07:40:17 [WARN] raft: Failed to get previous log: 5614360 log not found (last: 5614358)

And the consul server remains in the cluster with status alive.

What am I missing? 

Barry Kaplan

unread,
Oct 14, 2016, 3:46:16 AM10/14/16
to Consul
Also the peers is not changed -- the "leave" node is still in the peers for itself and the other servers.

James Phillips

unread,
Oct 14, 2016, 4:02:19 AM10/14/16
to consu...@googlegroups.com
Hi Barry,

The protocol remaining at 2 is normal for 0.7.0.

We changed the default leave on terminate behavior for servers
(https://www.consul.io/docs/upgrade-specific.html) to be more
conservative. Is it possible your process supervisor is restarting
consul after it exits when you run "consul leave" and it's rejoining
and then not leaving when you terminate the process? You may want to
change "leave_on_terminate" to true and "skip_leave_on_interrupt" to
false to get back to the pre-0.7.0 configuration.

-- James

On Fri, Oct 14, 2016 at 12:46 AM, Barry Kaplan <mem...@gmail.com> wrote:
> Also the peers is not changed -- the "leave" node is still in the peers for
> itself and the other servers.
>
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/34857921-ef5b-4bb3-ac3f-34dc2a777521%40googlegroups.com.
>
> For more options, visit https://groups.google.com/d/optout.

Barry Kaplan

unread,
Oct 14, 2016, 4:10:22 AM10/14/16
to Consul
I just tried executing

$ consul leave; sudo stop consul

This time the server does leave and is in the left state. 

So is there something that is causing the server to rejoin if not immediately stopped? I do have 'retry_join:[]' in the config, but not 'rejoin' 

James Phillips

unread,
Oct 14, 2016, 4:14:41 AM10/14/16
to consu...@googlegroups.com
Yes the retry-join will cause it to jump back into the cluster again
when it restarts.
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/ae8f937d-bc0b-44c2-919e-7d7edd1bbd5c%40googlegroups.com.

Barry Kaplan

unread,
Oct 14, 2016, 4:29:18 AM10/14/16
to Consul


On Friday, October 14, 2016 at 10:14:41 AM UTC+2, James Phillips wrote:
Yes the retry-join will cause it to jump back into the cluster again
when it restarts.

Yes, when it restarts. I was wondering if it blocked the shutdown after a leave. 

It probably was upstart that was restarting it. This means leave followed by stop will have a race condition. I will play with the settings you recommended. Thanks!

Reply all
Reply to author
Forward
0 new messages