Single node cluster fails to elect itself leader, even with bootstrap

Clay Bowen

unread,

Mar 28, 2016, 1:49:45 PM3/28/16

to Consul

I'm transitioning off of a datacenter, and I've moved Consul to a different location, but due to some limitations I've had to leave one server in a datacenter. That server, however, refuses to act as a leader.

When I start, this is what I get:

[root@vault ~]# tail -f /var/log/consul

Datacenter: 'devprodaws'

Server: true (bootstrap: true)

Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 53, RPC: 8400)

Cluster Addr: 10.16.19.138 (LAN: 8301, WAN: 8302)

Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false

Atlas: <disabled>

==> Log data will now stream in as it occurs:

2016/03/28 17:46:05 [ERR] agent: failed to sync remote state: No cluster leader

2016/03/28 17:46:07 [WARN] raft: Heartbeat timeout reached, starting election

2016/03/28 17:46:07 [ERR] consul: failed to reconcile member: {vault.<company>-external.com 10.16.19.138 8301 map[build:0.6.1dev:52ac5530 port:8300 bootstrap:1 role:consul dc:devprodaws vsn:2 vsn_min:1 vsn_max:3] alive 1 3 2 2 4 4}: No cluster leader

2016/03/28 17:46:07 [ERR] consul: failed to reconcile: No cluster leader

2016/03/28 17:46:08 [WARN] raft: EnableSingleNode disabled, and no known peers. Aborting election.

It continues to complain about no leader endlessly.

Config is this:

[root@vault ~]# cat /opt/consul/consul.json

{

"datacenter": "devprodAWS",

"data_dir": "/opt/consul/data",

"bind_addr" : "0.0.0.0",

"client_addr" : "0.0.0.0",

"log_level": "warn",

"ui_dir": "/opt/consul/ui",

"server": true,

"bootstrap_expect": 1,

"retry_max": 10,

"retry_interval": "20s",

"ports" : {

"dns": 53,

"http": 8500,

"https": -1,

"rpc": 8400,

"serf_lan": 8301,

"serf_wan": 8302,

"server": 8300

}

Clay Bowen

unread,

Mar 28, 2016, 2:39:57 PM3/28/16

to Consul

Consul info:

[root@vault ~]# consul info

agent:

check_monitors = 0

check_ttls = 0

checks = 0

services = 26

build:

prerelease = dev

revision = 52ac5530

version = 0.6.1

consul:

bootstrap = true

known_datacenters = 1

leader = false

server = true

raft:

applied_index = 3506210

commit_index = 3506210

fsm_pending = 0

last_contact = never

last_log_index = 3506210

last_log_term = 10944

last_snapshot_index = 3500890

last_snapshot_term = 10936

num_peers = 0

state = Follower

term = 10944

runtime:

arch = amd64

cpu_count = 1

goroutines = 59

max_procs = 1

os = linux

version = go1.5.1

serf_lan:

encrypted = false

event_queue = 1

event_time = 2

failed = 0

intent_queue = 1

left = 0

member_time = 2

members = 1

query_queue = 0

query_time = 1

serf_wan:

encrypted = false

event_queue = 0

event_time = 1

failed = 0

intent_queue = 0

left = 0

member_time = 1

members = 1

query_queue = 0

query_time = 1

Sean Chittenden

unread,

Mar 29, 2016, 3:34:18 AM3/29/16

to Consul

Hello Clay. How did the other servers leave the cluster? In the meantime, can you change your `"bootstrap_expect": 1` config option to just `"bootstrap": true` and see if that works for your single-node datacenter "cluseter" ? -sc

Clay Bowen

unread,

Mar 29, 2016, 11:53:58 AM3/29/16

to Consul

Hey Sean. Yeah, I tried it with just "bootstrap" and nothing. Since I'm using this consul for a backend to Vault, I just backed up the KV store (using consul-backup) from the migrated consul. I then renamed the "data" directory under consul to a different name and restarted consul. Came up perfectly. I then restored the KV backup. I'm able to start Vault, and unseal it, but now I'm getting:

[ERR] core: failed to acquire lock: Existing key does not match lock use

and Vault won't consider itself "active".

I'm still working on it, but if you have any ideas I'm receptive.

Thanks,

Clay

Clay Bowen

unread,

Mar 29, 2016, 12:02:46 PM3/29/16

to Consul

Fixed -- removed the core:lock and core:leader keys in consul (in the Vault KV) and restarted Vault. Working perfectly now. Since I had a backup, it was easy to take a chance and remove what could have been important keys as I could just restore if a problem occurred.