Consul cluster deployed to docker swarm

524 views
Skip to first unread message

jankows...@gmail.com

unread,
Aug 18, 2017, 12:36:17 PM8/18/17
to Consul
We're trying to configure Consul (version 0.9.2) to run in a high availability environment. We deploy a Consul cluster to a docker swarm (version 17.05.0-ce) of 3 manager nodes, where each node hosts a docker container running a Consul server. In our development environment, each swarm node is an Ubuntu-based (16.04.2 LTS) virtual machine running under KVM.

I test HA by manually killing (virsh destroy) the VM hosting the Consul leader and then manually restarting (virsh start) the dead VM. I expect to see a new leader elected from among the 2 surviving Consul servers. Once the dead VM is running again, I expect a new instance of Consul server to successfully re-join the cluster. I then repeat by killing the VM hosting the new leader. The number of repetitions never seems to go very high.

For a cluster of 3 Consul servers, I expect to see "raft:num_peers = 2" in the output of "consul info" executed on any server. I have observed values of 0, 1, or 3 for num_peers, and once I have, the cluster starts failing to recover. Consul is configured to run on its own subnet 172.29.20.0/29. Here's a sample test run (the dot notation is used to indicate the IP address of the container that's failed or restarted; e.g. .3 is shorthand for 172.29.20.3):

(1) Kill leader node3 (.3 failed) -> new leader node2; Restore node3 (.6 started) -> joined cluster
(2) Kill leader node2 (.5 failed) -> new leader node1; Restore node2 (.3 started, node1 LOST leadership to node3) -> joined cluster

    Node1 LOST leadership to node3:
    "[ERR] consul: failed to add raft peer: leadership lost while committing log"
    "[ERR] consul: failed to reconcile member: {9d9dc35a40b1 172.29.20.3 8301 map[expect:3 id:b315b134-2efe-0946-4296-244a454fa9bc vsn:2 build:0.9.2:75ca2ca wan_join_port:8302 vsn_min:2 vsn_max:3 port:8300 dc:dc1 role:consul raft_vsn:2] alive 1 5 2 2 5 4}: leadership lost while committing log"

(3) Kill leader node3 (.6 failed) -> Leader election FAILED (raft:num_peers = 1 @node1; raft:num_peers = 0 @node2) 

We mount Consul's data-dir to a directory on the host VM because the Consul documentation appears to strongly encourage it:

-data-dir - This flag provides a data directory for the agent to store state. This is required for all agents. The directory should be durable across reboots. This is especially critical for agents that are running in server mode as they must be able to persist cluster state. Additionally, the directory must support the use of filesystem locking, meaning some types of mounted folders (e.g. VirtualBox shared folders) may not be suitable.

The following are the contents of the stackfile we use for docker swarm deployment. Do you see anything wrong with this configuration that would threaten HA operation?  Thank you...

version: "3"
services:
  consul:
    image: consul:0.9.2

    # Deploy to all docker manager nodes
    deploy:
      mode: global
      placement:
        constraints:
          - node.role == manager
      restart_policy:
        condition: on-failure
    environment:
      CONSUL_LOCAL_CONFIG: "{disable_update_check: true}"
      CONSUL_BIND_INTERFACE: eth0
    entrypoint:
      - consul
      - agent
      - -server
      - -bootstrap-expect=3
      - -config-dir=/consul/config
      - -data-dir=/consul/data
      - -bind={{ GetInterfaceIP "eth0" }}
      - -client=0.0.0.0
      - -ui
      - -rejoin
      - -retry-join=172.29.20.2
      - -retry-join=172.29.20.3
      - -retry-join=172.29.20.4
      - -retry-join=172.29.20.5
      - -retry-join=172.29.20.6
      - -retry-join=172.29.20.7
    networks:
      - net
      - voltha-net
    ports:
      - "8300:8300"
      - "8400:8400"
      - "8500:8500"
      - "8600:8600/udp"
    volumes:
      - /consul/data:/consul/data

networks:
  net:
    driver: overlay
    driver_opts:
      encrypted: "true"
    ipam:
      driver: default
      config:
        - subnet: 172.29.20.0/29
  voltha-net:
    external:
      name: voltha_net

Armon Dadgar

unread,
Aug 18, 2017, 8:11:38 PM8/18/17
to consu...@googlegroups.com, jankows...@gmail.com
Hey,

It’s a bit hard to diagnose given the partial snippets, but my guess is that the Consul server IPs are changing between power loss / restore.
This causes Consul to keep old servers in the quorum which leads to an outage when enough failures happen. This is a known issue (https://github.com/hashicorp/consul/issues/1580) and something we are working on.

In the mean time, you should try to have persistent IPs for the servers and enable Autopilot to prevent the quorum loss (https://www.consul.io/docs/guides/autopilot.html).

Best Regards,
Armon Dadgar
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/e6e7cf4b-0a36-4983-8ce2-15f7b1d9356f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages