Hi all,
Apologies for the late reply, but anyway thank you to Armon for a great explanation and to Justin for the auto-scaling example.
I'm very tight for time with respect to getting my project in production so I'm going to settle for a fixed number of instances at first. However, I'm still concerned about the process of servers joining/leaving the cluster as in tests it seems I can only get it working by bootstrapping all together, but can't let one instance leave and then re-join without getting a "No cluster leader" error.
/etc/consul/config.json:
{
"datacenter": "cg",
"data_dir": "/opt/consul",
"log_level": "INFO",
"server": true
}
/etc/consul/bootstrap.json
{
The bootstrap is in a separate file, not sure but at one point I was thinking to delete it after the initial bootstrap.
After starting consul agent (with config dir set to /etc/consul) on both machines, I do "consul join 10.0.0.1 10.0.0.2" (not the real IPs).
However, if I ctrl+c an agent running on the consul (or reboot the machine), then I break the quorum and can't seem to repair it. Stopping consul makes the machine leave the cluster, so the raft/peers file just contains null. If I then start consul and do "consul join 10.0.0.1" (if that's the IP of the one left running), then they fail to elect a leader.
So far I can only get a working consul cluster by starting with a clean slate, booting both machines with an empty consul data dir (using /opt/consul), bootstrap-expect 2 and then running consul join again (and finally, import backed up data with consulate).
Is this expected behaviour?
I only used 2 consul server agents on the development stack, but I know the docs recommend 3 or 5. If I go up to 3 servers, will it solve the leader election problem? i.e. machines A, B and C are in the cluster and C leaves, then tries to rejoin, will A and B agree on a leader and welcome back C with a simple "consul join" request from C? If so, then I can stretch to a 3rd instance even in development.
About bootstrap-expect, if the same server tries to rejoin the cluster later, presumably it shouldn't have bootstrap-expect anymore? (Tried with and without, no difference).
Thanks,
Lars.