Using Consul in an Amazon auto scaling group?

3,389 views
Skip to first unread message

Lars Janssen

unread,
Mar 4, 2015, 5:38:00 AM3/4/15
to consu...@googlegroups.com
Hi all,

I'm new to Consul and managed to bootstrap my first pair of servers on the test environment yesterday.

I would like to know if anyone has experience of running the agents in server mode in an AWS auto scaling group?

I'm not so much concerned about scaling for high load, more for resilience and auto-recovery. Let's say we use three servers (it could be six if we want two per availability zone in future) and we set the auto scaling group to a fixed size (min 3 instances, max 3 instances). If one instance gets terminated, Amazon will spin up a new one to maintain the correct instance count.

My main concern is that the new instance will have a new IP address, so it can't just "replace" the old one. So:

1. Am I right in thinking it's ok for the new instance to join as a new member of the consensus quorum (which will have already bootstrapped in the past)?

2. Is there a safe/automatic way to clean up/remove the old instance?

Also, not such a big issue, but in terms of discovering the existing members. I guess this needs to be done from outside of Consul, e.g. by querying for the Amazon instances by tag, unless there's any more clever/recommended way to do this.

Any thoughts/experience would be appreciated.

Thanks,

Lars.

Armon Dadgar

unread,
Mar 4, 2015, 8:14:58 PM3/4/15
to Lars Janssen, consu...@googlegroups.com
Hey Lars,

This can certainly be done, but as you point out it requires some additional context that you
would need by querying the AWS APIs.

WRT to your questions, it is fine for a new service to join as a new member with a new IP.
If the cluster is already bootstrapped and has quorum, the server will be automatically added
to the replication group.

There is no automatic / safe way to remove the old instance (under a failure scenario) without
more context. You can query the AWS APIs and then issue a “force-leave” against that node
which will cause it to be removed.

Hope that helps!

Best Regards,
Armon Dadgar
--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Lars Janssen

unread,
Mar 5, 2015, 5:11:28 AM3/5/15
to consu...@googlegroups.com, la...@fazy.net
Hi Armon,

That's very helpful, thanks.

I searched for "force-leave" and found this: http://www.consul.io/docs/commands/force-leave.html

So it seems a node will be removed after 72 hours anyway. If so that's probably ok as I wouldn't expect many failures, so long as the clean-up is complete and automatic.

For now I've left out the auto scaling part, as it seems a lot more work to bootstrap and set up correctly. Specifying a fixed pool of instances gives me fixed IPs (so I can make an internal DNS entry in Route 53 for easier initial discovery), and termination protection. I still hope to do auto scaling and full automatic bootstrapping and recovery in future though.

Regarding the client nodes, what's the usual pattern for running these? For now, I'm thinking of putting a Consul agent running in client mode onto each EC2 instance (e.g. the web server instances), and these are in auto scaling groups and will come and go more frequently. Is this the best way, or should there be a cluster of client agents? i.e. like this:

    [Consul server] [Consul server]
    [Web + Consul client] [Web + Consul client] [Web + Consul client]
    
or like this:
    
    [Consul server] [Consul server]
    [Consul client] [Consul client] [Consul client]
    [Web instance] [Web instance] [Web instance] [Web instance]

Thanks,

Lars.


On Thursday, 5 March 2015 01:14:58 UTC, Armon Dadgar wrote:
Hey Lars,

This can certainly be done, but as you point out it requires some additional context that you
would need by querying the AWS APIs.

WRT to your questions, it is fine for a new service to join as a new member with a new IP.
If the cluster is already bootstrapped and has quorum, the server will be automatically added
to the replication group.

There is no automatic / safe way to remove the old instance (under a failure scenario) without
more context. You can query the AWS APIs and then issue a “force-leave” against that node
which will cause it to be removed.

Hope that helps!

Best Regards,
Armon Dadgar

Armon Dadgar

unread,
Mar 6, 2015, 9:02:14 PM3/6/15
to Lars Janssen, consu...@googlegroups.com, la...@fazy.net
Hey Lars,

The client agent just runs on all the nodes that are part of the cluster. They do not need
any dedicated machines. The clients don’t perform any global functions, so really they are
there to assist the registration and discovery of the other processes on the same machine.

Best Regards,
Armon Dadgar

From: Lars Janssen <la...@fazy.net>
Reply: Lars Janssen <la...@fazy.net>>

Lars Janssen

unread,
Mar 10, 2015, 5:58:33 PM3/10/15
to consu...@googlegroups.com, la...@fazy.net
Hi Armon,

Thanks for explaining some more, and I've looked at the documentation again before asking more:


I'm not completely sure about the terminology. What are the "nodes", and what is the "cluster"?

My reading of the docs and your reply is that if I have a number of web servers for example, and they use Consul (e.g. as a Key/Value store), then the web servers are each nodes in the Consul cluster. However, in the intro doc linked above, it says "Every node that provides services to Consul runs a Consul agent." I'm not sure what it means by a node providing services to Consul, unless that is either referring to agents in server mode, or to the fact that agents in client mode provide the gossip service.

To use a concrete example, I currently have the following:

1. Amazon EC2 instances running Consul in server mode
2. Load balancer + EC2 instances running Apache + Consul in client mode
3. RDS instance (MySQL)

The EC2 instances in #1 are presumably Consul nodes. How about the instances in #2? The database in #3 is a managed service, so can't run Consul anyway, but presumably I could configure the other agents to know about it (for health checking/discovery).

And yet, is the intention to run a Consul agent (client mode) on every host wherever possible? I understand it's very light, but how about the maintenance overhead e.g. when upgrading, if there are many more nodes/types of nodes than in my example? I don't have a strong opinion on if that's good or bad, just want to understand the best practice.

One final thing I'm having trouble finding in the docs - what information is contained in a Consul client? Does it have a local copy of the key/value store for example, or are the main functions just (a) monitoring the local machine and (b) participating in gossip?

Sorry for so many questions!

Thanks,

Lars.


On Saturday, 7 March 2015 02:02:14 UTC, Armon Dadgar wrote:
Hey Lars,

The client agent just runs on all the nodes that are part of the cluster. They do not need
any dedicated machines. The clients don’t perform any global functions, so really they are
there to assist the registration and discovery of the other processes on the same machine.

Best Regards,
Armon Dadgar

Armon Dadgar

unread,
Mar 16, 2015, 1:32:13 PM3/16/15
to Lars Janssen, consu...@googlegroups.com, la...@fazy.net
Hey Lars,

Sorry the terminology can be confusing. A “cluster” is just logical, all of the nodes that are participating in Consul.
I think of a “node” as an agent, so typically this is the same as a machine, but some people run one agent per
container, so it gets confusing. Depends on your deployment.

In the web server example case, each of the web servers (physical machines or VMs) has a single Consul agent
running, in client mode. They are “providing” or “registering” services via service definitions:

Those can be provided via configuration files or API driven such as the registrator too.

In your concrete example, both #1 and #2 are Consul nodes (since they are running an agent).
Group #1 is running the agent in server mode, group #2 is running in client mode.

Group #3 is an external service (external to the cluster, since it has no agent). Those

The architecture of the system is covered in more detail here:

Ultimately, the clients are there to shield you from the complexity of the internals.
They manage health checks, registration, anti-entropy, server discovery, watches, etc.
They provide an API compatibility promise, while the internals of the system are subject
to change. They also do lots of the heavy lifting and edge case handling that would otherwise
be required of a client.

Hope that helps clarify!

Best Regards,
Armon Dadgar

From: Lars Janssen <la...@fazy.net>
Reply: Lars Janssen <la...@fazy.net>>

Justin Watkinson

unread,
Mar 17, 2015, 10:54:47 AM3/17/15
to consu...@googlegroups.com, la...@fazy.net
Yikes!  Total bummer about the 72 hour things.  I just asked the same question and got an answer (sort of) while reading this one.  Hmm... anyways.  I think I can help with the ASG problem, as I solved it on Friday and just spun up a 3 node cluster with this:

I was terrible at commenting, but basically it destroys the raft/serf/tmp/services config to try and pitch any old info (I'm using registrator to re-populate that information on a set interval anyways).  Then I use the CLI as part of my cfn-init script to get the IP of itself, the instance ID, and all of the ASG members and their instances.  Then, I write it to disk because it was kind of painful using a subshell and variable assignment with the aws cli text output.

From there, it picks up the file, loops through and find any other instance that isn't itself, and appends the -retry-join parameter with the correct DNS name.  I use retry-join because this server may have started up before the others (I let them horse race, all 3 at a time).  This should work even if you start with 1, and then add 2 or 4 later, because the first one will simply not use a join parameter, but anything else that joins the ASG should.

                      "#!/bin/bash",                      
                     
"rm -f -r /mnt/raft /mnt/serf /mnt/tmp /mnt/services",
                     
"mkdir /tmp",
                     
"internalIP=$(curl http://169.254.169.254/latest/meta-data/local-ipv4)",
                     
"instanceID=$(curl http://169.254.169.254/latest/meta-data/instance-id)",
                     
"instanceASG=$(aws autoscaling describe-auto-scaling-instances --instance-ids $instanceID --region {{aws_region}} --query 'AutoScalingInstances[*].AutoScalingGroupName' --output text) > /tmp/instances",
                     
"aws autoscaling describe-auto-scaling-groups --region us-west-2 --auto-scaling-group-names $instanceASG --query 'AutoScalingGroups[].Instances[*].[InstanceId]' --output text > /tmp/instances",
                     
"while read line;",
                     
"do",
                     
"   if [ \"$line\" != \"$instanceID\" ]; then",
                     
"      echo \"adding join parameter\"",
                     
"      registerDNS=$(aws ec2 describe-instances --region us-west-2 --instance-ids $line --query 'Reservations[*].Instances[*].PrivateDnsName' --output text)",
                     
"      joinParam=\"-retry-join $registerDNS\"",
                     
"      break",
                     
"   fi",
                     
"done < /tmp/instances",
                     
"rm -rf /tmp/instances",

Please note you'll need to set your security group to allow autoscaling and ec2 access as needed for the instance to use these particular CLI commands.  Once you have the retry parameter, you can then feed it into Consul.  So far I've just been using the Consul Docker container.

Justin Watkinson

unread,
Mar 17, 2015, 10:57:10 AM3/17/15
to consu...@googlegroups.com, la...@fazy.net
Ahh, just noticed I did a few bad things :)  Since I'm using cfn-init and a context block to find/replace, you may see a few {{<variable name>>}} references.  I think the only context I was using was region, so replace as needed.  I even wrote out a few by hand... I was working really quick, sorry about that :)

Lars Janssen

unread,
Apr 7, 2015, 2:39:00 PM4/7/15
to consu...@googlegroups.com, la...@fazy.net
Hi all,

Apologies for the late reply, but anyway thank you to Armon for a great explanation and to Justin for the auto-scaling example.

I'm very tight for time with respect to getting my project in production so I'm going to settle for a fixed number of instances at first. However, I'm still concerned about the process of servers joining/leaving the cluster as in tests it seems I can only get it working by bootstrapping all together, but can't let one instance leave and then re-join without getting a "No cluster leader" error.

/etc/consul/config.json:
{
    "datacenter": "cg",
    "data_dir": "/opt/consul",
    "log_level": "INFO",
    "server": true
}

/etc/consul/bootstrap.json
{
    "bootstrap_expect": 2
}

The bootstrap is in a separate file, not sure but at one point I was thinking to delete it after the initial bootstrap. 

After starting consul agent (with config dir set to /etc/consul) on both machines, I do "consul join 10.0.0.1 10.0.0.2" (not the real IPs).

However, if I ctrl+c an agent running on the consul (or reboot the machine), then I break the quorum and can't seem to repair it. Stopping consul makes the machine leave the cluster, so the raft/peers file just contains null. If I then start consul and do "consul join 10.0.0.1" (if that's the IP of the one left running), then they fail to elect a leader.

So far I can only get a working consul cluster by starting with a clean slate, booting both machines with an empty consul data dir (using /opt/consul), bootstrap-expect 2 and then running consul join again (and finally, import backed up data with consulate).

Is this expected behaviour?

I only used 2 consul server agents on the development stack, but I know the docs recommend 3 or 5. If I go up to 3 servers, will it solve the leader election problem? i.e. machines A, B and C are in the cluster and C leaves, then tries to rejoin, will A and B agree on a leader and welcome back C with a simple "consul join" request from C? If so, then I can stretch to a 3rd instance even in development.

About bootstrap-expect, if the same server tries to rejoin the cluster later, presumably it shouldn't have bootstrap-expect anymore? (Tried with and without, no difference).

Thanks,

Lars.

Lars Janssen

unread,
Apr 7, 2015, 2:39:44 PM4/7/15
to consu...@googlegroups.com, la...@fazy.net
Hi all,

Apologies for the late reply, but anyway thank you to Armon for a great explanation and to Justin for the auto-scaling example.

I'm very tight for time with respect to getting my project in production so I'm going to settle for a fixed number of instances at first. However, I'm still concerned about the process of servers joining/leaving the cluster as in tests it seems I can only get it working by bootstrapping all together, but can't let one instance leave and then re-join without getting a "No cluster leader" error.

/etc/consul/config.json:
{
    "datacenter": "cg",
    "data_dir": "/opt/consul",
    "log_level": "INFO",
    "server": true
}

/etc/consul/bootstrap.json
{
    "bootstrap_expect": 2
}

The bootstrap is in a separate file, not sure but at one point I was thinking to delete it after the initial bootstrap. 

After starting consul agent (with config dir set to /etc/consul) on both machines, I do "consul join 10.0.0.1 10.0.0.2" (not the real IPs).

However, if I ctrl+c an agent running on the consul (or reboot the machine), then I break the quorum and can't seem to repair it. Stopping consul makes the machine leave the cluster, so the raft/peers file just contains null. If I then start consul and do "consul join 10.0.0.1" (if that's the IP of the one left running), then they fail to elect a leader.

So far I can only get a working consul cluster by starting with a clean slate, booting both machines with an empty consul data dir (using /opt/consul), bootstrap-expect 2 and then running consul join again (and finally, import backed up data with consulate).

Is this expected behaviour?

I only used 2 consul server agents on the development stack, but I know the docs recommend 3 or 5. If I go up to 3 servers, will it solve the leader election problem? i.e. machines A, B and C are in the cluster and C leaves, then tries to rejoin, will A and B agree on a leader and welcome back C with a simple "consul join" request from C? If so, then I can stretch to a 3rd instance even in development.

About bootstrap-expect, if the same server tries to rejoin the cluster later, presumably it shouldn't have bootstrap-expect anymore? (Tried with and without, no difference).

Thank


On Tuesday, 17 March 2015 14:57:10 UTC, Justin Watkinson wrote:

Darron Froese

unread,
Apr 7, 2015, 2:55:12 PM4/7/15
to Lars Janssen, consu...@googlegroups.com
Lars,

You need at least 3 to maintain quorum. If you have one leave, it should still be OK until you have another one replace it - but you should always have a minimum of 3 Consul servers.

Bootstrap expect is only used to get the cluster up and running - you shouldn't use it after the cluster is up.

Having only 2 and then killing one is a recipe for frustration - that's never going to work correctly.

Michael Fischer

unread,
Apr 7, 2015, 2:58:39 PM4/7/15
to Darron Froese, Lars Janssen, consu...@googlegroups.com
I'm not totally sure I understand the relationship between the Consul servers and EC2 autoscaling.   Why not just configure a server quorum ahead of time?  The other agents can be installed on autoscaling instances, and the creation/destruction of them shouldn't affect the overall stability of the cluster.

Best regards,

--Michael

Darron Froese

unread,
Apr 7, 2015, 3:11:28 PM4/7/15
to Michael Fischer, Lars Janssen, consu...@googlegroups.com
That is essentially what we do. We have a set of 3-5 Consul servers (depending on the cluster size) and they don't autoscale at all.

Everything else does - they're joining and leaving very regularly.

We're using AWS m3.large instances for the Consul server role - they're very lightly loaded and using few resources for approximately 500 nodes.

To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool+unsubscribe@googlegroups.com.

Lars Janssen

unread,
Apr 7, 2015, 3:15:59 PM4/7/15
to consu...@googlegroups.com, dar...@froese.org, la...@fazy.net
Hi all,

Sorry for the double post, I guess Google Groups doesn't work well on a train.

@Darron, understood. My initial thinking was that data security wasn't so important on a test rig so just start with 2 Amazon instances. But it seems it's almost not viable or not supported to run 2 instances even in testing. So I'll increase it to 3. Will that still work even if the leader leaves and re-joins?

@Michael, my original thinking was to use an auto-scaling group with a fixed size. If an instance is terminated, it could simply be replaced. However, I'm not sure that'll work with fixed IP addresses, so then there's a discovery/DNS issue for other servers finding a rejoined node with a new IP. So I might never get to implementing the auto scaling part. I do want it to be as easy as possible to manually recover though.

Thanks,

Lars.

Justin Franks

unread,
Feb 13, 2016, 11:24:50 AM2/13/16
to Consul, dar...@froese.org, la...@fazy.net
Just have the Consul servers write their IP to an S3 bucket. Then make AutoScaling group of 3 or 5 for the Consul servers. Everytime server is replaced it will write new IP to S3 bucket. So S3 bucket will be living list of live IPs. Then as other instances boot throughout your infrastructure they can read the IPs in the S3 bucket and use them to join the cluster.
Reply all
Reply to author
Forward
0 new messages