3 vault servers with consul backend / How to advertise?

1,690 views
Skip to first unread message

Nelson Castillo

unread,
Aug 6, 2015, 2:42:19 AM8/6/15
to Vault
Hello there.

I have 3 Vault servers, all of them using consul as backend.

In the clients I deploy Consul in client mode.

In my tests I'm also deploying vault to the clients and it doesn't make sense.

How do I get the active Vault's IP address from Consul?

Thanks,
Nelson.-

Armon Dadgar

unread,
Aug 6, 2015, 1:24:26 PM8/6/15
to vault...@googlegroups.com, Nelson Castillo
Hey Nelson,

You do not need to deploy Vault to the clients. Instead the clients should just
lean on Consul to do the service discovery by talking to “vault.service.consul”.
The tricky part is the TLS certificates, as you need to make sure the Vault cert
is valid for the service based lookup and their advertise address if a redirect is done.

Alternatively, you could put a load balancer in front of Vault and have it filter
out the non-active nodes. This way the LB always routes to the current active
instance and no redirect is needed.

Does that help clarify?

Best Regards,
Armon Dadgar
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/vault/issues
IRC: #vault-tool on Freenode
---
You received this message because you are subscribed to the Google Groups "Vault" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vault-tool+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vault-tool/a99fad50-214b-44f6-acda-7ff4fd6bf422%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Michael Fischer

unread,
Aug 6, 2015, 4:15:47 PM8/6/15
to vault...@googlegroups.com, Nelson Castillo
As far as I can tell, Vault is not currently registering itself as a service with Consul automatically, even if you configure it as a storage backend.  Is this expected? 

--Michael

Nelson Castillo

unread,
Aug 7, 2015, 1:34:42 AM8/7/15
to Michael Fischer, vault...@googlegroups.com
Thanks all for the replies.

On Thu, Aug 6, 2015 at 3:15 PM, Michael Fischer <mfis...@zendesk.com> wrote:
As far as I can tell, Vault is not currently registering itself as a service with Consul automatically, even if you configure it as a storage backend.  Is this expected? 

Perhaps it has to be registered as a service manually?

Testing...

Armon Dadgar

unread,
Aug 7, 2015, 1:02:52 PM8/7/15
to Michael Fischer, vault...@googlegroups.com, Nelson Castillo, vault...@googlegroups.com
Nelson is correct, Vault does not register with Consul automatically, 
but it is straightforward to do with a config file. Additionally the “/v1/sys/health”
endpoint is designed to fit with the Consul HTTP health check natively.

Nelson Castillo

unread,
Aug 7, 2015, 1:06:15 PM8/7/15
to Armon Dadgar, Michael Fischer, vault...@googlegroups.com
On Fri, Aug 7, 2015 at 12:01 PM, Armon Dadgar <armon....@gmail.com> wrote:
Nelson is correct, Vault does not register with Consul automatically, 
but it is straightforward to do with a config file. Additionally the “/v1/sys/health”
endpoint is designed to fit with the Consul HTTP health check natively.

Thanks, Armon.

If you have a config snippet that would be useful.

Regards,
Nel.-

Armon Dadgar

unread,
Aug 7, 2015, 1:12:49 PM8/7/15
to Nelson Castillo, Michael Fischer, vault...@googlegroups.com
Hey Nelson,

Here is our “service-vault.json”:

{
  "service": {
    "name": "vault",
    "port": 8200,
    "check": {
        "name": "Vault Health",
        "script": "curl -o /dev/stderr -A consul -sw '%{http_code}' --insecure https://localhost:8200/v1/sys/health | egrep -q '200|429'",
        "interval": "10s"
    }
    }
}

We have to do a little curl hacking since our TLS certificate is not valid for the “localhost” common name.

Lars Sommer

unread,
Apr 27, 2016, 3:40:17 PM4/27/16
to Vault, nels...@gmail.com, mfis...@zendesk.com
Hey Armon,

  Won't this return not only the active primary (200s) as well as the standby secondaries (429s)? I am actually trying to write a service check for Vault at the moment and having troubles as the service just always returns all servers, not just the one returning a 200.

{
  "service": {
    "name": "vault",
    "port": 8200,
    "check": {
      "name": "Vault Health",
      "script": "curl -s -o /dev/null -w \"%{http_code}\" http://localhost:8200/v1/sys/health | egrep -q '200'",
      "interval": "10s"
    }
  }
}

Why would this be returning all my servers, when after manually testing using the same curl string (without the grep) only the active primary is returning a 200?

Pete Emerson

unread,
Jun 11, 2016, 4:48:29 PM6/11/16
to Vault, nels...@gmail.com, mfis...@zendesk.com
This was puzzling me as well, and I've learned a bunch of things worth sharing.

We want to know if vault is healthy on a particular node.

The definition of "healthy" is:

1) vault is running
2) vault is unsealed

The definition of "healthy" does not include whether the node is the primary or whether it is the standby, since standby nodes just forward the request onto the primary node via the advertise_addr setting in the consul backend portion of the vault configuration file.

According to https://www.vaultproject.io/docs/http/sys-health.html, 200 is returned if the node is primary, 429 if the node is standby, and 500 if the node is sealed.

I had to modify the script to exit with code 2 to get consul's DNS to remove it:

curl -o /dev/stderr -A consul -sw '%{http_code}' --insecure http://localhost:8200/v1/sys/health | egrep -q '200|429' || exit 2

I then added two services, one for vault and one for vault-primary:

{
   
"services" : [
     
{
         
"check" : {
           
"interval" : "10s",
           
"name" : "Vault Health",
           
"script" : "curl -o /dev/stderr -A consul -sw '%{http_code}' --insecure http://localhost:8200/v1/sys/health | egrep -q '200|429' || exit 2"
         
},
         
"name" : "vault",
         
"port" : 8200
     
},
     
{
         
"port" : 8200,
         
"check" : {
           
"script" : "curl -o /dev/stderr -A consul -sw '%{http_code}' --insecure http://localhost:8200/v1/sys/health | grep 200 || exit 2",
           
"name" : "Primary Vault Health",
           
"interval" : "10s"
         
},
         
"name" : "vault-primary"
     
}
   
]
}

Vault can then be used at vault.service.us-west-1.consul, and if you hit a standby server it will forward the request on to the primary.

But then how do you seal the vaults elegantly?

If you don't have vault-primary, then when you try to seal vault.service.us-west-1.consul and you git a standby server, you'll get:

{"errors":["vault cannot seal when in standby mode; please restart instead"]}

With vault-primary:

curl -X PUT -H "X-Vault-Token: <TOKEN>" http://vault-primary.service.us-west-1.consul:8200/v1/sys/seal

Run this once per vault server (waiting for consul DNS to move onto the next vault primary server) and you can lock them all.

A down side to this method is that if you have 3 vault servers, two of them will be marked in consul as critical (because they're not primaries). I'm not sure how to do this gracefully quite yet, you could probably watch consul for the change of the primary node and then de-register the existing primary and register the new one.

I'm really new to vault, so the solution of adding vault-primary to make sealing as easy as possible may be overkill. It seems likely that you don't seal the vaults very often, and if you do, curling each vault server separately (or via 'vault seal' on the command line) is not very difficult. For example:

for count in `host vault.service.us-west-1.consul | cut -d ' ' -f 4` ; do
    leader=`curl -s -X GET -H "X-Vault-Token: <TOKEN>" http://vault.service.us-west-1.consul:8200/v1/sys/leader | sed -e 's/^.*"leader_address":"\(.*\)"}/\1/'`
    curl -X PUT -H "X-Vault-Token: <TOKEN>" $leader/v1/sys/seal
    sleep 1
done

Pete

Pete Emerson

unread,
Jun 11, 2016, 4:58:36 PM6/11/16
to Vault, nels...@gmail.com, mfis...@zendesk.com
Ah, leader election could help with the N-1 services being critical, of course. Looks like that isn't exposed in DNS, so you'd have to roll your own solution.

Someone else brought up the possibility of exposing a leader in DNS:

David Adams

unread,
Jun 11, 2016, 5:13:41 PM6/11/16
to vault...@googlegroups.com, nels...@gmail.com, mfis...@zendesk.com
FWIW, the upcoming release of Vault (now in RC) will have built-in Consul integration that will correctly tag leader and standby instances so that you can just point to active.vault.service.consul and standby.vault.service.consul if you want that: https://github.com/hashicorp/vault/blob/master/CHANGELOG.md

And of course if you need to get the list of all Vault hosts, even the inactive ones, you can query the Consul catalog API.

Reply all
Reply to author
Forward
0 new messages