multiple vault for production env.

748 views
Skip to first unread message

Hridyesh Pant

unread,
Mar 19, 2016, 12:29:55 AM3/19/16
to Vault
Hi ,
We are planning to configure Vault for production environment.
In test env, we have configured one Vault server with HA backed Dynamodb and working fine.
Is there any documentation to configure multiple vault server with same HA backend (Dynamodb),so that if one instance goes down ,other can start serving vault server.

Really appreciate if any one has setup such infrastructure and help me .

--Thanks
Hridyesh



Jeff Mitchell

unread,
Mar 19, 2016, 10:18:53 AM3/19/16
to vault...@googlegroups.com
Hi Hridyesh,

It's generally as simple as setting up the new Vault nodes with the
same Dynamo configuration as the initial node and making sure that the
advertise_addr for each points back to itself (for each node X, this
is the address that standby nodes should give to clients if X is the
current active node).

Best,
Jeff
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/vault-tool/1e70d0de-8d4e-40c3-b596-9bbf5353d4f2%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Hridyesh Pant

unread,
Mar 19, 2016, 9:28:24 PM3/19/16
to Vault
so if i understand  correctly i am launching three vault server X,Y and Z node with following configuration making advertise_addr same in all three node, where is the ip 10.5.50.10 of first node 

first node X making it active with recovery_mode=1: 
backend "dynamodb" {
table = "vault-dynamodb"
region = "us-west-2"
advertise_addr ="http://10.5.50.10:8200"
}
listener "tcp" {
address = "127.0.0.1:8200"
tls_disable = 1
}
default_lease_ttl="1h"
max_lease_ttl="1h"
recovery_mode=1
node Y :
backend "dynamodb" {
table = "vault-dynamodb"
region = "us-west-2"
advertise_addr ="http://10.5.50.10:8200"
}
listener "tcp" {
address = "127.0.0.1:8200"
tls_disable = 1
}
default_lease_ttl="1h"
max_lease_ttl="1h"
node Z :
backend "dynamodb" {
table = "vault-dynamodb"
region = "us-west-2"
advertise_addr ="http://10.5.50.10:8200"
}
listener "tcp" {
address = "127.0.0.1:8200"
tls_disable = 1
}
default_lease_ttl="1h"
max_lease_ttl="1h"

--Thanks
Hridyesh

Hridyesh Pant

unread,
Apr 11, 2016, 5:03:55 AM4/11/16
to Vault
Hi Jeff,
could you please have a look my above configuration in fine for 3 vault server setup? or the advertise_address will be there own ips address.

-Thanks
hridyesh

Hridyesh Pant

unread,
Apr 11, 2016, 5:11:44 AM4/11/16
to Vault
i need to put these three node behind the common ELB(vault.xx.com) .so wondering what need to be right setting. also do i need to unseal all three node?

Jeff Mitchell

unread,
Apr 11, 2016, 10:29:03 AM4/11/16
to vault...@googlegroups.com
Hi Hridyesh,

Your advertise_addr should either be the individual IP address of each
node, or all set to the ELB address.

Best,
Jeff

On Mon, Apr 11, 2016 at 5:11 AM, Hridyesh Pant <hridye...@gmail.com> wrote:
> i need to put these three node behind the common ELB(vault.xx.com) .so wondering what need to be right setting. also do i need to unseal all three node?
>
> --
> This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/vault-tool/4431c5bc-1de5-4795-89ab-f5716daa513e%40googlegroups.com.

Hridyesh Pant

unread,
Apr 11, 2016, 12:46:26 PM4/11/16
to Vault
Thanks Jeff got it.
one more question about the recovery_mode option, Doc says
>>
It is important that only one node is running in recovery mode! After this node has become the leader, other nodes can be started with regular configuration.
>>
 if i have 3 vault server cluster behind the ELB and if leader crash ,the one of other two standby will become leader automatically right?
so my doubt is how the old lock will get removed automatically, do i manually start the other server with recovery_mode=1 option.
i dont want to manual start to unlock , i am looking something like for my production support where we don't want single point of failure 
1.all three server are behind the ELB and advertise_addr pointing to ELB address.
2. one of the server will become as active node .
3. in case active node crashed or not reachable (health check failed) ,one of the other standby node become active and acquire the lock.

could you please suggest how i can configure such workflow ?

Jeff Mitchell

unread,
Apr 11, 2016, 12:56:33 PM4/11/16
to vault...@googlegroups.com
On Mon, Apr 11, 2016 at 12:46 PM, Hridyesh Pant <hridye...@gmail.com> wrote:
> Thanks Jeff got it.
> one more question about the recovery_mode option, Doc says
>>>
> It is important that only one node is running in recovery mode! After this
> node has become the leader, other nodes can be started with regular
> configuration.
>>>
> if i have 3 vault server cluster behind the ELB and if leader crash ,the
> one of other two standby will become leader automatically right?
> so my doubt is how the old lock will get removed automatically, do i
> manually start the other server with recovery_mode=1 option.
> i dont want to manual start to unlock , i am looking something like for my
> production support where we don't want single point of failure
> 1.all three server are behind the ELB and advertise_addr pointing to ELB
> address.
> 2. one of the server will become as active node .
> 3. in case active node crashed or not reachable (health check failed) ,one
> of the other standby node become active and acquire the lock.
>
> could you please suggest how i can configure such workflow ?

Hi Hridyesh,

Unfortunately, this is a drawback of the DynamoDB storage backend. It
doesn't have locks tied to an expiring session; the lock is instead a
document with a conditional write. The leader is able to write to that
document and standbys are not; however, if the leader cannot delete
the document being used due to a crash, there is no way for the
standbys to know that it is safe to ignore it. One of the drawbacks is
therefore that in a crash scenario where the leader does not give
ownership willingly, you must use recovery mode.

If this tradeoff is not acceptable to you, I suggest looking at
Consul, or one of the other backends that uses expiring session locks.

Best,
Jeff

Hridyesh Pant

unread,
Apr 12, 2016, 8:00:13 AM4/12/16
to Vault
so it means we can't use cluster based vaults node with dynamodb?

--Hridyesh

Hridyesh Pant

unread,
Apr 12, 2016, 8:11:22 AM4/12/16
to Vault
is there any chance Hashi corp have some solution for this use case?
also in case health check also failed to active node ,even through the standby node can't become the leader??
if any case standby nodes cant become active node for dynamodb HA mode, why hashicorp giving option to choose Dynamodb HA .
no one want to use single node for production environment, where single node failure cause all the truble.

--Hridyesh

Jeff Mitchell

unread,
Apr 12, 2016, 9:42:00 AM4/12/16
to vault...@googlegroups.com
Hridyesh,

I don't really know what to tell you here. Any of the databases that
Vault can use as a backend may have various failure scenarios, and
Vault's usage of them may have various failure scenarios. DynamoDB has
a specific failure scenario in that in the event of an unexpected
shutdown there may be known some small manual work required to get
things back up and running (whereas with other backends there may be
manual work required that can't be known about ahead of time). Whether
that's acceptable to you is up to you. If not, you should use another
backend.

Best,
Jeff

On Tue, Apr 12, 2016 at 8:00 AM, Hridyesh Pant <hridye...@gmail.com> wrote:
> so it means we can't use cluster based vaults node with dynamodb?
>
> --Hridyesh
>
> --
> This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/vault-tool/9c6246a3-8ccb-4b6c-b970-d9dc14e4c55d%40googlegroups.com.

Jeff Mitchell

unread,
Apr 12, 2016, 9:43:27 AM4/12/16
to vault...@googlegroups.com
On Tue, Apr 12, 2016 at 8:11 AM, Hridyesh Pant <hridye...@gmail.com> wrote:
> if any case standby nodes cant become active node for dynamodb HA mode, why hashicorp giving option to choose Dynamodb HA .
> no one want to use single node for production environment, where single node failure cause all the truble.

Directly quoting from the documentation
(https://www.vaultproject.io/docs/config/index.html):

*Please note*: The only physical backends actively maintained by
HashiCorp are consul, inmem, and file. The other backends are
community-derived and community-supported. We include them in the hope
that they will be useful to those users that wish to utilize them, but
they receive minimal validation and testing from HashiCorp, and
HashiCorp staff may not be knowledgeable about the data store being
utilized. If you encounter problems with them, we will attempt to help
you, but may refer you to the backend author.

--Jeff

Hridyesh Pant

unread,
Apr 12, 2016, 12:52:00 PM4/12/16
to Vault
Thanks Jeff.unfortunatly we are not very confortable with consul backend at this time as we dont want extra dependencies setup and support/maintainance .That wad the reasone we choose dynamodb which also support HA mode.but now seems we can't setup vault cluster with dynamodb as it does not solve porpose of automatic switch between standby to active node due to lock and required manual intervention.
it will be nice if documentation can cover all such limitatiom more specific way rather then generic statement.
not sure if i can go with vault and dynamodb combination specially in production env.

--Thanks
Hridyesh

Jeff Mitchell

unread,
Apr 12, 2016, 1:16:38 PM4/12/16
to vault...@googlegroups.com
On Tue, Apr 12, 2016 at 12:52 PM, Hridyesh Pant <hridye...@gmail.com> wrote:
> it will be nice if documentation can cover all such limitatiom more specific way rather then generic statement.

This limitation is specifically covered in the 'dynamodb' backend
reference section:
https://www.vaultproject.io/docs/config/index.html#recovery_mode

Hridyesh Pant

unread,
Apr 12, 2016, 1:21:14 PM4/12/16
to Vault
but any way thanks for making such wonderful open source tools and answering our all queries in such a effective way.

Hridyesh Pant

unread,
Apr 12, 2016, 1:28:35 PM4/12/16
to Vault
i still feel the lack of documentation part with example.
how to setup vault cluster in production env?
limitations of each HA backend?
what are the best option to store secret keys and root token?
i formation like appid authntication dont have lease expiry.

sumeet....@collectivehealth.com

unread,
May 31, 2016, 12:27:01 PM5/31/16
to Vault
Hi Hridyesh,
Were you able to figure out some solution? I am also in the same boat.

Hridyesh Pant

unread,
Jun 1, 2016, 2:59:06 AM6/1/16
to Vault
No we haven't find any solution yet.
calling vault with DyanmoDB as HA mode with one instance doent make any sense to me.

Hridyesh Pant

unread,
Jun 1, 2016, 2:59:08 AM6/1/16
to Vault

sumeet....@collectivehealth.com

unread,
Jun 1, 2016, 12:23:30 PM6/1/16
to Vault
:( We will probably use etcd or consul. 

ben.t...@roundingwell.com

unread,
Jul 6, 2017, 3:32:44 PM7/6/17
to Vault
Is this still an issue? 

Reading through the docs, it appears that this behavior has changed. Specifically, 

>High Availability – the DynamoDB storage backend supports high availability. Because DynamoDB uses the time on the Vault node to implement the session lifetimes on its locks, significant clock skew across Vault nodes could cause contention issues on the lock.
Reply all
Reply to author
Forward
0 new messages