Multiple vault server with DynamoDB HA

1,295 views
Skip to first unread message

Hridyesh Pant

unread,
Feb 26, 2016, 4:05:58 AM2/26/16
to Vault
Hi,
I am trying to setup multiple vault server (3) with AWS dynamoDB.
I have successfully setup single server with AWS dynamoDB.
As this setup is required for production, it will be really helpful if some one can answer my queries
1. how to setup multiple vault server and using same HA backend?
2. how the leader selection happened ?

--Thanks
Hridyesh


Jeff Mitchell

unread,
Feb 26, 2016, 11:56:30 AM2/26/16
to vault...@googlegroups.com
Hi Hridyesh,

1. Multiple Vault servers will automatically go into HA mode if
pointed at the same HA backend and unsealed. (Even a single server is
in HA mode in this scenario, there are just no other nodes to fall
back on.)

2. It happens opportunistically; all Vault servers attempt to
atomically grab the lock provided by the HA backend. Whichever server
grabs the lock first becomes the active node. Once that node is taken
down, the process repeats. The actual implementation of the lock is
specific to each backend. Details about the DynamoDB implementation
are at https://github.com/hashicorp/vault/pull/878

Best,
Jeff
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/vault-tool/63d75240-d719-40a1-acff-59246f61794c%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Hridyesh Pant

unread,
Feb 28, 2016, 11:29:41 PM2/28/16
to Vault
Thanks Jeff

Chris Ludden

unread,
Apr 12, 2016, 9:42:28 PM4/12/16
to Vault
Hey Jeff,
In regard to this situation described on the pull request:
One other thing: HA relies on being able to write a specific document. If all Vault nodes go down unexpectedly (so they do not clean up properly), will any of them be able to acquire leadership, or will the existence of that record prevent it? If not, I suggest adding an env var/conf flag to be used in case of emergency that basically says "you're the only active node, if the document exists delete it and create a new one". Then you can start one node with that flag one time, then start the other nodes without it.
Is there any alternative way of handling this? If an instance crashes in the middle of the night (which just happened to me), the other standby nodes are unable to become active until a new instance is created with the 'recovery_mode' flag. I'm looking for a way to automate this process, so it does not require manual intervention to the bring the vault service back online.. any help would be much appreciated!

Chris

Jeff Mitchell

unread,
Apr 12, 2016, 11:20:40 PM4/12/16
to vault...@googlegroups.com

Hi Chris,

I'm not aware of any alternate way, although I admit I don't know much about Dynamo's capabilities. If an automated way can be found then that would be great, but I assume that the original implementor didn't find a way or that would have been used in the first place.

Sorry, I just really have no advice to give on this topic, other than using a different backend if this is not an acceptable risk.

Best,
Jeff

Hridyesh Pant

unread,
Jun 1, 2016, 6:25:43 AM6/1/16
to Vault
Hi Jeff,
the document say :
It is important that only one node is running in recovery mode! After this node has become the leader, other nodes can be started with regular configuration.

could you please help me to understand this. let say i put 3 vault server (A ,B and C)behind the one ELB.
server A started with recovery_mode and this now become the leader.
server B and C started with regular configuration.

Now server A crashed, will B and C able to perform request (reading secret) as in our environment most of the time we are making read call?


if above works,this would potentially provide a solution to enable reads even in the event that the read/write leader crashes.

by that time we can start new new leader with recovery_mode  for writes, and hopefully secret reads could continue unimpeded.



--Thanks
Hridyesh

Jeff Mitchell

unread,
Jun 1, 2016, 9:08:31 AM6/1/16
to vault...@googlegroups.com
On Wed, Jun 1, 2016 at 6:25 AM, Hridyesh Pant <hridye...@gmail.com> wrote:
> Hi Jeff,
> the document say :
> It is important that only one node is running in recovery mode! After this
> node has become the leader, other nodes can be started with regular
> configuration.
>
> could you please help me to understand this. let say i put 3 vault server (A
> ,B and C)behind the one ELB.
> server A started with recovery_mode and this now become the leader.
> server B and C started with regular configuration.
>
> Now server A crashed, will B and C able to perform request (reading secret)
> as in our environment most of the time we are making read call?

No, they will not.

Best,
Jeff

Hridyesh Pant

unread,
Jun 6, 2016, 12:32:52 AM6/6/16
to Vault
We are able to get workaround to solve this locking issue  .  
1.Setup vault server in multiple region with separate 
2.setup DynamoDB Cross-region Replication 
3.setup Amazon Route 53 Adds ELB Integration for DNS Failover 

in this way if master goes down , we are still able to read data from second vault server (setup in different region ). 

--Hridyesh
Reply all
Reply to author
Forward
0 new messages