Root token management best practices

755 views
Skip to first unread message

Trevor Robinson

unread,
Jun 23, 2016, 6:23:22 PM6/23/16
to Vault
Hello,

I'm new to Vault, and I'm evaluating it for use in a banking application (which must be PCI DSS compliant, which I mention only for those familiar with its Split Knowledge, Dual Control, and Separation of Duties requirements). I'm wondering if there are any recommendations or best practices around managing the root auth token in an environment where no single person should have unrestricted trust/access. The secret sharing approach for unsealing the vault is great, since no one person has the master key, but managing the root token seems to reintroduce the possibility that one person may have access to all the secrets it protects (though the access will be logged).

It seems an alternative was made possible with Vault 0.5 in that the root token could be made ephemeral, being regenerated as part of each unseal. I'm envisioning a front-end process that coordinates the unseal and root token generation, stores the root token in memory, and coordinates any root-level operations using the desired policies. For example, we need to generate data encryption keys for use by service instances, but we need to mitigate the threat that an instance has been compromised (meaning we can't solely trust secrets stored on the service host). One way to do this is requiring that service authentication (performed at startup) be manually approved by ops personnel (who should know when startup is expected). This could be accomplished using cubbyhole authentication, in a variant of the coprocess-based approach, where the coprocess is the front-end process containing the root token. Services request a temp token via the coprocess, which then requires ops approval before generating a time- and count-limited perm token (but ops never sees either token). The coprocess would also perform related operations, such as generating new data encryption keys when requested by ops, without ever divulging those keys directly to ops. Particularly, if service auth approvers do not have access to service host secrets, the threat of a rogue ops person acquiring data encryption keys is mitigated.

So I'm wondering if others have seen this need and if there already exist any components of the solution (or if I'm completely overlooking a simpler, built-in mechanism for achieving a system that trusts no single person completely). I haven't found much written on root token management (except for re-generation after accidental loss). For protection of data encryption keys, the (updated) transit backend could be used, though in our case, the service instances need the actual keys, for various reasons. There was also an unimplemented proposal for secret sharing for tokens, which would be another approach if applied to the root key.

Thanks for any insights or recommendations.

-Trevor

Miguel Terrón

unread,
Jun 23, 2016, 11:15:26 PM6/23/16
to Vault
I guess a mechanisms akin to the pgp one used for unseal keys could be devised where the root token is split between n persons and encrypted with their pgp keys, something like:

vault init -key-shares=3 -key-threshold=2 -pgp-keys="jeff.asc,vishal.asc,seth.asc" -root-token-pgp-keys="seth.asc,jeff.asc"

This way the root token will be split in 2, with the left part encrypted with Seth's key and the right part with Jeff's key.

Jeff Mitchell

unread,
Jun 24, 2016, 9:29:54 AM6/24/16
to vault...@googlegroups.com
Hi Trevor,

On Thu, Jun 23, 2016 at 6:23 PM, Trevor Robinson <tre...@scurrilous.com> wrote:
> It seems an alternative was made possible with Vault 0.5 in that the root
> token could be made ephemeral, being regenerated as part of each unseal.

This isn't quite right, and it may be an important difference in your
use-case. The root token isn't ephemeral and it isn't regenerated at
each unseal. What was added in 0.5 was a way to generate a new root
token using unseal keys for authorization. This doesn't happen at
unseal time; it is a totally separate action that happens at any
user-chosen time when enough unseal key holders are around to
authorize the action. This allows you to revoke your initial root
token immediately after setting up enough of Vault to allow
appropriate administrative access, but with disaster-recovery
safeguards in-place. Generally speaking, we encourage as few root
tokens as possible within any given use-case to exist, and in many
cases that number can be "zero".

It's certainly possible to take the root token and use an outside tool
to perform Shamir's against it, and give those splits to the unseal
key holders. But that root token needs to be used to perform initial
setup of Vault anyways. You're also right that auditing can make it
clear what the root token was used for -- but only after auditing is
enabled, which doesn't have to be the first action carried out.

You may want to think about a workflow where a quorum of people are
present to watch the Vault initialization and initial setup that can
individually attest -- perhaps with a PGP-signed statement -- as to
what occurred...e.g. "Vault was initialized and unseal keys given to
Alice, Bob, and Charlie. Audit backend X was enabled. LDAP was
configured and a policy was loaded to allow administrators to gain
access to the full system. The root token was then revoked." Such a
statement, made/signed individually by multiple people, may pass
muster for auditors.

> for re-generation after accidental loss). For protection of data encryption
> keys, the (updated) transit backend could be used, though in our case, the
> service instances need the actual keys, for various reasons.

Transit has a helper feature to generate high-quality AES keys that
are wrapped by a transit key, so you can distribute encrypted
encryption keys to your instances but control at any given time what
machines are allowed to decrypt those keys into memory.

You mentioned the cubbyhole blog post. There are some neat things you
can do with that paradigm, but for many (maybe most) things people
want to use that for, response wrapping (new in 0.6) is the better
approach as it formalizes the process and does not require a trusted
service that sees all data passing through. See
https://www.hashicorp.com/blog/vault-0.6.html#response-wrapping

Hope this helps!
--Jeff

Trevor Robinson

unread,
Jun 24, 2016, 4:25:05 PM6/24/16
to Vault
Hi Jeff,

Thanks for the quick response!

On Friday, June 24, 2016 at 8:29:54 AM UTC-5, Jeff Mitchell wrote:
The root token isn't ephemeral and it isn't regenerated at
each unseal. What was added in 0.5 was a way to generate a new root
token using unseal keys for authorization. This doesn't happen at
unseal time; it is a totally separate action that happens at any
user-chosen time when enough unseal key holders are around to
authorize the action.
 
I'm aware that this isn't the normal or apparently even intended usage. I'm proposing that a front-end process gather the unseal keys and then use them to a) unseal the vault and then b) generate a root token, which it will use for any operation that handles secrets no single person is allowed to access. The root token is ephemeral in the sense that the one generated at init time is immediately revoked, and the subsequent ones are only held in memory by the front-end process. Is there a reason this wouldn't work? Does generating a new root token automatically invalidate the previous one, or would it need to be explicitly revoked? If the latter, perhaps the token accessor could be stored so that the process could revoke the previous (inaccessible) root token before creating a new one.

This allows you to revoke your initial root
token immediately after setting up enough of Vault to allow
appropriate administrative access, but with disaster-recovery
safeguards in-place. Generally speaking, we encourage as few root
tokens as possible within any given use-case to exist, and in many
cases that number can be "zero".

It sounds like you're recommending using the root token only to grant subsets of administrative access to various users (analogous to a Unix system where root is used to provide sudo access to administrators at install time and then disabled). The problem is that we have secrets that no single user may be allowed to access. We're looking for a way to have a privileged process, which gains full access to the Vault using a shared secret (ideally the unseal shared secret), perform the secret handling. In other words, we'd like to have custom policy logic in the same security domain as the vault, since I don't think Vault can/should anticipate every policy need. If a plug-in facility existed, that would also solve the problem.

You may want to think about a workflow where a quorum of people are
present to watch the Vault initialization and initial setup that can
individually attest -- perhaps with a PGP-signed statement -- as to
what occurred...e.g. "Vault was initialized and unseal keys given to
Alice, Bob, and Charlie. Audit backend X was enabled. LDAP was
configured and a policy was loaded to allow administrators to gain
access to the full system. The root token was then revoked." Such a
statement, made/signed individually by multiple people, may pass
muster for auditors.

That's exactly what we'd do except for the part about "allow administrators to gain access to the full system", which can't be allowed to happen. Partitioning the Vault across different administrators doesn't help -- some secrets need to be accessible only by a program and never by a person.
 
Transit has a helper feature to generate high-quality AES keys that
are wrapped by a transit key, so you can distribute encrypted
encryption keys to your instances but control at any given time what
machines are allowed to decrypt those keys into memory.

You're referring to "/transit/datakey"? That seems like a potentially useful way to generate our keys. I don't understand the purpose of encrypting them though, if they're already exchanged over an encryption channel. If we need to later provide the key decrypting key, wouldn't it be equivalent to simply providing the (plaintext) data encryption key at that point? (Obviously, the data encryption keys would only be held in memory and never persisted by those machines.)

You mentioned the cubbyhole blog post. There are some neat things you
can do with that paradigm, but for many (maybe most) things people
want to use that for, response wrapping (new in 0.6) is the better
approach as it formalizes the process and does not require a trusted
service that sees all data passing through. See
https://www.hashicorp.com/blog/vault-0.6.html#response-wrapping 
 
Cool, that does seem like it would save us a step. However, I think we still need a trusted service because the token we need to create for the worker service must have access to parts of the vault that the ops person authorizing the service can't access. In other words, since a token can't be used to generate a token with more privilege, we need to have a trusted service (which has more privilege than any person) generate it.

Thanks,
Trevor

Jeff Mitchell

unread,
Jun 24, 2016, 4:49:18 PM6/24/16
to vault...@googlegroups.com
Hi Trevor,

> I'm aware that this isn't the normal or apparently even intended usage.

You'd be surprised at how often things that "weren't intended usage"
have turned into officially-supported enhancements/features :-)

> I'm
> proposing that a front-end process gather the unseal keys and then use them
> to a) unseal the vault and then b) generate a root token, which it will use
> for any operation that handles secrets no single person is allowed to
> access. The root token is ephemeral in the sense that the one generated at
> init time is immediately revoked, and the subsequent ones are only held in
> memory by the front-end process. Is there a reason this wouldn't work?

This would work, but just keep in mind that anyone that manages to
gain access/control over this process can do _anything_.

> Does
> generating a new root token automatically invalidate the previous one, or
> would it need to be explicitly revoked?

The latter.

> If the latter, perhaps the token
> accessor could be stored so that the process could revoke the previous
> (inaccessible) root token before creating a new one.

Yep. Keep in mind that root tokens never expire. Rather than holding
it in memory, you may want to use it to create a token with a policy
that has full access to *, but has an expiration. Then immediately
revoke the root token and use the new one instead. That way there
isn't a valid token sitting around that has no expiration if the token
isn't properly revoked (e.g. due to a crash).

> It sounds like you're recommending using the root token only to grant
> subsets of administrative access to various users (analogous to a Unix
> system where root is used to provide sudo access to administrators at
> install time and then disabled). The problem is that we have secrets that no
> single user may be allowed to access. We're looking for a way to have a
> privileged process, which gains full access to the Vault using a shared
> secret (ideally the unseal shared secret), perform the secret handling. In
> other words, we'd like to have custom policy logic in the same security
> domain as the vault, since I don't think Vault can/should anticipate every
> policy need. If a plug-in facility existed, that would also solve the
> problem.

Plugins are on our long term roadmap. We have a plugin system for
other HC products (notably Terraform) but we need to have a more
secure facility for plugins before we support them in Vault. I'm going
to email you separately about the other part of this...

> That's exactly what we'd do except for the part about "allow administrators
> to gain access to the full system", which can't be allowed to happen.
> Partitioning the Vault across different administrators doesn't help -- some
> secrets need to be accessible only by a program and never by a person.

Yep, that totally makes sense.

>> Transit has a helper feature to generate high-quality AES keys that
>> are wrapped by a transit key, so you can distribute encrypted
>> encryption keys to your instances but control at any given time what
>> machines are allowed to decrypt those keys into memory.
>
>
> You're referring to "/transit/datakey"? That seems like a potentially useful
> way to generate our keys.

Yep. You can do exactly what datakey does by reading from your crypto
source and sending those bytes through transit -- it's purely a helper
function.

> I don't understand the purpose of encrypting them
> though, if they're already exchanged over an encryption channel.

The idea is that you can distribute the datakey in the open (for
instance, via config management, or baked into images), then give the
correct services the ability to perform the decryption. It's nothing
magic -- really, just a helper!

> Cool, that does seem like it would save us a step. However, I think we still
> need a trusted service because the token we need to create for the worker
> service must have access to parts of the vault that the ops person
> authorizing the service can't access.

Sure -- but, if the trusted service is sending that information to
someone/something else over an insecure channel, but itself doesn't
need to actually see the data inside, it can do this without the data
ever being exposed in its memory.

> In other words, since a token can't be
> used to generate a token with more privilege

It can! in one specific way: token roles. See
https://www.vaultproject.io/docs/auth/token.html -- among other
things, this is a facility that lets trusted services (or users)
create tokens with other sets of permissions than a subset of its own.
It's not dissimilar to a trusted service/user configuring any other
authentication backend in Vault and setting the policies that should
be granted on generated tokens, but does not require outside
authentication mechanisms.

Best,
Jeff

Trevor Robinson

unread,
Jun 24, 2016, 8:52:08 PM6/24/16
to Vault
Thanks for all the guidance, Jeff.

One more question: Is it possible to read the value of the keys used by the transit backend (aside from the datakey endpoint)? We basically want rotated keys that are available to our service instances. If we could use the transit keys for this, we wouldn't need to create our own keys (possibly using datakey) and store them in a list in the generic backend (essentially duplicating what transit does internally for rotation). If not, I guess the encryption aspect of datakey could come in handy: Administrators could perform datakey rotation using a script that generates and stores a new datakey without ever seeing the plaintext key. Administrators would not have access to transit/decrypt for the key encryption key but the service instances would. That approach would also sidestep the issue of how to import our existing keys into the transit backend (which I doubt is currently supported).

-Trevor

Jeff Mitchell

unread,
Jun 27, 2016, 10:27:54 AM6/27/16
to vault...@googlegroups.com
Hi Trevor,
Exposing keys is not currently supported, and we've been skeptical of
the idea of supporting it. Part of the contract of transit is that its
keys are not available to users -- even if you misconfigure your
policies.

I'm much more open to allowing importation of keys from authorized
parties. It doesn't exist now, but that way you're at least working
with private key material you already have stored elsewhere, so you're
not potentially increasing your exposure by putting it into Vault.

Best,
Jeff
Reply all
Reply to author
Forward
0 new messages