Vault agent still not renewing database credentials

1,110 views
Skip to first unread message

zxcvb...@gmail.com

unread,
Apr 9, 2019, 12:34:14 PM4/9/19
to Vault
I am attempting to integrate Vault with some commercial software which I don't have source code for. I thought that vault-agent would be the perfect solution for this but so far is not working out.

My policy /sys/policy/prod-awsvpn-x:

path "database/creds/prod-awsvpn-x" {
  capabilities = ["read"]
}

My auth policy /auth/aws/role/prod-awsvpn-x:

{
  "role": "prod-awsvpn-x",
  "auth_type": "iam",
  "bound_iam_principal_arn": [
    "arn:aws:iam::x:role/prod-awsvpn-x-*"
  ],
  "ttl": 86400,
  "max_ttl": 0,
  "period": 86400,
  "policies": [
    "prod-awsvpn-x"
  ]
}

My database policy /database/roles/prod-awsvpn-x:

{
  "name": "prod-awsvpn-x",
  "db_name": "prod-z",
  "default_ttl": 86400,
  "max_ttl": 0,
  "creation_statements": [
    "CREATE USER '{{name}}'@'%' IDENTIFIED BY '{{password}}';GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, ALTER, LOCK TABLES ON `xyz%`.* TO '{{name}}'@'%';"
  ]
}

There appears to be no issue with the auth policy, the vault agent logs in and the renewal happens without issue. However my database credentials seem to die after 24 hours - I can renew them easily enough by reading the path again however it is downtime for the software I am using (commercial, closed source) since I have to bring it down, update its configuration, and restart it when the database username and password change.

I see these logs from the vault agent after 24 hours:

Apr  8 18:03:40 ip-172-16-6-225 vault: 2019-04-08T18:03:40.899-0400 [INFO]  cache: received request: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:40 ip-172-16-6-225 vault: 2019-04-08T18:03:40.899-0400 [DEBUG] cache: using auto auth token: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:40 ip-172-16-6-225 vault: 2019-04-08T18:03:40.899-0400 [DEBUG] cache.leasecache: forwarding request: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:40 ip-172-16-6-225 vault: 2019-04-08T18:03:40.899-0400 [INFO]  cache.apiproxy: forwarding request: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:41 ip-172-16-6-225 vault: 2019-04-08T18:03:41.030-0400 [DEBUG] cache.leasecache: processing lease response: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:41 ip-172-16-6-225 vault: 2019-04-08T18:03:41.030-0400 [DEBUG] cache.leasecache: storing response into the cache: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:41 ip-172-16-6-225 vault: 2019-04-08T18:03:41.031-0400 [DEBUG] cache.leasecache: initiating renewal: path=/v1/database/creds/prod-awsvpn-x method=GET
Apr  8 18:03:41 ip-172-16-6-225 vault: 2019-04-08T18:03:41.047-0400 [DEBUG] cache.leasecache: secret renewed: path=/v1/database/creds/prod-awsvpn-x
 
Apr  9 11:17:07 ip-172-16-6-225 vault: 2019-04-09T11:17:07.152-0400 [INFO]  auth.handler: renewed auth token
 
Apr  9 11:38:04 ip-172-16-6-225 vault: 2019-04-09T11:38:04.217-0400 [DEBUG] cache.leasecache: secret renewed: path=/v1/database/creds/prod-awsvpn-x
Apr  9 11:38:04 ip-172-16-6-225 vault: 2019-04-09T11:38:04.217-0400 [DEBUG] cache.leasecache: renewal halted; evicting from cache: path=/v1/database/creds/prod-awsvpn-x
Apr  9 11:38:04 ip-172-16-6-225 vault: 2019-04-09T11:38:04.217-0400 [DEBUG] cache.leasecache: evicting index from cache: id=8bf760522285a625dd303f70da77c553bd580c12c1b8d04e6786967f9537ba4f path=/v1/database/creds/prod-awsvpn-x method=GET
 
So 17 hours 34 minutes 23 seconds into my 24 hour lifespan vault-agent tries to renew the database credential - good move, why wait for the last milisecond. However the renewal appears to fail, the last three entries happen the same milisecond, so the renewal routine is halted and the cache cleared. Then a few hours later when the current credential expires my app will die.

It is probably something subtle that I am missing but I don't know what. The lack of information from Vault on why it is halting the renewal process doesn't help.



zxcvb...@gmail.com

unread,
Apr 10, 2019, 4:53:02 PM4/10/19
to Vault
I'm not sure what else to do at the moment, enabling trace level logging to see if that captures any clues.

Brian Kassouf

unread,
Apr 10, 2019, 5:39:04 PM4/10/19
to vault...@googlegroups.com
Hi there,

Seems like Vault Agent doesn't have permissions to renew the lease.
Please try giving your vault policy "prod-awsvpn-x" access to the
renew path: https://www.vaultproject.io/api/system/leases.html#renew-lease

Best,
Brian
> --
> This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/vault/issues
> IRC: #vault-tool on Freenode
> ---
> You received this message because you are subscribed to the Google Groups "Vault" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vault-tool+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/vault-tool/aca68cbb-e5de-4c3f-b19d-5c34a528e43c%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

zxcvb...@gmail.com

unread,
Apr 12, 2019, 5:37:40 PM4/12/19
to Vault
I've added the following to my policies:

# Allow tokens to renew themselves
path "auth/token/renew-self" {
  capabilities = ["update"]
}

# Allow a token to renew a lease via lease_id
path "sys/leases/renew" {
  capabilities = ["update"]
}

So far it does not seem to have made a difference. On the agent server (loglevel=trace) I see:

Apr 12 12:55:37 ip-172-16-29-118 vault: 2019-04-12T12:55:37.155-0400 [DEBUG] cache.leasecache: secret renewed: path=/v1/database/creds/prod-awsvpn-x
Apr 12 12:55:37 ip-172-16-29-118 vault: 2019-04-12T12:55:37.155-0400 [DEBUG] cache.leasecache: renewal halted; evicting from cache: path=/v1/database/creds/prod-awsvpn-x
Apr 12 12:55:37 ip-172-16-29-118 vault: 2019-04-12T12:55:37.155-0400 [DEBUG] cache.leasecache: evicting index from cache: id=b94369f909510e33dcafe3fd6dc9599c64748031ab15945e686b3d04fc76d63e path=/v1/database/creds/prod-awsvpn-x method=GET

In the server logs (loglevel=trace) I see no mention of the event at all.

In the audit logs I see:

{"time":"2019-04-12T16:55:37.124906514Z","type":"request","auth":{"client_token":"hmac-sha256:7890dce4b524bf0a2e63a374858126037d924f6d0b020a0344e500002f0c5ab4","accessor":"hmac-sha256:24233f713167e930e68a7831e3d99073e4d874a35a791cb6280d1eebd707f6d2","display_name":"aws-prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A","policies":["default","prod-awsvpn-x"],"token_policies":["default","prod-awsvpn-x"],"metadata":{"account_id":"433624884903","auth_type":"iam","canonical_arn":"arn:aws:iam::433624884903:role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A","client_arn":"arn:aws:sts::433624884903:assumed-role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A/i-02ebb5cf8051af10a","client_user_id":"AROAIRRTWH7NJJUJ73FFQ","inferred_aws_region":"","inferred_entity_id":"","inferred_entity_type":"","role_id":"6982406f-bce9-946c-2c9a-2b3051b597ba"},"entity_id":"7ad8c765-cf01-6a23-5738-dc3882f59d3b","token_type":"service"},"request":{"id":"49a1e38c-4aab-942d-4c37-8ea6f2ae2b80","operation":"update","client_token":"hmac-sha256:7890dce4b524bf0a2e63a374858126037d924f6d0b020a0344e500002f0c5ab4","client_token_accessor":"hmac-sha256:24233f713167e930e68a7831e3d99073e4d874a35a791cb6280d1eebd707f6d2","namespace":{"id":"root","path":""},"path":"sys/leases/renew","data":{"increment":"hmac-sha256:271527f9edaf27ec0e7b164841a18b0fc6eb9936be9cd3d92163806010ddbb07","lease_id":"hmac-sha256:8e54369750c50b11d53a4e6f04a5e91c0944071ada7b1b652ad8f552be0421dc"},"policy_override":false,"remote_address":"172.16.28.252","wrap_ttl":0,"headers":{}},"error":""}
{"time":"2019-04-12T16:55:37.155276434Z","type":"response","auth":{"client_token":"hmac-sha256:7890dce4b524bf0a2e63a374858126037d924f6d0b020a0344e500002f0c5ab4","accessor":"hmac-sha256:24233f713167e930e68a7831e3d99073e4d874a35a791cb6280d1eebd707f6d2","display_name":"aws-prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A","policies":["default","prod-awsvpn-x"],"token_policies":["default","prod-awsvpn-x"],"metadata":{"account_id":"433624884903","auth_type":"iam","canonical_arn":"arn:aws:iam::433624884903:role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A","client_arn":"arn:aws:sts::433624884903:assumed-role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A/i-02ebb5cf8051af10a","client_user_id":"AROAIRRTWH7NJJUJ73FFQ","inferred_aws_region":"","inferred_entity_id":"","inferred_entity_type":"","role_id":"6982406f-bce9-946c-2c9a-2b3051b597ba"},"entity_id":"7ad8c765-cf01-6a23-5738-dc3882f59d3b","token_type":"service"},"request":{"id":"49a1e38c-4aab-942d-4c37-8ea6f2ae2b80","operation":"update","client_token":"hmac-sha256:7890dce4b524bf0a2e63a374858126037d924f6d0b020a0344e500002f0c5ab4","client_token_accessor":"hmac-sha256:24233f713167e930e68a7831e3d99073e4d874a35a791cb6280d1eebd707f6d2","namespace":{"id":"root","path":""},"path":"sys/leases/renew","data":{"increment":"hmac-sha256:271527f9edaf27ec0e7b164841a18b0fc6eb9936be9cd3d92163806010ddbb07","lease_id":"hmac-sha256:8e54369750c50b11d53a4e6f04a5e91c0944071ada7b1b652ad8f552be0421dc"},"policy_override":false,"remote_address":"172.16.28.252","wrap_ttl":0,"headers":{}},"response":{"secret":{"lease_id":"database/creds/prod-awsvpn-x/2bXA05NwNd5131cNwVopXwIG"},"warnings":["TTL of \"24h0m0s\" exceeded the effective max_ttl of \"6h54m3s\"; TTL value is capped accordingly"],"headers":null},"error":""}

Which suggests to me that the lease was renewed successfully, though not for 24 hours, so I am not sure why the renewer to quit - was it because it didn't get the full time it wanted?

My auth token looks like this:

Key                  Value
---                  -----
creation_time        1555026578
creation_ttl         24h
display_name         aws-prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A
entity_id            7ad8c765-cf01-6a23-5738-dc3882f59d3b
expire_time          2019-04-13T13:22:31.835998052-04:00
explicit_max_ttl     0s
issue_time           2019-04-11T19:49:38.169837273-04:00
last_renewal         2019-04-12T13:22:31.835998164-04:00
last_renewal_time    1555089751
meta                 map[canonical_arn:arn:aws:iam::433624884903:role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A inferred_entity_id: role_id:6982406f-bce9-946c-2c9a-2b3051b597ba auth_type:iam client_arn:arn:aws:sts::433624884903:assumed-role/prod-awsvpn-x-InstanceRole-1PYL5GT789Z1A/i-02ebb5cf8051af10a client_user_id:AROAIRRTWH7NJJUJ73FFQ inferred_aws_region: inferred_entity_type: account_id:433624884903]
num_uses             0
orphan               true
path                 auth/aws/login
policies             [default prod-awsvpn-x]
renewable            true
ttl                  20h6m17s
type                 service
> To unsubscribe from this group and stop receiving emails from it, send an email to vault...@googlegroups.com.

Calvin Leung Huang

unread,
Apr 12, 2019, 7:46:39 PM4/12/19
to Vault
Hello there,

Looking at your database role policy, it seems that you've set the TTL value to be 86400 which translates to 24hrs, and is the reason why those database creds are expiring after that duration. Even though you've explicitly set the role's `max_ttl` value to 0, this particular value is capped by the mount's/system's TTL value (see https://www.vaultproject.io/api/secret/databases/index.html#max_ttl).

The reason why the database lease cannot be renewed past the TTL value, but the Vault token obtained via auto-auth (in this case via agent's token that is obtain through the AWS auth method) can, is because the /auth/aws/role/prod-awsvpn-x policy creates a periodic token for Vault Agent to use, which it can renew indefinitely as long as renewal occurs before the specified value in the "period" field. This field does not exist in for database credentials creation, which is why you will eventually have database credentials that will expire. My recommendation is for you to send a new credential request on your side within a try-catch attempt in order for Vault Agent to get a fresh set of database credentials whenever it sees that the old one is invalid/expired. 

On another note, Vault Agent attempts renewal based on a percentage of total TTL + some jitter before the lease actually expires which is why you're seeing a renewal attempt at the ~17th hour mark. 

Hope you find this helpful!

- Calvin

zxcvb...@gmail.com

unread,
Apr 13, 2019, 12:40:37 AM4/13/19
to Vault
Good news is that I think the problem is understood now. My goal from the beginning has been dynamic shot-lived credentials, though from an integration perspective I think dynamic credentials that have to be renewed frequently are an acceptable half step. In my mind this situation seems to defeat the purpose of the auto-renew feature in vault agent because the ttl will need to be so large that it will never renew in the lifetime of the instance where the credentials are used. Any program written to fetch new credentials prior to the ttl running out wouldn't need the vault-agent to renew anything. However commercial software which I don't have the source code for and which will never be integrated vault does need an agent of some sort to fetch and renew credentials. However, even though I'm not thrilled with having to allow huge ttls I think this can still be salvaged so long as the secrets are destroyed when the auth token that created them is. I'll test this tomorrow or Monday.
Reply all
Reply to author
Forward
0 new messages