Changing consul agent acl token

1,008 views
Skip to first unread message

Brian Lalor

unread,
Jun 26, 2017, 10:24:46 AM6/26/17
to Consul
I’m using Consul 0.7.5.  I’m providing ACL tokens to the consul agents and servers via Vault with a consul-template instance that only talks to Vault.  After some period of time (I believe the Consul secret backend’s max TTL), my agents start acting “weird” and I see errors like the following:

[ERR] consul: RPC failed to server 10.64.16.101:8300: rpc error: rpc error: ACL not found
[ERR] agent: failed to sync changes: rpc error: rpc error: ACL not found

When I see this error, the ACL token written into Consul’s config file is valid (I can retrieve the rules applied to it), and the age of the consul process indicates that it was restarted after the config file was regenerated.  But services registered with the agent are not being synced to the cluster; the agent is basically cut off from the cluster.  This makes me think that the agent is not honoring the ACL token in the config file after it’s initially started.

How do I properly rotate the ACL token for a running agent?

— 
Brian Lalor

signature.asc

Brian Lalor

unread,
Jul 10, 2017, 7:09:25 AM7/10/17
to Consul
Bump on this; wondering if I’m going about things the wrong way…

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/2B7157F7-5062-4AE5-91EA-B049A09E46EF%40bravo5.org.
For more options, visit https://groups.google.com/d/optout.

— 
Brian Lalor

signature.asc

James Phillips

unread,
Jul 14, 2017, 2:35:02 PM7/14/17
to consu...@googlegroups.com
Hi Brian,

> When I see this error, the ACL token written into Consul’s config file is valid (I can retrieve the rules applied to it), and the age of the consul process indicates that it was restarted after the config file was regenerated.

The ACL token isn't currently in the reloadable configuration, so I
can see that a reload won't pick it up (or a SIGHUP), but it doesn't
make sense that an agent restart would not see it. Can you clarify if
you are completely restarting the agent process, or if you are just
reloading?

I think we can pretty easily make the ACL token get picked up with a
reload with a small code change; that would support rotation.

-- James
> https://groups.google.com/d/msgid/consul-tool/E850098C-E3C8-41DE-8280-07ED23A31D5A%40bravo5.org.

Brian Lalor

unread,
Jul 14, 2017, 2:45:58 PM7/14/17
to Consul
On Jul 14, 2017, at 2:34 PM, James Phillips <ja...@hashicorp.com> wrote:
>
> Hi Brian,
>
>> When I see this error, the ACL token written into Consul’s config file is valid (I can retrieve the rules applied to it), and the age of the consul process indicates that it was restarted after the config file was regenerated.
>
> The ACL token isn't currently in the reloadable configuration, so I
> can see that a reload won't pick it up (or a SIGHUP), but it doesn't
> make sense that an agent restart would not see it. Can you clarify if
> you are completely restarting the agent process, or if you are just
> reloading?
>
> I think we can pretty easily make the ACL token get picked up with a
> reload with a small code change; that would support rotation.

That would be a nice change. I do have “systemctl restart consul” in the consul-template config that manages the credential file. (That’s not quite as scary as it sounds; I have two c-t instances, one i call “vault-template” that doesn’t talk to Consul and is thus suitable for retrieving the token for the Consul agents).

When I said “the age of the consul process” I was referring to the etime field in ps, so it was definitely restarted.

I’m upgrading consul-template to 0.19.0 to take advantage of the Vault grace period option that was added. I’m hoping that will head off the possible race condition with the token getting revoked. I’ll keep an eye out for the behavior of the agent not honoring the new token. It doesn’t make any sense to me, either, but I was wondering if perhaps it was stored in the same way the gossip encryption key is.
signature.asc

James Phillips

unread,
Jul 26, 2017, 4:19:02 PM7/26/17
to consu...@googlegroups.com
Hi Brian,

We just created a new agent API that will go out in Consul 0.9.1 which
allows tokens to be introduced (they never have to go into config
files if you want) and rotated -
https://www.consul.io/api/agent.html#update-acl-tokens. We decided not
to make this work with `consul reload` since that could potentially
reset tokens that were rotated later with the API, so this seemed more
flexible.

Please let us know if you think this will work for you, and if you see
any other weird token behavior.

-- James
> --
> This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/0F698679-4F1D-4534-A477-B1EE48C288EB%40bravo5.org.
Reply all
Reply to author
Forward
0 new messages