Hey folks,
I'm using consul as a storage backend for vault, I have consul servers where consul runs as a server, and vault servers where consul runs as an agent.
Consul agent seems to be trying to get the raft configuration from the consul servers, gets the error, and switches to another consul server. It does this about once every 10 seconds.
Jan 10 23:22:16 HOSTNAME consul[1880]: 2019/01/10 23:22:16 [ERR] consul: "Operator.RaftGetConfiguration" RPC failed to server 1.2.3.4:8300: rpc error making call: rpc error making call: Permission denied
Jan 10 23:22:16 HOSTNAME consul[1880]: consul: "Operator.RaftGetConfiguration" RPC failed to server 1.2.3.4:8300: rpc error making call: rpc error making call: Permission denied
Jan 10 23:22:16 HOSTNAME consul[1880]: 2019/01/10 23:22:16 [DEBUG] manager: cycled away from server "consul2"
Jan 10 23:22:16 HOSTNAME consul[1880]: 2019/01/10 23:22:16 [ERR] http: Request GET /v1/operator/raft/configuration, error: rpc error making call: rpc error making call: Permission denied from=[::1]:40856
Jan 10 23:22:16 HOSTNAME consul[1880]: 2019/01/10 23:22:16 [DEBUG] http: Request GET /v1/operator/raft/configuration (1.980794ms) from=[::1]:40856
Jan 10 23:22:16 HOSTNAME consul[1880]: manager: cycled away from server "consul2"
Jan 10 23:22:16 HOSTNAME consul[1880]: http: Request GET /v1/operator/raft/configuration, error: rpc error making call: rpc error making call: Permission denied from=[::1]:40856
Jan 10 23:22:16 HOSTNAME consul[1880]: http: Request GET /v1/operator/raft/configuration (1.980794ms) from=[::1]:40856
The IP is associated with the host known as consul2.
/etc/consul/consul.d/50custom.json:
{
"acl": {
"default_policy": "deny",
"down_policy": "extend-cache",
"enabled": true,
"tokens": {
"agent": "$MASTER_TOKEN",
"agent_master": "$MASTER_TOKEN",
"master": "$MASTER_TOKEN"
}
},
"telemetry": {
"disable_hostname": false,
"statsite_address": "statsite:8125"
}
}
/etc/consul/config.json:
{
"acl_agent_master_token": "$MASTER_TOKEN",
"acl_datacenter": "$DC",
"acl_down_policy": "deny",
"acl_ttl": "30s",
"addresses": {
"dns": "0.0.0.0",
"grpc": "0.0.0.0",
"http": "0.0.0.0",
"https": "0.0.0.0"
},
"advertise_addr": "$IP_OF_HOST",
"advertise_addr_wan": "$IP_OF_HOST",
"bind_addr": "$IP_OF_HOST",
"ca_file": "/etc/consul/ssl/ca.crt",
"cert_file": "/etc/consul/ssl/server.crt",
"client_addr": "0.0.0.0",
"data_dir": "/var/consul",
"datacenter": "$DC",
"disable_update_check": false,
"domain": "$CONSUL_DNS_PREFIX.",
"enable_script_checks": false,
"enable_syslog": true,
"encrypt": "$SECRET",
"key_file": "/etc/consul/ssl/server.key",
"log_level": "DEBUG",
"node_name": "vault0",
"performance": {
"leave_drain_time": "5s",
"raft_multiplier": 1,
"rpc_hold_timeout": "7s"
},
"ports": {
"dns": 8600,
"grpc": -1,
"http": -1,
"https": 8443,
"serf_lan": 8301,
"serf_wan": 8302,
"server": 8300
},
"raft_protocol": 3,
"retry_interval": "30s",
"retry_join": [
(List of consul server IPs)
],
"retry_max": 0,
"server": false,
"syslog_facility": "local0",
"ui": true,
"verify_incoming": false,
"verify_incoming_https": false,
"verify_outgoing": true,
"verify_server_hostname": false
}
To clarify, variables shown here have actual values in place in the config file and have been changed before posting here
.
I've confirmed the token used is the master token by logging into the Consul UI with it and checking the ACLs page, which shows which token you're logged in with.
I can store values in vault, and vault appears to be working, but I'd like to resolve this permissions issue so consul agent stops hopping between consul servers.
This is happening with both the master token, and a new token created which has a policy equivalent to the global-management policy except without acl write access. The global-management policy has operator write permissions, my token policy has operator read permission.
This is the policy used for the token I created:
agent_prefix "" {
policy = "write"
}
event_prefix "" {
policy = "write"
}
key_prefix "" {
policy = "write"
}
keyring = "write"
node_prefix "" {
policy = "write"
}
operator = "read"
query_prefix "" {
policy = "write"
}
service_prefix "" {
policy = "write"
intentions = "write"
}
session_prefix "" {
policy = "write"
}
I am aware that my consul config has some extra lines for configuring ACLs for a previous version of consul, this is an artifact of the public consul ansible role that's being used, it needs to be patched to support the Consul ACL 1.4 syntax so I'm using 50custom.json to feed the correct ACL config in for now.
This is with consul 1.4.0 and vault 1.0.0 on a Centos7 system.
Anyone have an idea as to how to resolve this?
Thanks in advance,
Jeff.