KV Metrics from Consul

321 views
Skip to first unread message

Darron Froese

unread,
Jun 16, 2016, 7:10:14 PM6/16/16
to consu...@googlegroups.com
I'm working on some capacity planning and hoping to find some reasonable capacity limits to a functional Consul cluster.

To that end, I've been reading through the code looking for a bit more detailed metrics. I found a few metrics I'm curious about:

"consul.fsm.kvs.set.count" - it looks like a count of writes - 1 for each Consul server node in the quorum.

"consul.consul.fsm.kvs.delete.count" looks like the count of deletes - 1 for each Consul server node in the quorum.

They all look related to "consul.consul.kvs.apply.count" - which looks to be the leader "applying" the write or delete - after it's been written to enough server nodes - possibly related to Raft.

But I can't seem to find a metric for KV reads. Is there one?

Darron Froese

unread,
Jun 22, 2016, 3:19:32 PM6/22/16
to consu...@googlegroups.com
Digging through the code a little more I found `consul.rpc.query`.

https://github.com/hashicorp/consul/blob/master/consul/rpc.go#L370

It appears that `consul.rpc.query` is done only by server nodes.

Is that an approximation of total reads? It looks like that metric is closely related to the amount of KV reads and DNS requests that reach Consul.

I looked here: https://www.consul.io/docs/agent/telemetry.html - and didn't find anything closely related.

Any confirmation would be great - thanks.

James Phillips

unread,
Jun 22, 2016, 4:38:23 PM6/22/16
to consu...@googlegroups.com
Hey Darron,

There's unfortunately no KV-specific telemetry in the read path. The `consul.rpc.query` metric is in the blocking RPC helper function so it is basically a measure of all read volume on a given server (KV, health, catalog, etc.) so it's an upper bound for what the KV volume is, but it includes a lot of other stuff. The DNS handler ends up in the ServiceNodes endpoint on the servers which also calls this helper, so that explains why DNS queries reflect in here as well.

The only thing I can think of without changing code would be to run `consul monitor` with debug level on a few of your agents and use the HTTP logging to count /v1/kv/* requests over some interval. Sorry about that!

-- James

--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/CAEsqDyeZH8mrzndu2LKktDMAZ1E0tJHcvSr7%2BzqrNB9Cj4dQXA%40mail.gmail.com.

For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages