big Consul KV Datastore

1,264 views
Skip to first unread message

Chris Hartwig

unread,
Jan 20, 2015, 5:22:14 AM1/20/15
to consu...@googlegroups.com
Hi all

We're using Consul KV to store some read-mostly data (<1ko per key). 
The write rate is < 1 / sec, but I wonder how much data Consul can safely manage.

Consul allows us to keep the read-mostly data in memory, and keep it synchronized in the whole cluster when data changes (using ?index&wait).
It works perfectly, but I wonder if it will scale to 10k's keys...

Are you using Consul to store lots of data? Can you share some numbers from actual production systems?
Anyone storing 1M+ keys in Consul?

Thanks
Chris

PS : we're currently using 100's of keys, but must decide if this approach will scale for different use with >10k keys.

Armon Dadgar

unread,
Jan 20, 2015, 2:41:53 PM1/20/15
to Chris Hartwig, consu...@googlegroups.com
Hey Chris,

In our internal benchmarks we have loaded Consul with ~250K keys and tested with that.
There are a few factors to consider such as the amount of RAM available (should be enough
to comfortably fit everything in memory), and the QPS requirements. Assuming low write QPS,
and moderate read QPS, having that many keys in Consul was not an issue.

We haven’t tested (nor have I heard about) anybody doing 1M+ keys. In terms of storage,
it’s really just more memory pressure. The concerns are we have not spent much time optimizing
the periodic checkpointing which will start to become quite slow with that many keys.  Bringing
up a new server also will take a long time as many keys need to be replicated.

Best Regards,
Armon Dadgar
--
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email to consul-tool...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

sobe...@zendesk.com

unread,
Apr 26, 2017, 10:30:16 AM4/26/17
to Consul, chrisde...@gmail.com
Armon,

Are there newer benchmark figures available around large k/v datastores with Consul 0.7 or newer?  I've been asked about storing 10-20k keys that will likely contain 2-10Kb of JSON data.  There are certainly better ways to do this, but there is an interest in using Consul.  This would be a new use case for us.  The data should be primarily read heavy with very few writes.  Stale queries should be more than sufficient.  Our largest serf cluster has roughly 1300 members.

Thanks,
--SteveO


On Tuesday, January 20, 2015 at 2:41:53 PM UTC-5, Armon Dadgar wrote:
Hey Chris,

In our internal benchmarks we have loaded Consul with ~250K keys and tested with that.
There are a few factors to consider such as the amount of RAM available (should be enough
to comfortably fit everything in memory), and the QPS requirements. Assuming low write QPS,
and moderate read QPS, having that many keys in Consul was not an issue.

We haven’t tested (nor have I heard about) anybody doing 1M+ keys. In terms of storage,
it’s really just more memory pressure. The concerns are we have not spent much time optimizing
the periodic checkpointing which will start to become quite slow with that many keys.  Bringing
up a new server also will take a long time as many keys need to be replicated.

Best Regards,
Armon Dadgar

Armon Dadgar

unread,
Apr 26, 2017, 12:44:57 PM4/26/17
to consu...@googlegroups.com, chrisde...@gmail.com
SteveO,

You will be fine with that use case. I would recommend a simple script that writes 20K keys of 10K length just to test it out. As long as you have a few GB of RAM on the box, since memory is the main limitation there. We've seen folks with over 1M+ keys in Consul 0.7+. Lots of changes have been made since that last reply in January 2015, and having tens of thousands of keys is a non issue.

Best Regards,

Armon Dadgar

Sent from Outlook Mobile
_____________________________
From: soberther via Consul <consu...@googlegroups.com>
Sent: Wednesday, April 26, 2017 7:30 AM
Subject: [consul] Re: big Consul KV Datastore
To: Consul <consu...@googlegroups.com>
Cc: <chrisde...@gmail.com>
--
This mailing list is governed under the HashiCorp Community Guidelines - https://www.hashicorp.com/community-guidelines.html. Behavior in violation of those guidelines may result in your removal from this mailing list.
 
GitHub Issues: https://github.com/hashicorp/consul/issues
IRC: #consul on Freenode
---
You received this message because you are subscribed to the Google Groups "Consul" group.
To unsubscribe from this group and stop receiving emails from it, send an email toconsul-tool...@googlegroups.com.

Rom Freiman

unread,
Apr 28, 2017, 9:55:42 AM4/28/17
to Consul, chrisde...@gmail.com
Who can elaborate what is high write rate for consul? 
Lets say for 50 nodes cluster, with 3 servers. 
How many keys? Keys sizes?
In our environment, we see consul leader in approx 500% CPU (v6.4.0, GOMAXPRO=24) and we trying to figure out what causes it.  

Thanks

Armon Dadgar

unread,
Apr 28, 2017, 2:52:32 PM4/28/17
to consu...@googlegroups.com, chrisde...@gmail.com
Hey Rom,

I would use the telemetry output to diagnose (https://www.consul.io/docs/agent/telemetry.html).
Typically, Consul is write limited by I/O and read limited by CPU. Since you are seeing such high CPU utilization, my guess is there is an incredibly high query rate and potentially an abusive client.

The number and size of keys is limited by memory. As a heuristic, number of keys * average key size * 2-3x for overhead should be the RAM available.
We’ve also made significant performance enhancements in the last few versions, so moving to a later version and removing GOMAXPROCS may help (no longer needed with recent builds).

Best Regards,
Armon Dadgar

Rom Freiman

unread,
Apr 28, 2017, 3:16:57 PM4/28/17
to consu...@googlegroups.com, chrisde...@gmail.com
Thanks Armon.

Few things:

1. Do you have some benchmark number of what should we aim to with 50 nodes cluster? Approx 20K keys. 
2. We use consul also as our dns server, and for performing healthchecks on services - can it affect CPU usage as well?
3. I looked into consul telemetry - unfortunately the documentation is not so clear (mostly datadog blog) for v0.6.4
4. Upgrading consul version is a bit risky for us - it took us some to stabilize consul clustering self healing (we aim for fully auto recovery in case of node/network failures) and I'm afraid that upgrading will destabilize our solution. 

Thanks again,
Rom


You received this message because you are subscribed to a topic in the Google Groups "Consul" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/consul-tool/3XvE4lUARrY/unsubscribe.
To unsubscribe from this group and all its topics, send an email to consul-tool...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/consul-tool/CAJaobSyXYpyT%3DH2%3DweZf8ep2RR5nzZ8xmN7giz2uAf7KM1eqOQ%40mail.gmail.com.

Armon Dadgar

unread,
Apr 28, 2017, 4:13:59 PM4/28/17
to consu...@googlegroups.com, chrisde...@gmail.com
Hey Rom,

Here is a little benchmarking setup that we have, it’s quite simplistic but you can expand from here:

Both DNS and Health Checks affect the system in different ways. DNS is basically a read query, and does use CPU. You can help scale the read load by setting “dns.allow_stale” to true (https://www.consul.io/docs/agent/options.html#allow_stale). That is the default for version 0.7+, but in 0.6.4 it is false. Health checks trigger a write anytime they change (passing to failing, etc). If you have any flappy checks you want to investigate those. You may want to ensure the “check_update_interval” has not been modified from the default (https://www.consul.io/docs/agent/options.html#check_update_interval).

If you have Datadog setup, then all the telemetry will be there. If not, you can send a signal to the Consul process (SIGUSR1 on Linux, BREAK on Windows) and the agent will dump out a snapshot to standard error logs. It will not interrupt the running process, so it’s safe to do for a running instance. This page helps document what the various outputs are: https://www.consul.io/docs/agent/telemetry.html

If you are seeing such high CPU, likely there is a very high query volume. If you look for the offending services that have very high counters (e.g. 1K+ lookups to a particular service per second) that can help you identify where the load is coming from. Hope that helps!

Best Regards,
Armon Dadgar

Rom Freiman

unread,
May 8, 2017, 3:35:52 PM5/8/17
to Consul, chrisde...@gmail.com
Hey Armon,

To try and pinpoint the problem, what I did is to record consul GET requests on one of the nodes (11 nodes, 3 servers) by ' consul monitor --log-level debug | grep GET > consul.log'
And then I used httperf to replay (in different rate and parallelism, in a loop) the reads on all nodes.
In neither of the scenarios we managed to bring consul to 400+% cpu. The max we could get is 30-40% cpu.
Total keys ~ 11K.

In contrary, in our 50 nodes cluster (3 consul servers), consul leader is constantly utilize ~200-300% cpu. Total keys 15K.

From what you explained above, the CPU utilization is caused by reads - so why I could not reproduce it on smaller cluster? What am I missing?

Thanks,
Rom

Armon Dadgar

unread,
May 8, 2017, 4:50:00 PM5/8/17
to consu...@googlegroups.com, chrisde...@gmail.com
Rom,

It’s really quite hard to say without access to logs, telemetry, or really knowing your usage patterns.
My best guess is that the access pattern you are simulating is not representative of what the production cluster is getting.

For example, if you are mixing blocking queries (edge triggering on changes) with writes, there is much higher load because the edge triggers are firing, vs a purely read only workload that isn’t triggering the edges. Those blocking queries may not show up in the log file since they are blocked waiting on a write to trigger them, but depending on the write pattern they may be getting evaluated internally by Consul. The telemetry tells you how many requests and queries are being processed, and you can use that to see if you have a representative benchmark.

Also make sure your versions are the same between production and your benchmark, as newer versions are more optimized and higher performance. The 0.6.4 build is quite a bit behind the latest releases at this point.

Best Regards,
Armon Dadgar

Rom Freiman

unread,
May 8, 2017, 5:10:40 PM5/8/17
to consu...@googlegroups.com, chrisde...@gmail.com
What telemtries I should look at - consul_rcp_query and consul_rpc_request? What is the meaning of each?
They are not documented. 
How would I recognize blocking queries? What telemetry is for them?

Thanks,
Rom

James Phillips

unread,
May 9, 2017, 7:35:38 PM5/9/17
to consu...@googlegroups.com, chrisde...@gmail.com
Hi Rom,

We are in the process of improving our telemetry and documentation so
sorry things are a bit undocumented right now.

The main counter for read queries is "consul.rpc.query" which gets
incremented for each incoming read request. There's currently not
additional info on blocking queries, so you'd need to look at your
application to see what the patterns are there. This counter will at
least let you compare your prod cluster's read rate with your test,
though.

The "consul.raft.apply" metric counts the number of Raft transactions,
which will give you an idea of write rate, which can also help
validate the fidelity your test setup against the prod cluster. If
writes are waking up blocking reads then you'd need to set up
something similar. It would be interesting to look at "consul.fsm.*" -
these give you an idea of what kind of data is being written, which
might be a useful clue.

Later versions of Consul added this metric which gives the time it
takes to service each request (and can also provide a count of each
kind of request):

> consul.http.<verb>.<path>This tracks how long it takes to service the given HTTP request for the given verb and path. Paths do not include details like service or key names, for these an underscore will be present as a placeholder (eg. consul.http.GET.v1.kv._)

If you can post a gist of a dump from SIGUSR1 I could try to eyeball
some of the internal metrics and look for unusual things.

-- James
> https://groups.google.com/d/msgid/consul-tool/CADs-y4SzBOZZfSUo0gn9bcp9Pkp1Z1NOcvfZ5ePeXQA_5%3DHv_w%40mail.gmail.com.

Rom Freiman

unread,
May 10, 2017, 8:37:20 AM5/10/17
to consu...@googlegroups.com, chrisde...@gmail.com
Thanks James,

Attached consul telemetries output from consul leader:

[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.num_goroutines': 4441.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.sys_bytes': 464987008.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.malloc_count': 326108086272.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.free_count': 326107496448.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_pause_ns': 352568246272.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_runs': 299967.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.alloc_bytes': 158495136.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.heap_objects': 610068.000
[2017-05-10 12:27:30 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.consul.session_ttl.active': 50.000
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.memberlist.udp.sent': Count: 20 Min: 34.000 Mean: 94.450 Max: 154.000 Stddev: 61.097 Sum: 1889.000 LastUpdated: 2017-05-10 12:27:39.637565737 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.openstack-nova-api': Count: 4 Sum: 4.000 LastUpdated: 2017-05-10 12:27:38.200456187 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.docker-registry': Count: 1 Sum: 1.000 LastUpdated: 2017-05-10 12:27:39.473204888 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.rpc.query': Count: 85198 Sum: 85198.000 LastUpdated: 2017-05-10 12:27:39.999334698 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.194.186:8300': Count: 398 Min: 0.000 Mean: 0.905 Max: 2.000 Stddev: 0.349 Sum: 360.000 LastUpdated: 2017-05-10 12:27:39.962118982 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.policy-store': Count: 120 Sum: 120.000 LastUpdated: 2017-05-10 12:27:39.834136441 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.redis-cache': Count: 8 Sum: 8.000 LastUpdated: 2017-05-10 12:27:35.609343155 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.192.5:8300': Count: 390 Min: 0.000 Mean: 0.923 Max: 2.000 Stddev: 0.327 Sum: 360.000 LastUpdated: 2017-05-10 12:27:39.977647684 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.raft.apply': Count: 359 Sum: 359.000 LastUpdated: 2017-05-10 12:27:39.952323031 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.gcm': Count: 86 Sum: 86.000 LastUpdated: 2017-05-10 12:27:39.872773347 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.redis': Count: 9 Sum: 9.000 LastUpdated: 2017-05-10 12:27:39.211787524 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.rpc.request': Count: 80928 Sum: 80928.000 LastUpdated: 2017-05-10 12:27:39.999988258 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.rabbitmq-server': Count: 60 Sum: 60.000 LastUpdated: 2017-05-10 12:27:39.916093974 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.memberlist.udp.received': Count: 20 Min: 35.000 Mean: 94.500 Max: 154.000 Stddev: 61.046 Sum: 1890.000 LastUpdated: 2017-05-10 12:27:39.637777098 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.ntpd-server': Count: 78 Sum: 78.000 LastUpdated: 2017-05-10 12:27:39.894390733 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][C] 'consul.consul.health.service.query.stratonet-frontend': Count: 6 Sum: 6.000 LastUpdated: 2017-05-10 12:27:39.820476447 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.commitTime': Count: 360 Min: 7.146 Mean: 15.176 Max: 49.823 Stddev: 6.161 Sum: 5463.181 LastUpdated: 2017-05-10 12:27:39.962130451 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.194.186:8300': Count: 65 Min: 0.098 Mean: 0.581 Max: 15.099 Stddev: 2.351 Sum: 37.783 LastUpdated: 2017-05-10 12:27:39.966365789 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.leader.lastContact': Count: 42 Min: 0.000 Mean: 15.238 Max: 57.000 Stddev: 14.288 Sum: 640.000 LastUpdated: 2017-05-10 12:27:39.733056218 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.192.5:8300': Count: 65 Min: 0.097 Mean: 1.561 Max: 24.750 Stddev: 4.667 Sum: 101.457 LastUpdated: 2017-05-10 12:27:39.987370278 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.194.186:8300': Count: 398 Min: 0.110 Mean: 6.547 Max: 41.401 Stddev: 5.115 Sum: 2605.819 LastUpdated: 2017-05-10 12:27:39.962112114 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.runtime.gc_pause_ns': Count: 11 Min: 1038285.000 Mean: 1407761.727 Max: 1713013.000 Stddev: 223298.423 Sum: 15485379.000 LastUpdated: 2017-05-10 12:27:39.276902858 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.dns.domain_query.stratonode_c1n28': Count: 68 Min: 0.089 Mean: 0.125 Max: 0.209 Stddev: 0.029 Sum: 8.512 LastUpdated: 2017-05-10 12:27:38.599172769 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.fsm.kvs.cas': Count: 101 Min: 0.032 Mean: 0.078 Max: 0.253 Stddev: 0.032 Sum: 7.896 LastUpdated: 2017-05-10 12:27:39.962245916 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.serf.queue.Event': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:39.998709789 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.serf.coordinate.adjustment-ms': Count: 10 Min: 0.036 Mean: 0.062 Max: 0.090 Stddev: 0.018 Sum: 0.625 LastUpdated: 2017-05-10 12:27:39.63782256 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.fsm.kvs.set': Count: 257 Min: 0.022 Mean: 0.155 Max: 13.532 Stddev: 1.097 Sum: 39.744 LastUpdated: 2017-05-10 12:27:39.958597536 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.session.renew': Count: 64 Min: 0.008 Mean: 0.016 Max: 0.079 Stddev: 0.011 Sum: 1.035 LastUpdated: 2017-05-10 12:27:39.266474255 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.serf.queue.Query': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:39.947081666 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.serf.queue.Intent': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:39.798340584 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.leader.dispatchLog': Count: 352 Min: 3.234 Mean: 6.505 Max: 29.071 Stddev: 3.037 Sum: 2289.844 LastUpdated: 2017-05-10 12:27:39.956608978 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.memberlist.probeNode': Count: 10 Min: 0.327 Mean: 0.651 Max: 1.621 Stddev: 0.512 Sum: 6.513 LastUpdated: 2017-05-10 12:27:39.637828135 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.fsm.apply': Count: 360 Min: 0.039 Mean: 0.216 Max: 31.220 Stddev: 1.787 Sum: 77.883 LastUpdated: 2017-05-10 12:27:39.962251505 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.kvs.apply': Count: 358 Min: 7.308 Mean: 16.623 Max: 59.596 Stddev: 7.321 Sum: 5950.878 LastUpdated: 2017-05-10 12:27:39.962261617 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.192.5:8300': Count: 390 Min: 0.103 Mean: 20.394 Max: 76.179 Stddev: 14.745 Sum: 7953.496 LastUpdated: 2017-05-10 12:27:39.977643881 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.memberlist.gossip': Count: 70 Min: 0.003 Mean: 0.299 Max: 18.431 Stddev: 2.209 Sum: 20.930 LastUpdated: 2017-05-10 12:27:39.837499415 +0000 UTC
[2017-05-10 12:27:30 +0000 UTC][S] 'consul.consul.fsm.coordinate.batch-update': Count: 2 Min: 0.098 Mean: 0.163 Max: 0.228 Stddev: 0.092 Sum: 0.326 LastUpdated: 2017-05-10 12:27:35.628232172 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.sys_bytes': 464987008.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.free_count': 326120669184.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.heap_objects': 1312637.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_pause_ns': 352581582848.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.consul.session_ttl.active': 50.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.num_goroutines': 4459.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.alloc_bytes': 253277024.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.malloc_count': 326121979904.000
[2017-05-10 12:27:40 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_runs': 299977.000
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.rpc.query': Count: 85067 Sum: 85067.000 LastUpdated: 2017-05-10 12:27:49.999960144 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.raft.apply': Count: 359 Sum: 359.000 LastUpdated: 2017-05-10 12:27:49.951617785 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.memberlist.udp.sent': Count: 22 Min: 34.000 Mean: 99.818 Max: 154.000 Stddev: 60.751 Sum: 2196.000 LastUpdated: 2017-05-10 12:27:49.837083167 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.openstack-nova-api': Count: 12 Sum: 12.000 LastUpdated: 2017-05-10 12:27:49.878305582 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.194.186:8300': Count: 392 Min: 0.000 Mean: 0.916 Max: 2.000 Stddev: 0.351 Sum: 359.000 LastUpdated: 2017-05-10 12:27:49.964418117 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.192.5:8300': Count: 392 Min: 0.000 Mean: 0.916 Max: 2.000 Stddev: 0.351 Sum: 359.000 LastUpdated: 2017-05-10 12:27:49.98102864 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.ntpd-server': Count: 80 Sum: 80.000 LastUpdated: 2017-05-10 12:27:49.916670605 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.gcm': Count: 86 Sum: 86.000 LastUpdated: 2017-05-10 12:27:49.874046844 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.memberlist.udp.received': Count: 22 Min: 35.000 Mean: 89.091 Max: 154.000 Stddev: 60.648 Sum: 1960.000 LastUpdated: 2017-05-10 12:27:49.837012312 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.redis': Count: 10 Sum: 10.000 LastUpdated: 2017-05-10 12:27:49.231903518 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.rpc.request': Count: 81178 Sum: 81178.000 LastUpdated: 2017-05-10 12:27:49.999995466 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.policy-store': Count: 120 Sum: 120.000 LastUpdated: 2017-05-10 12:27:49.873096512 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.rabbitmq-server': Count: 54 Sum: 54.000 LastUpdated: 2017-05-10 12:27:49.391434012 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.docker-registry': Count: 3 Sum: 3.000 LastUpdated: 2017-05-10 12:27:46.321694793 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][C] 'consul.consul.health.service.query.stratonet-frontend': Count: 4 Sum: 4.000 LastUpdated: 2017-05-10 12:27:41.364572611 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.memberlist.gossip': Count: 70 Min: 0.003 Mean: 0.006 Max: 0.010 Stddev: 0.002 Sum: 0.425 LastUpdated: 2017-05-10 12:27:49.837537504 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.dns.domain_query.stratonode_c1n28': Count: 67 Min: 0.080 Mean: 0.304 Max: 11.429 Stddev: 1.381 Sum: 20.356 LastUpdated: 2017-05-10 12:27:48.52807155 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.fsm.coordinate.batch-update': Count: 2 Min: 0.143 Mean: 0.197 Max: 0.252 Stddev: 0.077 Sum: 0.395 LastUpdated: 2017-05-10 12:27:45.653438122 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.192.5:8300': Count: 392 Min: 0.110 Mean: 26.886 Max: 143.035 Stddev: 24.466 Sum: 10539.189 LastUpdated: 2017-05-10 12:27:49.981024464 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.194.186:8300': Count: 67 Min: 0.102 Mean: 0.704 Max: 13.436 Stddev: 2.000 Sum: 47.192 LastUpdated: 2017-05-10 12:27:49.874683785 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.192.5:8300': Count: 67 Min: 0.102 Mean: 1.299 Max: 21.951 Stddev: 3.755 Sum: 87.002 LastUpdated: 2017-05-10 12:27:49.931610906 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.leader.lastContact': Count: 42 Min: 0.000 Mean: 21.214 Max: 74.000 Stddev: 17.986 Sum: 891.000 LastUpdated: 2017-05-10 12:27:49.683955854 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.runtime.gc_pause_ns': Count: 10 Min: 1209334.000 Mean: 1334783.800 Max: 1485276.000 Stddev: 87559.391 Sum: 13347838.000 LastUpdated: 2017-05-10 12:27:49.28691643 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.serf.queue.Intent': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:49.799434086 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.serf.coordinate.adjustment-ms': Count: 10 Min: 0.049 Mean: 0.130 Max: 0.310 Stddev: 0.105 Sum: 1.300 LastUpdated: 2017-05-10 12:27:49.6379329 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.leader.dispatchLog': Count: 350 Min: 3.309 Mean: 7.130 Max: 36.046 Stddev: 3.766 Sum: 2495.485 LastUpdated: 2017-05-10 12:27:49.95805014 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.194.186:8300': Count: 392 Min: 0.118 Mean: 6.122 Max: 27.720 Stddev: 4.184 Sum: 2399.753 LastUpdated: 2017-05-10 12:27:49.964411679 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.fsm.kvs.cas': Count: 100 Min: 0.030 Mean: 0.119 Max: 3.898 Stddev: 0.383 Sum: 11.938 LastUpdated: 2017-05-10 12:27:49.878270928 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.fsm.apply': Count: 359 Min: 0.041 Mean: 0.159 Max: 10.261 Stddev: 0.747 Sum: 57.238 LastUpdated: 2017-05-10 12:27:49.964525912 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.session.renew': Count: 67 Min: 0.009 Mean: 0.017 Max: 0.065 Stddev: 0.009 Sum: 1.122 LastUpdated: 2017-05-10 12:27:49.971383893 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.memberlist.probeNode': Count: 10 Min: 0.312 Mean: 0.996 Max: 4.158 Stddev: 1.306 Sum: 9.956 LastUpdated: 2017-05-10 12:27:49.637937276 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.dns.ptr_query.stratonode_c1n28': Count: 1 Sum: 0.161 LastUpdated: 2017-05-10 12:27:47.636189214 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.raft.commitTime': Count: 359 Min: 8.050 Mean: 15.685 Max: 54.047 Stddev: 7.032 Sum: 5630.822 LastUpdated: 2017-05-10 12:27:49.964430396 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.kvs.apply': Count: 357 Min: 8.205 Mean: 16.914 Max: 58.241 Stddev: 7.812 Sum: 6038.415 LastUpdated: 2017-05-10 12:27:49.964534928 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.serf.queue.Query': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:49.991045287 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.consul.fsm.kvs.set': Count: 257 Min: 0.025 Mean: 0.094 Max: 9.147 Stddev: 0.568 Sum: 24.089 LastUpdated: 2017-05-10 12:27:49.964520941 +0000 UTC
[2017-05-10 12:27:40 +0000 UTC][S] 'consul.serf.queue.Event': Count: 19 Sum: 0.000 LastUpdated: 2017-05-10 12:27:49.197964168 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.alloc_bytes': 286871680.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.free_count': 326134038528.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.heap_objects': 1740808.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.num_goroutines': 4459.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.sys_bytes': 464987008.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.malloc_count': 326135775232.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_pause_ns': 352598491136.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_runs': 299987.000
[2017-05-10 12:27:50 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.consul.session_ttl.active': 50.000
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.rpc.request': Count: 81583 Sum: 81583.000 LastUpdated: 2017-05-10 12:27:59.997486667 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.memberlist.udp.received': Count: 21 Min: 35.000 Mean: 91.667 Max: 154.000 Stddev: 60.900 Sum: 1925.000 LastUpdated: 2017-05-10 12:27:59.637809811 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.docker-registry': Count: 8 Sum: 8.000 LastUpdated: 2017-05-10 12:27:59.817120678 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.redis-cache': Count: 92 Sum: 92.000 LastUpdated: 2017-05-10 12:27:56.754715796 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.memberlist.tcp.connect': Count: 1 Sum: 1.000 LastUpdated: 2017-05-10 12:27:56.769737413 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.194.186:8300': Count: 379 Min: 0.000 Mean: 0.929 Max: 3.000 Stddev: 0.375 Sum: 352.000 LastUpdated: 2017-05-10 12:27:59.955949566 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.gcm': Count: 82 Sum: 82.000 LastUpdated: 2017-05-10 12:27:59.875970989 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.openstack-nova-api': Count: 8 Sum: 8.000 LastUpdated: 2017-05-10 12:27:55.523750648 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.redis': Count: 10 Sum: 10.000 LastUpdated: 2017-05-10 12:27:59.877917233 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.rabbitmq-server': Count: 64 Sum: 64.000 LastUpdated: 2017-05-10 12:27:59.403294077 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.memberlist.udp.sent': Count: 21 Min: 34.000 Mean: 97.190 Max: 154.000 Stddev: 61.055 Sum: 2041.000 LastUpdated: 2017-05-10 12:27:59.637562499 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.raft.apply': Count: 353 Sum: 353.000 LastUpdated: 2017-05-10 12:27:59.988946234 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.192.5:8300': Count: 383 Min: 0.000 Mean: 0.919 Max: 3.000 Stddev: 0.385 Sum: 352.000 LastUpdated: 2017-05-10 12:27:59.991203292 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.ntpd-server': Count: 81 Sum: 81.000 LastUpdated: 2017-05-10 12:27:59.871616473 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.stratonet-frontend': Count: 4 Sum: 4.000 LastUpdated: 2017-05-10 12:27:51.380358881 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.rpc.query': Count: 85091 Sum: 85091.000 LastUpdated: 2017-05-10 12:27:59.999954949 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.consul.health.service.query.policy-store': Count: 120 Sum: 120.000 LastUpdated: 2017-05-10 12:27:59.915227669 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][C] 'consul.memberlist.tcp.sent': Count: 1 Sum: 5383.000 LastUpdated: 2017-05-10 12:27:56.770474577 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.192.5:8300': Count: 68 Min: 0.103 Mean: 1.505 Max: 26.926 Stddev: 4.118 Sum: 102.370 LastUpdated: 2017-05-10 12:27:59.878420001 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.session.renew': Count: 68 Min: 0.009 Mean: 0.017 Max: 0.043 Stddev: 0.006 Sum: 1.136 LastUpdated: 2017-05-10 12:27:59.998732065 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.memberlist.probeNode': Count: 10 Min: 0.359 Mean: 0.434 Max: 0.519 Stddev: 0.046 Sum: 4.341 LastUpdated: 2017-05-10 12:27:59.637859991 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.fsm.coordinate.batch-update': Count: 2 Min: 0.088 Mean: 0.122 Max: 0.156 Stddev: 0.048 Sum: 0.243 LastUpdated: 2017-05-10 12:27:55.695750346 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.memberlist.pushPullNode': Count: 1 Sum: 3.899 LastUpdated: 2017-05-10 12:27:56.773476661 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.194.186:8300': Count: 379 Min: 0.128 Mean: 6.336 Max: 23.939 Stddev: 3.805 Sum: 2401.283 LastUpdated: 2017-05-10 12:27:59.95594326 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.memberlist.gossip': Count: 70 Min: 0.003 Mean: 0.008 Max: 0.078 Stddev: 0.009 Sum: 0.549 LastUpdated: 2017-05-10 12:27:59.837517556 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.fsm.kvs.set': Count: 249 Min: 0.022 Mean: 0.056 Max: 0.206 Stddev: 0.023 Sum: 13.958 LastUpdated: 2017-05-10 12:27:59.928685296 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.fsm.kvs.cas': Count: 101 Min: 0.032 Mean: 0.075 Max: 0.171 Stddev: 0.027 Sum: 7.600 LastUpdated: 2017-05-10 12:27:59.95608219 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.leader.lastContact': Count: 42 Min: 1.000 Mean: 18.500 Max: 50.000 Stddev: 14.813 Sum: 777.000 LastUpdated: 2017-05-10 12:27:59.725196383 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.runtime.gc_pause_ns': Count: 10 Min: 1203588.000 Mean: 1690376.800 Max: 2339817.000 Stddev: 382307.532 Sum: 16903768.000 LastUpdated: 2017-05-10 12:27:59.306493278 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.serf.coordinate.adjustment-ms': Count: 10 Min: 0.086 Mean: 0.114 Max: 0.191 Stddev: 0.037 Sum: 1.139 LastUpdated: 2017-05-10 12:27:59.637855421 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.serf.queue.Query': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:59.992237254 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.fsm.apply': Count: 352 Min: 0.040 Mean: 0.090 Max: 0.467 Stddev: 0.038 Sum: 31.751 LastUpdated: 2017-05-10 12:27:59.956087922 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.kvs.apply': Count: 350 Min: 7.082 Mean: 16.153 Max: 59.595 Stddev: 7.235 Sum: 5653.689 LastUpdated: 2017-05-10 12:27:59.956099426 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.serf.queue.Event': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:59.202280198 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.commitTime': Count: 352 Min: 6.972 Mean: 14.958 Max: 51.726 Stddev: 5.613 Sum: 5265.270 LastUpdated: 2017-05-10 12:27:59.955963483 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.192.5:8300': Count: 383 Min: 0.114 Mean: 17.256 Max: 74.347 Stddev: 12.688 Sum: 6608.957 LastUpdated: 2017-05-10 12:27:59.991199175 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.serf.queue.Intent': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:27:59.800585342 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.194.186:8300': Count: 68 Min: 0.094 Mean: 0.649 Max: 12.908 Stddev: 2.194 Sum: 44.158 LastUpdated: 2017-05-10 12:27:59.992522232 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.raft.leader.dispatchLog': Count: 341 Min: 3.298 Mean: 6.603 Max: 28.147 Stddev: 3.182 Sum: 2251.689 LastUpdated: 2017-05-10 12:27:59.992496422 +0000 UTC
[2017-05-10 12:27:50 +0000 UTC][S] 'consul.consul.dns.domain_query.stratonode_c1n28': Count: 65 Min: 0.088 Mean: 0.138 Max: 0.383 Stddev: 0.054 Sum: 8.998 LastUpdated: 2017-05-10 12:27:59.110941516 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.alloc_bytes': 172795296.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.sys_bytes': 464987008.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.free_count': 326148587520.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.consul.session_ttl.active': 50.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.num_goroutines': 4441.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.malloc_count': 326149275648.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.heap_objects': 710778.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_pause_ns': 352614383616.000
[2017-05-10 12:28:00 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_runs': 299998.000
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.194.186:8300': Count: 400 Min: 0.000 Mean: 0.912 Max: 2.000 Stddev: 0.324 Sum: 365.000 LastUpdated: 2017-05-10 12:28:09.992360931 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.memberlist.udp.received': Count: 24 Min: 35.000 Mean: 84.583 Max: 154.000 Stddev: 59.930 Sum: 2030.000 LastUpdated: 2017-05-10 12:28:09.89805019 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.rabbitmq-server': Count: 56 Sum: 56.000 LastUpdated: 2017-05-10 12:28:08.953774641 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.redis': Count: 9 Sum: 9.000 LastUpdated: 2017-05-10 12:28:09.407269545 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.rpc.request': Count: 81156 Sum: 81156.000 LastUpdated: 2017-05-10 12:28:09.999970169 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.192.5:8300': Count: 400 Min: 0.000 Mean: 0.910 Max: 2.000 Stddev: 0.327 Sum: 364.000 LastUpdated: 2017-05-10 12:28:09.989792992 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.memberlist.udp.sent': Count: 24 Min: 34.000 Mean: 104.375 Max: 154.000 Stddev: 59.980 Sum: 2505.000 LastUpdated: 2017-05-10 12:28:09.898125003 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.policy-store': Count: 122 Sum: 122.000 LastUpdated: 2017-05-10 12:28:09.931787044 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.docker-registry': Count: 5 Sum: 5.000 LastUpdated: 2017-05-10 12:28:09.690052857 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.redis-cache': Count: 8 Sum: 8.000 LastUpdated: 2017-05-10 12:28:06.890925302 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.rpc.query': Count: 85208 Sum: 85208.000 LastUpdated: 2017-05-10 12:28:09.999991381 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.ntpd-server': Count: 78 Sum: 78.000 LastUpdated: 2017-05-10 12:28:09.969450472 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.openstack-nova-api': Count: 8 Sum: 8.000 LastUpdated: 2017-05-10 12:28:08.772904214 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.raft.apply': Count: 364 Sum: 364.000 LastUpdated: 2017-05-10 12:28:09.983640615 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.gcm': Count: 80 Sum: 80.000 LastUpdated: 2017-05-10 12:28:09.878594788 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][C] 'consul.consul.health.service.query.stratonet-frontend': Count: 4 Sum: 4.000 LastUpdated: 2017-05-10 12:28:01.387917428 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.memberlist.probeNode': Count: 10 Min: 0.324 Mean: 0.774 Max: 2.418 Stddev: 0.695 Sum: 7.736 LastUpdated: 2017-05-10 12:28:09.637825526 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.fsm.apply': Count: 365 Min: 0.039 Mean: 0.235 Max: 32.310 Stddev: 1.815 Sum: 85.834 LastUpdated: 2017-05-10 12:28:09.99251332 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.192.5:8300': Count: 67 Min: 0.103 Mean: 1.870 Max: 28.219 Stddev: 5.245 Sum: 125.270 LastUpdated: 2017-05-10 12:28:09.947224596 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.serf.queue.Intent': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:09.803510165 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.fsm.kvs.cas': Count: 100 Min: 0.043 Mean: 0.083 Max: 0.795 Stddev: 0.077 Sum: 8.306 LastUpdated: 2017-05-10 12:28:09.992508745 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.fsm.coordinate.batch-update': Count: 2 Min: 0.118 Mean: 0.150 Max: 0.182 Stddev: 0.045 Sum: 0.300 LastUpdated: 2017-05-10 12:28:05.728988944 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.commitTime': Count: 365 Min: 7.176 Mean: 14.787 Max: 72.213 Stddev: 6.497 Sum: 5397.096 LastUpdated: 2017-05-10 12:28:09.992376727 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.192.5:8300': Count: 400 Min: 0.118 Mean: 17.995 Max: 77.388 Stddev: 12.525 Sum: 7197.885 LastUpdated: 2017-05-10 12:28:09.989786899 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.memberlist.gossip': Count: 70 Min: 0.002 Mean: 0.281 Max: 10.356 Stddev: 1.509 Sum: 19.652 LastUpdated: 2017-05-10 12:28:09.839927544 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.leader.dispatchLog': Count: 359 Min: 3.169 Mean: 6.559 Max: 38.302 Stddev: 3.536 Sum: 2354.571 LastUpdated: 2017-05-10 12:28:09.98861913 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.serf.coordinate.adjustment-ms': Count: 10 Min: 0.077 Mean: 0.170 Max: 0.327 Stddev: 0.097 Sum: 1.699 LastUpdated: 2017-05-10 12:28:09.63782136 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.194.186:8300': Count: 69 Min: 0.088 Mean: 0.802 Max: 23.346 Stddev: 3.282 Sum: 55.316 LastUpdated: 2017-05-10 12:28:09.92156347 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.leader.lastContact': Count: 42 Min: 0.000 Mean: 17.524 Max: 83.000 Stddev: 19.529 Sum: 736.000 LastUpdated: 2017-05-10 12:28:09.739360141 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.194.186:8300': Count: 400 Min: 0.108 Mean: 6.241 Max: 33.982 Stddev: 4.227 Sum: 2496.580 LastUpdated: 2017-05-10 12:28:09.992353728 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.fsm.kvs.set': Count: 263 Min: 0.023 Mean: 0.063 Max: 0.682 Stddev: 0.051 Sum: 16.607 LastUpdated: 2017-05-10 12:28:09.981530156 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.kvs.apply': Count: 363 Min: 7.366 Mean: 16.054 Max: 72.497 Stddev: 8.026 Sum: 5827.665 LastUpdated: 2017-05-10 12:28:09.992520119 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.serf.queue.Query': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:09.993190409 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.serf.queue.Event': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:09.249490019 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.session.renew': Count: 65 Min: 0.008 Mean: 0.015 Max: 0.031 Stddev: 0.005 Sum: 0.952 LastUpdated: 2017-05-10 12:28:09.331789323 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.runtime.gc_pause_ns': Count: 11 Min: 1247240.000 Mean: 1443793.273 Max: 1728749.000 Stddev: 165740.931 Sum: 15881726.000 LastUpdated: 2017-05-10 12:28:09.317654339 +0000 UTC
[2017-05-10 12:28:00 +0000 UTC][S] 'consul.consul.dns.domain_query.stratonode_c1n28': Count: 66 Min: 0.086 Mean: 0.144 Max: 0.294 Stddev: 0.043 Sum: 9.484 LastUpdated: 2017-05-10 12:28:08.54777132 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.num_goroutines': 4447.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.alloc_bytes': 232524736.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.sys_bytes': 464987008.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.heap_objects': 1373216.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_pause_ns': 352632766464.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.malloc_count': 326163267584.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.free_count': 326161891328.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.runtime.total_gc_runs': 300008.000
[2017-05-10 12:28:10 +0000 UTC][G] 'consul.stratonode_c1n28.node.strato.consul.session_ttl.active': 50.000
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.memberlist.udp.received': Count: 12 Min: 35.000 Mean: 134.167 Max: 154.000 Stddev: 46.321 Sum: 1610.000 LastUpdated: 2017-05-10 12:28:19.637780402 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.rpc.request': Count: 81536 Sum: 81536.000 LastUpdated: 2017-05-10 12:28:19.999936667 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.192.5:8300': Count: 388 Min: 0.000 Mean: 0.910 Max: 2.000 Stddev: 0.344 Sum: 353.000 LastUpdated: 2017-05-10 12:28:19.993878698 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.raft.apply': Count: 353 Sum: 353.000 LastUpdated: 2017-05-10 12:28:19.979684372 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.policy-store': Count: 120 Sum: 120.000 LastUpdated: 2017-05-10 12:28:19.969006374 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.rabbitmq-server': Count: 48 Sum: 48.000 LastUpdated: 2017-05-10 12:28:19.39535565 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.rpc.query': Count: 85418 Sum: 85418.000 LastUpdated: 2017-05-10 12:28:19.999966371 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.ntpd-server': Count: 80 Sum: 80.000 LastUpdated: 2017-05-10 12:28:19.832095642 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.memberlist.udp.sent': Count: 12 Min: 34.000 Mean: 54.583 Max: 154.000 Stddev: 46.440 Sum: 655.000 LastUpdated: 2017-05-10 12:28:19.637535533 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.redis': Count: 10 Sum: 10.000 LastUpdated: 2017-05-10 12:28:19.41142522 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.docker-registry': Count: 6 Sum: 6.000 LastUpdated: 2017-05-10 12:28:14.234113139 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.openstack-nova-api': Count: 6 Sum: 6.000 LastUpdated: 2017-05-10 12:28:18.765616023 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.raft.replication.appendEntries.logs.1.66.194.186:8300': Count: 386 Min: 0.000 Mean: 0.915 Max: 2.000 Stddev: 0.339 Sum: 353.000 LastUpdated: 2017-05-10 12:28:19.992898636 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.gcm': Count: 92 Sum: 92.000 LastUpdated: 2017-05-10 12:28:19.880553732 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][C] 'consul.consul.health.service.query.stratonet-frontend': Count: 4 Sum: 4.000 LastUpdated: 2017-05-10 12:28:11.403947648 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.fsm.kvs.cas': Count: 100 Min: 0.039 Mean: 0.075 Max: 0.170 Stddev: 0.026 Sum: 7.531 LastUpdated: 2017-05-10 12:28:19.9158602 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.leader.lastContact': Count: 42 Min: 0.000 Mean: 19.643 Max: 60.000 Stddev: 17.548 Sum: 825.000 LastUpdated: 2017-05-10 12:28:19.652969814 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.runtime.gc_pause_ns': Count: 10 Min: 1283707.000 Mean: 1839514.600 Max: 4476970.000 Stddev: 978446.310 Sum: 18395146.000 LastUpdated: 2017-05-10 12:28:19.331869925 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.memberlist.probeNode': Count: 10 Min: 0.302 Mean: 1.785 Max: 13.514 Stddev: 4.126 Sum: 17.849 LastUpdated: 2017-05-10 12:28:19.637843085 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.fsm.coordinate.batch-update': Count: 2 Min: 0.093 Mean: 0.139 Max: 0.186 Stddev: 0.066 Sum: 0.279 LastUpdated: 2017-05-10 12:28:15.764279213 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.194.186:8300': Count: 386 Min: 0.108 Mean: 6.291 Max: 36.982 Stddev: 4.528 Sum: 2428.384 LastUpdated: 2017-05-10 12:28:19.992886455 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.commitTime': Count: 353 Min: 6.760 Mean: 14.698 Max: 58.141 Stddev: 6.559 Sum: 5188.406 LastUpdated: 2017-05-10 12:28:19.992909217 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.fsm.apply': Count: 353 Min: 0.038 Mean: 0.139 Max: 12.832 Stddev: 0.692 Sum: 49.185 LastUpdated: 2017-05-10 12:28:19.993007032 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.kvs.apply': Count: 351 Min: 6.868 Mean: 15.714 Max: 58.321 Stddev: 7.465 Sum: 5515.755 LastUpdated: 2017-05-10 12:28:19.993013476 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.fsm.kvs.set': Count: 251 Min: 0.022 Mean: 0.061 Max: 0.175 Stddev: 0.029 Sum: 15.277 LastUpdated: 2017-05-10 12:28:19.993003963 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.194.186:8300': Count: 64 Min: 0.096 Mean: 0.540 Max: 11.403 Stddev: 1.616 Sum: 34.536 LastUpdated: 2017-05-10 12:28:19.843824792 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.serf.coordinate.adjustment-ms': Count: 10 Min: 0.110 Mean: 0.132 Max: 0.147 Stddev: 0.012 Sum: 1.320 LastUpdated: 2017-05-10 12:28:19.637839013 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.dns.domain_query.stratonode_c1n28': Count: 62 Min: 0.095 Mean: 0.130 Max: 0.212 Stddev: 0.028 Sum: 8.087 LastUpdated: 2017-05-10 12:28:18.519318796 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.leader.dispatchLog': Count: 346 Min: 3.361 Mean: 6.322 Max: 40.028 Stddev: 3.330 Sum: 2187.381 LastUpdated: 2017-05-10 12:28:19.986453765 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.serf.queue.Intent': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:19.804605086 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.serf.queue.Query': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:19.994968117 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.replication.appendEntries.rpc.1.66.192.5:8300': Count: 388 Min: 0.114 Mean: 21.837 Max: 125.461 Stddev: 19.901 Sum: 8472.816 LastUpdated: 2017-05-10 12:28:19.993874928 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.memberlist.gossip': Count: 70 Min: 0.003 Mean: 0.332 Max: 11.107 Stddev: 1.856 Sum: 23.260 LastUpdated: 2017-05-10 12:28:19.837516492 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.consul.session.renew': Count: 66 Min: 0.010 Mean: 0.017 Max: 0.044 Stddev: 0.008 Sum: 1.114 LastUpdated: 2017-05-10 12:28:19.856947489 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.raft.replication.heartbeat.1.66.192.5:8300': Count: 65 Min: 0.100 Mean: 1.383 Max: 22.077 Stddev: 4.192 Sum: 89.897 LastUpdated: 2017-05-10 12:28:19.975301528 +0000 UTC
[2017-05-10 12:28:10 +0000 UTC][S] 'consul.serf.queue.Event': Count: 20 Sum: 0.000 LastUpdated: 2017-05-10 12:28:19.250573806 +0000 UTC



Thanks,
Rom

Message has been deleted

Motty Porat

unread,
May 11, 2017, 1:08:19 PM5/11/17
to Consul, chrisde...@gmail.com
* Sorry if you see previous mess in the thread.  I deleted this post in order to update.

Hi James.
I am with Rom.
Thank you for all the support!
I monitored our KV usage on an 11-node and a 50 node cluster. If all the math is correct, the number of  KV GETs per second (from all nodes together) is:

#Nodes   #KV/sec       typical %CPU of servers      rpc.requests/sec    rpc.queries/sec (see note 1)
11            600                20-40%                                   1500                         700
50            8670              250-500%                              8100                         8500

Notes:
1. The number from the telemetry is per 10 seconds, isn't it? So I divided by 10.
2. Yes, we read in a crappy way with square complexity; I have some fixes in mind.
3. Each node is waiting for ~20 blocking queries at any given moment.
 
So enjoy this data.
I'll let you know if & when we straighten up our querying (see note 2).
Motty
...

James Phillips

unread,
May 18, 2017, 11:23:57 AM5/18/17
to consu...@googlegroups.com, Chris Hartwig
Thanks for the follow up note!

> 1. The number from the telemetry is per 10 seconds, isn't it? So I divided by 10.

That's correct - these are accumulated over 10 second intervals.

> 2. Yes, we read in a crappy way with square complexity; I have some fixes in mind.

Your 8k/sec is roughly inline with our bench results on some 8 core
machines - https://github.com/hashicorp/consul/blob/master/bench/results-0.7.1.md.

> I'll let you know if & when we straighten up our querying (see note 2).

Sounds good.
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/dcad2a19-a1f5-4d94-9d35-2e6cf2303a61%40googlegroups.com.

Rom Freiman

unread,
Nov 21, 2017, 5:25:02 AM11/21/17
to Consul
Hey Armon,
Back to research our consul load.
 I have few questions:
1. What is better performance wise for consul - long blocking queries (few min), short with retries (30s), or probing it periodically?
2. We use consul exec a lot. Recently we noticed that it adds a lot of gossip load - not even sure if and how it affects consul performance.
3. We run GET /v1/agent/services every 3 seconds, while we have about 120 services registered. Once in few min, the query times out after 20 sec. No idea how to debug it.
4. In general, I cannot pinpoint the relevant metrics to look at. I prepared pretty extended grafana monitor dashboard, but 

Thanks,
Rom

Rom Freiman

unread,
Nov 21, 2017, 5:25:46 AM11/21/17
to Consul
BTW, we use consul 0.8.4, 10 physical servers, 3 consul servers

James Phillips

unread,
Jan 4, 2018, 9:08:32 PM1/4/18
to consu...@googlegroups.com
Sorry for the late reply one this one!

> 1. What is better performance wise for consul - long blocking queries (few min), short with retries (30s), or probing it periodically?

This depends a bit on the use case. If the churn is low then blocking
queries are definitely better as they are cheap internally and will
only wake as necessary. If the churn is high and you don't care to be
updated as often as possible then a slow poll is probably better. One
easy strategy is to wait a fixed period after a blocking query wakes
up in order to rate limit.

> 2. We use consul exec a lot. Recently we noticed that it adds a lot of gossip load - not even sure if and how it affects consul performance.

This uses events internally which are transferred via gossip. The
gossip rate is fixed for a given cluster size, so this should degrade
gracefully and not be a problem. It also writes results to the KV
store, so that does load the Consul servers as well.

> 3. We run GET /v1/agent/services every 3 seconds, while we have about 120 services registered. Once in few min, the query times out after 20 sec. No idea how to debug it.

This can get blocked waiting to get a lock that's held while the agent
is reading from the Consul servers during anti-entropy syncs. The
anti-entropy process was completely reworked in Consul 1.0.1, so I'd
be curious if you still see this with newer versions of Consul.

> 4. In general, I cannot pinpoint the relevant metrics to look at. I prepared pretty extended grafana monitor dashboard, but

Looks like that got cut off :-)

-- James
> --
> This mailing list is governed under the HashiCorp Community Guidelines -
> https://www.hashicorp.com/community-guidelines.html. Behavior in violation
> of those guidelines may result in your removal from this mailing list.
>
> GitHub Issues: https://github.com/hashicorp/consul/issues
> IRC: #consul on Freenode
> ---
> You received this message because you are subscribed to the Google Groups
> "Consul" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to consul-tool...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/consul-tool/7ef5c78d-1239-4d8c-b7f8-a011f0aff348%40googlegroups.com.

Rom Freiman

unread,
Jan 7, 2018, 6:16:59 AM1/7/18
to Consul
Hey,

Thanks for the response.
We actually made a huge progress at Stratoscale with consul load understanding.

The first reason for slowness and high cpu consumption was that we accidentally used blocking query on huge nested key that was continuously updated all over.
Once we removed that query and moved to polling consul servers cpu usage dramatically decreased. 

Another significant issue was that, apparently, what really hurt our performance was the size of raft.db that grew up to 800MB. After that, consul becomes much less responsive.
The root cause for that growth was the fact that one of our process kept updating big key (40K or so) during short period of time - so what happened is that we got ~8000 transactions with 
the same huge key updated. According to our configuration (I guess it's the default), consul performs snapshot every +-8200 transactions, and only then it start re-using
the raft db allocated space - til then, all 8000 transaction has to be stored. If the key is big, then the raft db grows, and since it's never compressed, it will forever affect consul performance.
The above is only empirical analysis - we didnt dig into the code.  
Comparing io rate of consul servers, we see ~500KB/s when raft db is 70MB and ~15-20MB/s when raft db is 800MB (under the same query rate).

Thanks,
Rom

James Phillips

unread,
Jan 9, 2018, 10:40:39 PM1/9/18
to consu...@googlegroups.com
Thanks ROM - that's an interesting data point about the Raft DB size
and performance. We are tracking the lack of ability to resize the db
here - https://github.com/hashicorp/consul/issues/866.

-- James

On Sun, Jan 7, 2018 at 3:16 AM, 'Rom Freiman' via Consul
> https://groups.google.com/d/msgid/consul-tool/e438fdfe-0cab-4e7f-a688-b9bb902cf993%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages