Metrics at scale

83 views

Skip to first unread message

Alessandro Bellina

unread,

Oct 18, 2016, 4:11:04 PM10/18/16

to metrics-user

Hello

I am looking to learn what are common patterns others have used while adding codahale metrics to many processes (say thousands of JVMs).

Is the common way to use metrics directly in each JVM and leave the complexity of the collection task from that many connections to the aggregator/storage? E.g. I see graphite has caches, relays, and aggregators, and I've seen references to folks creating setups with load balancers at that level. Or, do some funnel raw stats to intermediary services that eventually have Reporters? Not sure if codahale supports a "relay" of sorts.

Any pointers to projects that you know of that may have these any processes producing stats would be welcome.

Thanks,

Alessandro

Timothy Ehlers

unread,

Nov 9, 2016, 10:42:56 AM11/9/16

to metrics-user

Graphite is an old file system technology and does not scale well. You will need to dust off every OPs trick in the book. SSDs/kernel tunings. Move off all the python pids. I recommend carbon-c-relay and go-carbon instead of the python ones. As for the alternatives, Cyanite looks like it works well but it just doesnt work at large scale period. Opentsdb seems to work great but requires non-graphite input. Then you have Influx which also does not work and changes the fundamentals of how it works.