I've been doing a lot of work with grafana (fed by graphite and influxdb) lately, with data from metrics and other sources. I've notice a fairly large philosophical divide in how to use the system as a whole.
Here, in metrics, the system tends to do a lot of calculations of statistics in the application itself - using reservoirs, ewma, etc.
A lot of the functions available in graphite/influxdb can do a lot of those calculations for you, if you push all the raw data into the TSDB itself. This includes per-second rates, histogram plots, moving averages, etc. (I don't see ewma explicitly, but that could be added there as well, I'm sure).
Now, there is clearly a trade-off here: more cpu/memory usage of the application vs more storage in the TSDB and less network traffic to send every event.
Note that this is looking at metrics that are tracking the time it takes to do some action (like request response time of a web server) or other similar event-based metrics. Things that Meters and Timers are good at in metrics. However, in metrics, I don't see a way to push each "event" to the Reporter. Therefore, as a developer, we cannot change the choice of tradeoff without changing the entire choice of metrics library in our application.
To me, having the power to send all the raw data into the TSDB, and then be able to "play" with that data later (without an application update) using all those downstream tools would be grater.
I guess my question is this: was this choice intentional? Has there been talk of supporting both scheduled as well as event-driven reporting (with batching)? Looking at
influxdb-java library, it seems there system is set up to batch events like this - but they of course don't have a fully MetricRegistry system in front of it.
Thoughts?