On 18/04/2020 13:18, 'vivapolonium' via Prometheus Users wrote:
> Hey everyone,
>
> I'm failry new to prometheus and trying to wrap my head around some
> concepts which are not really clear to me.
>
> I'm running a Scala-Application with the official Prometheus Java
> client. I'm trying to measure the performance of http endpoints and
> use a `Summary` for that. I implemented an endpoint where I serve the
> Metrics via an internal andpoint by taking the `TextFormat.write004`
> method and serving it by myself (not via the included HTTPServlet).
>
> I've setup a Prometheus instance querying that endpoint every 15s and
> set the maxAge of the Summary also to 15s. Now I have a PromQL-Query
> like this: `sum
> by(route)(requests_latency_seconds_sum/requests_latency_seconds_count)*1000`,
> which should give me the average response-time of an endpoint in
> milliseconds for each scrape-interval
>
> When rendering the data though, I get some kind of weirdly aggregated
> data points which is probably a mixture of bad settings and
> misunderstanding. Take this metric for example:
>
> ```
> requests_latency_seconds_count{route="library.get",} 83.0
> requests_latency_seconds_sum{route="library.get",} 949.2774687769999
> ```
>
> This summary does not reset after 15s, instead it keeps accumulating
> all the data which makes it useless to pin-point timebased anomalies
> in my application.
>
That isn't a summary (that would have quantile labels), or at least the
bit you are showing doesn't cover that.
Normal counters don't reset except when the application restarts. Within
PromQL there is the rate() function which allows you to see spikes in
latency over time.
So try to add rate() as described at
https://www.robustperception.io/rate-then-sum-never-sum-then-rate
Generally I don't use summaries and instead use histograms. Summaries
aren't aggregatable (for example if you run multiple instances) or
adjustable within Prometheus. With histograms you can aggregate and
calculate percentiles over any range.
> I digged into the sourcecode of the java library and did not find a
> way to reset the values to zero or remove them after scraping them. Is
> this intentionally? Did I miss something in my configuration? Also, as
> I understood it, the summary is supposed to reset itself?
>
> Hope someone can give me some hints how to solve this
>
--
Stuart Clark