On Tuesday, April 19, 2016 at 7:48:06 AM UTC-7, Brian Brazil wrote:On 19 April 2016 at 15:38, Nick Yantikov <ovo...@gmail.com> wrote:Hello AllI have monotonically incrementing counter of processed requests. My goal is to have an equivalent of graphite's summarize(nonNegativeDerivative(simulated_requests_total), '1m', 'sum', false) or '1h', '1d' etc.The closest to the expected result I could get is with increase(simulated_requests_total[1m]) but it is still not matching the result I expect.Could you please point me in the right direction?When doing Prometheus evaluation over a time range, each point in time is evaluated independently so. So if you want to do alignToFrom=true you need to specify a start time that's aligned with what you want and then use increase().
The other thing is that it looks like in order to properly plot 1minute counter I also have to specify proper "step" value of 60 seconds. Otherwise if I understand correctly it plots results of the overlapping 1minute intervals which does not make much sense. Or does it?
A question about "increase" function. The doc says it "calculates the increase in the time series in the range vector". If my scraping interval is 15 seconds then "simulated_requests_total[1m]" seem to always return four data points (which is expected). However does "increase" function only calculate an increase between these four data points?
For the purpose of what I am trying to achieve it should account for the data point from the scraping immediately preceding the 1m interval. Is that right?
Otherwise I was able to model a situation when if the first data point in the 1m window is a significant spike in requests count (compared to the data point right before it) my chart completely misses the spike.
On 19 April 2016 at 16:51, Nick Yantikov <ovo...@gmail.com> wrote:
On Tuesday, April 19, 2016 at 7:48:06 AM UTC-7, Brian Brazil wrote:On 19 April 2016 at 15:38, Nick Yantikov <ovo...@gmail.com> wrote:Hello AllI have monotonically incrementing counter of processed requests. My goal is to have an equivalent of graphite's summarize(nonNegativeDerivative(simulated_requests_total), '1m', 'sum', false) or '1h', '1d' etc.The closest to the expected result I could get is with increase(simulated_requests_total[1m]) but it is still not matching the result I expect.Could you please point me in the right direction?When doing Prometheus evaluation over a time range, each point in time is evaluated independently so. So if you want to do alignToFrom=true you need to specify a start time that's aligned with what you want and then use increase().
The other thing is that it looks like in order to properly plot 1minute counter I also have to specify proper "step" value of 60 seconds. Otherwise if I understand correctly it plots results of the overlapping 1minute intervals which does not make much sense. Or does it?It makes sense, but not if you're trying to produce a per-minute report.A question about "increase" function. The doc says it "calculates the increase in the time series in the range vector". If my scraping interval is 15 seconds then "simulated_requests_total[1m]" seem to always return four data points (which is expected). However does "increase" function only calculate an increase between these four data points?Yes, it calculates based only on those data points. There's a small bit of extrapolation in there too to produce more accurate numbers overall.
For the purpose of what I am trying to achieve it should account for the data point from the scraping immediately preceding the 1m interval. Is that right?That's one way you could do it, but what we have works quite well without that.
Otherwise I was able to model a situation when if the first data point in the 1m window is a significant spike in requests count (compared to the data point right before it) my chart completely misses the spike.
If that's the sort of thing you're looking for then this is not how you should go about it. We recommend use of rate() to keep things consistently measured per-second, and then have your step smaller than the range in your rate(). This will let you see any spikes.
On Tue, Apr 19, 2016 at 8:59 AM, Brian Brazil <brian....@robustperception.io> wrote:On 19 April 2016 at 16:51, Nick Yantikov <ovo...@gmail.com> wrote:
On Tuesday, April 19, 2016 at 7:48:06 AM UTC-7, Brian Brazil wrote:On 19 April 2016 at 15:38, Nick Yantikov <ovo...@gmail.com> wrote:Hello AllI have monotonically incrementing counter of processed requests. My goal is to have an equivalent of graphite's summarize(nonNegativeDerivative(simulated_requests_total), '1m', 'sum', false) or '1h', '1d' etc.The closest to the expected result I could get is with increase(simulated_requests_total[1m]) but it is still not matching the result I expect.Could you please point me in the right direction?When doing Prometheus evaluation over a time range, each point in time is evaluated independently so. So if you want to do alignToFrom=true you need to specify a start time that's aligned with what you want and then use increase().
The other thing is that it looks like in order to properly plot 1minute counter I also have to specify proper "step" value of 60 seconds. Otherwise if I understand correctly it plots results of the overlapping 1minute intervals which does not make much sense. Or does it?It makes sense, but not if you're trying to produce a per-minute report.A question about "increase" function. The doc says it "calculates the increase in the time series in the range vector". If my scraping interval is 15 seconds then "simulated_requests_total[1m]" seem to always return four data points (which is expected). However does "increase" function only calculate an increase between these four data points?Yes, it calculates based only on those data points. There's a small bit of extrapolation in there too to produce more accurate numbers overall.Could you please point me to the source code so I can learn more?
For the purpose of what I am trying to achieve it should account for the data point from the scraping immediately preceding the 1m interval. Is that right?That's one way you could do it, but what we have works quite well without that.What is the other way that works well?
Otherwise I was able to model a situation when if the first data point in the 1m window is a significant spike in requests count (compared to the data point right before it) my chart completely misses the spike.
If that's the sort of thing you're looking for then this is not how you should go about it. We recommend use of rate() to keep things consistently measured per-second, and then have your step smaller than the range in your rate(). This will let you see any spikes.I'd say that this is a very common request to plot counters over time periods (1m, 5m, 1h, 1d, etc). I cannot override this requirement. There is a graphite (in grafana) dashboard that plots counters over time intervals. summarize(nonNegativeDerivative(simulated_requests_total), '1m', 'sum', false) produces exact results.
My goal is to get the same dashboard using Prometheus data and queries.
Summarizing the conversation so far it looks like in order to produce a dashboard with 1 minute counters I need to query "increase(simulated_requests_total[75s])" where 75 == 60+15second scraping interval and use step = 60. Is this how you would recommend approaching this or am I missing the mark entirely? Are there better ways of achieving this requirement?
This was never my intention to slide into religious wars or framework battle.My question really is how to derive some results based on the information already recorded in Prometheus. The reasons I brought graphite up are: a) as people use graphite queries currently they will be asking (myself including) what is the path to achieve the same in Prometheus, and b) graphite does return results that match test data.
Namely, how do I aggregate counters into larger time intervals based on monotonically increasing counter metric? I might be oversimplifying things but it looks like if there was a function that takes a diff between t and t-1 datapoints (accounting for counter resets of course) then I would "sum_over_time" results of this function to get desired result. By the same token if I reset counter in my test harness after every scraping period (just for the sake of the experiment) then sum_over_time(simulated_requests_total[1m]) produces results that match test data.