looks like rate/irate/delta/increase are not calculating consistent

391 views
Skip to first unread message

Alex

unread,
Jan 28, 2021, 6:58:57 AM1/28/21
to Prometheus Users
Hi,
I think rate/irate/delta/increase are not working correctly for most resolutions.
I however got redirected from github to here so please have a look and tell me what you think about this.
Is this a bug or do I get something wrong here?
Please have a look at the details in: https://github.com/prometheus/prometheus/issues/8413 

Thanks
Alex

Jeyrce Lu

unread,
Jan 28, 2021, 7:11:53 AM1/28/21
to Prometheus Users
Try to boost frequency until interval as high as you can.

Alex

unread,
Jan 28, 2021, 9:02:44 AM1/28/21
to Prometheus Users
I am quite sure it has nothing to do with the interval

Julius Volz

unread,
Jan 28, 2021, 9:08:34 AM1/28/21
to Alex, Prometheus Users
Hi,

Regarding "For a counter increase by 1 I expect a rate() result/value of 1/60 = 0.01666666666." - rate()'s extrapolating behavior might be the thing surprising you here, which can be extra surprising for very slow-moving counters like yours. rate() tries to calculate the best approximation of the increase of a counter *on average*, but since it has to operate on sampled values over time, it can never know the "right" value for sure. But imagine you provide a [5m] window, and the actual first+last samples under the window are only 4m apart, and thus don't 100% coincide with the beginning and end of your [5m] window. Thus the rate() (and increase()) function operate on the question: "ok but WHAT IF we had had data matching exactly the full window", and extrapolate the observed 4m-based slope to the whole 5m window. Thus even if you use increase() on a counter that only increases by an integer amount, you will typically get back a non-integer result that represents the whole window, and not just the raw increases seen between actual samples. You can find the exact details of how this works (including an exception when the first/last samples are too far away from the window boundaries) in the code here: https://github.com/prometheus/prometheus/blob/275f7e7766f80648d6e63ed968685f3963b494e9/promql/functions.go#L55-L131

Short summary is: rate() / increase() will give you on-average decent approximation of the actual rate of increase by extrapolating to the window boundaries.

Regards,
Julius

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/097e9b1d-8c12-49d4-8a78-66ff4c4f555fn%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

Alex

unread,
Jan 28, 2021, 11:29:08 AM1/28/21
to Prometheus Users
a rate series would never be exact but a rate on a counter could be exact since the counter is exact. do you agree?

I also agree to your multiple counter increases in one window size argumentation but thats not the case here. There is only one counter value increase (by one) within the window.

All values are known and in the past. Also the window with 1m is larger then the series interval so there is no extrapolation needed but i would understand interpolation / downsampling deu to the 1m. Do I miss on something here?

All this does not explain why the query results behave like they do.
So let me frase questions again:
a) why is [1m] behaving different then [1m:]? the optional resolution should behave the same in both cases - right?
b) why is [1m:15s] and [1m:30s] also different to [1m] and [1m:]? In my understanding the result should be the same for those due to the data and series interval.
c) why is the result of [1m] (and most others) twice as high as it should be (or even other strange values not matching the functions description)?
d) why is sometimes the result even stepped in a unexpected way e.g. [5m:1m]?

Julius Volz

unread,
Jan 28, 2021, 4:22:56 PM1/28/21
to Alex, Prometheus Users
On Thu, Jan 28, 2021 at 5:29 PM Alex <alexander....@gmail.com> wrote:
a rate series would never be exact but a rate on a counter could be exact since the counter is exact. do you agree?

I'm not 100% sure what you mean with the rate being exact because the counter is exact. The tricky bit is that we have to try to guess how the counter behaves outside of the exact data points that we actually have.

I felt like a meditative exercise, so I drew this example of how "rate(foo[1m])" would work with an assumed scrape interval of 15s:

rate-extrapolated.png

I did the calculation purely visually, getting a value of around 0.25/s, which is close enough for me to the 0.29/s or so you are getting. You can see that the extrapolated slope is steeper than the actual samples around the rate window would yield, but rate() has no idea how the data around the window behaves and thus guesses that on average it will behave similarly as under the window, which is not always precisely right.
 
I also agree to your multiple counter increases in one window size argumentation but thats not the case here. There is only one counter value increase (by one) within the window.

Yep, same above.
 
All values are known and in the past. Also the window with 1m is larger then the series interval so there is no extrapolation needed but i would understand interpolation / downsampling deu to the 1m. Do I miss on something here?

All this does not explain why the query results behave like they do.
So let me frase questions again:
a) why is [1m] behaving different then [1m:]? the optional resolution should behave the same in both cases - right?

This comes down to subtle evaluation semantics.

- foo[1m] is a range vector selector, running in a single outer PromQL query. It passes all raw samples as-is into rate().
- foo[1m:] can also be written as (foo)[1m:]. It's an instant vector selector that is being run as a subquery, where every subquery aligns its output points to the subquery resolution step, rather than giving you raw samples, so you get slightly shifted samples passed into rate(). Generally you want to run rate() on completely raw data (also to not lose any counter resets if subqueries run at coarser resolutions than the underlying data).
 
b) why is [1m:15s] and [1m:30s] also different to [1m] and [1m:]? In my understanding the result should be the same for those due to the data and series interval.

I haven't looked more into these but I assume it's similar issues.
 
c) why is the result of [1m] (and most others) twice as high as it should be (or even other strange values not matching the functions description)?

See the diagram.
 
d) why is sometimes the result even stepped in a unexpected way e.g. [5m:1m]?

As far as I can see there's no example data or image provided for this, so it's hard to answer this one.
 
juliu...@promlabs.com schrieb am Donnerstag, 28. Januar 2021 um 15:08:34 UTC+1:
Hi,

Regarding "For a counter increase by 1 I expect a rate() result/value of 1/60 = 0.01666666666." - rate()'s extrapolating behavior might be the thing surprising you here, which can be extra surprising for very slow-moving counters like yours. rate() tries to calculate the best approximation of the increase of a counter *on average*, but since it has to operate on sampled values over time, it can never know the "right" value for sure. But imagine you provide a [5m] window, and the actual first+last samples under the window are only 4m apart, and thus don't 100% coincide with the beginning and end of your [5m] window. Thus the rate() (and increase()) function operate on the question: "ok but WHAT IF we had had data matching exactly the full window", and extrapolate the observed 4m-based slope to the whole 5m window. Thus even if you use increase() on a counter that only increases by an integer amount, you will typically get back a non-integer result that represents the whole window, and not just the raw increases seen between actual samples. You can find the exact details of how this works (including an exception when the first/last samples are too far away from the window boundaries) in the code here: https://github.com/prometheus/prometheus/blob/275f7e7766f80648d6e63ed968685f3963b494e9/promql/functions.go#L55-L131

Short summary is: rate() / increase() will give you on-average decent approximation of the actual rate of increase by extrapolating to the window boundaries.

Regards,
Julius

On Thu, Jan 28, 2021 at 12:59 PM Alex <alexander....@gmail.com> wrote:
Hi,
I think rate/irate/delta/increase are not working correctly for most resolutions.
I however got redirected from github to here so please have a look and tell me what you think about this.
Is this a bug or do I get something wrong here?
Please have a look at the details in: https://github.com/prometheus/prometheus/issues/8413 

Thanks
Alex

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/097e9b1d-8c12-49d4-8a78-66ff4c4f555fn%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Alex

unread,
Jan 29, 2021, 7:56:31 AM1/29/21
to Prometheus Users
First of all thank you very much for your detailed answer. Some things however are still not clear to me.

Your drawn rate window is 75s in total (not 60s as I expect it to be) since it contains 4x 15s samples plus 2x 1/2 step to the next sample.
I would expect the window to be exactly 60s including edges and aligned with data points.
4x 15s samples in my mind sums up to a perfect 60s interval for the window. When the window is aligned with the data and matching a multiple of the interval you should get nice/perfect results - right?
Can you explain where the difference comes from? What is wrong with my understanding of a 60s rate window?

Thank you a lot for explaining the semantic difference between [1m] and [1m:] that helped a lot. I however as a user would not expect this but it makes sense when it picks 1m for the subquery.

In the github issue I have a screenshot with some graphs of example data with different queries.

Julius Volz

unread,
Jan 29, 2021, 8:12:53 AM1/29/21
to Alex, Prometheus Users
On Fri, Jan 29, 2021 at 1:56 PM Alex <alexander....@gmail.com> wrote:
First of all thank you very much for your detailed answer. Some things however are still not clear to me.

Your drawn rate window is 75s in total (not 60s as I expect it to be) since it contains 4x 15s samples plus 2x 1/2 step to the next sample.

The drawn rate window is indeed exactly 60s. In the underlying grid of the drawing tool, I made 2 boxes represent 15s. You can count that there are exactly 4 of those double boxes under the rate window. And a 60s rate window will usually contain 4 samples that are 15 seconds apart.
 
I would expect the window to be exactly 60s including edges and aligned with data points.

The range time windows are usually *not* aligned to the data points at all, but chosen completely separately (based on the evaluation timestamp and the size of the window stretching backwards from that timestamp), and then the range vector selector just selects any samples that happen to fall under the selected time window. Consider also that a single range vector selector like "foo[5m]" (with exactly one set of window boundaries) can select many different time series at once, where not even the samples between the different series are necessarily aligned with each other. And the window size will also almost never form a *perfect* multiple of your actual scrape timestamps, even your configure your scrape intervals to match your rate window boundaries in that way.

So you end up with a blanket pre-chosen time window, and then whatever samples happen to fall into it are used for the rate calculation under that window for each series.
 
4x 15s samples in my mind sums up to a perfect 60s interval for the window. When the window is aligned with the data and matching a multiple of the interval you should get nice/perfect results - right?
Can you explain where the difference comes from? What is wrong with my understanding of a 60s rate window?

Same explanation as above, I guess.

Thank you a lot for explaining the semantic difference between [1m] and [1m:] that helped a lot. I however as a user would not expect this but it makes sense when it picks 1m for the subquery.

Yeah... to be honest, subqueries don't even usually come up in the normal rate context and are mostly useful for situations where you want to pass a derived expression (vs. a raw series) into a function that expects a range vector, but you don't want to go through the route of using a recording rule to record the intermediary result into a materialized time series first.

In general, the articles https://promlabs.com/blog/2020/06/18/the-anatomy-of-a-promql-query and https://promlabs.com/blog/2020/07/02/selecting-data-in-promql could be interesting to understand a bit more the execution semantics of instant/range queries, as well as of instant/range vectors.

Hope this helps a bit!
 

Julius Volz

unread,
Jan 29, 2021, 12:25:11 PM1/29/21
to Alex, Prometheus Users
This actually triggered me to write a blog post about exactly this :) https://promlabs.com/blog/2021/01/29/how-exactly-does-promql-calculate-rates

Julien Pivotto

unread,
Jan 29, 2021, 12:29:44 PM1/29/21
to Julius Volz, Alex, Prometheus Users
There are other resources:

https://www.youtube.com/watch?v=67Ulrq6DxwA ( https://slideshare.net/brianbrazil/counting-with-prometheus-cloudnativeconkubecon-europe-2017 )
https://www.robustperception.io/what-range-should-i-use-with-rate
https://grafana.com/go/grafanaconline/prometheus-rate-queries-in-grafana/


.
> >>>>>> <https://groups.google.com/d/msgid/prometheus-users/097e9b1d-8c12-49d4-8a78-66ff4c4f555fn%40googlegroups.com?utm_medium=email&utm_source=footer>
> >>>>>> .
> >>>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> Julius Volz
> >>>>> PromLabs - promlabs.com
> >>>>>
> >>>> --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "Prometheus Users" group.
> >>>> To unsubscribe from this group and stop receiving emails from it, send
> >>>> an email to prometheus-use...@googlegroups.com.
> >>>>
> >>> To view this discussion on the web visit
> >>>> https://groups.google.com/d/msgid/prometheus-users/3cec78c4-851d-4262-8b1a-cfd7899b0da7n%40googlegroups.com
> >>>> <https://groups.google.com/d/msgid/prometheus-users/3cec78c4-851d-4262-8b1a-cfd7899b0da7n%40googlegroups.com?utm_medium=email&utm_source=footer>
> >>>> .
> >>>>
> >>>
> >>>
> >>> --
> >>> Julius Volz
> >>> PromLabs - promlabs.com
> >>>
> >> --
> >> You received this message because you are subscribed to the Google Groups
> >> "Prometheus Users" group.
> >> To unsubscribe from this group and stop receiving emails from it, send an
> >> email to prometheus-use...@googlegroups.com.
> >> To view this discussion on the web visit
> >> https://groups.google.com/d/msgid/prometheus-users/50b8f357-5a46-497c-8ebb-7283160b2c1an%40googlegroups.com
> >> <https://groups.google.com/d/msgid/prometheus-users/50b8f357-5a46-497c-8ebb-7283160b2c1an%40googlegroups.com?utm_medium=email&utm_source=footer>
> >> .
> >>
> >
> >
> > --
> > Julius Volz
> > PromLabs - promlabs.com
> >
>
>
> --
> Julius Volz
> PromLabs - promlabs.com
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CAObpH5yv4zrWfFBQqbcNo5A9P-GFoab8q9_xV3s0Zq7%2B97YXYg%40mail.gmail.com.

--
Julien Pivotto
@roidelapluie

Julius Volz

unread,
Jan 29, 2021, 12:59:37 PM1/29/21
to Julius Volz, Alex, Prometheus Users
Yup, thanks! Goes to show just how much to say there is just about this topic :)

Alex

unread,
Feb 1, 2021, 7:38:32 AM2/1/21
to Prometheus Users
Thank you both for the detailed explanation and the additional material on this topic. I do now understand why I get the results I get.
I however think the method used for calculation can be improved to provide more precise results and provide a better match to expectations.

I propose one changes to the algorithm for this:
align the window with the data points. This way you do not need to extrapolate at all in most cases and if you need to do so you would only do it to one edge. In the example above this would mean the 1m window covers 5 datapoints from edge to edge with a 15s scrape interval with datapoints aligning on the edges perfectly.
I understand that this may add complexity for multiple series in one query especially when they are on a different interval but in general the results would match up better especially for smaller ranges.

What do you think about that?

Stuart Clark

unread,
Feb 1, 2021, 7:51:50 AM2/1/21
to Alex, Prometheus Users
On 2021-02-01 12:38, Alex wrote:
> Thank you both for the detailed explanation and the additional
> material on this topic. I do now understand why I get the results I
> get.
> I however think the method used for calculation can be improved to
> provide more precise results and provide a better match to
> expectations.
>
> I propose one changes to the algorithm for this:
> align the window with the data points. This way you do not need to
> extrapolate at all in most cases and if you need to do so you would
> only do it to one edge. In the example above this would mean the 1m
> window covers 5 datapoints from edge to edge with a 15s scrape
> interval with datapoints aligning on the edges perfectly.
> I understand that this may add complexity for multiple series in one
> query especially when they are on a different interval but in general
> the results would match up better especially for smaller ranges.
>
> What do you think about that?
>

As Julius mentioned that is in general not possible to achieve.

Say you are monitoring 4 servers with a 1 minute scrape interval. You
would therefore expect one server to be scraped every 15 seconds (as
Prometheus tries to spread out scrapes evenly). If you then do a query
there is no window that would include 5 datapoints for all 4 timeseries.

In general I'd expect the vast majority of queries to be working on
multiple series in this way (multiple servers, pods, etc.)

Equally you are also expecting that all scrapes are going to be at
precisely the same interval (exactly 15 seconds apart). You might lose a
scrape occasionally (network or app error) and scrapes might take
slightly less or more time to be returned (while Prometheus tries to
keep things regular it couldn't guarantee that points are going to be
15s to the millisocond apart - you could have a scrape that sometimes
takes an extra second to return data).

--
Stuart Clark

Alex

unread,
Feb 1, 2021, 8:31:25 AM2/1/21
to Prometheus Users
Within data series of one scrape source this still makes sense when I am not mistaken.
You could align the data to its interval (which is I think done anyway when I understand this correctly: https://promlabs.com/blog/2020/06/18/the-anatomy-of-a-promql-query).
But even if one datapoint would be outside the window due to slower scrapes you end up with the same error extrapolating then without alignment.
The same is true for missing a sample - it would still work the same way as it dows now for such cases.

For multiple series this approach might not be better in general but at least not worse then the current approach since if it is not aligned the extrapolation works the same way and should give the same error on average is now.
You could also consider using a different window alignment for a series from a different scrape source.

Maybe I still miss something here but in my understanding an alignment can not make the extrapolation result worse
Reply all
Reply to author
Forward
0 new messages