A query on increase function

21 views
Skip to first unread message

Manish G

unread,
Sep 1, 2020, 9:03:34 AM9/1/20
to promethe...@googlegroups.com
Hi All,

I have a query regarding working of  increase function.

Suppose for a query I get multiple metrics in return(a use-case can be like an application running in multiple pods in kubernetes, so multiple sources for same metrics).
I apply a 1h time window.
mymetrics[1h])

This results in metrics, say m1,m2, m3.
m1 exists till somewhere in middle of the time window, m2 is there throughout the window, while m3 started somewhere in between of the window and exists till end.

So if I do:
sum(increase(mymetrics[1h]))

Would this take into account all 3? I mean would I get:
increase(m1[1h])+ increase(m2[1h]) +increase(m3[1h])

With regards



Brian Candler

unread,
Sep 1, 2020, 9:30:33 AM9/1/20
to Prometheus Users
On Tuesday, 1 September 2020 14:03:34 UTC+1, Manish G wrote:
I have a query regarding working of  increase function.

Suppose for a query I get multiple metrics in return(a use-case can be like an application running in multiple pods in kubernetes, so multiple sources for same metrics).
I apply a 1h time window.
mymetrics[1h])

This results in metrics, say m1,m2, m3.
m1 exists till somewhere in middle of the time window, m2 is there throughout the window, while m3 started somewhere in between of the window and exists till end.


mymetrics[1h] is a "range vector": a two-dimensional entity.  Showing series on the Y axis and timestamp on the X axis, what you describe might look like this:

^ mymetrics{instance="m1"}  m1.0 m1.1 m1.2
| mymetrics{instance="m2"}  m2.0 m2.1 m2.2 m2.3 m2.4
| mymetrics{instance="m3"}            m3.2 m3.3 m3.4
                             t0   t1   t2   t3   t4 ----->


 
So if I do:
sum(increase(mymetrics[1h]))

Would this take into account all 3? I mean would I get:
increase(m1[1h])+ increase(m2[1h]) +increase(m3[1h])

Yes, the expression is evaluated inside to out.  So first you calculate

increase(mymetrics[1h])

which gives an instant vector: a separate value for m1, m2, m3 at a single point in time.

^ mymetrics{instance="m1"}  (m1.2-m1.0)/(t2-t0) * 1h
| mymetrics{instance="m2"}  (m2.4-m2.0)/(t4-t0) * 1h
| mymetrics{instance="m3"}  (m3.4-m3.2)/(t4-t2) * 1h
                                      t4 ----->

As you can see, the increase() is calculated separately for each series.

Then you run sum() across this instant vector, and get a single answer.

Manish G

unread,
Sep 1, 2020, 9:55:25 AM9/1/20
to Brian Candler, Prometheus Users
Thanks for the detailed response.

Going by (m1.2-m1.0)/(t2-t0) * 1h, even though m1 played out only for period t2-t0, we still multiply by 1h, and same for m3.
So while increase() is expected to give absolute delta(m1.2-m1.0), it seems that's not the case.

Another concern: when we apply increase function, does the system divide whole of time line into that many time slots(total time/time window) and then we get one data point per slot? I am asking this because I observe that even for huge time slots like 3hours, I see continuou curve, something which is not possible if we get one data point per time slot.

With regards

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/d114496f-9c4a-4501-ab80-bfcbce98d37do%40googlegroups.com.

Brian Candler

unread,
Sep 1, 2020, 10:12:29 AM9/1/20
to Prometheus Users
On Tuesday, 1 September 2020 14:55:25 UTC+1, Manish G wrote:
Going by (m1.2-m1.0)/(t2-t0) * 1h, even though m1 played out only for period t2-t0, we still multiply by 1h, and same for m3.
So while increase() is expected to give absolute delta(m1.2-m1.0), it seems that's not the case.

It's not expected to give the absolute delta. See:

If you want the absolute delta, then you can say

mymetric - mymetric offset 1h

Note that this will not do anything special for counter resets, and may give you a meaningless or negative value if the counter has reset.

There is also a delta() function, but that also extrapolates the value to cover the full range, and is also not intended to be used for counters.



Another concern: when we apply increase function, does the system divide whole of time line into that many time slots(total time/time window) and then we get one data point per slot? I am asking this because I observe that even for huge time slots like 3hours, I see continuou curve, something which is not possible if we get one data point per time slot.

Unless you are using a subquery, the values used are the actual values in the database, with their actual timestamps when they were scraped.  They are not resampled.

Your "continuous curve" is probably because you are displaying this query as a graph.  Graphing *does* repeat the query over a range in steps, so that it calculates the answer at t(0), t(1), t(2), t(3) ... t(N) which become the points of the graph.

At each instant the same query is run.  So if your query includes mymetrics[1h], then to generate the graph point at t(0) it will use mymetrics(-3600,0].  At the graph point t(1) it will use mymetrics(-3599,1].  And so on.

Manish G

unread,
Sep 1, 2020, 10:29:10 AM9/1/20
to Brian Candler, Prometheus Users
Got it. Thanks a lot.

BTW, can a reset happen on a metrics even if it's alive? If that's not the case, then we may not have much issue as we handle all individual metrics independently for increase calculation.

With regards

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.

Brian Candler

unread,
Sep 1, 2020, 10:39:24 AM9/1/20
to Prometheus Users
Counters should be monotonically increasing.  A "reset" just means that the exporter caused the counter to decrease for some reason (typically a process restart).

If a counter goes 3 ... 7 ... 10 ... 2 ... 5 ...  then you have no idea what happened between the "10" and the "2".  Therefore, increase() will work out the average rate using 3...7...10 and 2...5, and then extrapolate it to cover the whole period.

As for the metric being "alive": a metric is either present in a scrape, or missing - in which case, it's immediately considered "stale" by prometheus.  A metric which has not been scraped for 5 minutes is also considered stale.
Reply all
Reply to author
Forward
0 new messages