usage of rate function on recording metric

53 views
Skip to first unread message

Venkata Bhagavatula

unread,
Feb 26, 2020, 9:05:52 AM2/26/20
to Prometheus Users
Hi,

In our application, there is one metric that we are deriving from another metric using recording rules.
When we plot the graphs of recording metric and the original metric in grafana, we see the graph to follow the same trend. But when we applied increase , then we have seen recording metric is having huge spikes, whereas original metric is not having these spikes.

following is the plotted graph:
image.png
The bottom panels show the increase of both these  metrics over time. As you can see, there are points where the metric values goes down. Prometheus handles these as resets for metric type “Counter”, and the increase function handles it gracefully.    

Can you let us know how these recording metrics are treated in prometheus? and any pointers on how to debug this issue.

Thanks n Regards,
Chalapathi.

Stuart Clark

unread,
Feb 26, 2020, 12:19:57 PM2/26/20
to Venkata Bhagavatula, Prometheus Users
Counters must only increase. Any reduction is seen as a counter reset.

If this isn't a counter use the derivative function rather than rate
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Venkata Bhagavatula

unread,
Feb 27, 2020, 1:24:28 AM2/27/20
to Stuart Clark, Prometheus Users
Hi ,
Thanks for the response. When we a recording rule, what will be the metric type of this derived counter?. In the increase function documentation it was mentioned that 
increase(v range-vector) calculates the increase in the time series in the range vector. Breaks in monotonicity (such as counter resets due to target restarts) are automatically adjusted for.

Also why it worked on the original metric as on both dervied and original metric has reduction?

Thanks n Regards,
Chalapathi
   

Stuart Clark

unread,
Feb 27, 2020, 2:29:15 AM2/27/20
to promethe...@googlegroups.com, Venkata Bhagavatula, Prometheus Users
The graph showed a reduction at various points. Is there a bug in the recording rule calculation that can cause reduction which needs fixing?

Venkata Bhagavatula

unread,
Mar 9, 2020, 7:22:00 AM3/9/20
to Stuart Clark, Prometheus Users
Hi Stuart,

sorry for the late reply, i was on vacation. 
I will check the recording rule. 
The reduction was same for both recorded metric vs original metric. can you correct my understanding?
After the rule is evaluated, will the type of metric be treated as Gauge?

Thanks n Regards,
Chalapathi

Stuart Clark

unread,
Mar 9, 2020, 7:35:52 AM3/9/20
to Venkata Bhagavatula, Prometheus Users
On 2020-03-09 11:21, Venkata Bhagavatula wrote:
> Hi Stuart,
>
> sorry for the late reply, i was on vacation.
> I will check the recording rule.
> The reduction was same for both recorded metric vs original metric.
> can you correct my understanding?
> After the rule is evaluated, will the type of metric be treated as
> Gauge?
>

If the metric reduced (and it isn't a counter reset) then it suggest it
isn't a counter, and therefore rate() shouldn't be used.
>> The bottom panels show the increase of both these metrics over
>> time. As you can see, there are points where the metric values goes
>> down. Prometheus handles these as resets for metric type
>> “Counter”, and the increase function handles it gracefully.
>>
>> Can you let us know how these recording metrics are treated in
>> prometheus? and any pointers on how to debug this issue.
>>
>> Thanks n Regards,
>> Chalapathi.
>>
>> --
>> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/prometheus-users/CABXnQPuBknw9miVXT0q22YwY%2BiHRLPskaQ45fteARLEzuL9XKg%40mail.gmail.com
> [1].
>
>
> Links:
> ------
> [1]
> https://groups.google.com/d/msgid/prometheus-users/CABXnQPuBknw9miVXT0q22YwY%2BiHRLPskaQ45fteARLEzuL9XKg%40mail.gmail.com?utm_medium=email&utm_source=footer

--
Stuart Clark

Julien Pivotto

unread,
Mar 9, 2020, 7:39:46 AM3/9/20
to Stuart Clark, Venkata Bhagavatula, Prometheus Users
On 09 Mar 11:35, Stuart Clark wrote:
> On 2020-03-09 11:21, Venkata Bhagavatula wrote:
> > Hi Stuart,
> >
> > sorry for the late reply, i was on vacation.
> > I will check the recording rule.
> > The reduction was same for both recorded metric vs original metric.
> > can you correct my understanding?
> > After the rule is evaluated, will the type of metric be treated as
> > Gauge?
> >
>

Hi there,

Please note that metric types in Prometheus are informative. You can in
theory run rate() on both gauges and counters, even if you should only
do it on counters.

Regards

--
(o- Julien Pivotto
//\ Open-Source Consultant
V_/_ Inuits - https://www.inuits.eu
signature.asc

Venkata Bhagavatula

unread,
Mar 10, 2020, 3:48:04 AM3/10/20
to Julien Pivotto, Stuart Clark, Prometheus Users
Hi Stuart, Julien,

Following is being done in the application side for the metrics given in the above mails.:

Some of the metrics used in these charts have multiple labels. Due to the usage of multiple labels and the possible different values of these labels, the cardinality of the metrics can be very high. So to avoid an exponential growth of number of metrics combination that Prometheus ends up scrapping, the application cleans up counters that are not incremented for a some period . So at some point some of the metrics (which have some current value) are removed. 

can this be the reason why we see a drop of the counter value in the above charts ?


Thanks n Regards,

Chalapathi.




 


Stuart Clark

unread,
Mar 10, 2020, 4:56:32 AM3/10/20
to Venkata Bhagavatula, Julien Pivotto, Prometheus Users
On 10/03/2020 07:47, Venkata Bhagavatula wrote:
Hi Stuart, Julien,

Following is being done in the application side for the metrics given in the above mails.:

Some of the metrics used in these charts have multiple labels. Due to the usage of multiple labels and the possible different values of these labels, the cardinality of the metrics can be very high. So to avoid an exponential growth of number of metrics combination that Prometheus ends up scrapping, the application cleans up counters that are not incremented for a some period . So at some point some of the metrics (which have some current value) are removed. 

can this be the reason why we see a drop of the counter value in the above charts ?



Counters must not decrease unless due to a service restart or overflow event. Otherwise the counter logic will assume the decrease is actually due to a reset/overflow and rate() will return a spike at that point.


Thanks n Regards,

Chalapathi.




 



On Mon, Mar 9, 2020 at 5:09 PM Julien Pivotto <roidel...@inuits.eu> wrote:
On 09 Mar 11:35, Stuart Clark wrote:
> On 2020-03-09 11:21, Venkata Bhagavatula wrote:
> > Hi Stuart,
> >
> > sorry for the late reply, i was on vacation.
> > I will check the recording rule.
> > The reduction was same for both recorded metric vs original metric.
> > can you correct my understanding?
> > After the rule is evaluated, will the type of metric be treated as
> > Gauge?
> >
>

Hi there,

Please note that metric types in Prometheus are informative. You can in
theory run rate() on both gauges and counters, even if you should only
do it on counters.

Regards

--
 (o-    Julien Pivotto
 //\    Open-Source Consultant
 V_/_   Inuits - https://www.inuits.eu
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages