The result of delta function is not same with raw data?

318 views
Skip to first unread message

Sungup Moon

unread,
Nov 19, 2020, 9:29:12 PM11/19/20
to Prometheus Users
Hello.

Currently I’m making a dashboard to detect some device errors using error counter. That error counter is a cumulative data, so I think delta() can be a good function to detection.

But, delta value cannot show the real differences between time series, output is always little bit bigger than real defferences.

Following is the query and result using grafana. When I use prometheus directly, it also similar result using 15s ~ 1m interval.

Query:
 1. normal query: error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}
 2. delta query: delta(error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}[$__interval] > 0)

Time Range: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec interval

result

16:16:15~30 raise 2 errors on device and move that error counter value from 7616 to 7618,
but the delta query shows result of 3

time                             ,            delta  ,                 normal 
2020-11-19 16:16:00,                       ,                     7616
2020-11-19 16:16:15,                       ,                     7616
2020-11-19 16:16:30,                     3,                     7618
2020-11-19 16:16:45,                       ,                     7618
2020-11-19 16:17:00,                       ,                     7618
(keep these value until end of query time range)

Am I misuse about the delta() function? I’m so pleasure If anybody told me how can I detect error count using delta() or any other way.

(I’m sorry that I can’t share the detail query information and data snapshot)

Thanks!

Brian Brazil

unread,
Nov 20, 2020, 7:11:49 AM11/20/20
to Sungup Moon, Prometheus Users
On Fri, 20 Nov 2020 at 02:29, Sungup Moon <mono...@gmail.com> wrote:
Hello.

Currently I’m making a dashboard to detect some device errors using error counter. That error counter is a cumulative data, so I think delta() can be a good function to detection.

Delta is for gauges, it is incorrect to use it with counters. You're probably looking for increase.

Brian
 

But, delta value cannot show the real differences between time series, output is always little bit bigger than real defferences.

Following is the query and result using grafana. When I use prometheus directly, it also similar result using 15s ~ 1m interval.

Query:
 1. normal query: error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}
 2. delta query: delta(error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}[$__interval] > 0)

Time Range: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec interval

result

16:16:15~30 raise 2 errors on device and move that error counter value from 7616 to 7618,
but the delta query shows result of 3

time                             ,            delta  ,                 normal 
2020-11-19 16:16:00,                       ,                     7616
2020-11-19 16:16:15,                       ,                     7616
2020-11-19 16:16:30,                     3,                     7618
2020-11-19 16:16:45,                       ,                     7618
2020-11-19 16:17:00,                       ,                     7618
(keep these value until end of query time range)

Am I misuse about the delta() function? I’m so pleasure If anybody told me how can I detect error count using delta() or any other way.

(I’m sorry that I can’t share the detail query information and data snapshot)

Thanks!

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/b26e2db7-0a67-4e13-ad3f-046caeec6a78n%40googlegroups.com.


--

b.ca...@pobox.com

unread,
Nov 20, 2020, 7:48:59 AM11/20/20
to Prometheus Users
On Friday, 20 November 2020 at 02:29:12 UTC mono...@gmail.com wrote:
Query:
 1. normal query: error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}
 2. delta query: delta(error_counter_something{job=“monitor”, device=“dev0”, serial=“xxxxxxxx”}[$__interval] > 0)

Time Range: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec interval

result

16:16:15~30 raise 2 errors on device and move that error counter value from 7616 to 7618,
but the delta query shows result of 3

time                             ,            delta  ,                 normal 
2020-11-19 16:16:00,                       ,                     7616
2020-11-19 16:16:15,                       ,                     7616
2020-11-19 16:16:30,                     3,                     7618
2020-11-19 16:16:45,                       ,                     7618
2020-11-19 16:17:00,                       ,                     7618
(keep these value until end of query time range)


"delta(v range-vector) calculates the difference between the first and last value of each time series element in a range vector v, returning an instant vector with the given deltas and equivalent labels. The delta is extrapolated to cover the full time range as specified in the range vector selector, so that it is possible to get a non-integer result even if the sample values are all integers."

You haven't said what $__interval expands to in your query.  It must be at least 30 seconds, because otherwise you wouldn't have two values in your range vector.

So let's see what happens with 30 seconds.  The window contains two values:

[...X........X...]
   7616    7618
    <--15s-->

The difference between these is 2, and the time interval between them is 15 seconds.  However this increase is then extrapolated to cover the whole window period of 30 seconds, so the value returned by delta() would be 4.

What about if $__interval was 45 seconds?  Then you'd have three values, the difference between the first and last is 2, the time difference is 30 seconds extrapolated to 45 seconds, so the result would be 2 x (45/30) = 3.

If you want the actual difference between the metric now and the metric some time ago, you can do :

something - something offset 15s

However, both that expression and delta() will give you nonsense values if a counter resets, because it will jump back down towards zero and give you a large negative value.

Better:

(something - something offset 15s) >= 0

but it won't handle counter resets as well as rate() or increase() can.

Aliaksandr Valialkin

unread,
Nov 21, 2020, 9:15:35 AM11/21/20
to b.ca...@pobox.com, Prometheus Users
There is an alternative solution - to use the increase() function from MetricsQL - it doesn't extrapolate results and it takes into account the previous value before the window in square brackets. So it returns exact expected values. See more details at https://victoriametrics.github.io/MetricsQL.html 

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.


--
Best Regards,

Aliaksandr Valialkin, CTO VictoriaMetrics

Ben Kochie

unread,
Nov 21, 2020, 9:34:47 AM11/21/20
to Aliaksandr Valialkin, b.ca...@pobox.com, Prometheus Users
While that sounds like a good idea, it's going to produce less accurate results for most use cases.

Aliaksandr Valialkin

unread,
Nov 21, 2020, 9:50:59 AM11/21/20
to Ben Kochie, b.ca...@pobox.com, Prometheus Users
On Sat, Nov 21, 2020 at 4:34 PM Ben Kochie <sup...@gmail.com> wrote:
While that sounds like a good idea, it's going to produce less accurate results for most use cases.

Could you provide practical examples?

Julien Pivotto

unread,
Nov 22, 2020, 9:07:26 AM11/22/20
to Ben Kochie, Aliaksandr Valialkin, b.ca...@pobox.com, Prometheus Users
On 21 Nov 15:34, Ben Kochie wrote:
> While that sounds like a good idea, it's going to produce less accurate
> results for most use cases.

It was agreed on previous dev summit that we would look into how we can
improve rate() alike functions. I don't think anyone has started on this
yet.

>
> On Sat, Nov 21, 2020 at 3:15 PM Aliaksandr Valialkin <val...@gmail.com>
> wrote:
>
> > There is an alternative solution - to use the increase() function from
> > MetricsQL - it doesn't extrapolate results and it takes into account the
> > previous value before the window in square brackets. So it returns exact
> > expected values. See more details at
> > https://victoriametrics.github.io/MetricsQL.html
> >
> > On Fri, Nov 20, 2020 at 2:49 PM b.ca...@pobox.com <b.ca...@pobox.com>
> > wrote:
> >
> >> On Friday, 20 November 2020 at 02:29:12 UTC mono...@gmail.com wrote:
> >>
> >>> *Query*:
> >>> 1. normal query: error_counter_something{job=“monitor”, device=“dev0”,
> >>> serial=“xxxxxxxx”}
> >>> 2. delta query: delta(error_counter_something{job=“monitor”,
> >>> device=“dev0”, serial=“xxxxxxxx”}[$__interval] > 0)
> >>>
> >>> *Time Range*: 2020-11-19 16:16:00 ~ 2020-11-19 16:20:00 with 15sec
> >>> interval
> >>>
> >>> *result*
> >>>
> >>> 16:16:15~30 raise 2 errors on device and move that error counter value
> >>> from 7616 to 7618,
> >>> but the delta query shows result of 3
> >>>
> >>> time , delta ,
> >>> normal
> >>> 2020-11-19 16:16:00, , 7616
> >>> 2020-11-19 16:16:15, , 7616
> >>> 2020-11-19 16:16:30, 3, 7618
> >>> 2020-11-19 16:16:45, , 7618
> >>> 2020-11-19 16:17:00, , 7618
> >>> (keep these value until end of query time range)
> >>>
> >>>
> >> See
> >> https://prometheus.io/docs/prometheus/latest/querying/functions/#delta
> >> *"delta(v range-vector) calculates the difference between the first and
> >> last value of each time series element in a range vector v, returning an
> >> instant vector with the given deltas and equivalent labels. The delta is
> >> extrapolated to cover the full time range as specified in the range vector
> >> selector, so that it is possible to get a non-integer result even if the
> >> sample values are all integers."*
> >> <https://groups.google.com/d/msgid/prometheus-users/f75b2007-96ed-4c66-b719-602934827cd3n%40googlegroups.com?utm_medium=email&utm_source=footer>
> >> .
> >>
> >
> >
> > --
> > Best Regards,
> >
> > Aliaksandr Valialkin, CTO VictoriaMetrics
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Prometheus Users" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to prometheus-use...@googlegroups.com.
> > To view this discussion on the web visit
> > https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAMMFA3gsTAbqVC5uz%3Dp-kymL3QT31vb9G3vOhemjWCfw%40mail.gmail.com
> > <https://groups.google.com/d/msgid/prometheus-users/CAPbKnmAMMFA3gsTAbqVC5uz%3Dp-kymL3QT31vb9G3vOhemjWCfw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> > .
> >
>
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABbyFmprYUL4BbT2UA8O%2BE_tX9wnkd8goZ8UUKJxPASwwGnWig%40mail.gmail.com.

--
Julien Pivotto
@roidelapluie
Reply all
Reply to author
Forward
0 new messages