Help with absent data

393 views
Skip to first unread message

Jennifer K

unread,
Jan 11, 2021, 7:10:36 PM1/11/21
to Prometheus Users
to anyone that can help...
I've been trying to a total number of "scrapes" by adding successful, unsuccessful and absent points together
this algorithm isn't working- any anyone explain why-

(probe_success==bool 0) + (probe_success == bool 1) + ignoring (target) sum without (target) (absent (probe_success))

should be simple, but I just can't get it to work- any help would greatly be appreciated.
Thanks!
Jennifer

Ben Kochie

unread,
Jan 12, 2021, 2:43:26 AM1/12/21
to Jennifer K, Prometheus Users
count_over_time(probe_success[5m])

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a8b8a4a0-59e7-4087-bba6-a54252ebafdcn%40googlegroups.com.

jennife...@gmail.com

unread,
Jan 12, 2021, 7:29:56 AM1/12/21
to Ben Kochie, Prometheus Users
Ben, thank you very much for responding- but if there is absent data then probe_success won’t count that data-
Correct? 
Jennifer 

Sent from my iPhone

On Jan 12, 2021, at 2:43 AM, Ben Kochie <sup...@gmail.com> wrote:



Julius Volz

unread,
Jan 12, 2021, 9:40:17 AM1/12/21
to Jennifer K, Prometheus Users
Do you want to count across scraped instances at *one* point in time, or do you want to count scrapes of a / each single instance *over* time?

On Tue, Jan 12, 2021 at 1:10 AM Jennifer K <jennife...@gmail.com> wrote:
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/a8b8a4a0-59e7-4087-bba6-a54252ebafdcn%40googlegroups.com.


--
Julius Volz
PromLabs - promlabs.com

Jennifer K

unread,
Jan 12, 2021, 11:26:02 AM1/12/21
to Julius Volz, Prometheus Users
Basically, I'm trying to fix an error that I am having when calculating the percent.  What i'm seeing:
sometimes the number of "good" scrapes is higher/smaller than the number of scrapes in [$__range].  I think this is just because sometimes I get one extra scrape compared to the number [$__range] scrapes and sometimes it's spot on.  I can't have a dashboard that sometimes shows the right value.  
Therefore, trying to make this work and not show values over/under 100 when the value should be 100%. I am trying to "fix" the denominator to be a value from a rule instead of the [$__range].
to do this I know I have "successful/unsuccessful probes" and "absent data".  I can quantify the probes using "probe_success" but when adding the absent function to "probe_success" it doesn't work- I think it has something to do with the vectors being different which is why I was trying to use the ignore function.

so... to answer your question-
I would want to count scrapes over the specified interval to include absent data as well- basically if my scrape is set to 30s, an interval of 1 hour should return 120 (to include successful/unsuccessful and absent data)
what is the best algorithm to use for that?
Thanks so much!
Jennifer

Ben Kochie

unread,
Jan 13, 2021, 10:19:20 AM1/13/21
to Jennifer K, Julius Volz, Prometheus Users
If you're looking for a "percent of OK scrapes" you can use:

avg_over_time(probe_success[$__range])

Then in Grafana, you can select "Percent (0 - 1)" to display the ratio as a percent.

probe_success will always be returned by the blackbox_exporter is functioning properly.

Marcelo Magallón

unread,
Jan 13, 2021, 10:30:19 PM1/13/21
to Prometheus Users
The problem with using avg_over_time for this is that it will ignore missing data points.

If I understand Jennifer's request correctly, she's looking for a way to start with data like {timestamp, value}, say the following values:

{0, 1}
{10, 1}
{20, 0}
{40, 1}
{50, 1}

and consider the missing {30, x} as a 0.

so in that example, avg_over_time would return (1+1+0+1+1)/5 = 80% and Jennifer wants (1+1+0+0+1+1)/6 = 67%

My understanding of the original question is how to obtain the 6.

The numerator is easy: sum_over_time.

For the denominator the best I can come up with is: count_over_time((vector(1))[5m:]), which is a really weird way of asking "how many samples should there be in this range?"

Marcelo




--
Marcelo Magallón

Jennifer K

unread,
Jan 20, 2021, 8:08:05 PM1/20/21
to Marcelo Magallón, Prometheus Users
Marcelo,
You are correct, I am looking for the (1+1+0+0+1+1)/6 = 67% value.
I tried the count_over_time as you explained above, however if prometheus isn't collecting on a particular target then the count_over_time value doesn't count the steps that are absent.  Following your example, I get 5 not 6.
Jennifer
 

You received this message because you are subscribed to a topic in the Google Groups "Prometheus Users" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/prometheus-users/B1dFucbIEmw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/CABiJYgY%2BcboQcV2r8z%3DueL7pL69X5YiXYmJ2D3KQtaQ2QJ61vg%40mail.gmail.com.
Reply all
Reply to author
Forward
0 new messages