site uptime calculation

186 views
Skip to first unread message

Alan Miller

unread,
Sep 29, 2023, 3:41:41 AM9/29/23
to Prometheus Users
I'm trying to put together a query to show the average uptime of the sites my blackbox exporter is polling.

So there's a metric probe_http_status_code that has the result code (eg. 200) of the probe.

Since BB is polling my sites every minute and I see 1440 values per day
and 40320 per 4weeks
  count_over_time(probe_http_status_code{instance="https://www.mysite.com:443"}[4w]) 

I verified the status code is always 200
    probe_http_status_code{instance="https://www.mysite.com:443"} 

so in theory I should be able to calculate uptime percentage as:
   sum_over_time(metric[4w]) / (200 * count_over_time(metric[4w]))

Why does this query shows uptimes >100%

100 * ((sum_over_time(probe_http_status_code{instance="https://www.mysite.com:443"}[4w]) / (200 * (count_over_time(probe_http_status_code{instance="https://www.mysite.com:443"}[4w])))))


Ben Kochie

unread,
Sep 29, 2023, 3:46:07 AM9/29/23
to Alan Miller, Prometheus Users
The `probe_success` metric should be providing a boolean result for the probe. For example the default example of http_2xx probe will make a 0 for non-2xx responses.

Then you can simply do `avg_over_time(probe_success[4w])` to find out your availability.

--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/4fa46ea2-6686-40de-9483-902b5d0ee56dn%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages