Alert Query

46 views
Skip to first unread message

sri L

unread,
Oct 4, 2023, 10:59:23 PM10/4/23
to Prometheus Users
Hi all,

Can anyone please suggest alert expression for configuring alert rule for below condition.

"metric data is not being received by Prometheus and to alert that there is an issue with the Prometheus and it is unable to scrape".

Thanks


hartfordfive

unread,
Oct 6, 2023, 9:43:59 AM10/6/23
to Prometheus Users
If you're looking to determine if a target is reachable or not, you could use the "up" metric which is automatically added to the scrape of a given target (see docs).  The alerting condition could look something like this:

alert: TargetIsUnreachable
expr: up == 0
for: 3m
labels:
  severity: warning
annotations:
  title: Instance {{ $labels.instance }} is unreachable
  description: Prometheus is unable to scrape {{ $labels.instance }}. This could indicate the target being down or at network issue.



This will trigger the alert if the "up" metric is continuously equal to 0 (or in other words, the instance is unreachable) for a period of 3 minutes.   The value of the "for" parameter should probably be at least 2 to 3 times higher than what your scrape_interval setting (see docs for reference) .  It's often advised to add the "for" parameter to alerting conditions to avoid noise from flapping alerts.  You wouldn't want to necessarily be notified if a single scrape fails, say due to a transient network connectivity problem.    There is also the "absent" function (see docs) which you can use to determine if series (aka samples) exist for a given metric name and label combination.   You would use that in cases like where you might want to be notified if a given metric disappears due to the target itself disappearing from the service discovery.

As for determining if there is an actual problem with Prometheus itself, that can vary depending on the issue but here's a good list of known alerting conditions that you can use to monitor the state of Prometheus instances:
https://samber.github.io/awesome-prometheus-alerts/rules.html#prometheus-self-monitoring

sri L

unread,
Oct 7, 2023, 10:12:29 AM10/7/23
to Prometheus Users
Incase of any issue with storage, network with in Prometheus that impacting metrics not getting scraped successfully, we would to like to identify such scenario by using any specific metric that can alert saying metrics not received.
Reply all
Reply to author
Forward
0 new messages