Requirements:
The reason we need a stability filter is: sometimes the service is not able to push metrics due to the push gateway service being down for 1 minute and the push gateway service being recovered within 2 minutes, so we do not want to send firing alerts in this scenario.
Prometheus configurations:
evaluation_interval: 1m
scrape_interval: 30s
Alert rule:
- alert: forecaster expr: rate(forecasts_published_counter{job=\"metrics_job\", module_name=\"forecaster\"}[5m]) <= 0 for: 5mI experimented stability filter using FOR clause, it works for firing alerts but it does not work for resolving alerts.
The service is not publishing forecasts for over the 5 minutes:
The service publishing forecasts over the 5 minutes:
I can change the evaluation interval to 5m but it affects other services. So I do not want to change it.
Is there any other way to set a stability filter (5m) in Prometheus for changing the alert state from firing to inactive(Resolved)?
Thanks,
Shivakumar Sajjan