alert: K8sPodRestartingTooMuch
expr: rate(kube_pod_container_status_restarts_total[1m])
> 1 / (5 * 60)
for: 30m
labels:
severity: warningalert: PodFrequentlyRestarting
expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
for: 10m
labels:
severity: warning
annotations:
description: Pod {{`{{$labels.namespace}}`}}/{{`{{$labels.pod}}`}} was restarted {{`{{$value}}`}}
times within the last hour
summary: Pod is restarting frequently
{{ end }}
Currently, we are alerting if pods are restarting too much. We are using rate function in our alert.
We checked the standard rules provided here https://github.com/coreos/prometheus-operator/blob/master/helm/exporter-kube-state/templates/kube-state-metrics.rules.yaml where increase function is used instead of rate. What is the difference and which one is better to use in this scenario?
alert: K8sPodRestartingTooMuch expr: rate(kube_pod_container_status_restarts_total[1m]) > 1 / (5 * 60) for: 30m labels: severity: warningalert: PodFrequentlyRestarting expr: increase(kube_pod_container_status_restarts_total[1h]) > 5 for: 10m labels: severity: warning annotations: description: Pod {{`{{$labels.namespace}}`}}/{{`{{$labels.pod}}`}} was restarted {{`{{$value}}`}} times within the last hour summary: Pod is restarting frequently {{ end }}
--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7df7e427-22b7-4571-9062-61cab7077f89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
On 8 June 2018 at 10:12, isha girdhar < hashag...@gmail.com > wrote:Currently, we are alerting if pods are restarting too much. We are using rate function in our alert.
We checked the standard rules provided here https://github.com/coreos/prometheus-operator/blob/master/helm/exporter-kube-state/templates/kube-state-metrics.rules.yaml where increase function is used instead of rate. What is the difference and which one is better to use in this scenario?The difference is fairly minor, increase is syntactic sugar over rate. So increase(x[1h]) is the exact same as rate(x[1h]) * 3600. Rate produces a per second result, increase is per the time range you pass it.
for: 30m labels: severity: warningalert: PodFrequentlyRestarting expr: increase(kube_pod_container_status_restarts_total[1h]) > 5
--for: 10m labels: severity: warning annotations: description: Pod {{`{{$labels.namespace}}`}}/{{`{{$labels.pod}}`}} was restarted {{`{{$value}}`}} times within the last hour summary: Pod is restarting frequently {{ end }}
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-users+unsubscribe@googlegroups.com.
To post to this group, send email to prometheus-users@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/7df7e427-22b7-4571-9062-61cab7077f89%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--Brian Brazil
Apparently Crtl-Enter sends an email, let me finish that...