I have two alert rules that used to monitor if Node/Pod status is normal.
Something like
- alert: node-down
expr: kube_node_status_condition{condition="Ready",status!="true"} > 0
for: 5m
annotations:
summary: '[p0] Node status abnormal'
description: 'Node `{{ $labels.node }}` down'
However, sometimes my cluster need to autoscale out then scale down (triggered by Azure). This situation always let alert rules be triggered.
I wanna know is there any better solution to monitor Node/Pod status?
I only want to be notified if the Node/Pod status is truely abnormal.
If the Node/Pod is down due to autoscale, hope it could be filtered.