Alertmanager email issues

28 views
Skip to first unread message

Daniello

unread,
Dec 19, 2024, 4:11:16 AM12/19/24
to Prometheus Users
Hi everyone
I am running kube-prometheus on K8s cluster - installed with helm.
Additionally I set up: AlertManagerConfig and PrometheusRules CRD

Issue:
I have two set of alerts, one of them called PodErrorAlert and the other PodPendingFor30mAlert. When I check the UI i can see both of the alerts, they get triggered. However only PodPendingFor30mAlert succesfully sends email. 
I do see a difference on how the alerts are being treated in logs but not sure how to solve that issue

Technical info/ logs/ manifests
1. You can find alertmanager debug logs here

2. AlertmanagerConfig

3. PrometheusRules



Daniello

unread,
Dec 19, 2024, 4:12:40 AM12/19/24
to Prometheus Users
Any help is greatly appreciated :) 

Brian Candler

unread,
Dec 19, 2024, 6:11:17 AM12/19/24
to Prometheus Users
> I am running kube-prometheus on K8s cluster - installed with helm.

Software versions?

> When I check the UI

Which UI are you referring to? Prometheus' web UI has an "alerts" section, and Alertmanager's web UI also has an "alerts" section. Do they both show the active PodErrorAlert?

Check the exact set of labels on each active alert, to see if there's any significant difference. AFAICS they should both have namespace, pod and severity.

Does the PodErrorAlert alert remain active for the full groupWait period (30 seconds), or does it keep resolving and firing again? If the alerts are resolving too quickly, you could try this:

        - alert: PodErrorAlert
          for: 3m
          keep_firing_for: 3m    # (Prometheus v2.42.0+)

Reply all
Reply to author
Forward
0 new messages