Need help for alert test using Promtool

173 views
Skip to first unread message

Shivanand Shete

unread,
Jul 18, 2022, 1:49:42 AM7/18/22
to Prometheus Users
Dear all,

Please find the below alert rules and I want to test that alert using Promtool.

groups:
- name: replicas-mismatch
rules:
- alert: KubernetesDeploymentReplicasMismatch-authproxy
expr: kube_replicaset_spec_replicas{namespace="auth-proxy"} != kube_deployment_status_replicas_available{namespace="auth-proxy"}
for: 10m
labels:
severity: critical
annotations:
summary: Kubernetes Deployment replicas mismatch (instance {{ $labels.instance }})
description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

And also I have eaten the test case but its not working please suggest .

rule_files:
- /testdata/deployment_replicas_mismatch.yaml
evaluation_interval: 1m
tests:
- interval: 1m
# Series Data
input_series:
- series: kube_replicaset_spec_replicas{job="prometheus", namespace="auth-proxy"}
values: '5+0x9 5+0x20 5+0x100000'
- series: kube_deployment_status_replicas_available{job="prometheus", namespace="auth-proxy"}
values: '5+0x9 4+0x20 5+0x100000'
alert_rule_test:
# Unit Test 1
- eval_time: 9m
alertname: KubernetesDeploymentReplicasMismatch-authproxy
exp_alerts:

- eval_time: 20m
alertname: KubernetesDeploymentReplicasMismatch-authproxy
exp_alerts:
- exp_labels:
namespace: auth-proxy
job: prometheus
severity: critical
exp_annotations:
summary: "Kube_replicaset_spec_replicas_authproxy missmatches"
description: "YaRD_Kubernetes Deployment Replicas Mismatch in authproxy namespace from 11 min getting alert"
replicas_mismatch_test.yaml
deployment_replicas_mismatch.yaml

David Leadbeater

unread,
Jul 18, 2022, 2:52:24 AM7/18/22
to Shivanand Shete, Prometheus Users
In your attachment the alertname in the rules and the test doesn't
match -- the unit tests match the alert name first, so fix that first;
then you can iterate on the other fields that need to match (it looks
like the annotations need adjusting).

David
> --
> You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/9af657d1-6240-4d9a-bbad-d44355e6650bn%40googlegroups.com.

David Leadbeater

unread,
Jul 18, 2022, 3:16:41 AM7/18/22
to Shivanand Shete, Prometheus Users
You're alerting in the rules with annotations as follows:

annotations:
summary: Kubernetes Deployment replicas mismatch (instance {{
$labels.instance }})
description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n
LABELS = {{ $labels }}"

Then expecting they match:

exp_annotations:
summary: "Kube_replicaset_spec_replicas_authproxy missmatches"
description: "YaRD_Kubernetes Deployment Replicas Mismatch in
authproxy namespace from 11 min getting alert"

You need to update the expected annotations to match that the rules
are generating.

David

On Mon, 18 Jul 2022 at 17:09, Shivanand Shete <shivana...@gmail.com> wrote:
>
> Hi David,
>
> I corrected the alert name and please find the attached update .yaml files.
> Alert:
>
> groups:
> - name: replicas-mismatch
> rules:
> - alert: KubernetesDeploymentReplicasMismatch-authproxy
> expr: kube_replicaset_spec_replicas{namespace="auth-proxy"} != kube_deployment_status_replicas_available{namespace="auth-proxy"}
> for: 10m
> labels:
> severity: critical
> annotations:
> summary: Kubernetes Deployment replicas mismatch (instance {{ $labels.instance }})
> description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
>
>
> TestCase:
> Regards,
> Shivanand Shete.
> --
> Thanks & Regards,
> Shivanand Shete
> 9422362618

Shivanand Shete

unread,
Jul 18, 2022, 4:30:19 AM7/18/22
to David Leadbeater, Prometheus Users
Hi David,

I corrected the alert name and please find the attached update .yaml files.
Alert: 

groups:
- name: replicas-mismatch
rules:
- alert: KubernetesDeploymentReplicasMismatch-authproxy
expr: kube_replicaset_spec_replicas{namespace="auth-proxy"} != kube_deployment_status_replicas_available{namespace="auth-proxy"}
for: 10m
labels:
severity: critical
annotations:
summary: Kubernetes Deployment replicas mismatch (instance {{ $labels.instance }})
description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"


TestCase: 
Regards,
Shivanand Shete.

On Mon, Jul 18, 2022 at 12:22 PM David Leadbeater <d...@dgl.cx> wrote:
deployment_replicas_mismatch.yaml
replicas_mismatch_test.yaml

Brian Candler

unread,
Jul 18, 2022, 4:53:47 AM7/18/22
to Prometheus Users
If I run those tests, I get an error saying describing exactly what the problem is:

root@prometheus:~# /opt/prometheus/promtool test rules replicas_mismatch_test.yaml
Unit Testing:  replicas_mismatch_test.yaml
  FAILED:
    alertname: KubernetesDeploymentReplicasMismatch-authproxy, time: 20m,
        exp:[
            0:
              Labels:{alertname="KubernetesDeploymentReplicasMismatch-authproxy", job="prometheus", namespace="auth-proxy", severity="critical"}
              Annotations:{description="YaRD_Kubernetes Deployment Replicas Mismatch in authproxy namespace from 11 min getting alert", summary="Kube_replicaset_spec_replicas_authproxy missmatches"}
            ],
        got:[
            0:
              Labels:{alertname="KubernetesDeploymentReplicasMismatch-authproxy", job="prometheus", namespace="auth-proxy", severity="critical"}
              Annotations:{description="Deployment Replicas mismatch\n  VALUE = 5\n  LABELS = map[__name__:kube_replicaset_spec_replicas job:prometheus namespace:auth-proxy]", summary="Kubernetes Deployment replicas mismatch (instance )"}
            ]

That's very clear.  Notice how "exp" (expected) is different to "got" (what the alerting rule actually produced).  You can either fix the "exp" to match the annotations generated by the alerting rule:

                         exp_annotations:
                               summary: "Kubernetes Deployment replicas mismatch (instance )"
                               description: "Deployment Replicas mismatch\n  VALUE = 5\n  LABELS = map[__name__:kube_replicaset_spec_replicas job:prometheus namespace:auth-proxy]"

(Notice that the instance in the summary is blank, because you didn't set an instance label in your test data).

Or you can change the alerting rules themselves to give the annotations that you expect in your test.  That's the whole point of testing - to see that what you generate is what you expect.

Aside: I see no point in putting LABELS in an annotation.  They are already part of the alert.  It only duplicates information, and makes the alert harder to understand and harder to test.

Shivanand Shete

unread,
Jul 18, 2022, 5:34:22 AM7/18/22
to David Leadbeater, Prometheus Users
Hi,

I did some changes as you suggested. After that I am getting below error.

Unit Testing:  replicas_mismatch_test.yaml

  FAILED:

    alertname: KubernetesDeploymentReplicasMismatch-authproxy, time: 20m, 

        exp:[

            0:

              Labels:{alertname="KubernetesDeploymentReplicasMismatch-authproxy", job="prometheus", namespace="auth-proxy", severity="critical"}

              Annotations:{description="Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}", summary="Kubernetes Deployment replicas mismatch (instance {{$labels.instance }})"}

            ], 

        got:[]


Alert:


groups:
- name: replicas-mismatch
rules:
- alert: KubernetesDeploymentReplicasMismatch-authproxy
expr: kube_replicaset_spec_replicas{namespace="auth-proxy"} != kube_deployment_status_replicas_available{namespace="auth-proxy"}
for: 10m
labels:
severity: critical
annotations:
summary: Kubernetes Deployment replicas mismatch (instance {{ $labels.instance }})
description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"

TestCase:
rule_files:
- /testdata/deployment_replicas_mismatch.yaml
evaluation_interval: 1m
tests:
- interval: 1m
# Series Data
input_series:
- series: kube_replicaset_spec_replicas{job="prometheus", namespace="auth-proxy"}
values: '5+0x9 5+0x20'
- series: kube_deployment_status_replicas_available{job="prometheus", namespace="auth-proxy"}
values: '5+0x9 5+0x20'
alert_rule_test:
# Unit Test 1
- eval_time: 9m
alertname: KubernetesDeploymentReplicasMismatch-authproxy
exp_alerts:

- eval_time: 20m
alertname: KubernetesDeploymentReplicasMismatch-authproxy
exp_alerts:
- exp_labels:
namespace: auth-proxy
job: prometheus
severity: critical
exp_annotations:
summary: Kubernetes Deployment replicas mismatch (instance {{$labels.instance }})
description: "Deployment Replicas mismatch\n VALUE = {{ $value }}\n LABELS = {{ $labels }}"
deployment_replicas_mismatch.yaml
replicas_mismatch_test.yaml

Brian Candler

unread,
Jul 18, 2022, 5:52:04 AM7/18/22
to Prometheus Users
It's saying there's no alert firing at time 20m ("got" is empty).

That seems correct to me.  You've now made the two input_series have identical values:

         input_series:
               - series: kube_replicaset_spec_replicas{job="prometheus", namespace="auth-proxy"}
                 values: '5+0x9 5+0x20'
               - series: kube_deployment_status_replicas_available{job="prometheus", namespace="auth-proxy"}
                 values: '5+0x9 5+0x20'

Hence you wouldn't expect the alert to fire, would you?

Shivanand Shete

unread,
Jul 18, 2022, 6:03:18 AM7/18/22
to Brian Candler, Prometheus Users
Yes,
Now in the  above case both Replicaset and Deployment replica count are the same.
If the count is different its like below
         input_series:
               - series: kube_replicaset_spec_replicas{job="prometheus", namespace="auth-proxy"}
                 values: '5+0x9 5+0x20'
               - series: kube_deployment_status_replicas_available{job="prometheus", namespace="auth-proxy"}
                 values: '5+0x9 4+0x20'
It will generate an alert.

But if the count is the same, I still get an error.

Brian Candler

unread,
Jul 18, 2022, 7:12:57 AM7/18/22
to Prometheus Users
Of course.  What you are doing is *testing your alerting rules*.  You give it a specific set of inputs and an alerting rule, and you tell it whether you expect an alert to be generated (or not).

Clearly, if you change the inputs, you may get different alerts generated (or not).

Therefore, you should make different test cases for these different scenarios, to test your alerts under different conditions. Like:

1. If the two input metrics are the same, I expect no alert to be raised.
2. If the two input metrics are different, I expect an alert to be raised with particular labels / annotations.
3. etc

If the alerts don't behave how you expect - e.g. you don't get an alert when you think there should be one, or you get a different set of labels or annotations than you expect - then this is a good way for you to learn how alerting works in prometheus.

> But if the count is the same, I still get an error.

The system is telling you what *actually happens* under the input conditions you have given. In other words: the error tells you exactly how the actual alert generated (or not generated) differs from the alert(s) you told it to expect. 

It's up to you to decide: is your alerting rule working correctly, but your declared expectations were wrong?  Or the alerting rule itself needs to be changed, so it works in the way you expect?
Reply all
Reply to author
Forward
0 new messages