Unit testing

38 views

Skip to first unread message

Jimmy the Greek

unread,

Oct 23, 2020, 5:41:43 PM10/23/20

to Prometheus Users

I have been experimenting with the unit test capabilities provided by promtool and have run into a few issues/gotchas that I can't seem to understand.

example code:

rule_files:

- ../nodelocal-cache.yaml

evaluation_interval: 1m

tests:

- interval: 1m

external_labels:

cluster: test

input_series:

- series: 'coredns_nodecache_setup_errors_total{pod="unit-test", errortype="configmap"}'

values: '1 2 3 4 5 6 7 8 9 10'

- series: 'coredns_dns_response_rcode_count_total{job="nodelocal-dns", rcode="SERVFAIL", zone="."}'

values: '0 60 120 180 240 300 360 420 480 540'

- series: 'coredns_dns_response_rcode_count_total{job="nodelocal-dns", rcode="NOERROR", zone="."}'

values: '0 120 240 360 480 600 720 840 960 1080'

promql_expr_test:

- expr: rate(coredns_nodecache_setup_errors_total{}[5m])

eval_time: 5m

exp_samples:

- labels: '{pod="unit-test", errortype="configmap"}'

value: 1.6666666666666666E-02

- expr: rate(coredns_dns_response_rcode_count_total{}[5m])

eval_time: 10m

exp_samples:

- labels: '{job="nodelocal-dns", rcode="SERVFAIL", zone="."}'

value: 1

- labels: '{job="nodelocal-dns", rcode="NOERROR", zone="."}'

value: 2

alert_rule_test:

- eval_time: 6m

alertname: NodeLocalDNSSetupErrorsHigh

exp_alerts:

- exp_labels:

severity: critical

alertname: NodeLocalDNSSetupErrorsHigh

errortype: configmap

pod: unit-test

exp_annotations:

description: test:unit-test There are configmap errors setting up NodeLocalDNS

summary: NodeLocalDNS setup errors on test:unit-test

----

groups:
- name: NodeLocalDNS
rules:
- alert: NodeLocalDNSSetupErrorsHigh
labels: severity: critical
for: 5m
expr: |
rate(coredns_nodecache_setup_errors_total{}[5m]) > 0 annotations:
summary: "NodeLocalDNS setup errors on {{ $externalLabels.cluster }}:{{ $labels.pod }}"
description: "{{ $externalLabels.cluster }}:{{ $labels.pod }} There are {{ $labels.errortype }} errors setting up NodeLocalDNS"

As you can see I run prom QL test rate(coredns_nodecache_setup_errors_total{}[5m]) that evaluates to 1.666. Therefore when I test NodeLocalDNSSetupErrorsHigh which will trigger when that value is above 0 for 5 minute period the test only passes if I set eval_time to 6m, and fails if I set it to 5m (alert doesn't trigger).

What is the relation between the for time in the alert rule itself and the eval_time in the test?

David Leadbeater

unread,

Oct 23, 2020, 6:47:14 PM10/23/20

to Jimmy the Greek, Prometheus Users

On Fri, 23 Oct 2020 at 22:41, Jimmy the Greek <matthe...@gmail.com> wrote:
[...]

> As you can see I run prom QL test rate(coredns_nodecache_setup_errors_total{}[5m]) that evaluates to 1.666. Therefore when I test NodeLocalDNSSetupErrorsHigh which will trigger when that value is above 0 for 5 minute period the test only passes if I set eval_time to 6m, and fails if I set it to 5m (alert doesn't trigger).
>
> What is the relation between the for time in the alert rule itself and the eval_time in the test?

In this case the "for: 5m" means that the alert rule has to be firing
for 5 minutes. Because a rate() needs two samples in order to
calculate a rate your rate() function starts to return a value at 1m,
then when your rules are evaluated at 6m the alert starts firing.

Aside: if you get the promtool from 2.22.0 it's now possible to look
at the ALERTS timeseries, including pending alerts where the for
threshold hasn't been reached, I wouldn't recommend you actually test
the "for" threshold in rules in most cases (you're kind of testing
Prometheus rather than your rules then). But it is possible to
temporarily add a test for debugging like:

- expr: ALERTS{alertstate="pending"}
eval_time: 5m

Which the failure output of will tell you that your alert is pending
at that point, e.g.:

expr: "ALERTS{alertstate=\"pending\"}", time: 5m,
exp:"nil"
got:"{__name__=\"ALERTS\",
alertname=\"NodeLocalDNSSetupErrorsHigh\", alertstate=\"pending\",
errortype=\"configmap\", pod=\"unit-test\", severity=\"critical\"}
1E+00"

David

Reply all

Reply to author

Forward

0 new messages