Prometheus AlertManager integration with ServiceNow.

Zhang Zhao

unread,

May 15, 2020, 3:55:41 AM5/15/20

to Prometheus Users

Hi, I have a question about integration with ServiceNow for ticketing via webhook. The alertmanager.yaml is straight forward as below. And in the event captured on ServiceNow side as below, one single event contained multiple types of alerts.. What shall I modify to break the alerts off? What I need is that one event corresponds to one alert. Thanks.

global:

resolve_timeout: 5m

receivers:

- name: prometheus-snow

webhook_configs:

- url: "https://abcde"

http_config:

basic_auth:

username: "id"

password: "passwd"

route:

receiver: prometheus-snow

______________________________________________________________________

{

   "receiver":"prometheus-snow",
   "status":"firing",
   "alerts":[

{

"status":"firing",
"labels":{

            "alertname":"Availability",
            "endpoint":"http-metrics",
            "instance":"",
            "job":"kube-etcd",
            "namespace":"kube-system",
            "pod":"etcd-minikube",
            "prometheus":"monitoring/demo-prometheus-operator-prometheus",
            "service":"demo-prometheus-operator-kube-etcd",
            "severity":"critical"

},
"annotations":{

"message":"The service is not available."

},
         "startsAt":"2020-05-13T20:02:44.237Z",
         "endsAt":"0001-01-01T00:00:00Z",
         "generatorURL":"http://demo-prometheus-operator-prometheus.monitoring:9090/graph?g0.expr=up+%3D%3D+0&g0.tab=1",
         "fingerprint":"abaaa1ad00692f32"

},
{

"status":"firing",
"labels":{

            "alertname":"etcdInsufficientMembers",
            "job":"kube-etcd",
            "prometheus":"monitoring/demo-prometheus-operator-prometheus",
            "severity":"critical"

},
"annotations":{

"message":"etcd cluster \"kube-etcd\": insufficient members (0)."

},
         "startsAt":"2020-05-15T07:12:05.291Z",
         "endsAt":"0001-01-01T00:00:00Z",
         "generatorURL":"http://demo-prometheus-operator-prometheus.monitoring:9090/graph?g0.expr=sum+by%28job%29+%28up%7Bjob%3D~%22.%2Aetcd.%2A%22%7D+%3D%3D+bool+1%29+%3C+%28%28count+by%28job%29+%28up%7Bjob%3D~%22.%2Aetcd.%2A%22%7D%29+%2B+1%29+%2F+2%29&g0.tab=1",
         "fingerprint":"b62c0895f98bd49a"

},
{

"status":"firing",
"labels":{

            "alertname":"KubeCPUOvercommit",
            "prometheus":"monitoring/demo-prometheus-operator-prometheus",
            "severity":"warning"

},
"annotations":{

"message":"Cluster has overcommitted CPU resource requests for Pods and cannot tolerate node failure.",
"runbook_url":"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubecpuovercommit"

},
         "startsAt":"2020-05-15T07:14:20.872Z",
         "endsAt":"0001-01-01T00:00:00Z",
         "generatorURL":"http://demo-prometheus-operator-prometheus.monitoring:9090/graph?g0.expr=sum%28namespace%3Akube_pod_container_resource_requests_cpu_cores%3Asum%29+%2F+sum%28kube_node_status_allocatable_cpu_cores%29+%3E+%28count%28kube_node_status_allocatable_cpu_cores%29+-+1%29+%2F+count%28kube_node_status_allocatable_cpu_cores%29&g0.tab=1",
         "fingerprint":"93f930c75b3f06ee"

},
{

"status":"firing",
"labels":{

            "alertname":"KubeMemoryOvercommit",
            "prometheus":"monitoring/demo-prometheus-operator-prometheus",
            "severity":"warning"

},
"annotations":{

"message":"Cluster has overcommitted memory resource requests for Pods and cannot tolerate node failure.",
"runbook_url":"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubememoryovercommit"

},
         "startsAt":"2020-05-15T07:14:20.872Z",
         "endsAt":"0001-01-01T00:00:00Z",
         "generatorURL":"http://demo-prometheus-operator-prometheus.monitoring:9090/graph?g0.expr=sum%28namespace%3Akube_pod_container_resource_requests_memory_bytes%3Asum%29+%2F+sum%28kube_node_status_allocatable_memory_bytes%29+%3E+%28count%28kube_node_status_allocatable_memory_bytes%29+-+1%29+%2F+count%28kube_node_status_allocatable_memory_bytes%29&g0.tab=1",
         "fingerprint":"e94552875fd8b6ab"

},
{

"status":"firing",
"labels":{

            "alertname":"Watchdog",
            "prometheus":"monitoring/demo-prometheus-operator-prometheus",
            "severity":"none"

},
"annotations":{

"message":"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n"

},
         "startsAt":"2020-05-12T03:48:10.933Z",
         "endsAt":"0001-01-01T00:00:00Z",
         "generatorURL":"http://demo-prometheus-operator-prometheus.monitoring:9090/graph?g0.expr=vector%281%29&g0.tab=1",
         "fingerprint":"e2cea8350b46b7df"

}

],
   "groupLabels":"",
   "commonLabels":"",
   "commonAnnotations":"",
   "externalURL":"http://demo-prometheus-operator-alertmanager.monitoring:9093",
   "version":"4",
   "groupKey":"{}:{}",
   "commonLabels_prometheus":"monitoring/demo-prometheus-operator-prometheus"

}

Brian Candler

unread,

May 15, 2020, 4:35:23 AM5/15/20

to Prometheus Users

Have you tried:

route:
group_by ['...']
receiver: prometheus-snow

(literally the string "..."). The doc says:

# To aggregate by all possible labels use the special value '...' as the sole label name, for example:
# group_by: ['...'] 
# This effectively disables aggregation entirely, passing through all 
# alerts as-is. This is unlikely to be what you want, unless you have 
# a very low alert volume or your upstream notification system performs 
# its own grouping.

However it doesn't say what happens if you omit group_by entirely.

Brian Candler

unread,

May 15, 2020, 4:38:03 AM5/15/20

to Prometheus Users

Oops, missed the colon.

Reply all

Reply to author

Forward