firing alerts not in alertmanager logs

3,322 views
Skip to first unread message

joshua...@pearson.com

unread,
Mar 11, 2018, 10:37:19 AM3/11/18
to Prometheus Users
Can someone please tell me is there any way to get the logs for the alertmanager pod to include any firing alerts?

I used this trick to cause some alerts to fire:
$ kubectl run crasher --image=rosskukulinski/crashing-app
deployment "crasher" created
$ kubectl run fail --image=rosskukulinski/dne:v1.0.0
deployment "fail" created

amtool of course shows the firing alerts...:
$ amtool alert
Labels                                                                                                                                                                                                                                                                             Annotations                                                                                                                                       Starts At                Ends At                  Generator URL
alertname="DeadMansSwitch" severity="none"                                                                                                                                                                                                              description="This is a DeadMansSwitch meant to ensure that the entire Alerting pipeline is functional." summary="Alerting DeadMansSwitch"         2018-03-10 18:33:37 UTC  0001-01-01 00:00:00 UTC  http://prometheus-k8s-1:9090/graph?g0.expr=vector%281%29&g0.tab=1
alertname="PodFrequentlyRestarting" container="crasher" endpoint="https-main" instance="172.17.0.11:8443" job="kube-state-metrics" namespace="default" pod="crasher-679745dd49-rkzvx" service="kube-state-metrics" severity="warning"                   description="Pod /crasher-679745dd49-rkzvx is was restarted 9.125210869565219 times within the last hour" summary="Pod is restarting frequently"  2018-03-11 13:34:33 UTC  0001-01-01 00:00:00 UTC  http://prometheus-k8s-1:9090/graph?g0.expr=increase%28kube_pod_container_status_restarts_total%5B1h%5D%29+%3E+5&g0.tab=1
alertname="DeploymentReplicasNotUpdated" deployment="crasher" endpoint="https-main" instance="172.17.0.11:8443" job="kube-state-metrics" namespace="default" pod="kube-state-metrics-5799bbb88c-df5cj" service="kube-state-metrics" severity="warning"  description="Replicas are not updated and available for deployment /crasher" summary="Deployment replicas are outdated"                           2018-03-11 13:36:33 UTC  0001-01-01 00:00:00 UTC  http://prometheus-k8s-1:9090/graph?g0.expr=%28%28kube_deployment_status_replicas_updated+%21%3D+kube_deployment_spec_replicas%29+or+%28kube_deployment_status_replicas_available+%21%3D+kube_deployment_spec_replicas%29%29+unless+%28kube_deployment_spec_paused+%3D%3D+1%29&g0.tab=1
alertname="DeploymentReplicasNotUpdated" deployment="fail" endpoint="https-main" instance="172.17.0.11:8443" job="kube-state-metrics" namespace="default" pod="kube-state-metrics-5799bbb88c-df5cj" service="kube-state-metrics" severity="warning"     description="Replicas are not updated and available for deployment /fail" summary="Deployment replicas are outdated"                              2018-03-11 13:37:03 UTC  0001-01-01 00:00:00 UTC  http://prometheus-k8s-1:9090/graph?g0.expr=%28%28kube_deployment_status_replicas_updated+%21%3D+kube_deployment_spec_replicas%29+or+%28kube_deployment_status_replicas_available+%21%3D+kube_deployment_spec_replicas%29%29+unless+%28kube_deployment_spec_paused+%3D%3D+1%29&g0.tab=1

...but this is the only stuff in the logs:
$ kubectl logs  alertmanager-main-0 -n monitoring -c alertmanager
level=info ts=2018-03-11T13:19:37.260119204Z caller=main.go:136 msg="Starting Alertmanager" version="(version=0.14.0, branch=HEAD, revision=30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)"
level=info ts=2018-03-11T13:19:37.260340376Z caller=main.go:137 build_context="(go=go1.9.2, user=root@37b6a49ebba9, date=20180213-08:16:42)"
level=info ts=2018-03-11T13:19:37.302733894Z caller=main.go:275 msg="Loading configuration file" file=/etc/alertmanager/config/alertmanager.yaml
level=info ts=2018-03-11T13:19:37.332733105Z caller=main.go:350 msg=Listening address=:9093
level=info ts=2018-03-11T13:34:37.261513558Z caller=silence.go:269 component=silences msg="Running maintenance"
level=info ts=2018-03-11T13:34:37.261702547Z caller=nflog.go:293 component=nflog msg="Running maintenance"
level=info ts=2018-03-11T13:34:37.26728836Z caller=silence.go:271 component=silences msg="Maintenance done" duration=5.794022ms size=0
level=info ts=2018-03-11T13:34:37.268134833Z caller=nflog.go:295 component=nflog msg="Maintenance done" duration=6.447975ms size=0

By the way, i'm using kube-prometheus to deploy alertmanager. 
Perhaps if a more verbose log level (than whatever the default is for alertmanager - I presume "info") of alertmanager does include firing alerts in the pod, then i could simply specify that level? But unfortunately AlertmanagerSpec doesn't have a field to allow me to specify the log level.

Any suggestions would be greatly appreciated. thank you!

joshua...@pearson.com

unread,
Mar 12, 2018, 11:04:11 AM3/12/18
to Prometheus Users
Note that I used the kubernetes helm charts for prometheus to do an experiment - i.e. set the alertmanager log.level to debug:
~/tmp/charts/stable
$ git diff prometheus/templates/alertmanager-deployment.yaml
           args:
             - --config.file=/etc/config/alertmanager.yml
             - --storage.path={{ .Values.alertmanager.persistentVolume.mountPath }}
+            - --log.level=debug

And in my local copy of values.yaml in prometheus.yml I specified DeadMansSwitch.

Then (as I suspected/hoped) alertmanager does indeed include firing alerts in the logs:
~/tmp/charts/stable
$ helm install prometheus --namespace monitoring --name monitoring --values ./prometheus/values.yaml
$ kubectl logs monitoring-prometheus-alertmanager-76888b675f-78zs8 -n monitoring -c prometheus-alertmanager
level=info ts=2018-03-12T14:18:44.254314568Z caller=main.go:136 msg="Starting Alertmanager" version="(version=0.14.0, branch=HEAD, revision=30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)"
level=info ts=2018-03-12T14:18:44.254989852Z caller=main.go:137 build_context="(go=go1.9.2, user=root@37b6a49ebba9, date=20180213-08:16:42)"
level=info ts=2018-03-12T14:18:44.256272062Z caller=main.go:275 msg="Loading configuration file" file=/etc/config/alertmanager.yml
level=info ts=2018-03-12T14:18:44.378824804Z caller=main.go:350 msg=Listening address=:9093
level=debug ts=2018-03-12T14:19:10.322446753Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert=DeadMansSwitch[f0e0b66][active]
level=debug ts=2018-03-12T14:19:12.324200616Z caller=dispatch.go:429 component=dispatcher aggrGroup="{}:{alertname=\"DeadMansSwitch\"}" msg=Flushing alerts=[DeadMansSwitch[f0e0b66][active]]
level=debug ts=2018-03-12T14:20:10.321705844Z caller=dispatch.go:188 component=dispatcher msg="Received alert" alert=DeadMansSwitch[f0e0b66][active]

But is there any easy way to specify the alertmanager log.level when using kube-prometheus / prometheus-operator? The only way I see is if I were to build my own prometheus-operator with the following change:
~/tmp/prometheus-operator
$ git diff pkg/alertmanager/statefulset.go
@@ -173,6 +173,7 @@ func makeStatefulSetSpec(a *monitoringv1.Alertmanager, config Config) (*v1beta1.
        amArgs := []string{
                fmt.Sprintf("-config.file=%s", alertmanagerConfFile),
+               fmt.Sprintf("-log.level=debug"),
                fmt.Sprintf("-web.listen-address=:%d", 9093),

Should I create a new prometheus-operator issue to request that functionality be added to AlertmanagerSpec to allow the log.level to be specified?

Tyler Roscoe

unread,
Mar 12, 2018, 11:08:43 AM3/12/18
to joshua...@pearson.com, Prometheus Users
On Mon, Mar 12, 2018 at 9:04 AM, <joshua...@pearson.com> wrote:
But is there any easy way to specify the alertmanager log.level when using kube-prometheus / prometheus-operator? The only way I see is if I were to build my own prometheus-operator with the following change:

I encountered the same shortcoming and didn't find a better way to solve it.
 
Should I create a new prometheus-operator issue to request that functionality be added to AlertmanagerSpec to allow the log.level to be specified?

 +1 for this functionality.

joshua...@pearson.com

unread,
Mar 22, 2018, 11:38:38 AM3/22/18
to Prometheus Users
Thank you Tyler for confirming my suspicions.

fyi i created a new Add log.level to AlertmanagerSpec prometheus-operator issue.
Reply all
Reply to author
Forward
0 new messages