Thanks
for the quick response.
I just checked and prom is using the correct yml. I just missed that I actually changed the alertmanager to localhost:9093 - thats why my error sais localhost:9093 not 127.0.0.1:9093
Furthermore, I actually don't know why but I just restartet the prom.exe and the alertmanager.exe and its working fine.
Another question I had was how to configure the webhook to teams correctly. The commented block in the end from my alertmanager.yml is now the following:
route:
receiver: 'email'
group_wait: 30s
group_interval: 5m
repeat_interval: 4h
group_by: [cluster, alertname]
routes:
- receiver: email
group_interval: 1m
group_wait: 10s
repeat_interval: 1m
matchers:
- severity="critical"
- receiver: email
group_interval: 1m
group_wait: 10s
repeat_interval: 1m
matchers:
- severity="warning"
- receiver: teams
group_interval: 1m
group_wait: 10s
repeat_interval: 1m
matchers:
- severity="critical"
receivers:
- name: 'email'
email_configs:
- to: 'm...@company.com'
send_resolved: true
- name: 'teams'
webhook_configs:
- url: 'https://company.webhook.office.com/webhookb2/XXX'
send_resolved: true
error in prom console:
ts=2023-03-07T07:42:29.984Z caller=notifier.go:532 level=error component=notifier alertmanager=http://localhost:9093/api/v2/alerts count=25 msg="Error sending alert" err="Post \"http://localhost:9093/api/v2/alerts\": dial tcp [::1]:9093: connectex: No connection could be made because the target machine actively refused it."
and my alertmanager console sais:
ts=2023-03-07T08:10:21.942Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=150 err="teams/webhook[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: https://company.webhook.office.com/webhookb2/XXX: Summary or Text is required."
I will attach my tmpl.
I copied the prometheus-msteams on my machine and run it via:
.\prometheus-msteams-windows-amd64.exe -http-addr "localhost:2000" -teams-incoming-webhook-url "https://company.webhook.office.com/webhookb2/XXX"
What I already checked:
- ports used (running netstat -ao) -> everything looks fine, ports aren't blocked and only the prom services are running on those ports
- firewall rules (non blocking)
Now i'm kinda stuck because I really want that Teams integration but running prom on Windows Server (without docker) doesn't seem to be that often in use.
And
one final question in regards of using this google group:
When I'm having a completly different question shall I create a new
conversation or just ask in an open one like mine here? (e.x. for the
relabeling part - because I actually don't know anything about regex and so on)
I just checked and prom is using the correct yml. I just missed that I actually changed the alertmanager to localhost:9093 - thats why my error sais localhost:9093 not 127.0.0.1:9093
Furthermore, I actually don't know why but I just restartet the prom.exe and the alertmanager.exe and its working fine.
Another question I had was how to configure the webhook to teams correctly.
receivers:
...
- name: 'teams'
webhook_configs:
- url: 'https://company.webhook.office.com/webhookb2/XXX'
send_resolved: true
This means you are attempting to send the alertmanager webhook message directly to Microsoft (*.office.com). You can't do this. You need to send it to your prometheus-msteams process instead, e.g. localhost:2000, and *that* will send a message to Microsoft.
There is a valid example here: https://github.com/prometheus-msteams/prometheus-msteams#static-uri-handler-eg-alertmanager
receivers:
- name: 'prometheus-msteams'
webhook_configs:
- send_resolved: true
url: 'http://localhost:2000/alertmanager' # the prometheus-msteams proxy