{\"message\":\"Not Found\"}\n" - mail Alertmanager (and prometheus-msteams)

492 views
Skip to first unread message

Kolja Krückmann

unread,
Mar 6, 2023, 7:33:21 AM3/6/23
to Prometheus Users
Hi y'all

I'm currently trying to get my alerting to work via email.

When running my prom.exe I get the following error quite often:

ts=2023-03-06T10:32:02.804Z caller=notifier.go:532 level=error component=notifier alertmanager=http://localhost:9093/api/v2/alerts count=30 msg="Error sending alert" err="Post \"http://localhost:9093/api/v2/alerts\": dial tcp [::1]:9093: connectex: No connection could be made because the target machine actively refused it."

My prom.yml is:

# my global config
global:
  scrape_interval: 1m # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 30s # Evaluate rules every 30 seconds. The default is every 1 minute.
  scrape_timeout: 30s
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
           - 127.0.0.1:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
    - rules.yml

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'node'
    file_sd_configs:
      - files:
        - C:/Prometheus/prometheus-2.41.0.windows-amd64/target_cluster_b.yml


    alertingmanager.yml:

global:
  resolve_timeout: 1m
  smtp_smarthost: 'smtp.ionos.de:587'
  smtp_from: 'internal-mail1'
  smtp_auth_username: ' internal-mail1'
  smtp_auth_password: 'password'
  smtp_require_tls: true

templates:
  - 'C:\Prometheus\templates\default-message-card.tmpl'
   
route:
  receiver: 'email'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: [cluster, alertname]
 
  routes:
  - receiver: email
    group_interval: 1m
    group_wait: 10s
    repeat_interval: 1m
    matchers:
    - severity="critical"
 
  - receiver: email
    group_interval: 1m
    group_wait: 10s
    repeat_interval: 1m
    matchers:
    - severity="warning"    


 
receivers:
- name: 'email'
  email_configs:
  - to: 'internal-mail2'
    send_resolved: true


#- name: 'alert_channel'
#  webhook_configs:
#  - url: 'http://127.0.0.1:2000/alertmanager'
#    send_resolved: true
#
#  
#- name: 'teams'
#  webhook_configs:
#    - url: 'my-teams-webhook'
#      send_resolved: true


Can someone help me to correct my fault?

If I missed something please let me know :)

Kind regards - Kolja

Brian Candler

unread,
Mar 6, 2023, 10:35:12 AM3/6/23
to Prometheus Users
On Monday, 6 March 2023 at 12:33:21 UTC Kolja Krückmann wrote:
# Alertmanager configuration
alerting:
  alertmanagers:
    - static_configs:
        - targets:
           - 127.0.0.1:9093

The error message that you've shown suggests otherwise:

* It seems Prometheus is actually using "localhost:9093", not "127.0.0.1:9093", as the alertmanager target
* "localhost" is being resolved to ::1 (the IPv6 loopback address)
* alertmanager is not accepting connections on IPv6

What flags are you running alertmanager with? If you've set --web.listen-address=127.0.0.1:9093 then that would explain it.

You should double-check what actual config file prometheus is running with, because I believe it's not actually using 127.0.0.1:9093 to talk to alertmanager.  That is: it may be using a different config file than the one you're looking at.

Kolja Krückmann

unread,
Mar 7, 2023, 3:48:37 AM3/7/23
to Prometheus Users

Thanks for the quick response.

I just checked and prom is using the correct yml. I just missed that I actually changed the alertmanager to localhost:9093 - thats why my error sais localhost:9093 not 127.0.0.1:9093

 

Furthermore, I actually don't know why but I just restartet the prom.exe and the alertmanager.exe and its working fine.

 

 

Another question I had was how to configure the webhook to teams correctly. The commented block in the end from my alertmanager.yml is now the following:

 

route:
  receiver: 'email'
  group_wait: 30s
  group_interval: 5m
  repeat_interval: 4h
  group_by: [cluster, alertname]
 
  routes:
  - receiver: email
    group_interval: 1m
    group_wait: 10s
    repeat_interval: 1m
    matchers:
    - severity="critical"
 
  - receiver: email
    group_interval: 1m
    group_wait: 10s
    repeat_interval: 1m
    matchers:
    - severity="warning"    
 

  - receiver: teams


    group_interval: 1m
    group_wait: 10s
    repeat_interval: 1m
    matchers:
    - severity="critical"


receivers:
- name: 'email'
  email_configs:

  - to: 'm...@company.com'
    send_resolved: true

 
- name: 'teams'
  webhook_configs:
    - url: 'https://company.webhook.office.com/webhookb2/XXX'
      send_resolved: true
      


error in prom console:

ts=2023-03-07T07:42:29.984Z caller=notifier.go:532 level=error component=notifier alertmanager=http://localhost:9093/api/v2/alerts count=25 msg="Error sending alert" err="Post \"http://localhost:9093/api/v2/alerts\": dial tcp [::1]:9093: connectex: No connection could be made because the target machine actively refused it."

 

and my alertmanager console sais:

ts=2023-03-07T08:10:21.942Z caller=dispatch.go:352 level=error component=dispatcher msg="Notify for alerts failed" num_alerts=150 err="teams/webhook[0]: notify retry canceled due to unrecoverable error after 1 attempts: unexpected status code 400: https://company.webhook.office.com/webhookb2/XXX: Summary or Text is required."

 

I will attach my tmpl.

 

I copied the prometheus-msteams on my machine and run it via: 

.\prometheus-msteams-windows-amd64.exe -http-addr "localhost:2000" -teams-incoming-webhook-url "https://company.webhook.office.com/webhookb2/XXX"

 

What I already checked:

- ports used (running netstat -ao) -> everything looks fine, ports aren't blocked and only the prom services are running on those ports

- firewall rules (non blocking)

 

Now i'm kinda stuck because I really want that Teams integration but running prom on Windows Server (without docker) doesn't seem to be that often in use.

 

 

And one final question in regards of using this google group:
When I'm having a completly different question shall I create a new conversation or just ask in an open one like mine here? (e.x. for the relabeling part - because I actually don't know anything about regex and so on)

Brian Candler

unread,
Mar 7, 2023, 5:29:31 AM3/7/23
to Prometheus Users
On Tuesday, 7 March 2023 at 08:48:37 UTC Kolja Krückmann wrote:

I just checked and prom is using the correct yml. I just missed that I actually changed the alertmanager to localhost:9093 - thats why my error sais localhost:9093 not 127.0.0.1:9093

 

Furthermore, I actually don't know why but I just restartet the prom.exe and the alertmanager.exe and its working fine.

localhost:9093 and 127.0.0.1:9093 are different

The first connects to any address which maps to "localhost" in your hosts file, and this includes ::1 (IPv6)

127.0.0.1 *only* connects to 127.0.0.1 (IPv4)
 

 Another question I had was how to configure the webhook to teams correctly.

You can't simply point a webhook to Teams.  The Alertmanager webhook sends a fixed format JSON payload, which is not in the format that Teams expects (as the error says).

You'll need some middleware to convert it into something that Teams will understand, for example:

Google for "prometheus alertmanager teams" for more info.

Kolja Krückmann

unread,
Mar 7, 2023, 6:57:11 AM3/7/23
to Prometheus Users
I am trying to use prometheus-msteams as middleware.
there I am using my .tmpl earlyer attached. The error sais that it's missing a summary or text. But my tmpl should have that.

any suggestions?

Brian Candler

unread,
Mar 7, 2023, 7:22:44 AM3/7/23
to Prometheus Users
This one? https://github.com/prometheus-msteams/prometheus-msteams

As your problem is with configuring that piece of software, then you're probably best off asking your question there.

ISTM you'd be best off starting with their supplied default-message-card.tmpl, and get it working with that, before modifying it.

> And one final question in regards of using this google group:
> When I'm having a completly different question shall I create a new conversation or just ask in an open one like mine here? (e.x. for the relabeling part - because I actually don't know anything about regex and so on)

Separate threads for each topic, please, with a suitable Subject: heading for each one.  It makes them much easier to track and search.

Brian Candler

unread,
Mar 7, 2023, 7:43:33 AM3/7/23
to Prometheus Users
You showed that you are using the following config:

receivers:
...


- name: 'teams'
  webhook_configs:
    - url: 'https://company.webhook.office.com/webhookb2/XXX'
      send_resolved: true

This means you are attempting to send the alertmanager webhook message directly to Microsoft (*.office.com).  You can't do this.  You need to send it to your prometheus-msteams process instead, e.g. localhost:2000, and *that* will send a message to Microsoft.

There is a valid example here: https://github.com/prometheus-msteams/prometheus-msteams#static-uri-handler-eg-alertmanager

receivers:
- name: 'prometheus-msteams'
  webhook_configs:
  - send_resolved: true
    url: 'http://localhost:2000/alertmanager' # the prometheus-msteams proxy

Reply all
Reply to author
Forward
0 new messages