AlertManager firing duplicate alerts

630 views
Skip to first unread message

sunil sagar

unread,
Mar 25, 2020, 12:17:59 AM3/25/20
to Prometheus Users
Hi , 

I have Prometheus environment in HA mode . And AlertManager is also in HA mode . 
I am receiving duplicate alerts . 
When I start both the prometheus , because of global label with different replica name , getting duplicate alert . Please advise. 

Prometheus Config:

Prometheus Node1:
global:
   external_labels:
      replica: 1

alerting:
  alertmanagers:
     -static_configs:
         - targets: 
              - alertmanager1:9093
              - alertmanager2:9093

------------------------------------------------------
Prometheus Node2:
global:
   external_labels:
      replica: 1

alerting:
  alertmanagers:
     -static_configs:
         - targets: 
              - alertmanager1:9093
              - alertmanager2:9093
-------------------------------------------------------------
Sample AlertManager rule:
expr: max(up == 0 ) by (host)

sunil sagar

unread,
Mar 25, 2020, 10:17:07 AM3/25/20
to Prometheus Users
Hi All , 

Corrected on error :
Node 2 is marked as replica2 

Prometheus Config:

Prometheus Node1:
global:
   external_labels:
      replica: 1

alerting:
  alertmanagers:
     -static_configs:
         - targets: 
              - alertmanager1:9093
              - alertmanager2:9093

------------------------------------------------------
Prometheus Node2:
global:
   external_labels:
      replica: 2

alerting:
  alertmanagers:
     -static_configs:
         - targets: 
              - alertmanager1:9093
              - alertmanager2:9093
-------------------------------------------------------------


Thanks

On 25 Mar 2020, at 12:18 PM, sunil sagar <sunils...@gmail.com> wrote:


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/c1b788da-59c0-439c-8fc2-d1ded5f0bf46%40googlegroups.com.

Christian Hoffmann

unread,
Mar 25, 2020, 5:37:22 PM3/25/20
to sunil sagar, Prometheus Users
Hi,

you seem to be using external_labels without alert_relabel_configs to
drop this label from your alerts again. Therefore, your alerts will have
different labels and will not be de-duplicated.

See this blog post:
https://www.robustperception.io/high-availability-prometheus-alerting-and-notification

It has an example for the dc label (where you would need replica).

Kind regards,
Christian
> --
> You received this message because you are subscribed to the Google
> Groups "Prometheus Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to prometheus-use...@googlegroups.com
> <mailto:prometheus-use...@googlegroups.com>.
> <https://groups.google.com/d/msgid/prometheus-users/c1b788da-59c0-439c-8fc2-d1ded5f0bf46%40googlegroups.com?utm_medium=email&utm_source=footer>.

Sagar

unread,
Apr 3, 2020, 12:00:28 PM4/3/20
to Christian Hoffmann, Prometheus Users
Hi Christian ,

Thank you for helping me on this. For some reason, I am not able to drop these global labels . 
I tried config as mentioned in link you shared . 

For more information , there are more global labels (such as environment, cluster , etc) apart from replica . 

Do I need to remove other global variables and maintain only one label as replica . 

I also tried with options given below ( from the link : https://github.com/prometheus/prometheus/issues/3239 )

alerting:
  alert_relabel_configs:
    - action: drop
      source_labels: [replica]
      regex: '1'
Please advise , thanks 

Is it possible to query alertmanager alerts like metrics in prometheus ? Like http:/server:9093/mertrics or something else . 

Thanks, 
Sunil Sagar

Christian Hoffmann

unread,
Apr 3, 2020, 3:05:01 PM4/3/20
to Sagar, Prometheus Users
Hi Sunil,

On 4/3/20 6:00 PM, Sagar wrote:
> For more information , there are more global labels (such as
> environment, cluster , etc) apart from replica .
>
> Do I need to remove other global variables and maintain only one label
> as replica .
You need to drop those labels which are used to distinguish different
Prometheus instances which scrape the same targets.

Some example:
If you've got two Prometheus servers scraping all your database servers
and are adding external labels such as cluster=database and replica=A
for the one and replica=B for the other and if you've got another pair
of Prometheus servers for your web servers with external labels
cluster=web and replica=A and replica=B again, then you would only drop
the replica label, not the cluster label.

> I also tried with options given below ( from the link
> : https://github.com/prometheus/prometheus/issues/3239 )
>
> |alerting: alert_relabel_configs:- action: drop source_labels:
> [replica] regex: '1'|
>
> Please advise , thanks
I don't think you should have regex: 1 here? Just drop the regex line so
that it will default to '(.*)' which matches all values.

If this doesn't help, can you share your actual config + the behaviour
you are seeing?

> Is it possible to query alertmanager alerts like metrics in prometheus ?
> Like http:/server:9093/mertrics or something else .
Alertmanager does have metrics, but it does not export alerts there.
However, there is an API for reading them programatically.

https://petstore.swagger.io/?url=https://raw.githubusercontent.com/prometheus/alertmanager/master/api/v2/openapi.yaml#/alert


Kind regards,
Christian

Sagar

unread,
Apr 4, 2020, 12:42:38 PM4/4/20
to Christian Hoffmann, Prometheus Users
Hi Christian, 

Thank you for the helping hand . 

My prometheus.yml file looks like this  : 

global: 
  external_labels:
     dc: europe
     replica: primary

alerting:
  alertmanagers:
    - static_configs:
      - targets:
        - am:9093
     relabel_configs:
     - action: drop
       source_labels: [replica]
       regex: (.*)

rule_configs:
  - prom_rules.yml

scrap_configs:

  - job_name: 'prometheus'

.
.
.

For other server, replica is secondary 
I want to drop only label in alert manager , but it drops entire alert in alert_manager. 

In Alert-Manager , I see details in boxes like as given 

dc=europe        instance=servername          replica=primary
dc=europe        instance=servername          replica=secondary 

I want the output like below, to avoid duplicate entry from secondary server
dc=europe        instance=servername     

Please suggest

Thanks 

Christian Hoffmann

unread,
Apr 4, 2020, 12:58:11 PM4/4/20
to Sagar, Prometheus Users
On 4/4/20 6:42 PM, Sagar wrote:
>      relabel_configs:
>      - action: drop
>        source_labels: [replica]
>        regex: (.*)
[...]
>
> For other server, replica is secondary 
> I want to drop only label in alert manager , but it drops entire alert
> in alert_manager. 

Ah, now I see the problem. You are using action: drop -- this means
exactly that: dropping the whole alert.
You want to use action: labeldrop instead.

Kind regards,
Christian

Sagar

unread,
Apr 4, 2020, 1:30:36 PM4/4/20
to Christian Hoffmann, Prometheus Users
Thank you so much for the pointer Christian. 
It resolved the issue. 

Thanks 
Sunil Sagar
Reply all
Reply to author
Forward
0 new messages