Grafana Alertmanager

81 views
Skip to first unread message

Julien Pivotto

unread,
Sep 14, 2021, 9:35:09 AM9/14/21
to grafana-team
Dear Grafana developers,

I try to somehow keep up on the alertmanager maintainership these days.
A bunch of recent pull requests export functions or add abstraction
layers on top of the alertmanager.

The alertmanager code is tricky, and adding abstraction layers create
bugs and make the code less readable.

I think that Grafana should somehow make a choice here: using the
alertmanager, or forking it. If it looks like an alertmanager but
behaves differently, e.g. runs different validations, accepts other
forms of silences, etc, maybe it should not be called alertmanager.

I also think that once grafana engages in a "prometheus way" of
alerting, they should follow the Prometheus standards, e.g. the format
of the alerts, including which labels are allowed, etc. If Grafana uses
the alertmanager, it should ensure that e.g. the matchers and alert
labels are "prometheus compatible".

I'd also note that if we miss some feature or are not flexible enough,
alertmanager is still pre-1.0. We can probably still change some
concepts right and left if we have them wrong.

--
Julien Pivotto
@roidelapluie

Josue Abreu

unread,
Sep 23, 2021, 10:20:12 AM9/23/21
to Grafana Developers
Thank you very much for reaching out to us on Julien. We've had a closer look and managed to avoid adding any unnecessary abstractions.

Sometimes, these are non-obvious and require a bit more effort from our side to avoid side-tracking ourselves from the Alertmanager-way.

All that's left from the original PR is: https://github.com/grafana/alertmanager/commit/b2bf570cb2349d39b69a66ba8180471a9948fdbd - do you think this is acceptable?

We currently maintain parity with all the main components of the upstream Alertmanager (routing, grouping, deduping, notification dispatching, silencing, and even HA gossiping in Grafana) - it's just the API and configuration side that's slightly different. Although, eventually, we want to get back to running upstream Alertmanager on its whole, consider this a bit like the Grafana Agent vs Prometheus Agent scenario.

Kind regards,
Reply all
Reply to author
Forward
0 new messages