Best practice for storing hundreds of alert rules

52 views
Skip to first unread message

jthunder

unread,
Oct 22, 2021, 3:31:49 AM10/22/21
to Prometheus Users
Hi guys,

We use Prometheus to monitor K8s clusters and applications on them. We have create 500-800 alert rules to be applied to Prometheus. But we don't find any doc about how to arrange these rules.

Currently we have 3 options:
1. One file per rule; or
2. One file including all rules; or
3. Spread all rules to 5-8 files based on clusters or severity;

In addition, we need to modify these rules and trigger Prometheus to reload them.

Any suggestions? 

Thanks

Evelyn Pereira Souza

unread,
Oct 22, 2021, 6:27:04 AM10/22/21
to Prometheus Users
On 22.10.21 09:31, jthunder wrote:
> Hi guys,
>
> We use Prometheus to monitor K8s clusters and applications on them. We
> have create 500-800 alert rules to be applied to Prometheus. But we
> don't find any doc about how to arrange these rules.

Why you need 800 alert rules? Maybe write the rules more generic?

kind regards
Evelyn
OpenPGP_0x61776FA8E38403FB.asc
OpenPGP_signature

Brian Candler

unread,
Oct 22, 2021, 10:47:51 AM10/22/21
to Prometheus Users
On Friday, 22 October 2021 at 08:31:49 UTC+1 jthunder wrote:
We use Prometheus to monitor K8s clusters and applications on them. We have create 500-800 alert rules to be applied to Prometheus. But we don't find any doc about how to arrange these rules.

Currently we have 3 options:
1. One file per rule; or
2. One file including all rules; or
3. Spread all rules to 5-8 files based on clusters or severity;


It's entirely up to you how to organize them.  If they are being generated automatically, then you might as well put them in one file.  If you are maintaining them manually, then whatever makes sense from an organizational point of view: by host, by service, by responsible team etc.
 
In addition, we need to modify these rules and trigger Prometheus to reload them.

Ben Kochie

unread,
Oct 22, 2021, 11:52:30 AM10/22/21
to jthunder, Prometheus Users
The best option is to use the Prometheus Operator "PrometheusRule" object. This will allow you to directly push rules to your clusters where they need to be.


--
You received this message because you are subscribed to the Google Groups "Prometheus Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to prometheus-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/prometheus-users/25496f59-f441-4d69-b10e-e833a7334f70n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages