Hi,
I tried to find proper soluion in this group but I still can't find possible one.
I am monitoring over 1000 AWS instances.
and I have 3 Alert rules for CPU, Memory and Disk.
There is common threshold for all instances something like,
CPU : 0.7, Memory : 0.7, Disk: 0.7
-> e.g. For Common (All instances)
- alert: CPU
expr: CPU > 0.7
...
- alert: MEM
expr: MEM > 0.7
...
Now, I need to add different threshold for specific instances which it want to use different threshold for CPU.
- alert: CPU
expr: CPU{instance="AAA"} > 0.8
Simplest approach like below, But If there are more instances with specific case, file will be complexed.
# For specific one
- alert: CPU
expr: CPU{instance="AAA"} > 0.8
- alert: CPU
expr: CPU{instance="BBB"} > 0.9
# For common
- alert: CPU
expr: CPU{instance!~"AAA|BBB} > 0.7
- alert: MEM
expr: MEM > 0.7
In this case, Isn't there more efficient way to make rules for specific and common.
Thanks for your answer in advance.