promql optimization

26 views
Skip to first unread message

ishu...@gmail.com

unread,
Nov 24, 2021, 5:58:21 AM11/24/21
to Prometheus Users
Hi Team,

Any suggestion is very much appreciated.

I have an alert for breach of 95 percentile threshold on API requests. Now problem is, different API requests has different threshold, so I cannot hard code the alerts as I would have to add 10 alerts for 10 different APIs with their thresholds. 

The alerts would like something like this
es_exporter_percentiles_values_95_0{httppath="path1",app="app1"} > 10
es_exporter_percentiles_values_95_0{httppath="path2",app="app2"} > 5
es_exporter_percentiles_values_95_0{httppath="path3",app="app3"} > 3

Doesn't want to use or as again wanted to do this in a better smarter way. Tried using recording rules, but that would result in adding recording rules for 20 apps * 4 env * 5 http paths. 

Any better way, so that I have only one alert definition but can be applied against all apps, paths and envs. 

Thanks
Eswar


Brian Candler

unread,
Nov 24, 2021, 7:26:19 AM11/24/21
to Prometheus Users
Generate additional static timeseries such as:

es_exporter_percentiles_values_95_0_threshold{httppath="path1",app="app1"} 10
es_exporter_percentiles_values_95_0_threshold{httppath="path2",app="app2"} 5
es_exporter_percentiles_values_95_0_threshold{httppath="path3",app="app3"} 3

Then your alerting rule becomes just:

es_exporter_percentiles_values_95_0 > on(httppath,app) es_exporter_percentiles_values_95_0_threshold

Those static timeseries can be put onto a webserver that you scrape, or using node_exporter textfile_collector.  You may need to add extra labels to your threshold timeseries and on(...) clause if there is overlap, e.g. between different environments, to ensure there's always a 1:1 match between value and threshold.

which also shows how you can have a default threshold for those which aren't explicitly set.
Reply all
Reply to author
Forward
0 new messages