What does count by job rule mean?

19 views
Skip to first unread message

radhamani...@gmail.com

unread,
Aug 24, 2020, 7:02:11 PM8/24/20
to Prometheus Users
Targetdown alert rule shows before PromQL query,

what does job,service mean in this rule?
Why is there (up) in the ount by(job, namespace, service) (up) ? I understand what is up==0.But I am not sure what is just (up) means in the denominator.

If I want to customize this query,how can I do this? is there a document reference for it? I am looking for Prometheus operator alert rule.

Brian Candler

unread,
Aug 25, 2020, 5:18:45 AM8/25/20
to Prometheus Users
"job" and "service" are labels.

"up" is a metric - it's a metric generated by prometheus itself for every scrape job, with value 1 for successful scrape or 0 for failed scrape.

Inside this query, "up" and "up == 0" are both PromQL (sub)expressions.  They both return an "instant vector": that is, a collection of timeseries at a given point in time.  Both are filters which return a subset of the universe of timeseries which are available.  "up" returns all timeseries with metric name "up".  "up == 0" returns all timeseries with metric name "up" which also have a value of 0  (that is, it filters out all timeseries where the value is not 0)

count by (labels) (metric) is an aggregation expression in prometheus.  It generates a vector of counts for each unique combination of the given labels.

For customising this, you need to learn PromQL.  Start with the "concepts" pages in the official documentation:

Then there are lots of good resources on the Internet, use the ones which make most sense to you:


You can enter these expressions in the PromQL expression browser in the Prometheus web interface (by default port 9090) to test them.  If you are writing alerting rules, then any PromQL expression which returns any non-empty instant vector will generate an alert.

Good luck!
Reply all
Reply to author
Forward
0 new messages