CPU Usage

275 views
Skip to first unread message

Sandesh Shivakumar

unread,
Dec 9, 2022, 12:23:04 AM12/9/22
to Prometheus Users

Hello All,

 

I was scraping data for fetching CPU usage for every minute and the query i'm using is ((1 - avg(irate(node_cpu_seconds_total{mode="idle",instance=~"$ip"}[5m])) by (instance)) * 100),

 i'm getting the average data but i want the last data sample which is evaluated.

Could you please help me with a proper query to fetch cpu usage of entire instance for a particular minute.

Thanks and Regards,

Sandesh S


Brian Candler

unread,
Dec 9, 2022, 3:43:16 AM12/9/22
to Prometheus Users
You're averaging across all the CPUs to get a single figure for the instance, instead of a separate figure per CPU.  You're not averaging over time.

You're using irate(...) which uses the last two figures available for CPU usage.  I'd use rate(...[2m]) instead of irate(...[5m]) but they should give the same results when you have a 1 minute scrape interval.  Either of these will take the difference between node_cpu_seconds_total@now and node_cpu_seconds_total@1_minute_ago and use this to calculate the rate of CPU usage.

So what exactly is wrong with this query - in other words, what do you want that's different?

Sandesh Shivakumar

unread,
Dec 9, 2022, 5:06:16 AM12/9/22
to Prometheus Users
Hello Sir,

Actually I wanted find anomaly using Prometheus for time series data of Memory and CPU, so I thought of implementing Z-score.
and, Z-score formula is (x-mean)/Standard deviation.

We have already referred the following Links:



but couldn't fetch any positive result.

So I wanted Prometheus query which would help me to get anomaly for specified time interval, whether it is Z-score or any other method.

Brian Candler

unread,
Dec 9, 2022, 5:38:49 AM12/9/22
to Prometheus Users
Then you need avg_over_time and stddev_over_time to get the mean and standard deviation over time, and those links give you the exact queries to use, especially https://stackoverflow.com/a/73351330

Instead of sum(rate(http_server_requests_seconds_count[1m])) you'll use 1 - avg(irate(node_cpu_seconds_total{mode="idle",instance=~"$ip"}[5m])) by (instance)

If the expression you've built doesn't work, then you'll need to debug it.  I'd approach that by breaking it into parts, and putting the parts individually into the PromQL expression browser in the web UI, and drawing graphs of each subexpression.  Then start to combine the subexpressions and look at the results of those, until you've built up the whole query.

Typical problems is that one subexpression creates an empty instance vector; or you're trying to combine two subexpressions with different label sets, so the result set is empty unless you use appropriate "on" or "ignoring" qualifiers.
Reply all
Reply to author
Forward
0 new messages