You will need to make a PromQL query which performs the same calculation that "top" is doing to calculate that value. I don't know what time period top calculates that over, nor what scrape interval you are using for your node_exporter metrics.
As I said before, if you want some queries to copy for any node_exporter variables, there are Grafana dashboards available. Just open them up and copy the queries they are making.
If you are scraping at 1 minute intervals, then something like this should do the trick:
avg by (instance) (rate(node_cpu_seconds_total{mode="steal"}[2m])) * 100
You can use 'sum' instead of 'avg', but then the percentages will reflect multiple CPUs (e.g. host has 8 CPUs => values will be out of 800%)
How this works:
- the metric node_cpu_seconds_total{mode="steal"} accumulates all the time that each CPU has spent in the "steal" state
- taking a rate(...) of this metric will tell you the fraction of time in this state, i.e. the number of seconds in "steal" state, per second of real time
- there will be separate values of this metric for each host (instance) and each cpu on that host
- avg by (instance) will group together all the metrics for each unique host, i.e. all CPUs on that host, and average them - giving one metric per host
The values for all CPU states *should* add up to 100%. In practice, they don't quite exactly: see
sum by (instance,cpu)(rate(node_cpu_seconds_total[2m])) * 100
If this matters, you can make a more complex query to normalize the results.