Parsing mpstat

Richard Harnden

unread,

Nov 11, 2020, 12:18:17 PM11/11/20

to

Hi,

I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
mpstat.

My problem is that the %idle column changes:
sometimes like this:
16:50:03 CPU %usr %nice %sys %iowait %irq %soft %steal
%guest %gnice %idle

and sometimes like this:
16:45:36 CPU %user %nice %sys %iowait %irq %soft %steal
%idle intr/s

I want the average after 5 seconds, which is the last line of mpstat 1 5

So what I have is this ...

#!/bin/ksh

typeset -i IDLE=1

for X in $(mpstat | grep %idle)
do
[ "x${X}" = "x%idle" ] && break
IDLE=$((${IDLE}+1))
done

IDLE_PC=$(mpstat 1 5 | awk -v IDLE_COL=${IDLE} '{print $IDLE_COL}' |
tail -1)
BUSY_PC=$((100.0 - ${IDLE_PC}))

echo ${BUSY_PC}
return 0

... but that just seems awfully clunkly.

There must be a better way? Any help much appreciated.

Thanks.

Kenny McCormack

unread,

Nov 11, 2020, 12:39:46 PM11/11/20

to

In article <roh6ci$mru$1...@dont-email.me>,

Richard Harnden <nospam....@gmail.com> wrote:
>Hi,
>
>I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
>mpstat.

% mpstat | gawk '/%idle/ { for (i=1; i<=NF; i++) if ($i == "%idle") break;next } i { print 100-$i }'
6.39
%

6.39 seems to be the right answer, since I have idle == 93.61.

Note: If you don't have gawk, you should either:

1) Install it (preferred).
or
2) Try it with plain old "awk".

--
The randomly chosen signature file that would have appeared here is more than 4
lines long. As such, it violates one or more Usenet RFCs. In order to remain
in compliance with said RFCs, the actual sig can be found at the following URL:
http://user.xmission.com/~gazelle/Sigs/EternalFlame

Chris Elvidge

unread,

Nov 11, 2020, 12:44:22 PM11/11/20

to

> .... but that just seems awfully clunkly.

>
> There must be a better way? Any help much appreciated.
>
> Thanks.

Not good at ksh, but in bash:
echo "100-$(mpstat 1 5 | tail -n1 | awk '{print $NF}')" | bc
seems to work

--

Chris Elvidge, England

Richard Harnden

unread,

Nov 11, 2020, 1:15:32 PM11/11/20

to

On 11/11/2020 17:39, Kenny McCormack wrote:
> In article <roh6ci$mru$1...@dont-email.me>,
> Richard Harnden <nospam....@gmail.com> wrote:
>> Hi,
>>
>> I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
>> mpstat.
>
> % mpstat | gawk '/%idle/ { for (i=1; i<=NF; i++) if ($i == "%idle") break;next } i { print 100-$i }'
> 6.39
> %
>
> 6.39 seems to be the right answer, since I have idle == 93.61.

Thank you.

This works for me:
$ mpstat 1 5 | awk '/%idle/ { for (i=1; i<=NF; i++) if ($i == "%idle")
break;next } /Average/ { print 100-$i }'

>
> Note: If you don't have gawk, you should either:
>
> 1) Install it (preferred).
> or
> 2) Try it with plain old "awk".
>

$ ls -l /usr/bin/awk
lrwxrwxrwx. 1 root root 4 Aug 12 2018 /usr/bin/awk -> gawk

But, yeah, I ought to spell it gawk.

Thanks again.

Richard Harnden

unread,

Nov 11, 2020, 1:21:59 PM11/11/20

to

Thanks, but the %idle column isn't always the last one and that was half
my problem.

Chris Elvidge

unread,

Nov 11, 2020, 2:22:42 PM11/11/20

to

OIC

--

Chris Elvidge, England

Ben Bacarisse

unread,

Nov 11, 2020, 3:37:48 PM11/11/20

to

Richard Harnden <richard...@gmail.com> writes:

> I'm trying to get the average cpu busy%, where busy = 100 - idle%, from mpstat.
>
> My problem is that the %idle column changes:
> sometimes like this:
> 16:50:03 CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
>
> and sometimes like this:
> 16:45:36 CPU %user %nice %sys %iowait %irq %soft %steal %idle intr/s
>
> I want the average after 5 seconds, which is the last line of mpstat 1 5

Another approach is to use the -o JSON option. If you have a js
processor available you can get properly reliable results, but just
using sed you can get the raw percentages on their own:

$ mpstat -o JSON 1 5 | sed -ne '/idle/s/.*"idle": $[0-9.]*$.*/\1/p'
95.49
95.24
93.73
94.47
92.98
$ mpstat -o JSON 1 5 |\
sed -ne '/idle/s/.*"idle": $[0-9.]*$.*/\1/p' |\
awk '{s+=$1} END {print s/NR}'
95.628

--
Ben.

Jorgen Grahn

unread,

Nov 11, 2020, 4:06:55 PM11/11/20

to

On Wed, 2020-11-11, Richard Harnden wrote:
> Hi,
>
> I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
> mpstat.
>
> My problem is that the %idle column changes:
> sometimes like this:
> 16:50:03 CPU %usr %nice %sys %iowait %irq %soft %steal
> %guest %gnice %idle
>
> and sometimes like this:
> 16:45:36 CPU %user %nice %sys %iowait %irq %soft %steal
> %idle intr/s

Do they really vary the spelling of "%usr" and skip "intr/s" too? At
random? Does all this vary within one invocation of mpstat?

I used to parse mpstat output with much success ten years ago. It
seems to have gotten worse since then, but it would be unlikely to be
actively hostile to scripting.

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Kenny McCormack

unread,

Nov 11, 2020, 4:09:26 PM11/11/20

to

In article <87v9ebt...@bsb.me.uk>,
Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
...

>$ mpstat -o JSON 1 5 |\
> sed -ne '/idle/s/.*"idle": $[0-9.]*$.*/\1/p' |\
> awk '{s+=$1} END {print s/NR}'
>95.628

Whenever you end up with both sed and awk in a pipeline, you have almost
certainly made a mistake.

With very rare exceptions (*), anything sed can do, awk can do (better).
(And much more clearly - in a way that doesn't look like line noise).

(*) I grant that these exceptions exist (as it has come up in previous
discussions with hardcore sedsters), but I am not personally aware of
what they are.

--
Mike Huckabee has yet to consciously uncouple from Josh Duggar.

Ben Bacarisse

unread,

Nov 11, 2020, 5:12:03 PM11/11/20

to

gaz...@shell.xmission.com (Kenny McCormack) writes:

> In article <87v9ebt...@bsb.me.uk>,
> Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> ...
>>$ mpstat -o JSON 1 5 |\
>> sed -ne '/idle/s/.*"idle": $[0-9.]*$.*/\1/p' |\
>> awk '{s+=$1} END {print s/NR}'
>>95.628
>
> Whenever you end up with both sed and awk in a pipeline, you have almost
> certainly made a mistake.

I solve problems like this is stages. Picking out the number that
follows '"idle": ' is the sort of thing I know how to do in sed without
thinking. Averaging a list of numbers is another one mist of us would
type without much thought. I don't consider this sort of development,
or the solutions that come from it, a mistake. It has served me well
for more nearly 40 years. YMMV.

--
Ben.

Richard Harnden

unread,

Nov 11, 2020, 5:14:44 PM11/11/20

to

On 11/11/2020 21:06, Jorgen Grahn wrote:
> On Wed, 2020-11-11, Richard Harnden wrote:
>> Hi,
>>
>> I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
>> mpstat.
>>
>> My problem is that the %idle column changes:
>> sometimes like this:
>> 16:50:03 CPU %usr %nice %sys %iowait %irq %soft %steal
>> %guest %gnice %idle
>>
>> and sometimes like this:
>> 16:45:36 CPU %user %nice %sys %iowait %irq %soft %steal
>> %idle intr/s
>
> Do they really vary the spelling of "%usr" and skip "intr/s" too? At
> random? Does all this vary within one invocation of mpstat?

Those are cut'n'pastes and I hadn't noticed the %usr/%user thing.

Not at random, no. I have different versions of redhat from 5.6 thru
8.1 and that's why I have different versions of mpstat.

Janis Papanagnou

unread,

Nov 11, 2020, 5:15:21 PM11/11/20

to

On 11.11.2020 19:15, Richard Harnden wrote:
> On 11/11/2020 17:39, Kenny McCormack wrote:
>> In article <roh6ci$mru$1...@dont-email.me>,
>> Richard Harnden <nospam....@gmail.com> wrote:

> [...]

>
> This works for me:
> $ mpstat 1 5 | awk '/%idle/ { for (i=1; i<=NF; i++) if ($i == "%idle")
> break;next } /Average/ { print 100-$i }'
>
>>
>> Note: If you don't have gawk, you should either:
>>
>> 1) Install it (preferred).
>> or
>> 2) Try it with plain old "awk".
>>
>
> $ ls -l /usr/bin/awk
> lrwxrwxrwx. 1 root root 4 Aug 12 2018 /usr/bin/awk -> gawk
>
> But, yeah, I ought to spell it gawk.

Unless you use GNU awk specific features I consider using the name of the
standard tool 'awk' better to indicate that. Kenny's solution is standard.
(The suggestion to install GNU awk is unrelated to such code maintenance
and portability considerations; using GNU awk has its advantages, and you
have it anyway installed as you showed us.)

In your original post you say "My problem is that the %idle column changes"
and I wonder under which conditions that happens; using different options?
or on different systems? or depending on the mpstat release?

For that reason I always appreciate the existence of options to control
output (like -o in 'ps'); that way you can simplify parsing and standardize
access to tool features.

Janis

Richard Harnden

unread,

Nov 11, 2020, 5:19:23 PM11/11/20

to

Thanks, but the JSON output option seems pretty new and most of my
mpstat versions don't have it.

Kenny's awk one-liner was the clue I needed.

Richard Harnden

unread,

Nov 11, 2020, 5:26:54 PM11/11/20

to

On 11/11/2020 22:15, Janis Papanagnou wrote:
> On 11.11.2020 19:15, Richard Harnden wrote:
>> On 11/11/2020 17:39, Kenny McCormack wrote:
>>> In article <roh6ci$mru$1...@dont-email.me>,
>>> Richard Harnden <nospam....@gmail.com> wrote:
>> [...]
>>
>> This works for me:
>> $ mpstat 1 5 | awk '/%idle/ { for (i=1; i<=NF; i++) if ($i == "%idle")
>> break;next } /Average/ { print 100-$i }'
>>
>>>
>>> Note: If you don't have gawk, you should either:
>>>
>>> 1) Install it (preferred).
>>> or
>>> 2) Try it with plain old "awk".
>>>
>>
>> $ ls -l /usr/bin/awk
>> lrwxrwxrwx. 1 root root 4 Aug 12 2018 /usr/bin/awk -> gawk
>>
>> But, yeah, I ought to spell it gawk.
>
> Unless you use GNU awk specific features I consider using the name of the
> standard tool 'awk' better to indicate that. Kenny's solution is standard.
> (The suggestion to install GNU awk is unrelated to such code maintenance
> and portability considerations; using GNU awk has its advantages, and you
> have it anyway installed as you showed us.)
>
> In your original post you say "My problem is that the %idle column changes"
> and I wonder under which conditions that happens; using different options?
> or on different systems? or depending on the mpstat release?

Different systems and therefore different versions of mpstat.

>
> For that reason I always appreciate the existence of options to control
> output (like -o in 'ps'); that way you can simplify parsing and standardize
> access to tool features.

-o as in ps would've been exactly what I needed, yes.

Kenny McCormack

unread,

Nov 11, 2020, 6:01:59 PM11/11/20

to

In article <rohnpk$8qd$1...@news-1.m-online.net>,
Janis Papanagnou <janis_pa...@hotmail.com> wrote:
...

>Unless you use GNU awk specific features I consider using the name of
>the standard tool 'awk' better to indicate that.

The problem with this mentality is that if you just say "awk", you really
don't know what you're getting and it can vary a lot from system to system.
Even if you don't explicitly use any "GAWK extensions", you still have to
deal with the fact that "awk" can be old and/or buggy. Even if it is
current, it can still be buggy.

Although this problem is probably lessened over the years, I think we all,
as long time readers of comp.lang.awk, know which particular system I have
in mind as the poster child for this problem.

The way I look at is, if I can't trust a system to have gawk, then I can't
really trust it to have a usable awk. So, I always write "gawk".

For example, unless/until you install gawk on a Debian system (or any
Debian-like system, such as Ubuntu, etc), "awk" gets you "mawk". Now,
"mawk" is not terrible, but it isn't particularly good either. In
particular, one of the things I love about GAWK (and GNU tools in general)
is the "no arbitrary limits" philosophy. This means that if I use GAWK,
even if I don't explicitly use any "GAWK extensions", I can rely on the
program doing the right thing with very long input lines. I have
successfully used GAWK to process lines of several hundred thousand
characters. Most other AWKs, including mawk, will barf on this.

$ mawk -W version
mawk 1.3.3 Nov 1996, Copyright (C) Michael D. Brennan

compiled limits:
max NF 32767
sprintf buffer 1020
$

P.S. Yes, I know that, in theory at least, the Debian "alternatives"
system will take care of this - that is, if I install "gawk", then "awk"
will (hopefully) get me "gawk" (i.e., it no longer gets me mawk). However,
this can and does break in practice. I won't bore you further with the
details.

--
"Insisting on perfect safety is for people who don't have the balls to live
in the real world."

- Mary Shafer, NASA Ames Dryden -

Jorgen Grahn

unread,

Nov 11, 2020, 6:26:00 PM11/11/20

to

On Wed, 2020-11-11, Richard Harnden wrote:
> On 11/11/2020 21:06, Jorgen Grahn wrote:
>> On Wed, 2020-11-11, Richard Harnden wrote:
>>> Hi,
>>>
>>> I'm trying to get the average cpu busy%, where busy = 100 - idle%, from
>>> mpstat.
>>>
>>> My problem is that the %idle column changes:
>>> sometimes like this:
>>> 16:50:03 CPU %usr %nice %sys %iowait %irq %soft %steal
>>> %guest %gnice %idle
>>>
>>> and sometimes like this:
>>> 16:45:36 CPU %user %nice %sys %iowait %irq %soft %steal
>>> %idle intr/s
>>
>> Do they really vary the spelling of "%usr" and skip "intr/s" too? At
>> random? Does all this vary within one invocation of mpstat?
>
> Those are cut'n'pastes and I hadn't noticed the %usr/%user thing.

Not nice (pun unintended) to rename the columns between versions ...

> Not at random, no. I have different versions of redhat from 5.6 thru
> 8.1 and that's why I have different versions of mpstat.

I suspect this is a very non-shell suggestion, but if you're
collecting stuff like this from a large number of machines,
you may want to look into Prometheus:

<https://en.wikipedia.org/wiki/Prometheus_(software)>

You'd have, among other things, the samples mpstat would have taken,
accumulating in a central database, and you could query it for
e.g. "idle on hosts named foo* since January".

I haven't tried to query that database from a shell script, so I
cannot tell how well it's supported. (It would be useful and it's
technically feasible to stream in a columnar text format out of that
database, but I don't know if anyone has cared enough to implement it.)

Janis Papanagnou

unread,

Nov 11, 2020, 6:53:55 PM11/11/20

to

On 12.11.2020 00:01, Kenny McCormack wrote:
> In article <rohnpk$8qd$1...@news-1.m-online.net>,
> Janis Papanagnou <janis_pa...@hotmail.com> wrote:
> ...
>> Unless you use GNU awk specific features I consider using the name of
>> the standard tool 'awk' better to indicate that.
>
> The problem with this mentality is that if you just say "awk", you really
> don't know what you're getting and it can vary a lot from system to system.
> Even if you don't explicitly use any "GAWK extensions", you still have to
> deal with the fact that "awk" can be old and/or buggy. Even if it is
> current, it can still be buggy.

While I understand the "buggy" issue we can explicitly name SunOS/Solaris
with its standard path pointing to a really old and buggy awk, but even
on Solaris you have the standard awk available in /usr/xpg4/bin. All other
awks should anyway be standard. (That's at least widely my observation.)

That said, I even wonder whether your awk solution wouldn't work on Solaris
even with its default old awk, but I have no system to check.

So distinguishing standard from non-standard is, even on Solaris, simply
achievable, and if one has problems with the Solaris default he should just
fix his path setting. And there's always room in /usr/local/bin to install
(if not already there) other versions.

> Although this problem is probably lessened over the years, I think we all,
> as long time readers of comp.lang.awk, know which particular system I have
> in mind as the poster child for this problem.
>
> The way I look at is, if I can't trust a system to have gawk, then I can't
> really trust it to have a usable awk. So, I always write "gawk".

I also understand the wish to have a reliable (and powerful) awk explicitly
named. That's why [on Solaris] I first set the standard path and thus fix
the issues with standard-conformity; and not only WRT awk but with all tools.

And I'd also install GNU awk locally; though - it's been a long time since I
used SunOS/Solaris, so I may be misremembering - I seem to recall that at
some point Sun had GNU awk also available in its delivery (i.e. without need
to install it yourself).

> For example, unless/until you install gawk on a Debian system (or any
> Debian-like system, such as Ubuntu, etc), "awk" gets you "mawk". Now,
> "mawk" is not terrible, but it isn't particularly good either.

I'd expected it to be standard.

> In
> particular, one of the things I love about GAWK (and GNU tools in general)
> is the "no arbitrary limits" philosophy. This means that if I use GAWK,
> even if I don't explicitly use any "GAWK extensions", I can rely on the
> program doing the right thing with very long input lines.

Yes, then you probably need non-obviously visible (non-standard) features.

My point was a view from another perspective; is my program standard and can
it be taken from system A to system B, or do I need to have to install (or
be able to install - which is *not* generally possible!) a specific version
of awk.

To illustrate it; professionally I worked over time in several companies,
and quite some of them had security policies that disallowed installing own
software. If I see software posted that relies on 'gawk' I'd be out of luck
running it in my standard environment. If, OTOH, I see code using 'awk' I'm
confident to be able to run it. Effectively it's just an hint and indication
that the code is standard.

That's why I suggest to use 'awk' where the standard applies, and 'gawk' to
indicate the use of non-standard GNU awk specific features. That of course
needs knowledge of what is a standard and what is an extension.

Similary with shells; unless I am using shell specific features I speak of
"shell" or of 'sh', implying using the standard base. Often I use extensions
(typically the ones that the "prominent", "modern" shells support the same
way) and name/explain it accordingly. And sometimes I name a specific shell
explicitly (often ksh) because of its distinguished feature used.

This was the central point of my suggestion; to indicate by tool name whether
a standard or non-standard feature set is used. And it was only a suggestion,
YMMV.

> I have
> successfully used GAWK to process lines of several hundred thousand

> characters. Most other AWKs, including mawk, will barf on this. [...]

Janis

John D Groenveld

unread,

Nov 12, 2020, 10:32:12 PM11/12/20

to

In article <rohtie$aml$1...@news-1.m-online.net>,

Janis Papanagnou <janis_pa...@hotmail.com> wrote:
>And I'd also install GNU awk locally; though - it's been a long time since I
>used SunOS/Solaris, so I may be misremembering - I seem to recall that at
>some point Sun had GNU awk also available in its delivery (i.e. without need
>to install it yourself).

$ cat /etc/release
Oracle Solaris 11.4 X86
Copyright (c) 1983, 2020, Oracle and/or its affiliates.
Assembled 23 September 2020
$ pkg search -l /usr/bin/gawk OR /usr/gnu/bin/awk
INDEX ACTION VALUE PACKAGE
path link usr/bin/gawk pkg:/text/ga...@5.0.1-11.4.24.0.1.75.1
path file usr/gnu/bin/awk pkg:/text/ga...@5.0.1-11.4.24.0.1.75.1

John
groe...@acm.org

Chris Elvidge

unread,

Nov 13, 2020, 6:31:39 AM11/13/20

to

On 11/11/2020 06:21 pm, Richard Harnden wrote:

Why not go to the source and calculate what you want?
awk '/cpu[0-9]/{for(i=2;i<=NF;i++)T+=$i;I=($5/T)*100;print
toupper($1),I"%",100-I"%";T=0}' /proc/stat

--

Chris Elvidge, England

Kenny McCormack

unread,

Nov 13, 2020, 7:30:31 AM11/13/20

to

In article <rolqql$rj7$1...@dont-email.me>,
Chris Elvidge <ch...@mshome.net> wrote:
...

>Why not go to the source and calculate what you want?
>awk '/cpu[0-9]/{for(i=2;i<=NF;i++)T+=$i;I=($5/T)*100;print
>toupper($1),I"%",100-I"%";T=0}' /proc/stat

Yes, that's clever, but I don't think it is a sensible answer to OP's
question.

--
Hindsight is (supposed to be) 2020.

Trumpers, don't make the same mistake twice.
Don't shoot yourself in the feet - and everywhere else - again!.

Chris Elvidge

unread,

Nov 13, 2020, 8:33:59 AM11/13/20

to

On 13/11/2020 12:30 pm, Kenny McCormack wrote:
> In article <rolqql$rj7$1...@dont-email.me>,
> Chris Elvidge <ch...@mshome.net> wrote:
> ...
>> Why not go to the source and calculate what you want?
>> awk '/cpu[0-9]/{for(i=2;i<=NF;i++)T+=$i;I=($5/T)*100;print
>> toupper($1),I"%",100-I"%";T=0}' /proc/stat
>
> Yes, that's clever, but I don't think it is a sensible answer to OP's
> question.
>

Ah but it still works in the event (Void Linux) that sysstat is not
installed.

--

Chris Elvidge, England