[PATCH] lparstat: Fix negative values for %idle PURR

77 views
Skip to first unread message

Saket Kumar Bhaskar

<skb99@linux.ibm.com>
unread,
Jan 13, 2025, 3:13:47 AMJan 13
to powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, srikar@linux.ibm.com, sshegde@linux.ibm.com, nathanl@linux.ibm.com, naveen.n.rao@linux.ibm.com
In certain scenarios, the %idle PURR metric displays negative values [1],
while %busy PURR exceeds 100% giving users false impression of resource
utilisation. This occurs when delta_purr becomes greater than delta_tb,
causing the following expression to yield a negative value, particularly
during 100% system utilization for %idle PURR:

%idle = (delta_tb - delta_purr + delta_idle_purr) / delta_tb * 100;

Without change:

./lparstat -E 1 30

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=20 mem=208057792 kB cpus=42 ent=2.00

---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
103.88 -3.88 2.75GHz[ 98%] 101.80 0.00
103.46 -3.46 2.67GHz[ 95%] 98.28 1.49
101.53 -1.53 2.74GHz[ 98%] 99.50 0.51
103.41 -3.41 2.70GHz[ 96%] 99.27 0.37

The delta_tb is computed using get_scaled_tb, which calculates the
timebase for a given time difference. Previously, nanoseconds were
ignored in the calculation of time difference, which led to delta_tb
being underestimated.

This patch addresses the issue by incorporating nanoseconds into the
time difference, ensuring precise calculations.

Also, rename get_time() to get_time_ns() to denote it returns time in
nanoseconds. get_delta_time() is introduced as a wrapper to get delta
time in seconds.

With change:
./lparstat -E 1 30

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=20 mem=208057792 kB cpus=42 ent=2.00

---Actual--- -Normalized-
%busy %idle Frequency %busy %idle
------ ------ ------------- ------ ------
99.52 0.48 2.74GHz[ 98%] 97.53 2.66
99.53 0.47 2.71GHz[ 97%] 96.54 3.67
99.49 0.51 2.71GHz[ 97%] 96.51 3.87
99.51 0.49 2.70GHz[ 97%] 96.53 3.90
99.48 0.52 2.69GHz[ 96%] 95.50 4.38

[1] https://github.com/ibm-power-utilities/powerpc-utils/issues/103

Signed-off-by: Saket Kumar Bhaskar <sk...@linux.ibm.com>
---
src/lparstat.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/src/lparstat.c b/src/lparstat.c
index fe8b0fc..db22316 100644
--- a/src/lparstat.c
+++ b/src/lparstat.c
@@ -254,7 +254,7 @@ long long get_delta_value(char *se_name)
return (value - old_value);
}

-void get_time()
+void get_time_ns(void)
{
struct sysentry *se;
struct timespec ts;
@@ -266,7 +266,12 @@ void get_time()

se = get_sysentry("time");
sprintf(se->value, "%lld",
- (long long)ts.tv_sec);
+ (long long)ts.tv_sec * 1000000000LL + (long long)ts.tv_nsec);
+}
+
+double get_delta_time(void)
+{
+ return (get_delta_value("time") / 1000000000.0);
}

int get_time_base()
@@ -307,7 +312,7 @@ double get_scaled_tb(void)
se = get_sysentry("online_cores");
online_cores = atoi(se->value);

- elapsed = get_delta_value("time");
+ elapsed = get_delta_time();

se = get_sysentry("timebase");
timebase = atoi(se->value);
@@ -386,7 +391,7 @@ void get_cpu_physc(struct sysentry *unused_se, char *buf)

physc = delta_purr / delta_tb;
} else {
- elapsed = get_delta_value("time");
+ elapsed = get_delta_time();

se = get_sysentry("timebase");
timebase = atoi(se->value);
@@ -415,7 +420,7 @@ void get_cpu_app(struct sysentry *unused_se, char *buf)
float timebase, app, elapsed_time;
long long new_app, old_app;

- elapsed_time = get_delta_value("time");
+ elapsed_time = get_delta_time();

se = get_sysentry("timebase");
timebase = atof(se->value);
@@ -1018,7 +1023,7 @@ void init_sysdata(void)
{
int rc = 0;

- get_time();
+ get_time_ns();
parse_lparcfg();
parse_proc_stat();
parse_proc_ints();
--
2.43.5

Srikar Dronamraju

<srikar@linux.ibm.com>
unread,
Jan 19, 2025, 7:25:20 AMJan 19
to Saket Kumar Bhaskar, powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, sshegde@linux.ibm.com
* Saket Kumar Bhaskar <sk...@linux.ibm.com> [2025-01-13 13:43:39]:

Saket,
This change looks good. Its good that we rename the function to get_time_ns
so that there is no confusion on whats the unit of time we were looking for.

Reviewed-by: Srikar Dronamraju <sri...@linux.ibm.com>
--
Thanks and Regards
Srikar Dronamraju

Samir Alamshaha Mulani

<samir@linux.ibm.com>
unread,
Feb 25, 2025, 3:48:38 AMFeb 25
to Saket Kumar Bhaskar, powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, srikar@linux.ibm.com, sshegde@linux.ibm.com, nathanl@linux.ibm.com, naveen.n.rao@linux.ibm.com

Hello,

I have tested the patch, and the issue is no longer present.


------------Logs---------------
-----Without Patch-------
 lparstat -E 4 4

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=10 mem=103223680 kB cpus=17 ent=5.00

---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
 80.57  19.43  3.13GHz[112%]  90.24  10.10
101.21  -1.21  3.11GHz[111%] 112.35   0.00
101.20  -1.20  3.11GHz[111%] 112.33   0.00
101.19  -1.19  3.12GHz[111%] 112.32   0.00


-----With Patch------
./lparstat -E 4 4

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=10 mem=103223680 kB cpus=17 ent=5.00

---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
 99.25   0.75  2.98GHz[106%] 105.20   0.00
 99.24   0.76  3.00GHz[107%] 106.19   0.00
 99.23   0.77  3.00GHz[107%] 106.18   0.00
 99.25   0.75  3.00GHz[107%] 106.19   0.00


Thanks for the fix !!
Tested-by: Samir Mulani <sa...@linux.vnet.ibm.com>


Samir Alamshaha Mulani

<samir@linux.ibm.com>
unread,
Feb 25, 2025, 3:59:18 AMFeb 25
to Saket Kumar Bhaskar, powerpc-utils-devel@googlegroups.com, tyreld@linux.ibm.com, srikar@linux.ibm.com, sshegde@linux.ibm.com, nathanl@linux.ibm.com, naveen.n.rao@linux.ibm.com

Hello,

On 13/01/25 1:43 pm, Saket Kumar Bhaskar wrote:
I have tested the patch, and the issue is no longer present.


------------Logs---------------
-----Without Patch-------
 lparstat -E 4 4

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=10 mem=103223680 kB cpus=17 ent=5.00

---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
 80.57  19.43  3.13GHz[112%]  90.24  10.10
101.21  -1.21  3.11GHz[111%] 112.35   0.00
101.20  -1.20  3.11GHz[111%] 112.33   0.00
101.19  -1.19  3.12GHz[111%] 112.32   0.00


-----With Patch------
./lparstat -E 4 4

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=10 mem=103223680 kB cpus=17 ent=5.00

---Actual---                 -Normalized-
%busy  %idle   Frequency     %busy  %idle
------ ------  ------------- ------ ------
 99.25   0.75  2.98GHz[106%] 105.20   0.00
 99.24   0.76  3.00GHz[107%] 106.19   0.00
 99.23   0.77  3.00GHz[107%] 106.18   0.00
 99.25   0.75  3.00GHz[107%] 106.19   0.00


Thanks for the fix !!
Tested-by: Samir Mulani <sa...@linux.vnet.ibm.com>

Tyrel Datwyler

<tyreld@linux.ibm.com>
unread,
Feb 25, 2025, 5:02:28 PMFeb 25
to Saket Kumar Bhaskar, powerpc-utils-devel@googlegroups.com, srikar@linux.ibm.com, sshegde@linux.ibm.com, nathanl@linux.ibm.com, naveen.n.rao@linux.ibm.com
Reply all
Reply to author
Forward
0 new messages