Operational intensity inf on cluster

28 views
Skip to first unread message

bob lukas

unread,
Feb 28, 2024, 12:02:22 PMFeb 28
to likwid-users
Hi all,

I am using likwid-5.2.2 to compute the arithmetic intensity of some parts of an application. In particular, I am using the marker API and likwid-perfctr in the following way:

likwid-perfctr -C 0 -g MEM_DP -V 3 ./likwidprogram

Here's the relevant part of the output. As it can be seen, I am not able to obtain a meaningful operational intensity.


CPU name: Intel(R) Xeon(R) CPU E5-2640 0 @ 2.50GHz
CPU type: Intel Xeon SandyBridge EN/EP processor
CPU clock: 2.50 GHz

+-----------------------------------+------------+
| Metric | HWThread 0 |
+-----------------------------------+------------+
| Runtime (RDTSC) [s] | 254.2896 |
| Runtime unhalted [s] | 209.4750 |
| Clock [MHz] | 2947.3560 |
| CPI | 0.8580 |
| Energy [J] | 0 |
| Power [W] | 0 |
| Energy DRAM [J] | 0 |
| Power DRAM [W] | 0 |
| MFLOP/s | 8.1904 |
| AVX [MFLOP/s] | 0.2023 |
| Packed [MUOPS/s] | 0.0506 |
| Scalar [MUOPS/s] | 7.9881 |
| Memory read bandwidth [MBytes/s] | 0 |
| Memory read data volume [GBytes] | 0 |
| Memory write bandwidth [MBytes/s] | 0 |
| Memory write data volume [GBytes] | 0 |
| Memory bandwidth [MBytes/s] | 0 |
| Memory data volume [GBytes] | 0 |
| Operational intensity | inf |
+-----------------------------------+------------+


likwid has been installed thourgh spack in the cluster of my institution. In particular, in the config.mk file I have:
ACCESSMODE = perf_event#NO SPACE

After searching a bit, I discovered that perf_event_paranoid is set to 2. Moreover, looking at all the output file, likwid at some point reported the following:

DEBUG - [perfmon_setupCountersThread_perfevent:905] Cannot measure Uncore with perf_event_paranoid value = 2


This looks like https://github.com/RRZE-HPC/likwid/issues/327, so I guess the solution is to set that value to 0, right?

Since I am using likwid inside a cluster, does this mean that the only way to make this work is to contact the cluster administrator and ask for changing such a value? 


Best,
Bob

Thomas Gruber

unread,
Feb 28, 2024, 12:28:11 PMFeb 28
to likwid-users
Hi,

yes, you have to ask your sysadmin how you can reduce the paranoid level to zero, otherwise you won't get data from the memory controllers. In our center, we have a special job submission flag for that and I know other centers who also do. It is often mentioned in the docs in relation to Intel VTune, PAPI, or any other performance tool.

Remark: You are not using the MarkerAPI at the moment. You have to add -m to the command.

bob lukas

unread,
Feb 28, 2024, 4:20:00 PMFeb 28
to likwid-users
First, thanks a lot for the quick reply!

Are you specifically talking about this:
https://github.com/RRZE-HPC/likwid/wiki/LIKWID-and-SLURM#enabling-cpu-performance-monitoring
?

> A suitable way for HPC clusters with Slurm is to configure a prolog that detects if a job is running exclusively on a node and then sets > /proc/sys/kernel/perf_event_paranoid to 0. Correspondingly, an epilog is needed that sets it back to the default value of 2.

Best,
Bob

Thomas Gruber

unread,
Mar 16, 2024, 7:54:10 AMMar 16
to likwid-users
Hi bob,

yes, this is the page. The actual code snippets required for SLURM are at the bottom: https://github.com/RRZE-HPC/likwid/wiki/LIKWID-and-SLURM#slurm-integration

A submit filter to allow the "gimme access flag" aka "--constraint hwperf" only for exclusive jobs and the two pieces for prologue/epilogue to actually set it.

Best,
Tom
Reply all
Reply to author
Forward
0 new messages