On 8/10/21 11:11 AM, Vince Weaver wrote:
> On Wed, 4 Aug 2021, Christopher Kraemer wrote:
>
>> I am using PAPI with TAU to profile machine learning applications that will
>> run on an NVIDIA NX. I am unable to get any events when I run ./papi_avail
>> and ./papi_component gives Unknown libpfm4 related error as the reason. I
>> have pasted the output of both of those commands below. Is there a way to
>> fix this?
>
> I don't know if libpfm4 has support for the NVIDIA NX ARM processors.
>
> The fix would be to add support for this to libpfm4 but that's not
> necessarily a trivial thing to do.
>
> Does the "perf" tool work on your device? Something like
> "perf stat /bin/ls"
>
> If perf is working it might be possible to convince PAPI to use the built
> in kernel events even if libpfm4 has no support, but you seem to be
> running a fairly old version of Linux so that might be an additional
> problem.
>
> Vince
Hi Christopher,
Yes, checking that perf supports the process would be a good idea. As Vince mentioned linux 4.9 is fairly old. Can list out the pmu events that perf knows about with:
perf list pmu
Looking at the kernel info it looks like it might be due to the old kernel (or libpfm/papi). What versions are being used? For nvidia NVIDIA Jetson Nano Developer Kit running Fedora 34 linux-5.13.8 definitely see some events from the above command and from "papi_avail".
$ papi_avail -a
Available PAPI preset and user defined events plus hardware information.
--------------------------------------------------------------------------------
PAPI version : 6.0.0.1
Operating system : Linux 5.13.8-200.fc34.aarch64
Vendor string and code : ARM (7, 0x7)
Model string and code : ARM Cortex A57 (1, 0x1)
CPU revision : 1.000000
CPUID : Family/Model/Stepping 8/3335/1, 0x08/0xd07/0x01
CPU Max MHz : 1912
CPU Min MHz : 204
Total cores : 4
SMT threads per core : 1
Cores per socket : 4
Sockets : 1
Cores per NUMA region : 4
NUMA regions : 1
Running in a VM : no
Number Hardware Counters : 6
Max Multiplex Counters : 384
Fast counter read (rdpmc): no
--------------------------------------------------------------------------------
================================================================================
PAPI Preset Events
================================================================================
Name Code Deriv Description (Note)
PAPI_L1_DCM 0x80000000 Yes Level 1 data cache misses
PAPI_L1_ICM 0x80000001 No Level 1 instruction cache misses
PAPI_L2_DCM 0x80000002 No Level 2 data cache misses
PAPI_L2_LDM 0x80000019 No Level 2 load misses
PAPI_L2_STM 0x8000001a No Level 2 store misses
PAPI_BR_MSP 0x8000002e No Conditional branch instructions mispredicted
PAPI_TOT_IIS 0x80000031 No Instructions issued
PAPI_TOT_INS 0x80000032 No Instructions completed
PAPI_FP_INS 0x80000034 No Floating point instructions
PAPI_LD_INS 0x80000035 No Load instructions
PAPI_SR_INS 0x80000036 No Store instructions
PAPI_BR_INS 0x80000037 No Branch instructions
PAPI_VEC_INS 0x80000038 No Vector/SIMD instructions (could include integer)
PAPI_TOT_CYC 0x8000003b No Total cycles
PAPI_L2_DCH 0x8000003f No Level 2 data cache hits
PAPI_L1_DCA 0x80000040 Yes Level 1 data cache accesses
PAPI_L1_DCR 0x80000043 No Level 1 data cache reads
PAPI_L2_DCR 0x80000044 No Level 2 data cache reads
PAPI_L1_DCW 0x80000046 No Level 1 data cache writes
PAPI_L2_DCW 0x80000047 No Level 2 data cache writes
PAPI_L1_ICA 0x8000004c No Level 1 instruction cache accesses
--------------------------------------------------------------------------------
Of 21 available events, 2 are derived.
-Will