PAPI support for ThunderX (Cavium) ARM processor

135 views
Skip to first unread message

moha...@gmail.com

unread,
Jun 19, 2017, 4:30:27 PM6/19/17
to ptools-perfapi
Hi,

I am trying to make PAPI work on the ThunderX (Cavium) ARM processor.

So far, the compilation (./configure; make) does fine. But when I try to execute the tests (make test) I receive the following error:

Component perf_event disabled due to Error initializing libpfm4
test_utils.c                           FAILED
Line # 702
Error: ERROR! Zero Counters Available!
Makefile.inc:225: recipe for target 'test' failed



My investigations led me to find that Libfpm4 works fine as indicated by this command and others:
$ ./examples/check_events
Supported PMU models:
    [51, perf, "perf_events generic PMU"]
    [114, perf_raw, "perf_events raw PMU"]
    [151, arm_ac57, "ARM Cortex A57"]
    [152, arm_ac53, "ARM Cortex A53"]
    [156, arm_xgene, "Applied Micro X-Gene"]
Detected PMU models:
    [51, perf, "perf_events generic PMU"]
    [114, perf_raw, "perf_events raw PMU"]
Total events: 295 available, 81 supported
Requested Event: PERF_COUNT_HW_CPU_CYCLES
Actual    Event: perf::PERF_COUNT_HW_CPU_CYCLES
PMU            : perf_events generic PMU
IDX            : 106954752
Codes          : 0x0
Requested Event: PERF_COUNT_HW_INSTRUCTIONS
Actual    Event: perf::PERF_COUNT_HW_INSTRUCTIONS
PMU            : perf_events generic PMU
IDX            : 106954755
Codes          : 0x1

Any idea on the problem and how to fix it ?
Please let me know if you need any additional information.

Vince Weaver

unread,
Jun 19, 2017, 6:20:23 PM6/19/17
to moha...@gmail.com, ptools-perfapi
On Mon, 19 Jun 2017, moha...@gmail.com wrote:

> I am trying to make PAPI work on the ThunderX (Cavium) ARM processor.
>
> So far, the compilation (./configure; make) does fine. But when I try to
> execute the tests (make test) I receive the following error:
>
> Component perf_event disabled due to Error initializing libpfm4
> test_utils.c                           FAILED
> Line # 702
> Error: ERROR! Zero Counters Available!
> Makefile.inc:225: recipe for target 'test' failed

what version of PAPI are you trying?

What does the output of "papi_component_avail" say?

Vince

Steve Kaufmann

unread,
Jun 19, 2017, 6:23:14 PM6/19/17
to Vince Weaver, moha...@gmail.com, ptools-perfapi

libpfm4 does not support cavium thunderx processors...libpfm4 does not recognize the CPU nor does it supply any event tables.


Steve


From: Vince Weaver <vincent...@maine.edu>
Sent: Monday, June 19, 2017 5:20:05 PM
To: moha...@gmail.com
Cc: ptools-perfapi
Subject: Re: [ptools-perfapi] PAPI support for ThunderX (Cavium) ARM processor
 
--
You received this message because you are subscribed to the Google Groups "ptools-perfapi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ptools-perfap...@icl.utk.edu.
To post to this group, send email to ptools-...@icl.utk.edu.
Visit this group at https://groups.google.com/a/icl.utk.edu/group/ptools-perfapi/.

moha...@gmail.com

unread,
Jun 19, 2017, 7:02:54 PM6/19/17
to ptools-perfapi, moha...@gmail.com
It's the latest version: PAPI 5.5.1.

$ ./utils/papi_component_avail 
Available components and hardware information.
--------------------------------------------------------------------------------
PAPI Version             : 5.5.1.0
Vendor string and code   : ARM (7)
Model string and code    :  (0)
CPU Revision             : 0.000000
CPU Max Megahertz        : 2
CPU Min Megahertz        : 2
Hdw Threads per core     : 1
Cores per Socket         : 48
Sockets                  : 2
NUMA Nodes               : 2
CPUs per Node            : 48
Total CPUs               : 96
Running in a VM          : no
Number Hardware Counters : 0
Max Multiplex Counters   : 384
--------------------------------------------------------------------------------

Compiled-in components:
Name:   perf_event              Linux perf_event CPU counters
   \-> Disabled: Error initializing libpfm4
Name:   perf_event_uncore       Linux perf_event CPU uncore and northbridge
   \-> Disabled: No uncore PMUs or events found

Active components:

--------------------------------------------------------------------------------
component.c                             PASSED

moha...@gmail.com

unread,
Jun 19, 2017, 7:12:48 PM6/19/17
to ptools-perfapi, vincent...@maine.edu, moha...@gmail.com, s...@cray.com
Libpfm4 works fine. That's probably due to the fact that ThunderX is composed of Armv8 (?)

For example, the following command give me this output:
$./perf_examples/task ls
config.mk  COPYING  debian  docs  examples  include  lib  libpfm.spec  Makefile  perf_examples python README rules.mk  tests

           3,307,661 cycles (0.00% scaling, ena=1,660,570, run=1,660,570)
           1,186,952 instructions (0.00% scaling, ena=1,660,570, run=1,660,570)

I suspect that there is only a small modification to be done in papi to add the support.
BTW, the output of $uname -m is "aarch64"

Vince Weaver

unread,
Jun 19, 2017, 7:28:02 PM6/19/17
to moha...@gmail.com, ptools-perfapi, vincent...@maine.edu, s...@cray.com
On Mon, 19 Jun 2017, moha...@gmail.com wrote:

> Libpfm4 works fine. That's probably due to the fact that ThunderX is
> composed of Armv8 (?)
> For example, the following command give me this output:
> $./perf_examples/task ls

I misread your initial e-mail, so no, libpfm4 does not support this chip
directly.

Is there documentation available listing the events and their numbers?
It wouldn't be that bad probably to get libpfm4 patched to support this.

Alternatively, it's true that armv8 I think has a standard set of
supported events. It would be nice if PAPI/libpfm4 could fall back to
these.

Also, even if libpfm4 doesn't suppport things, in theory PAPI should be
able to use the perf::: events if Linux supports things. I'll have to
check to see why PAPI doesn't currently try to do this.

Vince

moha...@gmail.com

unread,
Jun 19, 2017, 7:53:00 PM6/19/17
to ptools-perfapi, moha...@gmail.com, vincent...@maine.edu, s...@cray.com
For now I don't have the necessary documentation. But, soon hopefully.

Indeed the "perf" tool works fine on the installed Linux, so perf_event API should work.

Enrico Calore

unread,
Jun 20, 2017, 8:07:12 AM6/20/17
to ptools-...@icl.utk.edu, vince Weaver, moha...@gmail.com, s...@cray.com
On 06/20/2017 01:27 AM, Vince Weaver wrote:
> Also, even if libpfm4 doesn't suppport things, in theory PAPI should be
> able to use the perf::: events if Linux supports things. I'll have to
> check to see why PAPI doesn't currently try to do this.

Hi,
as far as I know the BSC (Barcelona Supercomputing Center) developed a
patch to use PAPI on the Cavium ThunderX. Both a patch to support the
ThunderX PMU on old Linux kernels (it should be already supported since
4.4) and a patch to extend PAPI events definition to support the
ThunderX hardware counters.

I used it and its working fine, I guess it will be released soon... I
can make sure it gets announced also here.


Regards,

Enrico

signature.asc

Steve Kaufmann

unread,
Jun 20, 2017, 8:32:27 AM6/20/17
to moha...@gmail.com, ptools-perfapi, vincent...@maine.edu

Setting LIBPFM_FORCE_PMU to "arm_xgene" should allow both libpfm4 executables and PAPI to think it is using ThunderX events conforming to the ARM standard. While some of the events might not be valid or return valid counts it might be enough to get by for the more common events found on ARM systems.


Steve


From: moha...@gmail.com <moha...@gmail.com>
Sent: Monday, June 19, 2017 6:52:59 PM
To: ptools-perfapi
Cc: moha...@gmail.com; vincent...@maine.edu; Steve Kaufmann

Subject: Re: [ptools-perfapi] PAPI support for ThunderX (Cavium) ARM processor

moha...@gmail.com

unread,
Jun 20, 2017, 11:15:00 AM6/20/17
to ptools-perfapi, moha...@gmail.com, vincent...@maine.edu, s...@cray.com
Indeed forcing to arm_xgene works.

Is there some tests that I can do to validate the counters ?

For now I tried some tests in ./ctests. One of them, the "zero", give me this output:

Test case 0: start, stop.
-----------------------------------------------
Default domain is: 11 (PAPI_DOM_USER|PAPI_DOM_KERNEL|PAPI_DOM_SUPERVISOR)
Default granularity is: 1 (PAPI_GRN_THR)
Using 20000000 iterations of c += a*b
-------------------------------------------------------------------------
Test type    :                1
PAPI_FP_INS  :          40000041
PAPI_TOT_CYC :         540557238
Real usec    :            270321
Real cycles  :          27030565
Virt usec    :            270303
Virt cycles  :            540604
-------------------------------------------------------------------------
Verification: PAPI_TOT_CYC should be roughly real_cycles
NOTE: Not true if dynamic frequency scaling is enabled.
Verification: PAPI_FP_INS should be roughly 40000000
PAPI_TOT_CYC Error of 1899.80%
zero.c                                       FAILED
Line # 125
Error: Cycles validation

 
Should I worry ?

Vince Weaver

unread,
Jun 20, 2017, 11:27:45 AM6/20/17
to moha...@gmail.com, ptools-perfapi, vincent...@maine.edu, s...@cray.com

I've just committed code to git that should let PAPI fall back to the
Linux perf default events if there is Linux support but libpfm4 isn't
there yet. So you can manually run native events like
perf::PERF_COUNT_HW_CPU_CYCLES
but still the PAPI preset events won't work unless libpfm4 support is
added.

> Is there some tests that I can do to validate the counters ?

no, the PAPI test suite does not do a good job of this (though maybe that
will be changing soon).

> PAPI_TOT_CYC Error of 1899.80%
> zero.c                                       FAILED
> Line # 125
> Error: Cycles validation
>
>  
> Should I worry ?

not necessarily. The assumptions made in the zero test don't hold on the
ARM architecture.

Vince

moha...@gmail.com

unread,
Jun 20, 2017, 11:42:28 AM6/20/17
to ptools-perfapi, moha...@gmail.com, vincent...@maine.edu, s...@cray.com
Thank you!

Filippo Mantovani

unread,
Aug 8, 2017, 6:40:26 AM8/8/17
to ptools-perfapi, vincent...@maine.edu, moha...@gmail.com, s...@cray.com, enrico...@fe.infn.it

Hi all,
The mentioned patch has eventually been released within a technical report which you can download here: http://upcommons.upc.edu/handle/2117/107063.
The actual patch for enabling Cavium ThunderX was originally developed for PAPI v5.4.3 but should be easily adaptable to newer versions.
It can be downloaded here: https://goo.gl/MXn2h2
Best Regards,
Filippo

Vince Weaver

unread,
Aug 10, 2017, 4:39:40 PM8/10/17
to Filippo Mantovani, ptools-perfapi, moha...@gmail.com, s...@cray.com, enrico...@fe.infn.it, Stephane Eranian
On Tue, 8 Aug 2017, Filippo Mantovani wrote:

> The mentioned patch has eventually been released within a technical report
> which you can download here: http://upcommons.upc.edu/handle/2117/107063.
> The actual patch for enabling Cavium ThunderX was originally developed for
> PAPI v5.4.3 but should be easily adaptable to newer versions.
> It can be downloaded here: https://goo.gl/MXn2h2

Most of your patch is against libpfm4 and should be submitted upstream to
the libpfm4 list. Once the changes make it into libpfm4 then the
remaining PAPI support can be added.

Vince

Steve Kaufmann

unread,
Aug 10, 2017, 5:39:44 PM8/10/17
to vince Weaver, Filippo Mantovani, ptools-perfapi, moha...@gmail.com, enrico...@fe.infn.it, Stephane Eranian

I've had a Cavium ThunderX patch to libpfm4 for over two years now (for our own use). I'd have to replace any new patches to libpfm4 with our version. I don't know a good workaround for this. Any suggestions?


Steve


From: Vince Weaver <vincent...@maine.edu>
Sent: Thursday, August 10, 2017 3:39:30 PM
To: Filippo Mantovani
Cc: ptools-perfapi; moha...@gmail.com; Steve Kaufmann; enrico...@fe.infn.it; Stephane Eranian

Subject: Re: [ptools-perfapi] PAPI support for ThunderX (Cavium) ARM processor

Vince Weaver

unread,
Aug 11, 2017, 1:00:48 PM8/11/17
to Steve Kaufmann, Filippo Mantovani, ptools-perfapi, moha...@gmail.com, enrico...@fe.infn.it, Stephane Eranian
On Thu, 10 Aug 2017, Steve Kaufmann wrote:

>
> I've had a Cavium ThunderX patch to libpfm4 for over two years now (for our
> own use). I'd have to replace any new patches to libpfm4 with our version. I
> don't know a good workaround for this. Any suggestions?

Are the patches incompatible? Do they use different names for the events?

This seems like it's mostly an issue that needs to be worked out in
libpfm4. As far as PAPI goes it would mostly be sorting out the proper
preset event choices.

Vince

Steve Kaufmann

unread,
Aug 11, 2017, 1:16:57 PM8/11/17
to vince Weaver, Filippo Mantovani, ptools-perfapi, moha...@gmail.com, enrico...@fe.infn.it, Stephane Eranian

I presume that the differences would be in the naming of events, although I based our names off of the ARMv8 specification. We'd have to see how these changes would compliment or conflict with ours (devils in the details kindof thing). I could always not apply any thunderx patches although I'd hate to get out of sync with libpfm4. If event names were "close enough" I would just defer to the patch and drop ours.


Steve


From: Vince Weaver <vincent...@maine.edu>
Sent: Friday, August 11, 2017 12:00:41 PM
To: Steve Kaufmann
Cc: Filippo Mantovani; ptools-perfapi; moha...@gmail.com; enrico...@fe.infn.it; Stephane Eranian

Subject: Re: [ptools-perfapi] PAPI support for ThunderX (Cavium) ARM processor
Reply all
Reply to author
Forward
0 new messages