On Thu, 3 Aug 2017, Michael Knobloch wrote:
> /sys/devices/system/cpu/ shows 272 entries, so just what I'd expect from
> the 68 cores with 4 threads each.
>
> Still wondering where the additional 32 cpus are coming from and whether
> there is a binding of the cpu qualifier of the counters and the entries
> in /sys/devices/system/cpu/.
>
> Anyway, my understanding is that the uncore counters cannot be mapped to
> individual cores, so I'm still struggling to understand what the cpu
> qualifier is doing.
Each package has its own set of uncore counters. I am not an expert on
KNL, but for example on a high end haswell-ep server you might have two
packages, each with 16 cores, and each core with 2 threads, for a total
of 64 CPUs seen by Linux. In this case there are two uncores, one for
each package.
So when specifying the event, you specify the CPU number to properly
indicate which package you want the measurements from. The PAPI and perf
interface is not great for this. For example in the case I gave above,
specifying from CPU=0 to CPU=31 would give you the results for the first
uncore (they are aliased to give the same results) and CPU=32 to CPU=63
would give you the results for the second uncore.
I assume KNL is similar but I don't know.
I'm not sure where your extra cores are coming from. The
perf_event_open() call takes the CPU field directly and the kernel should
reject any that are invalid. The actual CPU= parsing is done by libpfm4
so it's possible there are some bugs there too.
You can find the package/cpu mapping under
/sys/devices/system/cpu/cpu0/topology/
Vince