perf events (node-loads, node-load-misses, ...)

959 views
Skip to first unread message

Hyojong

unread,
Apr 24, 2017, 7:51:27 PM4/24/17
to ptools-perfapi
Hello,

This may not be the best place to ask, but here I go, just in case someone could shed some light :)

My processor has several hardware cache events, of which I am interested in node-loads, node-load-misses, node-stores, node-store-misses. I think it is kind of obvious that node-loads and node-stores represent references to local memory (memory physically attached to a processor package). However, I have no clue as to what the rest represents. I have some theory that "node-load-misses" represent the number of misses in local memory, or the number of accesses to disk. Based on the profiled results I have (page-faults:16M << node-load-misses+node-store-misses:748M), however, this cannot be the raw number of disk accesses. Maybe some kind of piggybacking mechanism (by some MSHR kind of structure in memory?) could be the reason behind this discrepancy, but no clue. If anyone has some idea on this, please share it with me. I appreciate your help!

Thanks.

Servat, Harald

unread,
Apr 25, 2017, 5:23:27 AM4/25/17
to Hyojong, ptools-perfapi

Hello Hyojong,

 

  If you refer to the linux perf tool, I think that you probably should ask that to the linux-perf-users mailing list (see [1]. Even through it is possible that somebody following this list could help you.

 

Best,

 

[1] http://vger.kernel.org/vger-lists.html#linux-perf-users

--
You received this message because you are subscribed to the Google Groups "ptools-perfapi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ptools-perfap...@icl.utk.edu.
To post to this group, send email to ptools-...@icl.utk.edu.
Visit this group at https://groups.google.com/a/icl.utk.edu/group/ptools-perfapi/.


-----------------------------------------------------------
Intel Corporation Iberia, S.A.
Registered Office: Torre Picasso, 25th Floor,
Plaza Pablo Ruiz Picasso, no. 1, 28020  Madrid

Este mensaje se dirige exclusivamente a su destinatario y puede
contener informacion privilegiada o confidencial. Si no es vd.
el destinatario indicado, queda notificado de que la lectura,
utilizacion, divulgacion y,o copia sin autorizacion esta prohibida
en virtud de la legislacion vigente. Si ha recibido este mensaje por
error, le rogamos que nos lo communique inmediatamente por
esta misma via y proceda a su destruccion.

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

Hyojong

unread,
Apr 25, 2017, 11:42:35 AM4/25/17
to ptools-perfapi, hki...@gmail.com
Hi Harald, 

Thank you for the pointer.

On Tuesday, April 25, 2017 at 5:23:27 AM UTC-4, harald.servat wrote:

Hello Hyojong,

 

  If you refer to the linux perf tool, I think that you probably should ask that to the linux-perf-users mailing list (see [1]. Even through it is possible that somebody following this list could help you.

 

Best,

 

[1] http://vger.kernel.org/vger-lists.html#linux-perf-users

 

From: Hyojong [mailto:hki...@gmail.com]
Sent: Monday, 24 April, 2017 22:16
To: ptools-perfapi <ptools-...@icl.utk.edu>
Subject: [ptools-perfapi] perf events (node-loads, node-load-misses, ...)

 

Hello,

 

This may not be the best place to ask, but here I go, just in case someone could shed some light :)

 

My processor has several hardware cache events, of which I am interested in node-loads, node-load-misses, node-stores, node-store-misses. I think it is kind of obvious that node-loads and node-stores represent references to local memory (memory physically attached to a processor package). However, I have no clue as to what the rest represents. I have some theory that "node-load-misses" represent the number of misses in local memory, or the number of accesses to disk. Based on the profiled results I have (page-faults:16M << node-load-misses+node-store-misses:748M), however, this cannot be the raw number of disk accesses. Maybe some kind of piggybacking mechanism (by some MSHR kind of structure in memory?) could be the reason behind this discrepancy, but no clue. If anyone has some idea on this, please share it with me. I appreciate your help!

 

Thanks.

 

--
You received this message because you are subscribed to the Google Groups "ptools-perfapi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ptools-perfap...@icl.utk.edu.

To post to this group, send email to ptools...@icl.utk.edu.

Vince Weaver

unread,
Apr 25, 2017, 5:09:37 PM4/25/17
to Servat, Harald, Hyojong, ptools-perfapi
On Tue, 25 Apr 2017, Servat, Harald wrote:

> My processor has several hardware cache events, of which I am interested in node-loads,
> node-load-misses, node-stores, node-store-misses. I think it is kind of obvious that
> node-loads and node-stores represent references to local memory (memory physically
> attached to a processor package).

For predefined perf_event events you really have to look at the Linux
source code to see what the events map too.

Assuming you are using a recent Intel processor it would be in
arch/x86/events/intel/core.c
and search for NODE

What perf reports for these events varies, but typically I think the node
access events are reporting on last-level-cache and the misses involve
accesses to main memory.

It is confusing because in many (all?) cases the events used are different
from the Last Level Cache access/miss events, though the idea might be
to show things per-package. They definitely have nothing to do with disk
accesses though.

Vince

Hyojong

unread,
May 1, 2017, 12:43:39 PM5/1/17
to ptools-perfapi, harald...@intel.com, hki...@gmail.com
Hi Vince,

I appreciate your answer. I looked at the Linux source code before asking question but was not able to find any useful information from a cursory look. I also speculated that NODE counters might be related to LLC, but I'm skeptical about that due to the existence of LL counters. I'll ask this to perf user mailing list and report back as soon as I hear anything.

Thanks,
Hyojong

Vince Weaver

unread,
May 1, 2017, 4:42:38 PM5/1/17
to Hyojong, ptools-perfapi, harald...@intel.com
On Mon, 1 May 2017, Hyojong wrote:

> I appreciate your answer. I looked at the Linux source code before asking question but
> was not able to find any useful information from a cursory look. I also speculated
> that NODE counters might be related to LLC, but I'm skeptical about that due to the
> existence of LL counters. I'll ask this to perf user mailing list and report back as
> soon as I hear anything.

There's no need to speculate, the events being measured are right there.

It just depends what type of processor you have. Did you say what you
were using?

in arch/x86/events/intel/core.c

Let's assume you have a Haswell processor, look for
hsw_hw_cache_event_ids
which says the events used are

[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = 0x1b7, /* OFFCORE_RESPONSE */
[ C(RESULT_MISS) ] = 0x1b7, /* OFFCORE_RESPONSE */
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},
};

in combination with

[ C(NODE) ] = {
[ C(OP_READ) ] = {
[ C(RESULT_ACCESS) ] = HSW_DEMAND_READ|
HSW_L3_MISS_LOCAL_DRAM|
HSW_SNOOP_DRAM,
[ C(RESULT_MISS) ] = HSW_DEMAND_READ|
HSW_L3_MISS_REMOTE|
HSW_SNOOP_DRAM,
},
[ C(OP_WRITE) ] = {
[ C(RESULT_ACCESS) ] = HSW_DEMAND_WRITE|
HSW_L3_MISS_LOCAL_DRAM|
HSW_SNOOP_DRAM,
[ C(RESULT_MISS) ] = HSW_DEMAND_WRITE|
HSW_L3_MISS_REMOTE|
HSW_SNOOP_DRAM,
},
[ C(OP_PREFETCH) ] = {
[ C(RESULT_ACCESS) ] = 0x0,
[ C(RESULT_MISS) ] = 0x0,
},
},


so it's an offcore response event, which is measuring something similar to
(but not exactly the same as) LLC due to measuring snooping effects, etc.

Vince

Reply all
Reply to author
Forward
0 new messages