A long question regarding perf_event_open, rdpmc and rdmsr.

457 views
Skip to first unread message

Tong Zhou

unread,
Aug 3, 2017, 11:56:47 AM8/3/17
to ptools-perfapi
Hi,

Is system call __NR_perf_event_open implemented by using RDPMC privileged instruction? I saw the example given by perf_even_open's manpage, linux/tools/testing/selftests/powerpc/pmu/event.c also has some similar example too. I can see how to read an event counter with the fd returned by perf_event_open, but I don't know if this is implemented with RDPMC internally.  

The second one is more general. So there is a perf event called "ROB_MISC_EVENTS.LBR_INSERTS", with the description "Count cases of saving new LBR records by hardware". I don't understand the relation between this event and the value of LBR registers. For example, does this event mean that there is another performance counter that is delicated to record the number of times of LBR_INSERT? And the actual value in LBR is another thing? Basically I do not have a grasp of the difference between events, hardware performance counter and machine specifc registers. Is LBR not an event but LBR_INSERTS is an event?

The third question is the difference between rdpmc and rdmsr. It is related to the second question. My understanding is that rdmsr is used to set up some counters, like configuration. So can rdpmc read MSRs?

Thanks,
Tong

Vince Weaver

unread,
Aug 3, 2017, 12:52:03 PM8/3/17
to Tong Zhou, ptools-perfapi

these questions really aren't PAPI related, but I suppose I can answer
them here.

> Is system call __NR_perf_event_open implemented by using RDPMC privileged
> instruction?

perf_event_open is a Linux system call that sets up the performance
counters. The codepath in the kernel for event setup does eventually
set some MSRs (model specific registers) on x86 machines, but does
not use rdpmc at all.

rdpmc is an x86 instruction that can read only the very small subset of
MSRs, those that hold event counts for the core fixed and generic
counters.

rdpmc is slightly faster that the equivelent rdmsr instruction. rdpmc can
also be configured to allow access to the counters from userspace, without
being priviledged. However getting proper values out of the counters
while perf_event is running is a somewhat complicated process (though
still about 5 times faster than using a read() system call).

> The second one is more general. So there is a perf event
> called "ROB_MISC_EVENTS.LBR_INSERTS", with the description "Count cases of
> saving new LBR records by hardware". I don't understand the relation between
> this event and the value of LBR registers. For example, does this event mean
> that there is another performance counter that is delicated to record the
> number of times of LBR_INSERT? And the actual value in LBR is another thing?
> Basically I do not have a grasp of the difference between events, hardware
> performance counter and machine specifc registers. Is LBR not an event
> but LBR_INSERTS is an event?

Questions about obscure events such as ROB_MISC_EVENTS.LBR_INSERTS can
only really be asnwered by people with inside Intel knowledge.

The LBR registers are more or less completely unrelated to the regular
core performance counters.

perf_event shoves everything even vaguely perforance related into the
one perf_event_open() system call because it's the only interface they
have.

Vince

Vince Weaver

unread,
Aug 3, 2017, 3:54:54 PM8/3/17
to Tong Zhou, ptools-perfapi
On Thu, 3 Aug 2017, Tong Zhou wrote:

> Thank you so much for the answer! The reason I am asking is because I found
> that reading LBRs by seeking and reading file /dev/cpu/xx/msr has too much
> overhead. So I was wondering if I can using rdpmc to read LBRs in
> unprivileged mode​ since that'll be faster if it works​
> .​ But​ I guess rdpmc can't read most MSRs then.

The official Linux way of getting the LBR values is using the
SAMPLE_BRANCH_STACK interface to perf_event_open().

If you get the perf_event_tests git tree
git clone https://github.com/deater/perf_event_tests

the included
tests/record_sample/sample_branch_stack
test has some sample code that accesses the values in the LBR stack.

Vince
Reply all
Reply to author
Forward
0 new messages