Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

perf test BPF failing on f24: fix

62 views
Skip to first unread message

Arnaldo Carvalho de Melo

unread,
Aug 2, 2016, 4:00:05 PM8/2/16
to
Hi Wang,

Something changed and a function used in a perf test for BPF is
not anymore appearing on vmlinux, albeit still available on
/proc/kallsyms:

# readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w sys_epoll_wait
#

But:

[root@jouet ~]# grep -i sys_epoll_wait /proc/kallsyms
ffffffffbd295b50 T SyS_epoll_wait
ffffffffbd295b50 T sys_epoll_wait
[root@jouet ~]#

I noticed that it is some sort of aliasing so I checked the other
variant:

[root@jouet ~]# readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w SyS_epoll_wait
<2bc9b85> DW_AT_name : (indirect string, offset: 0xe7524): SyS_epoll_wait
[root@jouet ~]#

Trying to use perf probe also produces the same resuls I notice when
running the perf test that is failing:

[root@jouet ~]# perf probe sys_epoll_wait
Failed to find debug information for address ffffffffbd295b50
Probe point 'sys_epoll_wait' not found.
Error: Failed to add events.
[root@jouet ~]# perf probe SyS_epoll_wait
Added new events:
probe:SyS_epoll_wait (on SyS_epoll_wait)
probe:SyS_epoll_wait_1 (on SyS_epoll_wait)
probe:SyS_epoll_wait_2 (on SyS_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:SyS_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

So I am changing the BPF perf test to use the CamelCase notation alias:

Failing (with sys_epoll_wait):

[root@jouet ~]# perf test BPF
37: Test BPF filter :
37.1: Test basic BPF filtering : FAILED!
37.2: Test BPF prologue generation : Skip
37.3: Test BPF relocation checker : Skip
[root@jouet ~]#

Changing to (SyS_epoll_wait):

[root@jouet ~]# perf test BPF
37: Test BPF filter :
37.1: Test basic BPF filtering : Ok
37.2: Test BPF prologue generation : Ok
37.3: Test BPF relocation checker : Ok
[root@jouet ~]#

I'm adding a fix that just makes it use SyS_epoll_wait.

- Arnaldo

Alexei Starovoitov

unread,
Aug 2, 2016, 7:00:05 PM8/2/16
to
that change will break all sorts of scripts that relying on syscalls to start
with sys_
I guess we can workaround in user space, but what was the motivation to
disable kprobe attach on sys_* while it's still in kallsyms?

Arnaldo Carvalho de Melo

unread,
Aug 2, 2016, 10:20:05 PM8/2/16
to
Is there a promise that internal kernel functions will not change names?

> I guess we can workaround in user space, but what was the motivation to
> disable kprobe attach on sys_* while it's still in kallsyms?

Was it disabled? What I noticed whas that the sys_epoll_wait wasn't
present in the DWARF info present in the vmlinux file, which would be
good for others to confirm, which I'll check on other machines here at
home, tomorrow.

- Arnaldo

Alexei Starovoitov

unread,
Aug 2, 2016, 11:00:05 PM8/2/16
to
as far as I can see sys_ and SyS_ both present in kallsyms,
so kprobe infra should recognize both,
the question is why perf decided to ingore sys_ version?

Wangnan (F)

unread,
Aug 2, 2016, 11:00:06 PM8/2/16
to


On 2016/8/3 3:51, Arnaldo Carvalho de Melo wrote:
> Hi Wang,
>
> Something changed and a function used in a perf test for BPF is
> not anymore appearing on vmlinux, albeit still available on
> /proc/kallsyms:
>
> # readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w sys_epoll_wait
> #
>
> But:
>
> [root@jouet ~]# grep -i sys_epoll_wait /proc/kallsyms
> ffffffffbd295b50 T SyS_epoll_wait
> ffffffffbd295b50 T sys_epoll_wait
> [root@jouet ~]#
>
> I noticed that it is some sort of aliasing so I checked the other
> variant:
>
> [root@jouet ~]# readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w SyS_epoll_wait
> <2bc9b85> DW_AT_name : (indirect string, offset: 0xe7524): SyS_epoll_wait
> [root@jouet ~]#

I also see no sys_epoll_wait in output of 'readelf -wi', but I can't
reproduce the whole problem.

> Trying to use perf probe also produces the same resuls I notice when
> running the perf test that is failing:
>
> [root@jouet ~]# perf probe sys_epoll_wait
> Failed to find debug information for address ffffffffbd295b50
> Probe point 'sys_epoll_wait' not found.
> Error: Failed to add events.

For me:

# uname -r
4.7.0

# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81281240
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81281240
Probe point found: SyS_epoll_wait+0
found inline addr: 0xffffffff812813a7
Probe point found: SyS_epoll_pwait+135
found inline addr: 0xffffffff8128157a
Probe point found: compat_SyS_epoll_pwait+154
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2626112
Writing event: p:probe/sys_epoll_wait_1 _text+2626471
Writing event: p:probe/sys_epoll_wait_2 _text+2626938
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1


I'll try different kernel version and kconfig.

Thank you.

Wangnan (F)

unread,
Aug 2, 2016, 11:50:05 PM8/2/16
to
Still unable reproduce. I tried tip/core/core, which is '4.7.0+'.
I also turns off CONFIG_DEBUG_INFO. In this case, testcase
BPF prologue fails as expected, and basic BPF test is okay.

After removing vmlinux, with kallsyms information
'perf probe sys_epoll_wait' sets only 1 kprobe, but it still
works.

I think there is a bug in your vmlinux. Could you please
do a bisect to find which commit causes this problem?

Thank you.

Wangnan (F)

unread,
Aug 3, 2016, 4:30:05 AM8/3/16
to
Add Masami to Cc list. He knowns debug info.

Please see below...

On 2016/8/3 3:51, Arnaldo Carvalho de Melo wrote:
> Hi Wang,
>
> Something changed and a function used in a perf test for BPF is
> not anymore appearing on vmlinux, albeit still available on
> /proc/kallsyms:
>
> # readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w sys_epoll_wait
> #

As I have said in another mail, although I don't know why,
I don't think the missing of sys_epoll_wait is the root cause
of your failure.

I did a bisect search and found commit
83460ec8dcac14142e7860a01fa59c267ac4657c:
"syscalls.h: use gcc alias instead of assembler aliases for syscalls"
After this commit, sys_xxx symbols is gone from output of 'readelf -wi'.
It is
introduced in v3.13.

> But:
>
> [root@jouet ~]# grep -i sys_epoll_wait /proc/kallsyms
> ffffffffbd295b50 T SyS_ep

> oll_wait
> ffffffffbd295b50 T sys_epoll_wait
> [root@jouet ~]#

If we remove vmlinux, perf should use /proc/kallsyms. I think
the failure would gone. The problem is: when symbol searching
fail using vmlinux, should we fallback to kallsyms? However,
this is another question.

> I noticed that it is some sort of aliasing so I checked the other
> variant:
>
> [root@jouet ~]# readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w SyS_epoll_wait
> <2bc9b85> DW_AT_name : (indirect string, offset: 0xe7524): SyS_epoll_wait
> [root@jouet ~]#
>
> Trying to use perf probe also produces the same resuls I notice when
> running the perf test that is failing:
>
> [root@jouet ~]# perf probe sys_epoll_wait
> Failed to find debug information for address ffffffffbd295b50
> Probe point 'sys_epoll_wait' not found.
> Error: Failed to add events.
> [root@jouet ~]# perf probe SyS_epoll_wait
> Added new events:
> probe:SyS_epoll_wait (on SyS_epoll_wait)
> probe:SyS_epoll_wait_1 (on SyS_epoll_wait)
> probe:SyS_epoll_wait_2 (on SyS_epoll_wait)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:SyS_epoll_wait_2 -aR sleep 1
>
> [root@jouet ~]#
>
> So I am changing the BPF perf test to use the CamelCase notation alias:

Changing to SyS_xxx is okay, byt we still need to know the root
cause.

Thank you.

Masami Hiramatsu

unread,
Aug 3, 2016, 10:50:06 AM8/3/16
to
As below result shown, perf probe tries to fallback, but it seems to
fail to find corresponding debuginfo entry...

> > [root@jouet ~]# perf probe sys_epoll_wait
> > Failed to find debug information for address ffffffffbd295b50
> > Probe point 'sys_epoll_wait' not found.
> > Error: Failed to add events.

So, at least I should investigate it.

Thanks,

> > [root@jouet ~]# perf probe SyS_epoll_wait
> > Added new events:
> > probe:SyS_epoll_wait (on SyS_epoll_wait)
> > probe:SyS_epoll_wait_1 (on SyS_epoll_wait)
> > probe:SyS_epoll_wait_2 (on SyS_epoll_wait)
> >
> > You can now use it in all perf tools, such as:
> >
> > perf record -e probe:SyS_epoll_wait_2 -aR sleep 1
> >
> > [root@jouet ~]#
> >
> > So I am changing the BPF perf test to use the CamelCase notation alias:
>
> Changing to SyS_xxx is okay, byt we still need to know the root
> cause.
>
> Thank you.
>


--
Masami Hiramatsu <mhir...@kernel.org>

Arnaldo Carvalho de Melo

unread,
Aug 3, 2016, 4:10:05 PM8/3/16
to
Em Wed, Aug 03, 2016 at 11:45:57PM +0900, Masami Hiramatsu escreveu:
> > If we remove vmlinux, perf should use /proc/kallsyms. I think

I am not removing vmlinux, it is being used, its just that the function
chosen by the 'perf test BPF' testcase isn't there.

So lets go again trying to chase this without missing a single step of
the way:

We start with:

[root@jouet ~]# perf test bpf
37: Test BPF filter :
37.1: Test basic BPF filtering : FAILED!
37.2: Test BPF prologue generation : Skip
37.3: Test BPF relocation checker : Skip
[root@jouet ~]#

Ok, so we add -v to get more information:

[root@jouet ~]# perf test -v bpf
<BIG SNIP>
bpf: config program 'func=sys_epoll_wait'
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
bpf: config 'func=sys_epoll_wait' is ok
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffffbd295b50
Failed to find debug information for address ffffffffbd295b50
Probe point 'sys_epoll_wait' not found.
bpf_probe: failed to convert perf probe eventsFailed to add events
selected by BPF
test child finished with -1
---- end ----
Test BPF filter subtest 0: FAILED!

--------------

See? It _is_ using /lib/modules/4.7.0+/build/vmlinux, and it should
because:

[acme@jouet linux]$ file /lib/modules/4.7.0+/build/vmlinux
/lib/modules/4.7.0+/build/vmlinux: ELF 64-bit LSB executable, x86-64,
version 1 (SYSV), statically linked,
BuildID[sha1]=a08d121dcee2a0ea0cfa5d84363de0c1cfdc729a, not stripped
[acme@jouet linux]$

Its the kernel that is in use:

[acme@jouet linux]$ perf buildid-list --kernel
a08d121dcee2a0ea0cfa5d84363de0c1cfdc729a
[acme@jouet linux]$ perf buildid-list -h --kernel

Usage: perf buildid-list [<options>]

-k, --kernel Show current kernel build id

[acme@jouet linux]$

And, in this vmlinux file, there is _no_ such function:

[acme@jouet linux]$ readelf -wi /lib/modules/4.7.0+/build/vmlinux | grep -w sys_epoll_wait
[acme@jouet linux]$

Exactly like the 'perf probe -v bpf' says:

Symbol sys_epoll_wait address found : ffffffffbd295b50
Failed to find debug information for address ffffffffbd295b50

-----

It mapped it to an address, sure, it found it in /proc/kallsyms, but
then it didn't find it in the matching vmlinux file.

Since the test was working before, when did it stop to be available on
vmlinux?

Looking at a distro kernel vmlinux file, that comes in the
kernel-debuginfo package...

[acme@jouet linux]$ readelf -wi /usr/lib/debug/usr/lib/modules/4.6.3-300.fc24.x86_64/vmlinux | grep -w sys_epoll_wait
[acme@jouet linux]$
[acme@jouet linux]$ readelf -wi /usr/lib/debug/usr/lib/modules/4.6.3-300.fc24.x86_64/vmlinux | grep -w SyS_epoll_wait
<2cb9655> DW_AT_name : (indirect string, offset: 0xe0d53): SyS_epoll_wait
[acme@jouet linux]$

So the situation is the same, i.e. it seems that we were doing the fallback
from vmlinux to kallsyms and this somehow regressed, so now lets do a tools
bisect:

Ok, I tried this on a RHEL7 machine and got this:

Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0-rc3+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0-rc3+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_pwait address found : ffffffff8128b4a0
Matched function: SyS_epoll_pwait
Probe point found: SyS_epoll_pwait+0

So it seems that sys_epoll_wait (of sys_epoll_pwait) was never in the debuginfo:

But tooling was finding it in kallsyms, getting its address, then being able to
"reverse lookup" it to the SyS_ alias, and now this isn't happening :-\

[root@felicio ~]# cat /etc/redhat-release
Red Hat Enterprise Linux Server release 7.2 (Maipo)
[root@felicio ~]# gcc --version | head -1
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4)
[root@felicio ~]#
[root@felicio ~]# readelf -wi /lib/modules/4.7.0-rc3+/build/vmlinux | grep -wi sys_epoll_pwait
<23e4444> DW_AT_name : (indirect string, offset: 0xe089f): SyS_epoll_pwait
[root@felicio ~]#

So I'll concentrate on this path of investigation...

- Arnaldo

Masami Hiramatsu

unread,
Aug 3, 2016, 7:10:04 PM8/3/16
to
On Wed, 3 Aug 2016 23:45:57 +0900
Masami Hiramatsu <mhir...@kernel.org> wrote:

> > > [root@jouet ~]# perf probe sys_epoll_wait
> > > Failed to find debug information for address ffffffffbd295b50
> > > Probe point 'sys_epoll_wait' not found.
> > > Error: Failed to add events.
>
> So, at least I should investigate it.

FYI, on ubuntu16.04, I can put probes on sys_epoll_wait as below

mhiramat@devnote:~/ksrc/linux/tools/perf$ sudo ./perf probe -vnf sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/boot/vmlinux-4.4.0-31-generic for symbols
Open Debuginfo file: /usr/lib/debug/boot/vmlinux-4.4.0-31-generic
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81257410
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81257410
Probe point found: SyS_epoll_wait+0
found inline addr: 0xffffffff8125756d
Probe point found: SyS_epoll_pwait+141
found inline addr: 0xffffffff81257793
Probe point found: compat_SyS_epoll_pwait+147
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2454544
Writing event: p:probe/sys_epoll_wait_1 _text+2454893
Writing event: p:probe/sys_epoll_wait_2 _text+2455443
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

--
Masami Hiramatsu <mhir...@kernel.org>

Wangnan (F)

unread,
Aug 3, 2016, 10:00:05 PM8/3/16
to
Have you checked 'readelf -wi'? I guess you wouldn't find
sys_epoll_wait.

Thank you.

Masami Hiramatsu

unread,
Aug 4, 2016, 2:50:04 AM8/4/16
to
Would you remember the previous test was done with this kernel or other one?

Actually, I could put probes on sys_epoll_wait on F24 standard kernel.

-----
[mhiramat@devbox perf]$ sudo ./perf probe -vnf sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81292d60
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff812930f3
Probe point found: compat_SyS_epoll_pwait+147
found inline addr: 0xffffffff81292ed6
Probe point found: SyS_epoll_pwait+134
found inline addr: 0xffffffff81292d60
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2699507
Writing event: p:probe/sys_epoll_wait_1 _text+2698966
Writing event: p:probe/sys_epoll_wait_2 _text+2698592
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

-----
FYI, if perf-probe failed to find the symbol in debuginfo, it tries to find its address
from symbol table(kallsyms, here), and then tries to convert the address to the symbol
in debuginfo again. It seems that in your case this convert process is failed.

Then, could you grep DEBUG_INFO in .config? I guess your kernel enables some
reduced debuginfo related config enabled...

Thank you,


--
Masami Hiramatsu <mhir...@kernel.org>

Masami Hiramatsu

unread,
Aug 4, 2016, 5:00:11 AM8/4/16
to
Yes, my understandig is that the sys_* functions (alias) are
generated by linker, so debuginfo doesn't know about it.

As the above log said, perf-probe(and bpf too) tries to find
symbol in kallsyms and find the corresponding debuginfo entry(DIE)
and find the probe point again in debuginfo.

Arnaldo Carvalho de Melo

unread,
Aug 4, 2016, 8:50:07 AM8/4/16
to
No, you did not, you put probes in SyS_epoll_wait, because the part that
is failing for me, i.e. mapping sys_epoll_wait kallsyms's addr to
SyS_epoll_wait's addr is working for you.

I will test with the standard kernel for f24, this is my hunch as well,
that something on the 4.7.0 kernel series (or one after the
f24/ubuntu-you-use ones) is causing this, will diff the configs

more below
Yeah, that is what is failing for me.

> Then, could you grep DEBUG_INFO in .config? I guess your kernel enables some
> reduced debuginfo related config enabled...

If that is the case, then we better add a proper warning because this is
very subtle :-)

Checking...

[acme@jouet linux]$ grep ^CONFIG_DEBUG_ ../build/v4.7.0+/.config | grep 'INFO\|REDUCED'
CONFIG_DEBUG_INFO=y
[acme@jouet linux]$

Nope.

- Arnaldo

Arnaldo Carvalho de Melo

unread,
Aug 4, 2016, 3:40:06 PM8/4/16
to
[root@jouet ~]# grep ffffffffbd295b50 /proc/kallsyms
ffffffffbd295b50 T SyS_epoll_wait
ffffffffbd295b50 T sys_epoll_wait
[root@jouet ~]#

Adding some printfs in debuginfo__find_probe_point()

Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffffbd295b50
dwarf_addrdie(ffffffffbd295b50)
dwarf_addrdie(ffffffffbd295b50)
Failed to find debug information for address ffffffffbd295b50
Probe point 'sys_epoll_wait' not found.
Error: Failed to add events. Reason: No such file or directory (Code: -2)
[root@jouet ~]# grep ffffffffbd295b50 /proc/kallsyms
ffffffffbd295b50 T SyS_epoll_wait
ffffffffbd295b50 T sys_epoll_wait
[root@jouet ~]#

So:

int err = debuginfo__get_text_offset(dbg, &baseaddr);

is returning 0, no relocation, its dwarf_addrdie() that is not finding
SyS_epoll_wait from its address.

Trying to figure out why dwarf_addrdie(0xffffffffbd295b50) fails...

- Arnaldo

Arnaldo Carvalho de Melo

unread,
Aug 4, 2016, 5:50:05 PM8/4/16
to
Em Thu, Aug 04, 2016 at 04:36:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> So:
>
> int err = debuginfo__get_text_offset(dbg, &baseaddr);
>
> is returning 0, no relocation, its dwarf_addrdie() that is not finding
> SyS_epoll_wait from its address.
>
> Trying to figure out why dwarf_addrdie(0xffffffffbd295b50) fails...

So, trying to use that vmlinux with objdump to do disassembly I found
that I need to do some offsetting, and after calculating it, this made
things works for me:

diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
index f2d9ff064e2d..9b95754f28ed 100644
--- a/tools/perf/util/probe-finder.c
+++ b/tools/perf/util/probe-finder.c
@@ -1486,6 +1486,8 @@ retry:
/* Find cu die */
if (!dwarf_addrdie(dbg->dbg, (Dwarf_Addr)addr, &cudie)) {
if (!reloc && debuginfo__get_text_offset(dbg, &baseaddr) == 0) {
+ if (baseaddr == 0)
+ baseaddr = -0x3c000000;
addr += baseaddr;
reloc = true;
goto retry;

----------------

With it, which is not a proper fix, of course, we get:

[root@jouet ~]# perf probe sys_epoll_wait
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

Which, using -v shows it "finding" the alias:

[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffffbd295b50
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81295ee7
Probe point found: compat_SyS_epoll_pwait+151
found inline addr: 0xffffffff81295cca
Probe point found: SyS_epoll_pwait+138
found inline addr: 0xffffffff81295b50
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2711271
Writing event: p:probe/sys_epoll_wait_1 _text+2710730
Writing event: p:probe/sys_epoll_wait_2 _text+2710352
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

----------

Now to figure out why baseaddr is returning as zero...

- Arnaldo

Masami Hiramatsu

unread,
Aug 5, 2016, 5:50:05 AM8/5/16
to
On Thu, 4 Aug 2016 18:47:24 -0300
Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:

> Em Thu, Aug 04, 2016 at 04:36:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> > So:
> >
> > int err = debuginfo__get_text_offset(dbg, &baseaddr);
> >
> > is returning 0, no relocation, its dwarf_addrdie() that is not finding
> > SyS_epoll_wait from its address.
> >
> > Trying to figure out why dwarf_addrdie(0xffffffffbd295b50) fails...
>
> So, trying to use that vmlinux with objdump to do disassembly I found
> that I need to do some offsetting, and after calculating it, this made
> things works for me:
>
> diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> index f2d9ff064e2d..9b95754f28ed 100644
> --- a/tools/perf/util/probe-finder.c
> +++ b/tools/perf/util/probe-finder.c
> @@ -1486,6 +1486,8 @@ retry:
> /* Find cu die */
> if (!dwarf_addrdie(dbg->dbg, (Dwarf_Addr)addr, &cudie)) {
> if (!reloc && debuginfo__get_text_offset(dbg, &baseaddr) == 0) {
> + if (baseaddr == 0)
> + baseaddr = -0x3c000000;

Hmm, strange... what's this magic number ...?
Actually debuginfo__get_text_offset() is for kernel modules and it relocate the offset.
But I guess sys_epoll_wait is in the kernel.
Mine shows
----
Using /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81292d60 <-(*)
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff812930f3
Probe point found: compat_SyS_epoll_pwait+147
found inline addr: 0xffffffff81292ed6
Probe point found: SyS_epoll_pwait+134
found inline addr: 0xffffffff81292d60
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
-----

See the symbol address calcurated from symbol map, in successful case
the address exactly same address of SyS_epoll_wait. This indicates
something might wrong in the symbol map. (maybe KASLR?)
Could you check what happen if nokaslr is passed to your kernel?

Thank you,

> Found 3 probe_trace_events.
> Opening /sys/kernel/debug/tracing//kprobe_events write=1
> Writing event: p:probe/sys_epoll_wait _text+2711271
> Writing event: p:probe/sys_epoll_wait_1 _text+2710730
> Writing event: p:probe/sys_epoll_wait_2 _text+2710352
> Added new events:
> probe:sys_epoll_wait (on sys_epoll_wait)
> probe:sys_epoll_wait_1 (on sys_epoll_wait)
> probe:sys_epoll_wait_2 (on sys_epoll_wait)
>
> You can now use it in all perf tools, such as:
>
> perf record -e probe:sys_epoll_wait_2 -aR sleep 1
>
> [root@jouet ~]#
>
> ----------
>
> Now to figure out why baseaddr is returning as zero...
>
> - Arnaldo


--
Masami Hiramatsu <mhir...@kernel.org>

Arnaldo Carvalho de Melo

unread,
Aug 5, 2016, 10:40:06 AM8/5/16
to
Em Fri, Aug 05, 2016 at 06:45:50PM +0900, Masami Hiramatsu escreveu:
> On Thu, 4 Aug 2016 18:47:24 -0300
> Arnaldo Carvalho de Melo <ac...@kernel.org> wrote:
>
> > Em Thu, Aug 04, 2016 at 04:36:56PM -0300, Arnaldo Carvalho de Melo escreveu:
> > > So:
> > >
> > > int err = debuginfo__get_text_offset(dbg, &baseaddr);
> > >
> > > is returning 0, no relocation, its dwarf_addrdie() that is not finding
> > > SyS_epoll_wait from its address.
> > >
> > > Trying to figure out why dwarf_addrdie(0xffffffffbd295b50) fails...
> >
> > So, trying to use that vmlinux with objdump to do disassembly I found
> > that I need to do some offsetting, and after calculating it, this made
> > things works for me:
> >
> > diff --git a/tools/perf/util/probe-finder.c b/tools/perf/util/probe-finder.c
> > index f2d9ff064e2d..9b95754f28ed 100644
> > --- a/tools/perf/util/probe-finder.c
> > +++ b/tools/perf/util/probe-finder.c
> > @@ -1486,6 +1486,8 @@ retry:
> > /* Find cu die */
> > if (!dwarf_addrdie(dbg->dbg, (Dwarf_Addr)addr, &cudie)) {
> > if (!reloc && debuginfo__get_text_offset(dbg, &baseaddr) == 0) {
> > + if (baseaddr == 0)
> > + baseaddr = -0x3c000000;
>
> Hmm, strange... what's this magic number ...?
> Actually debuginfo__get_text_offset() is for kernel modules and it relocate the offset.

I see no mention to "kernel modules" in this function, and it is there
that the message is being emitted :-\

> But I guess sys_epoll_wait is in the kernel.

sure
[acme@jouet linux]$ grep RANDOMIZE_BASE ../build/v4.7.0+/.config
CONFIG_RANDOMIZE_BASE=y

Yes, it is in place, I'll check rebootint with nokaslr.

Can you think about how to properly warn the user about this
possibility, if it is confirmed to cause this issue?

I wonder why this doesn't seem to affect annotation, that also uses vmlinux,
for instance:

[root@jouet ~]# perf record --all-kernel -F 10000 ping -f -c 100000 localhost
PING localhost.localdomain (127.0.0.1) 56(84) bytes of data.

--- localhost.localdomain ping statistics ---
100000 packets transmitted, 100000 received, 0% packet loss, time 732ms
rtt min/avg/max/mdev = 0.003/0.003/0.048/0.002 ms, ipg/ewma 0.007/0.003 ms
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 0.298 MB perf.data (7340 samples) ]
[root@jouet ~]# perf report -v | grep '\[' | head -10
Using /lib/modules/4.7.0+/build/vmlinux for symbols
20.42% ping /lib/modules/4.7.0+/build/vmlinux 0x960374 v [k] ipt_do_table
2.55% ping /lib/modules/4.7.0+/build/vmlinux 0x900b8a v [k] nf_iterate
1.94% ping /lib/modules/4.7.0+/build/vmlinux 0x95190b v [k] fib_table_lookup
1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x2accfa v [k] __local_bh_enable_ip
1.68% ping /lib/modules/4.7.0+/build/vmlinux 0x9e41ab v [k] _raw_spin_lock_irqsave
1.66% ping /lib/modules/4.7.0+/build/vmlinux 0x55fd94 v [k] avc_has_perm
1.65% ping /lib/modules/4.7.0+/build/vmlinux 0x90e3e2 v [k] __ip_append_data.isra.44
1.31% ping /lib/modules/4.7.0+/build/vmlinux 0x5ead59 v [k] copy_user_enhanced_fast_string
1.21% ping /lib/modules/4.7.0+/build/vmlinux 0x938820 v [k] raw_local_deliver
1.14% ping /lib/modules/4.7.0+/build/vmlinux 0x6c6cea v [k] n_tty_write
[root@jouet ~]#

[root@jouet ~]# perf annotate -v --stdio ipt_do_table
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
symbol__disassemble: filename=/lib/modules/4.7.0+/build/vmlinux, sym=ipt_do_table, start=0xffffffffbd7601c0, end=0xffffffffbd7607b0
annotating [0x3cef9b0] /lib/modules/4.7.0+/build/vmlinux : [0x4159750] ipt_do_table
Executing: objdump --start-address=0xffffffff817601c0 --stop-address=0xffffffff817607b0 -l -d --no-show-raw -S -C /lib/modules/4.7.0+/build/vmlinux 2>/dev/nu
Percent | Source code & Disassembly of vmlinux for cycles:ppp (1498 samples)
----------------------------------------------------------------------------------
<SNIP>

[root@jouet ~]# grep ffffffffbd7601c0 /proc/kallsyms
ffffffffbd7601c0 T ipt_do_table
[root@jouet ~]#

Anyway, will reboot this machine with nokaslr to check that possibility.

- Arnaldo

Arnaldo Carvalho de Melo

unread,
Aug 5, 2016, 12:00:06 PM8/5/16
to
Em Fri, Aug 05, 2016 at 11:35:32AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Fri, Aug 05, 2016 at 06:45:50PM +0900, Masami Hiramatsu escreveu:
> > See the symbol address calcurated from symbol map, in successful case
> > the address exactly same address of SyS_epoll_wait. This indicates
> > something might wrong in the symbol map. (maybe KASLR?)
> > Could you check what happen if nokaslr is passed to your kernel?

> [acme@jouet linux]$ grep RANDOMIZE_BASE ../build/v4.7.0+/.config
> CONFIG_RANDOMIZE_BASE=y

> Yes, it is in place, I'll check rebooting with nokaslr.
So, more info, but executive summary: distro kernels don't exhibit this
problem, but they have CONFIG_RANDOMIZE_BASE=y, the kernel where I
noticed this problem (4.7.0+) doesn't exhibit this problem when used
with 'nokaslr', but I think more investigation is needed to get to the
bottom of this, i.e. to emit meaningful messages when kaslr is in place
(or is suspected to be):

Using a distro kernel:

[root@jouet ~]# uname -r
4.6.4-301.fc24.x86_64
[root@jouet ~]# perf probe sys_epoll_wait
Added new event:
probe:sys_epoll_wait (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait -aR sleep 1

[root@jouet ~]# rpm -q kernel-debuginfo
kernel-debuginfo-4.6.3-300.fc24.x86_64
[root@jouet ~]# uname -r
4.6.4-301.fc24.x86_64
[root@jouet ~]#

So it must be using just /proc/kallsyms, since there is no matching vmlinux:

[root@jouet ~]# perf probe -d *:*
Removed event: probe:sys_epoll_wait
[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Could not open debuginfo. Try to use symbols.
Looking at the vmlinux_path (8 entries long)
symsrc__init: cannot get elf header.
Using /proc/kcore for kernel object code
Using /proc/kallsyms for symbols
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2698592
Added new event:
probe:sys_epoll_wait (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait -aR sleep 1

[root@jouet ~]#

Ok, installing a matching kernel vmlinux:

[root@jouet ~]# perf probe -d *:*
Removed event: probe:sys_epoll_wait
[root@jouet ~]# dnf debuginfo-install kernel

Trying again:

[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux for symbols
Open Debuginfo file: /usr/lib/debug/lib/modules/4.6.4-301.fc24.x86_64/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81292d60
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff812930f3
Probe point found: compat_SyS_epoll_pwait+147
found inline addr: 0xffffffff81292ed6
Probe point found: SyS_epoll_pwait+134
found inline addr: 0xffffffff81292d60
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2699507
Writing event: p:probe/sys_epoll_wait_1 _text+2698966
Writing event: p:probe/sys_epoll_wait_2 _text+2698592
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

So, a distro kernel works with or without vmlinux, but then...

[root@jouet ~]# grep RANDOMIZE_BASE /boot/config-4.6.4-301.fc24.x86_64
CONFIG_RANDOMIZE_BASE=y
CONFIG_RANDOMIZE_BASE_MAX_OFFSET=0x40000000
[root@jouet ~]#

It also has kaslr :-\ Rebooting with the kernel where the problem happened,
this time with nokaslr:

[root@jouet ~]# uname -r
4.7.0+
[root@jouet ~]# cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-4.7.0+ root=/dev/mapper/fedora-root ro rd.lvm.lv=fedora/root rd.lvm.lv=fedora/swap rhgb quiet LANG=en_US.UTF-8 nokaslr
[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81295b50
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81295ee7
Probe point found: compat_SyS_epoll_pwait+151
found inline addr: 0xffffffff81295cca
Probe point found: SyS_epoll_pwait+138
found inline addr: 0xffffffff81295b50
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2711271
Writing event: p:probe/sys_epoll_wait_1 _text+2710730
Writing event: p:probe/sys_epoll_wait_2 _text+2710352
Added new events:
probe:sys_epoll_wait (on sys_epoll_wait)
probe:sys_epoll_wait_1 (on sys_epoll_wait)
probe:sys_epoll_wait_2 (on sys_epoll_wait)

You can now use it in all perf tools, such as:

perf record -e probe:sys_epoll_wait_2 -aR sleep 1

[root@jouet ~]#

It works, but why was annotation worked when "nokaslr" wasn't used?

- Arnaldo

Masami Hiramatsu

unread,
Aug 6, 2016, 5:10:07 PM8/6/16
to
On Fri, 5 Aug 2016 11:35:32 -0300
I've found "commit f90acac75713cc6f18a4b2f1b9162bc1cd893c20"

Actually, I forgot to count the relocation at find_alternative_probe_point().
Could you try the patch below?

------
perf-probe: Adjust map->reloc offset when finding kernel symbol from map

From: Masami Hiramatsu <masami.h...@linaro.org>

Adjust map->reloc offset for the unmapped address when
finding alternative symbol address from map, because
KASLR can relocate the kernel symbol address.

The same adjustment has been done when finding appropriate
kernel symbol address from map which was introduced by
commit f90acac75713 ("perf probe: Find given address from offline dwarf")

Reported-by: Arnaldo Carvalho de Melo <ac...@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.h...@linaro.org>
---
tools/perf/util/probe-event.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index 953dc1a..d5ccb65 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -385,7 +385,7 @@ static int find_alternative_probe_point(struct debuginfo *dinfo,
if (uprobes)
address = sym->start;
else
- address = map->unmap_ip(map, sym->start);
+ address = map->unmap_ip(map, sym->start) - map->reloc;
break;
}
if (!address) {

Arnaldo Carvalho de Melo

unread,
Aug 8, 2016, 3:40:07 PM8/8/16
to
Does the trick, booting with a kernel built from what is in
tip/perf/urgent, and with CONFIG_RANDOMIZE_BASE=y, before this fix:

[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Failed to get sdt note: -9
Failed to update/scan SDT cache for (null)
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffffac296900
Failed to find debug information for address ffffffffac296900
Probe point 'sys_epoll_wait' not found.
Error: Failed to add events. Reason: No such file or directory (Code:
-2)
[root@jouet ~]#
[root@jouet ~]#

After the fix:

[root@jouet ~]# perf probe -v sys_epoll_wait
probe-definition(0): sys_epoll_wait
symbol:sys_epoll_wait file:(null) line:0 offset:0 return:0 lazy:(null)
0 arguments
Looking at the vmlinux_path (8 entries long)
Using /lib/modules/4.7.0+/build/vmlinux for symbols
Open Debuginfo file: /lib/modules/4.7.0+/build/vmlinux
Try to find probe point from debuginfo.
Symbol sys_epoll_wait address found : ffffffff81296900
Matched function: SyS_epoll_wait
found inline addr: 0xffffffff81296c97
Probe point found: compat_SyS_epoll_pwait+151
found inline addr: 0xffffffff81296a7a
Probe point found: SyS_epoll_pwait+138
found inline addr: 0xffffffff81296900
Probe point found: SyS_epoll_wait+0
Found 3 probe_trace_events.
Opening /sys/kernel/debug/tracing//kprobe_events write=1
Writing event: p:probe/sys_epoll_wait _text+2714775
Writing event: p:probe/sys_epoll_wait_1 _text+2714234
Writing event: p:probe/sys_epoll_wait_2 _text+2713856

tip-bot for Masami Hiramatsu

unread,
Aug 9, 2016, 3:20:06 PM8/9/16
to
Commit-ID: 8e34189b347d76acf48ce05831176582201b664d
Gitweb: http://git.kernel.org/tip/8e34189b347d76acf48ce05831176582201b664d
Author: Masami Hiramatsu <masami.h...@linaro.org>
AuthorDate: Sat, 6 Aug 2016 19:29:48 +0900
Committer: Arnaldo Carvalho de Melo <ac...@redhat.com>
CommitDate: Tue, 9 Aug 2016 10:47:43 -0300

perf probe: Adjust map->reloc offset when finding kernel symbol from map

Adjust map->reloc offset for the unmapped address when finding
alternative symbol address from map, because KASLR can relocate the
kernel symbol address.

The same adjustment has been done when finding appropriate kernel symbol
address from map which was introduced by commit f90acac75713 ("perf
probe: Find given address from offline dwarf")

Reported-by: Arnaldo Carvalho de Melo <ac...@kernel.org>
Signed-off-by: Masami Hiramatsu <masami.h...@linaro.org>
Cc: Alexei Starovoitov <alexei.st...@gmail.com>
Cc: Wang Nan <wang...@huawei.com>
Link: http://lkml.kernel.org/r/20160806192948.e366...@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <ac...@redhat.com>
0 new messages