KASAN: vmalloc-out-of-bounds Read in bpf_trace

syzbot

unread,

Nov 2, 2020, 6:54:20 AM11/2/20

to ak...@linux-foundation.org, and...@kernel.org, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, john.fa...@gmail.com, ka...@fb.com, kps...@chromium.org, ku...@kernel.org, linux-...@vger.kernel.org, mi...@elte.hu, mi...@redhat.com, mmul...@fb.com, net...@vger.kernel.org, pet...@infradead.org, ros...@goodmis.org, songliu...@fb.com, syzkall...@googlegroups.com, y...@fb.com

Hello,

syzbot found the following issue on:

HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
git tree: bpf
console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
compiler: gcc (GCC) 10.1.0-syz 20200507
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000

The issue was bisected to:

commit 9df1c28bb75217b244257152ab7d788bb2a386d0
Author: Matt Mullins <mmul...@fb.com>
Date: Fri Apr 26 18:49:47 2019 +0000

bpf: add writable context for raw tracepoints

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=12b6c4da500000
final oops: https://syzkaller.appspot.com/x/report.txt?x=11b6c4da500000
console output: https://syzkaller.appspot.com/x/log.txt?x=16b6c4da500000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d29e58...@syzkaller.appspotmail.com
Fixes: 9df1c28bb752 ("bpf: add writable context for raw tracepoints")

==================================================================
BUG: KASAN: vmalloc-out-of-bounds in __bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
BUG: KASAN: vmalloc-out-of-bounds in bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
Read of size 8 at addr ffffc90000e6c030 by task kworker/0:3/3754

CPU: 0 PID: 3754 Comm: kworker/0:3 Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: 0x0 (events)
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x107/0x163 lib/dump_stack.c:118
print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
__kasan_report mm/kasan/report.c:545 [inline]
kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
__bpf_trace_run kernel/trace/bpf_trace.c:2045 [inline]
bpf_trace_run3+0x3e0/0x3f0 kernel/trace/bpf_trace.c:2083
__bpf_trace_sched_switch+0xdc/0x120 include/trace/events/sched.h:138
__traceiter_sched_switch+0x64/0xb0 include/trace/events/sched.h:138
trace_sched_switch include/trace/events/sched.h:138 [inline]
__schedule+0xeb8/0x2130 kernel/sched/core.c:4520
schedule+0xcf/0x270 kernel/sched/core.c:4601
worker_thread+0x14c/0x1120 kernel/workqueue.c:2439
kthread+0x3af/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:296

Memory state around the buggy address:
ffffc90000e6bf00: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90000e6bf80: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
>ffffc90000e6c000: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
^
ffffc90000e6c080: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
ffffc90000e6c100: f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8 f8
==================================================================

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Dmitry Vyukov

unread,

Nov 11, 2020, 9:58:03 AM11/11/20

to syzbot, Andrew Morton, and...@kernel.org, Alexei Starovoitov, bpf, Daniel Borkmann, David Miller, Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau, KP Singh, Jakub Kicinski, LKML, Ingo Molnar, Ingo Molnar, mmul...@fb.com, netdev, Peter Zijlstra, Steven Rostedt, Song Liu, syzkaller-bugs, Yonghong Song

On Mon, Nov 2, 2020 at 12:54 PM syzbot
<syzbot+d29e58...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
> git tree: bpf
> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
> kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
> compiler: gcc (GCC) 10.1.0-syz 20200507
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
>
> The issue was bisected to:
>
> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
> Author: Matt Mullins <mmul...@fb.com>
> Date: Fri Apr 26 18:49:47 2019 +0000
>
> bpf: add writable context for raw tracepoints

We have a number of kernel memory corruptions related to bpf_trace_run now:
https://groups.google.com/g/syzkaller-bugs/search?q=kernel%2Ftrace%2Fbpf_trace.c

Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
Or they shouldn't?

Looking at the description of Matt's commit, it seems that corruptions
should not be possible (bounded buffer, checked size, etc). Then it
means it's a real kernel bug?

> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/00000000000004500b05b31e68ce%40google.com.

Matt Mullins

unread,

Nov 13, 2020, 12:37:28 AM11/13/20

to Dmitry Vyukov, syzbot, Andrew Morton, and...@kernel.org, Alexei Starovoitov, bpf, Daniel Borkmann, David Miller, Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau, KP Singh, Jakub Kicinski, LKML, Ingo Molnar, Ingo Molnar, mmul...@fb.com, netdev, Peter Zijlstra, Steven Rostedt, Song Liu, syzkaller-bugs, Yonghong Song

This bug doesn't seem to be related to the writability of the
tracepoint; it bisected to that commit simply because it used
BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE for the reproducer and it EINVAL's
before that program type was introduced. The BPF program it loads is
pretty much a no-op.

The problem here is a kmalloc failure injection into
tracepoint_probe_unregister, but the error is ignored -- so the bpf
program is freed even though the tracepoint is never unregistered.

I have a first pass at a patch to pipe through the error code, but it's
pretty ugly. It's also called from the file_operations ->release(), for
which errors are solidly ignored in __fput(), so I'm not sure what the
best way to handle ENOMEM is...

Yonghong Song

unread,

Nov 13, 2020, 11:09:39 AM11/13/20

to Dmitry Vyukov, syzbot, Andrew Morton, and...@kernel.org, Alexei Starovoitov, bpf, Daniel Borkmann, David Miller, Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau, KP Singh, Jakub Kicinski, LKML, Ingo Molnar, Ingo Molnar, mmul...@fb.com, netdev, Peter Zijlstra, Steven Rostedt, Song Liu, syzkaller-bugs

On 11/12/20 9:37 PM, Matt Mullins wrote:
> On Wed, Nov 11, 2020 at 03:57:50PM +0100, Dmitry Vyukov wrote:
>> On Mon, Nov 2, 2020 at 12:54 PM syzbot
>> <syzbot+d29e58...@syzkaller.appspotmail.com> wrote:
>>>
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: 080b6f40 bpf: Don't rely on GCC __attribute__((optimize)) ..
>>> git tree: bpf
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1089d37c500000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=58a4ca757d776bfe
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=d29e58bb557324e55e5e
>>> compiler: gcc (GCC) 10.1.0-syz 20200507
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10f4b032500000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1371a47c500000
>>>
>>> The issue was bisected to:
>>>
>>> commit 9df1c28bb75217b244257152ab7d788bb2a386d0
>>> Author: Matt Mullins <mmul...@fb.com>
>>> Date: Fri Apr 26 18:49:47 2019 +0000
>>>
>>> bpf: add writable context for raw tracepoints
>>
>>
>> We have a number of kernel memory corruptions related to bpf_trace_run now:

>> https://groups.google.com/g/syzkaller-bugs/search?q=kernel/trace/bpf_trace.c

>>
>> Can raw tracepoints "legally" corrupt kernel memory (a-la /dev/kmem)?
>> Or they shouldn't?
>>
>> Looking at the description of Matt's commit, it seems that corruptions
>> should not be possible (bounded buffer, checked size, etc). Then it
>> means it's a real kernel bug?
>
> This bug doesn't seem to be related to the writability of the
> tracepoint; it bisected to that commit simply because it used
> BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE for the reproducer and it EINVAL's
> before that program type was introduced. The BPF program it loads is
> pretty much a no-op.
>
> The problem here is a kmalloc failure injection into
> tracepoint_probe_unregister, but the error is ignored -- so the bpf
> program is freed even though the tracepoint is never unregistered.
>
> I have a first pass at a patch to pipe through the error code, but it's
> pretty ugly. It's also called from the file_operations ->release(), for

Maybe you can still post the patch, so people can review and make
suggestions which may lead to a *better* solution.

[...]

Eric Dumazet

unread,

Feb 10, 2021, 1:23:43 PM2/10/21

to Yonghong Song, Dmitry Vyukov, syzbot, Andrew Morton, and...@kernel.org, Alexei Starovoitov, bpf, Daniel Borkmann, David Miller, Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau, KP Singh, Jakub Kicinski, LKML, Ingo Molnar, Ingo Molnar, mmul...@fb.com, netdev, Peter Zijlstra, Steven Rostedt, Song Liu, syzkaller-bugs

ping

This bug is still there.

Steven Rostedt

unread,

Feb 10, 2021, 2:52:12 PM2/10/21

to Eric Dumazet, Yonghong Song, Dmitry Vyukov, syzbot, Andrew Morton, and...@kernel.org, Alexei Starovoitov, bpf, Daniel Borkmann, David Miller, Jesper Dangaard Brouer, John Fastabend, Martin KaFai Lau, KP Singh, Jakub Kicinski, LKML, Ingo Molnar, Ingo Molnar, mmul...@fb.com, netdev, Peter Zijlstra, Song Liu, syzkaller-bugs

On Wed, 10 Feb 2021 19:23:38 +0100
Eric Dumazet <eric.d...@gmail.com> wrote:

> >> The problem here is a kmalloc failure injection into
> >> tracepoint_probe_unregister, but the error is ignored -- so the bpf
> >> program is freed even though the tracepoint is never unregistered.
> >>
> >> I have a first pass at a patch to pipe through the error code, but it's
> >> pretty ugly. It's also called from the file_operations ->release(), for
> >
> > Maybe you can still post the patch, so people can review and make suggestions which may lead to a *better* solution.
>
>
> ping
>
> This bug is still there.

Is this a bug via syzkaller?

I have this fix in linux-next:

befe6d946551 ("tracepoint: Do not fail unregistering a probe due to memory failure")

But because it is using undefined behavior (calling a sub return from a
call that has parameters, but Peter Zijlstra says is OK), I'm hesitant to
send it to Linus now or label it as stable.

Now this can only happen if kmalloc fails from here (called by func_remove).

static inline void *allocate_probes(int count)
{
struct tp_probes *p = kmalloc(struct_size(p, probes, count),
GFP_KERNEL);
return p == NULL ? NULL : p->probes;
}

As probes and count together is typically much less than a page (unless you
are doing fuzz testing and adding a ton of callbacks to a single
tracepoint), that kmalloc should always succeed.

The failure above can only happen if allocate_probes returns NULL, which is
extremely unlikely.

My question is, how is this triggered? And this should only be triggered by
root doing stupid crap. Is it that critical to have fixed?

-- Steve

Reply all

Reply to author

Forward

KASAN: vmalloc-out-of-bounds Read in bpf_trace_run3

syzbot

Dmitry Vyukov

Matt Mullins

Yonghong Song

Eric Dumazet

Steven Rostedt