WARNING in mark_chain_precision

11 views
Skip to first unread message

syzbot

unread,
Jul 8, 2019, 12:07:08 PM7/8/19
to aaron....@intel.com, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, intel-w...@lists.osuosl.org, jakub.k...@netronome.com, jeffrey....@intel.com, john.fa...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, net...@vger.kernel.org, sasha....@intel.com, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com
Hello,

syzbot found the following crash on:

HEAD commit: a51df9f8 gve: fix -ENOMEM null check on a page allocation
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=17e64325a00000
kernel config: https://syzkaller.appspot.com/x/.config?x=6bb3e6e7997c14f9
dashboard link: https://syzkaller.appspot.com/bug?extid=f21251a7468cd46efc60
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=114f842da00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1630a5aba00000

The bug was bisected to:

commit 55fdbeaa2db8b271db767240fba24a60bd232528
Author: Sasha Neftin <sasha....@intel.com>
Date: Mon Jan 7 14:40:17 2019 +0000

igc: Remove unused code

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15c205b9a00000
final crash: https://syzkaller.appspot.com/x/report.txt?x=17c205b9a00000
console output: https://syzkaller.appspot.com/x/log.txt?x=13c205b9a00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+f21251...@syzkaller.appspotmail.com
Fixes: 55fdbeaa2db8 ("igc: Remove unused code")

------------[ cut here ]------------
verifier backtracking bug
WARNING: CPU: 0 PID: 8846 at kernel/bpf/verifier.c:1755
mark_chain_precision+0x15c2/0x18e0 kernel/bpf/verifier.c:1755
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 8846 Comm: syz-executor835 Not tainted 5.2.0-rc6+ #56
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
panic+0x2cb/0x744 kernel/panic.c:219
__warn.cold+0x20/0x4d kernel/panic.c:576
report_bug+0x263/0x2b0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:179 [inline]
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986
RIP: 0010:mark_chain_precision+0x15c2/0x18e0 kernel/bpf/verifier.c:1755
Code: e9 55 f2 ff ff 48 89 df e8 4b 0a 2c 00 e9 3a f3 ff ff e8 61 cb f2 ff
48 c7 c7 e0 43 91 87 c6 05 40 2b 1f 08 01 e8 1c 03 c5 ff <0f> 0b 41 be f2
ff ff ff e9 eb f7 ff ff e8 3c cb f2 ff 45 31 f6 e9
RSP: 0018:ffff8880a01ef378 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815adb06 RDI: ffffed101403de61
RBP: ffff8880a01ef4d0 R08: ffff88808e26c400 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8880a14e8440 R14: 0000000000000001 R15: dffffc0000000000
check_cond_jmp_op+0xcce/0x3c20 kernel/bpf/verifier.c:5793
do_check+0x61cf/0x8930 kernel/bpf/verifier.c:7684
bpf_check+0x6f99/0x9950 kernel/bpf/verifier.c:9195
bpf_prog_load+0xec8/0x1670 kernel/bpf/syscall.c:1690
__do_sys_bpf+0xa20/0x42c0 kernel/bpf/syscall.c:2830
__se_sys_bpf kernel/bpf/syscall.c:2789 [inline]
__x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:2789
do_syscall_64+0xfd/0x680 arch/x86/entry/common.c:301
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x440369
Code: 18 89 d0 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 fb 13 fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007ffccb952af8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 0000000000440369
RDX: 0000000000000048 RSI: 0000000020000200 RDI: 0000000000000005
RBP: 00000000006ca018 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: 0000000000401bf0
R13: 0000000000401c80 R14: 0000000000000000 R15: 0000000000000000
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Andrii Nakryiko

unread,
Jul 8, 2019, 11:49:37 PM7/8/19
to syzbot, aaron....@intel.com, Alexei Starovoitov, bpf, Daniel Borkmann, David S. Miller, ha...@kernel.org, intel-w...@lists.osuosl.org, Jakub Kicinski, jeffrey....@intel.com, john fastabend, Martin Lau, open list, Networking, sasha....@intel.com, Song Liu, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, Yonghong Song
#syz test: https://github.com/anakryiko/linux bpf-fix-precise-bpf_st

syzbot

unread,
Jul 9, 2019, 12:25:01 AM7/9/19
to aaron....@intel.com, andrii....@gmail.com, a...@kernel.org, b...@vger.kernel.org, dan...@iogearbox.net, da...@davemloft.net, ha...@kernel.org, intel-w...@lists.osuosl.org, jakub.k...@netronome.com, jeffrey....@intel.com, john.fa...@gmail.com, ka...@fb.com, linux-...@vger.kernel.org, net...@vger.kernel.org, sasha....@intel.com, songliu...@fb.com, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, y...@fb.com
Hello,

syzbot has tested the proposed patch but the reproducer still triggered
crash:
WARNING in bpf_jit_free

WARNING: CPU: 0 PID: 9077 at kernel/bpf/core.c:851 bpf_jit_free+0x157/0x1b0
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 9077 Comm: kworker/0:3 Not tainted 5.2.0-rc6+ #1
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Workqueue: events bpf_prog_free_deferred
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
panic+0x2cb/0x744 kernel/panic.c:219
__warn.cold+0x20/0x4d kernel/panic.c:576
report_bug+0x263/0x2b0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:179 [inline]
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
invalid_op+0x14/0x20 arch/x86/entry/entry_64.S:986
RIP: 0010:bpf_jit_free+0x157/0x1b0
Code: 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5d 48 b8 00 02 00 00
00 00 ad de 48 39 43 70 0f 84 05 ff ff ff e8 09 7f f4 ff <0f> 0b e9 f9 fe
ff ff e8 2d 02 2e 00 e9 d9 fe ff ff 48 89 7d e0 e8
RSP: 0018:ffff888084affcb0 EFLAGS: 00010293
RAX: ffff88808a622100 RBX: ffff88809639d580 RCX: ffffffff817b0b0d
RDX: 0000000000000000 RSI: ffffffff817c4557 RDI: ffff88809639d5f0
RBP: ffff888084affcd0 R08: 1ffffffff150daa8 R09: fffffbfff150daa9
R10: fffffbfff150daa8 R11: ffffffff8a86d547 R12: ffffc90001921000
R13: ffff88809639d5e8 R14: ffff8880a0589800 R15: ffff8880ae834d40
bpf_prog_free_deferred+0x27a/0x350 kernel/bpf/core.c:1982
process_one_work+0x989/0x1790 kernel/workqueue.c:2269
worker_thread+0x98/0xe40 kernel/workqueue.c:2415
kthread+0x354/0x420 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Kernel Offset: disabled
Rebooting in 86400 seconds..


Tested on:

commit: b9321614 bpf: fix precision bit propagation for BPF_ST ins..
git tree: https://github.com/anakryiko/linux bpf-fix-precise-bpf_st
console output: https://syzkaller.appspot.com/x/log.txt?x=112f0dfda00000
kernel config: https://syzkaller.appspot.com/x/.config?x=6bb3e6e7997c14f9

Andrii Nakryiko

unread,
Jul 9, 2019, 2:55:29 PM7/9/19
to syzbot, aaron....@intel.com, Alexei Starovoitov, bpf, Daniel Borkmann, David S. Miller, ha...@kernel.org, intel-w...@lists.osuosl.org, Jakub Kicinski, jeffrey....@intel.com, john fastabend, Martin Lau, open list, Networking, sasha....@intel.com, Song Liu, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, Yonghong Song
Original reproducer is almost identical to the one that is fixed by
https://patchwork.ozlabs.org/patch/1129479/.

bpf_prog_free_deferred bug that's undeterministically exposed after
this fix seems to be the cause of a bunch of other bug reports and is
not related to verifier precision tracking.

#syz dup: WARNING in __mark_chain_precision

Andrii Nakryiko

unread,
Jul 16, 2019, 6:31:05 AM7/16/19
to Hillf Danton, syzbot, aaron....@intel.com, Alexei Starovoitov, bpf, Daniel Borkmann, David S. Miller, ha...@kernel.org, intel-w...@lists.osuosl.org, Jakub Kicinski, jeffrey....@intel.com, john fastabend, Martin Lau, open list, Networking, sasha....@intel.com, Song Liu, syzkall...@googlegroups.com, xdp-n...@vger.kernel.org, Yonghong Song
On Tue, Jul 9, 2019 at 9:08 PM Hillf Danton <hda...@sina.com> wrote:
>
>
> Mon, 08 Jul 2019 21:25:00 -0700 (PDT)
> 1, currently, bpf_prog_put(), the put helper, deletes all kallsyms before
> invoking the free helper, bpf_prog_free(); the latter complains if kallsyms
> are detected not all off.
>
> 2, before commit 538950a1b752 ("soreuseport: setsockopt
> SO_ATTACH_REUSEPORT_[CE]BPF"), __bpf_prog_release() already called the put
> helper or the free one for a given prog depending on its type: put for
> BPF_PROG_TYPE_SOCKET_FILTER.
>
> In the commit we can see bpf_prog_destroy(), __bpf_prog_free(),
> bpf_prog_put(), here and then; Note in __get_bpf() the put for
> !BPF_PROG_TYPE_SOCKET_FILTER.
>
> 3, in commit 113214be7f6c ("bpf: refactor bpf_prog_get and type check into
> helper") bpf_prog_get_type() was added and put in __get_bpf().
>
> In the commit we can see other types like BPF_PROG_TYPE_SCHED_ACT and
> BPF_PROG_TYPE_SCHED_CLS.
>
> 4, in commit 8217ca653ec6 ("bpf: Enable BPF_PROG_TYPE_SK_REUSEPORT bpf prog
> in reuseport selection") sk_reuseport_prog_free() was added without a word
> in the log for it.
>
> 5, enrich prog type in __bpf_prog_release().

Thanks for investigation!

I'm curious what makes BPF_PROG_TYPE_SOCKET_FILTER (and
BPF_PROG_TYPE_SK_REUSEPORT) special, compared to all the other types
of BPF programs. If it's something to do with sockets specifically,
there are a bunch of other programs that deal with sockets, so should
they be handled specially as well (e.g., BPF_PROG_TYPE_CGROUP_SOCK)?

Daniel, do you have any insight here?

>
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1142,11 +1142,15 @@ static void bpf_release_orig_filter(stru
>
> static void __bpf_prog_release(struct bpf_prog *prog)
> {
> - if (prog->type == BPF_PROG_TYPE_SOCKET_FILTER) {
> + switch (prog->type) {
> + case BPF_PROG_TYPE_SOCKET_FILTER:
> + case BPF_PROG_TYPE_SK_REUSEPORT:
> bpf_prog_put(prog);
> - } else {
> + break;
> + default:
> bpf_release_orig_filter(prog);
> bpf_prog_free(prog);
> + break;
> }
> }
>
> --
>
Reply all
Reply to author
Forward
0 new messages