general protection fault in qdisc_put

24 views
Skip to first unread message

syzbot

unread,
Sep 8, 2019, 2:08:10 AM9/8/19
to akinob...@gmail.com, ak...@linux-foundation.org, da...@davemloft.net, dvy...@google.com, j...@mojatatu.com, ji...@resnulli.us, linux-...@vger.kernel.org, mho...@kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, torv...@linux-foundation.org, xiyou.w...@gmail.com
Hello,

syzbot found the following crash on:

HEAD commit: 3b47fd5c Merge tag 'nfs-for-5.3-4' of git://git.linux-nfs...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10244dd6600000
kernel config: https://syzkaller.appspot.com/x/.config?x=b89bb446a3faaba4
dashboard link: https://syzkaller.appspot.com/bug?extid=d5870a903591faaca4ae
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=174743fe600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11f8c43e600000

The bug was bisected to:

commit e41d58185f1444368873d4d7422f7664a68be61d
Author: Dmitry Vyukov <dvy...@google.com>
Date: Wed Jul 12 21:34:35 2017 +0000

fault-inject: support systematic fault injection

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=13f66bc6600000
final crash: https://syzkaller.appspot.com/x/report.txt?x=100e6bc6600000
console output: https://syzkaller.appspot.com/x/log.txt?x=17f66bc6600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+d5870a...@syzkaller.appspotmail.com
Fixes: e41d58185f14 ("fault-inject: support systematic fault injection")

RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000001bbbbbb
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000005 R14: 0000000000000000 R15: 0000000000000000
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 9699 Comm: syz-executor169 Not tainted 5.3.0-rc7+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:qdisc_put+0x25/0x90 net/sched/sch_generic.c:983
Code: 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 e8 c1 52 bf fb 49 8d 7c
24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84
c0 74 04 3c 03 7e 54 41 8b 5c 24 10 31 ff 83 e3 01
RSP: 0018:ffff8880944c7488 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8880945c8540 RCX: ffffffff85b49e8a
RDX: 0000000000000002 RSI: ffffffff85b3228f RDI: 0000000000000010
RBP: ffff8880944c7498 R08: ffff888099d50480 R09: ffffed1012898e45
R10: ffffed1012898e44 R11: 0000000000000003 R12: 0000000000000000
R13: ffff8880945c8540 R14: ffff888094894500 R15: ffff8880945c857c
FS: 0000555557553880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000610 CR3: 000000008c29d000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
sfb_destroy+0x61/0x80 net/sched/sch_sfb.c:468
qdisc_create+0xbc6/0x1210 net/sched/sch_api.c:1285
tc_modify_qdisc+0x524/0x1c50 net/sched/sch_api.c:1652
rtnetlink_rcv_msg+0x463/0xb00 net/core/rtnetlink.c:5223
netlink_rcv_skb+0x177/0x450 net/netlink/af_netlink.c:2477
rtnetlink_rcv+0x1d/0x30 net/core/rtnetlink.c:5241
netlink_unicast_kernel net/netlink/af_netlink.c:1302 [inline]
netlink_unicast+0x531/0x710 net/netlink/af_netlink.c:1328
netlink_sendmsg+0x8a5/0xd60 net/netlink/af_netlink.c:1917
sock_sendmsg_nosec net/socket.c:637 [inline]
sock_sendmsg+0xd7/0x130 net/socket.c:657
___sys_sendmsg+0x803/0x920 net/socket.c:2311
__sys_sendmsg+0x105/0x1d0 net/socket.c:2356
__do_sys_sendmsg net/socket.c:2365 [inline]
__se_sys_sendmsg net/socket.c:2363 [inline]
__x64_sys_sendmsg+0x78/0xb0 net/socket.c:2363
do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x4424f9
Code: e8 9c 07 03 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff
ff 0f 83 3b 0a fc ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fffed10bed8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00000000004424f9
RDX: 0000000000000000 RSI: 0000000020000240 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000001bbbbbb
R10: 0000000000000000 R11: 0000000000000246 R12: ffffffffffffffff
R13: 0000000000000005 R14: 0000000000000000 R15: 0000000000000000
Modules linked in:
---[ end trace 97e52c48ae7a3cc1 ]---
RIP: 0010:qdisc_put+0x25/0x90 net/sched/sch_generic.c:983
Code: 00 00 00 00 00 55 48 89 e5 41 54 49 89 fc 53 e8 c1 52 bf fb 49 8d 7c
24 10 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84
c0 74 04 3c 03 7e 54 41 8b 5c 24 10 31 ff 83 e3 01
RSP: 0018:ffff8880944c7488 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff8880945c8540 RCX: ffffffff85b49e8a
RDX: 0000000000000002 RSI: ffffffff85b3228f RDI: 0000000000000010
RBP: ffff8880944c7498 R08: ffff888099d50480 R09: ffffed1012898e45
R10: ffffed1012898e44 R11: 0000000000000003 R12: 0000000000000000
R13: ffff8880945c8540 R14: ffff888094894500 R15: ffff8880945c857c
FS: 0000555557553880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000610 CR3: 000000008c29d000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Linus Torvalds

unread,
Sep 8, 2019, 1:19:01 PM9/8/19
to syzbot, akinob...@gmail.com, Andrew Morton, David Miller, Dmitry Vyukov, j...@mojatatu.com, ji...@resnulli.us, Linux List Kernel Mailing, Michal Hocko, Netdev, syzkaller-bugs, Cong Wang
On Sat, Sep 7, 2019 at 11:08 PM syzbot
<syzbot+d5870a...@syzkaller.appspotmail.com> wrote:
>
> The bug was bisected to:
>
> commit e41d58185f1444368873d4d7422f7664a68be61d
> Author: Dmitry Vyukov <dvy...@google.com>
> Date: Wed Jul 12 21:34:35 2017 +0000
>
> fault-inject: support systematic fault injection

That commit does seem a bit questionable, but not the cause of this
problem (just the trigger).

I think the questionable part is that the new code doesn't honor the
task filtering, and will fail even for protected tasks. Dmitry?

> kasan: GPF could be caused by NULL-ptr deref or user memory access
> general protection fault: 0000 [#1] PREEMPT SMP KASAN
> CPU: 1 PID: 9699 Comm: syz-executor169 Not tainted 5.3.0-rc7+ #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> RIP: 0010:qdisc_put+0x25/0x90 net/sched/sch_generic.c:983

Yes, looks like 'qdisc' is NULL.

This is the

qdisc_put(q->qdisc);

in sfb_destroy(), called from qdisc_create().

I think what is happening is this (in qdisc_create()):

if (ops->init) {
err = ops->init(sch, tca[TCA_OPTIONS], extack);
if (err != 0)
goto err_out5;
}
...
err_out5:
/* ops->init() failed, we call ->destroy() like qdisc_create_dflt() */
if (ops->destroy)
ops->destroy(sch);

and "ops->init" is sfb_init(), which will not initialize q->qdisc if
tcf_block_get() fails.

I see two solutions:

(a) move the

q->qdisc = &noop_qdisc;

up earlier in sfb_init(), so that qdisc is always initialized
after sfb_init(), even on failure.

(b) just make qdisc_put(NULL) just silently work as a no-op.

(c) change all the semantics to not call ->destroy if ->init failed.

Honestly, (a) seems very fragile - do all the other init routines do
this? And (c) sounds like a big change, and very fragile too.

So I'd suggest that qdisc_put() be made to just ignore a NULL pointer
(and maybe an error pointer too?).

But I'll leave it to the maintainers to sort out the proper fix.
Maybe people prefer (a)?

Linus

Dmitry Vyukov

unread,
Sep 9, 2019, 2:46:06 AM9/9/19
to Linus Torvalds, syzbot, Akinobu Mita, Andrew Morton, David Miller, Jamal Hadi Salim, Jiří Pírko, Linux List Kernel Mailing, Michal Hocko, Netdev, syzkaller-bugs, Cong Wang
On Sun, Sep 8, 2019 at 6:19 PM Linus Torvalds
<torv...@linux-foundation.org> wrote:
>
> On Sat, Sep 7, 2019 at 11:08 PM syzbot
> <syzbot+d5870a...@syzkaller.appspotmail.com> wrote:
> >
> > The bug was bisected to:
> >
> > commit e41d58185f1444368873d4d7422f7664a68be61d
> > Author: Dmitry Vyukov <dvy...@google.com>
> > Date: Wed Jul 12 21:34:35 2017 +0000
> >
> > fault-inject: support systematic fault injection
>
> That commit does seem a bit questionable, but not the cause of this
> problem (just the trigger).
>
> I think the questionable part is that the new code doesn't honor the
> task filtering, and will fail even for protected tasks. Dmitry?

That commit added a new fault injection mode with a new API that is
used by syzkaller to inject faults. Before that commit the fault
inject is not working for syzkaller at all. I think this bisection
result simply means "the GPF is related to an earlier failure".

Cong Wang

unread,
Sep 9, 2019, 7:14:49 PM9/9/19
to Linus Torvalds, syzbot, Akinobu Mita, Andrew Morton, David Miller, Dmitry Vyukov, Jamal Hadi Salim, Jiri Pirko, Linux List Kernel Mailing, Michal Hocko, Netdev, syzkaller-bugs
On Sun, Sep 8, 2019 at 10:19 AM Linus Torvalds
<torv...@linux-foundation.org> wrote:
> I see two solutions:
>
> (a) move the
>
> q->qdisc = &noop_qdisc;
>
> up earlier in sfb_init(), so that qdisc is always initialized
> after sfb_init(), even on failure.
>
> (b) just make qdisc_put(NULL) just silently work as a no-op.
>
> (c) change all the semantics to not call ->destroy if ->init failed.
>
> Honestly, (a) seems very fragile - do all the other init routines do
> this? And (c) sounds like a big change, and very fragile too.
>
> So I'd suggest that qdisc_put() be made to just ignore a NULL pointer
> (and maybe an error pointer too?).

I think (a) is the best solution here.

(c) changes too much, we already rely on this behavior.

(b) is not bad either, just very slightly more risky.

Alternatively, we can add a quick NULL check inside
sfb_destroy().

I can send out a patch if you don't.

Thanks for looking at this!
Reply all
Reply to author
Forward
0 new messages