WARNING in geneve_exit_batch_net (2)

11 views
Skip to first unread message

syzbot

unread,
Feb 29, 2020, 4:01:12 AM2/29/20
to da...@davemloft.net, jb...@redhat.com, linux-...@vger.kernel.org, mo...@mellanox.com, net...@vger.kernel.org, s...@queasysnail.net, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following crash on:

HEAD commit: f8788d86 Linux 5.6-rc3
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=138dd22de00000
kernel config: https://syzkaller.appspot.com/x/.config?x=5d2e033af114153f
dashboard link: https://syzkaller.appspot.com/bug?extid=68a8ed58e3d17c700de5
compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16601d31e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14fdf8f9e00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+68a8ed...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 0 PID: 304 at drivers/net/geneve.c:1849 geneve_destroy_tunnels drivers/net/geneve.c:1849 [inline]
WARNING: CPU: 0 PID: 304 at drivers/net/geneve.c:1849 geneve_exit_batch_net+0x2b1/0x300 drivers/net/geneve.c:1859
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 304 Comm: kworker/u4:4 Not tainted 5.6.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1fb/0x318 lib/dump_stack.c:118
panic+0x264/0x7a9 kernel/panic.c:221
__warn+0x209/0x210 kernel/panic.c:582
report_bug+0x1b6/0x2f0 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0xcf/0x1c0 arch/x86/kernel/traps.c:267
do_invalid_op+0x36/0x40 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:geneve_destroy_tunnels drivers/net/geneve.c:1849 [inline]
RIP: 0010:geneve_exit_batch_net+0x2b1/0x300 drivers/net/geneve.c:1859
Code: 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 46 aa d1 fc 48 8b 1b 4c 39 fb 74 13 e8 c9 80 94 fc e9 f4 fd ff ff e8 bf 80 94 fc <0f> 0b eb cf e8 b6 80 94 fc eb 05 e8 af 80 94 fc 48 8d 7d c0 e8 c6
RSP: 0018:ffffc90001917c08 EFLAGS: 00010293
RAX: ffffffff84e288c1 RBX: ffff8880a7bc6120 RCX: ffff8880a88304c0
RDX: 0000000000000000 RSI: ffffc90001917c28 RDI: ffff8880a47da068
RBP: ffffc90001917c68 R08: ffffffff866be459 R09: fffffbfff12b21a9
R10: fffffbfff12b21a9 R11: 0000000000000000 R12: ffffc90001917c28
R13: dffffc0000000000 R14: ffff8880a1ca0dd0 R15: ffffc90001917c98
ops_exit_list net/core/net_namespace.c:175 [inline]
cleanup_net+0x78b/0xb80 net/core/net_namespace.c:589
process_one_work+0x7f5/0x10f0 kernel/workqueue.c:2264
worker_thread+0xbbc/0x1630 kernel/workqueue.c:2410
kthread+0x332/0x350 kernel/kthread.c:255
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches

Hillf Danton

unread,
Feb 29, 2020, 8:52:34 AM2/29/20
to syzbot, da...@davemloft.net, jb...@redhat.com, linux-...@vger.kernel.org, mo...@mellanox.com, net...@vger.kernel.org, s...@queasysnail.net, syzkall...@googlegroups.com, tg...@linutronix.de

On Sat, 29 Feb 2020 01:01:11 -0800
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1846,6 +1846,9 @@ static void geneve_destroy_tunnels(struc
unregister_netdevice_queue(geneve->dev, head);
}

+ /* unregister the devices gathered */
+ unregister_netdevice_many(head);
+
WARN_ON_ONCE(!list_empty(&gn->sock_list));
}


syzbot

unread,
Mar 12, 2020, 3:12:03 AM3/12/20
to core...@netfilter.org, da...@davemloft.net, f...@strlen.de, hda...@sina.com, jb...@redhat.com, kad...@blackhole.kfki.hu, linux-...@vger.kernel.org, mo...@mellanox.com, net...@vger.kernel.org, netfilt...@vger.kernel.org, pa...@netfilter.org, s...@queasysnail.net, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot has bisected this bug to:

commit 4e645b47c4f000a503b9c90163ad905786b9bc1d
Author: Florian Westphal <f...@strlen.de>
Date: Thu Nov 30 23:21:02 2017 +0000

netfilter: core: make nf_unregister_net_hooks simple wrapper again

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1594fbfde00000
start commit: 63623fd4 Merge tag 'for-linus' of git://git.kernel.org/pub..
git tree: upstream
final crash: https://syzkaller.appspot.com/x/report.txt?x=1794fbfde00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1394fbfde00000
kernel config: https://syzkaller.appspot.com/x/.config?x=9833e26bab355358
dashboard link: https://syzkaller.appspot.com/bug?extid=68a8ed58e3d17c700de5
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14f08165e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17902329e00000

Reported-by: syzbot+68a8ed...@syzkaller.appspotmail.com
Fixes: 4e645b47c4f0 ("netfilter: core: make nf_unregister_net_hooks simple wrapper again")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Florian Westphal

unread,
Mar 12, 2020, 9:45:05 PM3/12/20
to syzbot, core...@netfilter.org, da...@davemloft.net, f...@strlen.de, hda...@sina.com, jb...@redhat.com, kad...@blackhole.kfki.hu, linux-...@vger.kernel.org, mo...@mellanox.com, net...@vger.kernel.org, netfilt...@vger.kernel.org, pa...@netfilter.org, s...@queasysnail.net, syzkall...@googlegroups.com, tg...@linutronix.de
syzbot <syzbot+68a8ed...@syzkaller.appspotmail.com> wrote:
> syzbot has bisected this bug to:
>
> commit 4e645b47c4f000a503b9c90163ad905786b9bc1d
> Author: Florian Westphal <f...@strlen.de>
> Date: Thu Nov 30 23:21:02 2017 +0000
>
> netfilter: core: make nf_unregister_net_hooks simple wrapper again

No idea why this turns up, the reproducer doesn't hit any of these code
paths.

The debug splat is a false-positive; ndo_stop/list_del hasn't run yet.
I will send a fix for net tree.

Dmitry Vyukov

unread,
Mar 13, 2020, 3:31:10 AM3/13/20
to Florian Westphal, syzbot, core...@netfilter.org, David Miller, Hillf Danton, Jiri Benc, Jozsef Kadlecsik, LKML, mo...@mellanox.com, netdev, NetFilter, Pablo Neira Ayuso, Sabrina Dubroca, syzkaller-bugs, Thomas Gleixner
On Fri, Mar 13, 2020 at 2:45 AM Florian Westphal <f...@strlen.de> wrote:
>
> syzbot <syzbot+68a8ed...@syzkaller.appspotmail.com> wrote:
> > syzbot has bisected this bug to:
> >
> > commit 4e645b47c4f000a503b9c90163ad905786b9bc1d
> > Author: Florian Westphal <f...@strlen.de>
> > Date: Thu Nov 30 23:21:02 2017 +0000
> >
> > netfilter: core: make nf_unregister_net_hooks simple wrapper again
>
> No idea why this turns up, the reproducer doesn't hit any of these code
> paths.

The attached bisection log usually makes this reasonably transparent.
It seems that in this case another kernel bug gets in the way of bisection.

> The debug splat is a false-positive; ndo_stop/list_del hasn't run yet.
> I will send a fix for net tree.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200313014435.GY979%40breakpoint.cc.

syzbot

unread,
Mar 13, 2020, 10:57:04 PM3/13/20
to f...@strlen.de, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger crash:

Reported-and-tested-by: syzbot+68a8ed...@syzkaller.appspotmail.com

Tested on:

commit: aedf4c9e geneve: move debug check after netdev unregister
git tree: git://git.breakpoint.cc/fw/net.git geneve_fp
kernel config: https://syzkaller.appspot.com/x/.config?x=8e8e51c36c1e1ca7
dashboard link: https://syzkaller.appspot.com/bug?extid=68a8ed58e3d17c700de5
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Note: testing is done by a robot and is best-effort only.

Florian Westphal

unread,
Mar 14, 2020, 3:20:05 AM3/14/20
to net...@vger.kernel.org, syzkall...@googlegroups.com, Florian Westphal, syzbot+68a8ed...@syzkaller.appspotmail.com, Haishuang Yan
The debug check must be done after unregister_netdevice_many() call --
the list_del() for this is done inside .ndo_stop.

Fixes: 2843a25348f8 ("geneve: speedup geneve tunnels dismantle")
Reported-and-tested-by: <syzbot+68a8ed...@syzkaller.appspotmail.com>
Cc: Haishuang Yan <yanhai...@cmss.chinamobile.com>
Signed-off-by: Florian Westphal <f...@strlen.de>
---
drivers/net/geneve.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index 75757e9954ba..09f279c0182b 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -1845,8 +1845,6 @@ static void geneve_destroy_tunnels(struct net *net, struct list_head *head)
if (!net_eq(dev_net(geneve->dev), net))
unregister_netdevice_queue(geneve->dev, head);
}
-
- WARN_ON_ONCE(!list_empty(&gn->sock_list));
}

static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
@@ -1861,6 +1859,12 @@ static void __net_exit geneve_exit_batch_net(struct list_head *net_list)
/* unregister the devices gathered above */
unregister_netdevice_many(&list);
rtnl_unlock();
+
+ list_for_each_entry(net, net_list, exit_list) {
+ const struct geneve_net *gn = net_generic(net, geneve_net_id);
+
+ WARN_ON_ONCE(!list_empty(&gn->sock_list));
+ }
}

static struct pernet_operations geneve_net_ops = {
--
2.24.1

Haishuang Yan

unread,
Mar 14, 2020, 11:14:41 PM3/14/20
to Florian Westphal, net...@vger.kernel.org, syzkall...@googlegroups.com, syzbot+68a8ed...@syzkaller.appspotmail.com
LGTM, thanks for the fix.
Acked-by: Haishuang Yan <yanhai...@cmss.chinamobile.com>

David Miller

unread,
Mar 15, 2020, 3:43:42 AM3/15/20
to f...@strlen.de, net...@vger.kernel.org, syzkall...@googlegroups.com, syzbot+68a8ed...@syzkaller.appspotmail.com, yanhai...@cmss.chinamobile.com
From: Florian Westphal <f...@strlen.de>
Date: Sat, 14 Mar 2020 08:18:42 +0100

> The debug check must be done after unregister_netdevice_many() call --
> the list_del() for this is done inside .ndo_stop.
>
> Fixes: 2843a25348f8 ("geneve: speedup geneve tunnels dismantle")
> Reported-and-tested-by: <syzbot+68a8ed...@syzkaller.appspotmail.com>
> Cc: Haishuang Yan <yanhai...@cmss.chinamobile.com>
> Signed-off-by: Florian Westphal <f...@strlen.de>

Applied and queued up for -stable, thanks Florian.
Reply all
Reply to author
Forward
0 new messages