WARNING in ib_unregister_device_queued

12 views
Skip to first unread message

syzbot

unread,
Apr 26, 2020, 9:43:14 AM4/26/20
to dled...@redhat.com, j...@ziepe.ca, kamal...@gmail.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, pa...@mellanox.com, syzkall...@googlegroups.com
Hello,

syzbot found the following crash on:

HEAD commit: b9663b7c net: stmmac: Enable SERDES power up/down sequence
git tree: net
console output: https://syzkaller.appspot.com/x/log.txt?x=166bf717e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
dashboard link: https://syzkaller.appspot.com/bug?extid=4088ed905e4ae2b0e13b
compiler: gcc (GCC) 9.0.0 20181231 (experimental)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+4088ed...@syzkaller.appspotmail.com

rdma_rxe: ignoring netdev event = 10 for netdevsim0
infiniband yz2: set down
------------[ cut here ]------------
WARNING: CPU: 0 PID: 22753 at drivers/infiniband/core/device.c:1565 ib_unregister_device_queued+0x122/0x160 drivers/infiniband/core/device.c:1565
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 22753 Comm: syz-executor.5 Not tainted 5.7.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x188/0x20d lib/dump_stack.c:118
panic+0x2e3/0x75c kernel/panic.c:221
__warn.cold+0x2f/0x35 kernel/panic.c:582
report_bug+0x27b/0x2f0 lib/bug.c:195
fixup_bug arch/x86/kernel/traps.c:175 [inline]
fixup_bug arch/x86/kernel/traps.c:170 [inline]
do_error_trap+0x12b/0x220 arch/x86/kernel/traps.c:267
do_invalid_op+0x32/0x40 arch/x86/kernel/traps.c:286
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1027
RIP: 0010:ib_unregister_device_queued+0x122/0x160 drivers/infiniband/core/device.c:1565
Code: fb e8 72 e2 d4 fb 48 89 ef e8 2a c3 c1 fe 48 83 c4 08 5b 5d e9 5f e2 d4 fb e8 5a e2 d4 fb 0f 0b e9 46 ff ff ff e8 4e e2 d4 fb <0f> 0b e9 6f ff ff ff 48 89 ef e8 2f a9 12 fc e9 16 ff ff ff 48 c7
RSP: 0018:ffffc900072ef290 EFLAGS: 00010246
RAX: 0000000000040000 RBX: ffff8880a6a24000 RCX: ffffc90013201000
RDX: 0000000000040000 RSI: ffffffff859e51b2 RDI: ffff8880a6a24310
RBP: 0000000000000019 R08: ffff88808d21c280 R09: ffffed1014d449bb
R10: ffff8880a6a24dd3 R11: ffffed1014d449ba R12: 0000000000000006
R13: ffff88805988c000 R14: 0000000000000000 R15: ffffffff8a44f8c0
rxe_notify+0x77/0xd0 drivers/infiniband/sw/rxe/rxe_net.c:605
notifier_call_chain+0xc0/0x230 kernel/notifier.c:83
call_netdevice_notifiers_info net/core/dev.c:1948 [inline]
call_netdevice_notifiers_info+0xb5/0x130 net/core/dev.c:1933
call_netdevice_notifiers_extack net/core/dev.c:1960 [inline]
call_netdevice_notifiers net/core/dev.c:1974 [inline]
rollback_registered_many+0x75c/0xe70 net/core/dev.c:8828
rollback_registered+0xf2/0x1c0 net/core/dev.c:8873
unregister_netdevice_queue net/core/dev.c:9969 [inline]
unregister_netdevice_queue+0x1d7/0x2b0 net/core/dev.c:9962
unregister_netdevice include/linux/netdevice.h:2725 [inline]
nsim_destroy+0x35/0x60 drivers/net/netdevsim/netdev.c:330
__nsim_dev_port_del+0x144/0x1e0 drivers/net/netdevsim/dev.c:934
nsim_dev_port_del_all+0x86/0xe0 drivers/net/netdevsim/dev.c:947
nsim_dev_reload_destroy+0x77/0x110 drivers/net/netdevsim/dev.c:1123
nsim_dev_reload_down+0x6e/0xd0 drivers/net/netdevsim/dev.c:703
devlink_reload+0xbd/0x3b0 net/core/devlink.c:2797
devlink_nl_cmd_reload+0x2f7/0x7c0 net/core/devlink.c:2832
genl_family_rcv_msg_doit net/netlink/genetlink.c:673 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:718 [inline]
genl_rcv_msg+0x627/0xdf0 net/netlink/genetlink.c:735
netlink_rcv_skb+0x15a/0x410 net/netlink/af_netlink.c:2469
genl_rcv+0x24/0x40 net/netlink/genetlink.c:746
netlink_unicast_kernel net/netlink/af_netlink.c:1303 [inline]
netlink_unicast+0x537/0x740 net/netlink/af_netlink.c:1329
netlink_sendmsg+0x882/0xe10 net/netlink/af_netlink.c:1918
sock_sendmsg_nosec net/socket.c:652 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:672
____sys_sendmsg+0x6bf/0x7e0 net/socket.c:2362
___sys_sendmsg+0x100/0x170 net/socket.c:2416
__sys_sendmsg+0xec/0x1b0 net/socket.c:2449
do_syscall_64+0xf6/0x7d0 arch/x86/entry/common.c:295
entry_SYSCALL_64_after_hwframe+0x49/0xb3
RIP: 0033:0x45c829
Code: 0d b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 db b6 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007f6fae1cac78 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00000000004fcc00 RCX: 000000000045c829
RDX: 0000000000000000 RSI: 0000000020000800 RDI: 0000000000000006
RBP: 000000000078c040 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000ffffffff
R13: 0000000000000905 R14: 00000000004cbaab R15: 00007f6fae1cb6d4
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Jason Gunthorpe

unread,
Apr 27, 2020, 8:10:36 PM4/27/20
to syzbot, dled...@redhat.com, kamal...@gmail.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, pa...@mellanox.com, syzkall...@googlegroups.com
On Sun, Apr 26, 2020 at 06:43:13AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: b9663b7c net: stmmac: Enable SERDES power up/down sequence
> git tree: net
> console output: https://syzkaller.appspot.com/x/log.txt?x=166bf717e00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
> dashboard link: https://syzkaller.appspot.com/bug?extid=4088ed905e4ae2b0e13b
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+4088ed...@syzkaller.appspotmail.com
>
> rdma_rxe: ignoring netdev event = 10 for netdevsim0
> infiniband yz2: set down
> WARNING: CPU: 0 PID: 22753 at drivers/infiniband/core/device.c:1565 ib_unregister_device_queued+0x122/0x160 drivers/infiniband/core/device.c:1565

The only thing I can think of for this is that ib_register_driver()
is racing with __ib_unregister_device() and took the special error
unwind.

I suspect this is not a bug, but over complexity triggering a
pre-condition WARN_ON..

Maybe the solution is to swap the dealloc_driver = NULL for some other flag.

Jason

Hillf Danton

unread,
Apr 28, 2020, 12:20:17 AM4/28/20
to syzbot, dled...@redhat.com, j...@ziepe.ca, kamal...@gmail.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, pa...@mellanox.com, syzkall...@googlegroups.com

Sun, 26 Apr 2020 06:43:13 -0700
Quiesce the warning by adding a dummy destruct function and using it in
the error path that is supposedly related to triggering the warning.

--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -1324,6 +1324,10 @@ out:
return ret;
}

+static void ib_dummy_dealloc_fn(struct ib_device *ib_dev)
+{
+}
+
/**
* ib_register_device - Register an IB device with IB core
* @device: Device to register
@@ -1393,11 +1397,10 @@ int ib_register_device(struct ib_device
* possibility for a parallel unregistration along with this
* error flow. Since we have a refcount here we know any
* parallel flow is stopped in disable_device and will see the
- * NULL pointers, causing the responsibility to
- * ib_dealloc_device() to revert back to this thread.
+ * updated pointer and leave things to our care.
*/
dealloc_fn = device->ops.dealloc_driver;
- device->ops.dealloc_driver = NULL;
+ device->ops.dealloc_driver = ib_dummy_dealloc_fn;
ib_device_put(device);
__ib_unregister_device(device);
device->ops.dealloc_driver = dealloc_fn;
@@ -1445,7 +1448,8 @@ static void __ib_unregister_device(struc
* Drivers using the new flow may not call ib_dealloc_device except
* in error unwind prior to registration success.
*/
- if (ib_dev->ops.dealloc_driver) {
+ if (ib_dev->ops.dealloc_driver &&
+ ib_dev->ops.dealloc_driver != ib_dummy_dealloc_fn) {
WARN_ON(kref_read(&ib_dev->dev.kobj.kref) <= 1);
ib_dealloc_device(ib_dev);
}

Jason Gunthorpe

unread,
Apr 28, 2020, 8:31:05 AM4/28/20
to Hillf Danton, syzbot, dled...@redhat.com, kamal...@gmail.com, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, pa...@mellanox.com, syzkall...@googlegroups.com
On Tue, Apr 28, 2020 at 12:19:55PM +0800, Hillf Danton wrote:
>
> Sun, 26 Apr 2020 06:43:13 -0700
> > syzbot found the following crash on:
> >
> > HEAD commit: b9663b7c net: stmmac: Enable SERDES power up/down sequence
> > git tree: net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=166bf717e00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=5d351a1019ed81a2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=4088ed905e4ae2b0e13b
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+4088ed...@syzkaller.appspotmail.com
> >
> > rdma_rxe: ignoring netdev event = 10 for netdevsim0
> > infiniband yz2: set down
Yeah, something like that might work if this is the source of the bug

Thanks,
Jason
Reply all
Reply to author
Forward
0 new messages