[syzbot] [rdma?] WARNING in ib_uverbs_release_dev

14 views
Skip to first unread message

syzbot

unread,
Jun 19, 2024, 2:37:21 AM (10 days ago) Jun 19
to j...@ziepe.ca, le...@kernel.org, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 2ccbdf43d5e7 Merge tag 'for-linus' of git://git.kernel.org..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=179e93fe980000
kernel config: https://syzkaller.appspot.com/x/.config?x=fa0ce06dcc735711
dashboard link: https://syzkaller.appspot.com/bug?extid=19ec7595e3aa1a45f623
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/27e64d7472ce/disk-2ccbdf43.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e1c494bb5c9c/vmlinux-2ccbdf43.xz
kernel image: https://storage.googleapis.com/syzbot-assets/752498985a5e/bzImage-2ccbdf43.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+19ec75...@syzkaller.appspotmail.com

smc: removing ib device syz0
------------[ cut here ]------------
WARNING: CPU: 0 PID: 51 at kernel/rcu/srcutree.c:653 cleanup_srcu_struct+0x404/0x4d0 kernel/rcu/srcutree.c:653
Modules linked in:
CPU: 0 PID: 51 Comm: kworker/u8:3 Not tainted 6.10.0-rc3-syzkaller-00044-g2ccbdf43d5e7 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/07/2024
Workqueue: ib-unreg-wq ib_unregister_work
RIP: 0010:cleanup_srcu_struct+0x404/0x4d0 kernel/rcu/srcutree.c:653
Code: 12 80 00 48 c7 03 00 00 00 00 48 83 c4 48 5b 41 5c 41 5d 41 5e 41 5f 5d e9 14 67 34 0a 90 0f 0b 90 eb e7 90 0f 0b 90 eb e1 90 <0f> 0b 90 eb db 90 0f 0b 90 eb 0a 90 0f 0b 90 eb 04 90 0f 0b 90 48
RSP: 0018:ffffc90000bb7970 EFLAGS: 00010202
RAX: 0000000000000001 RBX: ffff88802a1bc980 RCX: 0000000000000002
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffe8ffffd74c58
RBP: 0000000000000001 R08: ffffe8ffffd74c5f R09: 1ffffd1ffffae98b
R10: dffffc0000000000 R11: fffff91ffffae98c R12: dffffc0000000000
R13: ffff88802285b5f0 R14: ffff88802285b000 R15: ffff88802a1bc800
FS: 0000000000000000(0000) GS:ffff8880b9400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa3852cae10 CR3: 000000000e132000 CR4: 0000000000350ef0
Call Trace:
<TASK>
ib_uverbs_release_dev+0x4e/0x80 drivers/infiniband/core/uverbs_main.c:136
device_release+0x9b/0x1c0
kobject_cleanup lib/kobject.c:689 [inline]
kobject_release lib/kobject.c:720 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x231/0x480 lib/kobject.c:737
remove_client_context+0xb9/0x1e0 drivers/infiniband/core/device.c:776
disable_device+0x13b/0x360 drivers/infiniband/core/device.c:1282
__ib_unregister_device+0x6d/0x170 drivers/infiniband/core/device.c:1475
ib_unregister_work+0x19/0x30 drivers/infiniband/core/device.c:1586
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2e/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd70 kernel/workqueue.c:3393
kthread+0x2f2/0x390 kernel/kthread.c:389
ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Leon Romanovsky

unread,
Jun 19, 2024, 5:16:04 AM (10 days ago) Jun 19
to syzbot, j...@ziepe.ca, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
I see that this is caused by call to ib_unregister_device_queued() as a
response to NETDEV_UNREGISTER event, but we don't flush anything before.
How can we be sure that ib_device is not used anymore?

Thanks

Zhu Yanjun

unread,
Jun 19, 2024, 10:16:34 AM (10 days ago) Jun 19
to Leon Romanovsky, syzbot, j...@ziepe.ca, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hi, Leon

This is the console output:

https://syzkaller.appspot.com/x/log.txt?x=179e93fe980000

From the above link, it seems that other devices or subsystems failed
firstly, then caused this call trace to appear. When other problem
occurred, the whole kernel system was in mess state.So it is not weird
that some problems occurred.

To be simple, the root cause is not in RDMA subsystem.

I will continue to delve into this problem.

Zhu Yanjun
>
> Thanks

Leon Romanovsky

unread,
Jun 19, 2024, 1:48:11 PM (10 days ago) Jun 19
to Zhu Yanjun, syzbot, j...@ziepe.ca, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Which devices/subsystems failed? I grepped the log and don't see
anything suspicious, before first "------------[ cut here ]------------"
sentence.

Zhu Yanjun

unread,
Jun 20, 2024, 5:05:37 AM (9 days ago) Jun 20
to Leon Romanovsky, syzbot, j...@ziepe.ca, linux-...@vger.kernel.org, linux...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Need the script to check this problem. It is an interesting problem.

Zhu Yanjun
Reply all
Reply to author
Forward
0 new messages