[syzbot] WARNING: ODEBUG bug in blk_mq_hw_sysfs_release

5 views
Skip to first unread message

syzbot

unread,
Sep 20, 2021, 11:15:20ā€ÆPM9/20/21
to ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 85c698863c15 net/ipv4/tcp_minisocks.c: remove superfluous ..
git tree: net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=13d9d3e7300000
kernel config: https://syzkaller.appspot.com/x/.config?x=6d93fe4341f98704
dashboard link: https://syzkaller.appspot.com/bug?extid=a10a3d280be23be45d04
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11bd98f7300000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a10a3d...@syzkaller.appspotmail.com

------------[ cut here ]------------
ODEBUG: assert_init not available (active state 0) object type: timer_list hint: 0x0
WARNING: CPU: 0 PID: 3816 at lib/debugobjects.c:505 debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Modules linked in:
CPU: 0 PID: 3816 Comm: syz-executor.0 Not tainted 5.15.0-rc1-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:debug_print_object+0x16e/0x250 lib/debugobjects.c:505
Code: ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 af 00 00 00 48 8b 14 dd c0 39 e4 89 4c 89 ee 48 c7 c7 c0 2d e4 89 e8 ef e3 14 05 <0f> 0b 83 05 35 09 91 09 01 48 83 c4 18 5b 5d 41 5c 41 5d 41 5e c3
RSP: 0018:ffffc9000a6770f0 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000005 RCX: 0000000000000000
RDX: ffff88805b3db900 RSI: ffffffff815dbd88 RDI: fffff520014cee10
RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000
R10: ffffffff815d5b2e R11: 0000000000000000 R12: ffffffff898de180
R13: ffffffff89e43440 R14: ffffffff8164bae0 R15: 1ffff920014cee29
FS: 00007fee66b3a700(0000) GS:ffff8880b9c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f362471f3a0 CR3: 00000000272af000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
debug_object_assert_init lib/debugobjects.c:895 [inline]
debug_object_assert_init+0x1f4/0x2e0 lib/debugobjects.c:866
debug_timer_assert_init kernel/time/timer.c:739 [inline]
debug_assert_init kernel/time/timer.c:784 [inline]
try_to_del_timer_sync+0x6d/0x110 kernel/time/timer.c:1229
del_timer_sync+0x138/0x1b0 kernel/time/timer.c:1382
cleanup_srcu_struct+0x14c/0x2f0 kernel/rcu/srcutree.c:379
blk_mq_hw_sysfs_release+0x147/0x190 block/blk-mq-sysfs.c:40
kobject_cleanup lib/kobject.c:705 [inline]
kobject_release lib/kobject.c:736 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x1c8/0x540 lib/kobject.c:753
blk_mq_release+0x259/0x430 block/blk-mq.c:3094
blk_release_queue+0x2a7/0x4a0 block/blk-sysfs.c:823
kobject_cleanup lib/kobject.c:705 [inline]
kobject_release lib/kobject.c:736 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x1c8/0x540 lib/kobject.c:753
__blk_mq_alloc_disk+0x12e/0x160 block/blk-mq.c:3142
nbd_dev_add+0x3be/0xb90 drivers/block/nbd.c:1716
nbd_genl_connect+0x11f3/0x1930 drivers/block/nbd.c:1884
genl_family_rcv_msg_doit+0x228/0x320 net/netlink/genetlink.c:731
genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
genl_rcv_msg+0x328/0x580 net/netlink/genetlink.c:792
netlink_rcv_skb+0x153/0x420 net/netlink/af_netlink.c:2504
genl_rcv+0x24/0x40 net/netlink/genetlink.c:803
netlink_unicast_kernel net/netlink/af_netlink.c:1314 [inline]
netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1340
netlink_sendmsg+0x86d/0xdb0 net/netlink/af_netlink.c:1929
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg+0xcf/0x120 net/socket.c:724
____sys_sendmsg+0x6e8/0x810 net/socket.c:2409
___sys_sendmsg+0xf3/0x170 net/socket.c:2463
__sys_sendmsg+0xf3/0x1c0 net/socket.c:2492
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fee673e4739
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fee66b3a188 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fee674e9038 RCX: 00007fee673e4739
RDX: 0000000000000000 RSI: 0000000020000000 RDI: 0000000000000007
RBP: 00007fee6743ecc4 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fee674e9038
R13: 00007ffe3695106f R14: 00007fee66b3a300 R15: 0000000000022000


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches

Thomas Gleixner

unread,
Sep 22, 2021, 6:56:41ā€ÆPM9/22/21
to syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com, Paul E. McKenney
On Mon, Sep 20 2021 at 20:15, syzbot wrote:

Cc+: paulmck
So cleanup_srcu_struct() tries to delete a timer which was never initialized....

Paul E. McKenney

unread,
Sep 22, 2021, 7:27:03ā€ÆPM9/22/21
to Thomas Gleixner, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
That does not sound like a way to keep a kernel running...

For dynamically allocated srcu_struct structures, what is supposed to
happen is that init_srcu_struct() is invoked at some point before the
first use. Then init_srcu_struct() invokes init_srcu_struct_fields()
which invokes init_srcu_struct_nodes() which initializes that timer.
Unless the memory allocations in init_srcu_struct_fields() failed,
in which case init_srcu_struct() should have handed you back a -ENOMEM.

OK, there is a call to init_srcu_struct() in blk_mq_alloc_hctx().
It does ignore the init_srcu_struct() return value, so maybe a
WARN_ON_ONCE(init_srcu_struct(hctx->srcu)) would be good.

The ->srcu field is declared as follows:

struct srcu_struct srcu[];

The blk_mq_hw_ctx_size() function adjusts the size. All of this is
controlled by a BLK_MQ_F_BLOCKING flag. If this flag were to change,
clearly bad things could happen. The places where it is set look to me
to be initialization time, but it might be good to have someone familiar
with this code double-check this. Or have a separate bit that records
the state of BLK_MQ_F_BLOCKING at blk_mq_alloc_hctx() time and complain
bitterly if there was a change at blk_mq_hw_sysfs_release() time.

Other thoughts?

Thanx, Paul

Ming Lei

unread,
Sep 22, 2021, 11:03:26ā€ÆPM9/22/21
to Paul E. McKenney, Thomas Gleixner, syzbot, ax...@kernel.dk, linux...@vger.kernel.org, linux-...@vger.kernel.org, net...@vger.kernel.org, syzkall...@googlegroups.com
Yeah.

That should be the issue because del_timer_sync() from flush_delayed_work() in
cleanup_srcu_struct() isn't complained from the stack trace log, and
just the timer in perpcu field of ->sda isn't initialized, and ->sda is
allocated via alloc_percpu().

>
> The ->srcu field is declared as follows:
>
> struct srcu_struct srcu[];
>
> The blk_mq_hw_ctx_size() function adjusts the size. All of this is
> controlled by a BLK_MQ_F_BLOCKING flag. If this flag were to change,
> clearly bad things could happen. The places where it is set look to me
> to be initialization time, but it might be good to have someone familiar
> with this code double-check this. Or have a separate bit that records
> the state of BLK_MQ_F_BLOCKING at blk_mq_alloc_hctx() time and complain
> bitterly if there was a change at blk_mq_hw_sysfs_release() time.

BLK_MQ_F_BLOCKING is set from the beginning and never changed.


Thanks,
Ming

syzbot

unread,
Sep 16, 2022, 10:32:28ā€ÆPM9/16/22
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages