[v5.15] INFO: task hung in ext4_stop_mmpd (3)

3 views
Skip to first unread message

syzbot

unread,
Mar 3, 2025, 12:11:30 PM3/3/25
to syzkaller...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: c16c81c81336 Linux 5.15.178
git tree: linux-5.15.y
console output: https://syzkaller.appspot.com/x/log.txt?x=15106464580000
kernel config: https://syzkaller.appspot.com/x/.config?x=d302c69e93fb6774
dashboard link: https://syzkaller.appspot.com/bug?extid=13b8de76bda8c4760b9a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/267e46ee7273/disk-c16c81c8.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/944e289206cf/vmlinux-c16c81c8.xz
kernel image: https://storage.googleapis.com/syzbot-assets/f8cadf62458e/bzImage-c16c81c8.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+13b8de...@syzkaller.appspotmail.com

INFO: task syz.0.346:5651 blocked for more than 143 seconds.
Not tainted 5.15.178-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.0.346 state:D stack:25440 pid: 5651 ppid: 4165 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5027 [inline]
__schedule+0x12c4/0x45b0 kernel/sched/core.c:6373
schedule+0x11b/0x1f0 kernel/sched/core.c:6456
schedule_timeout+0xac/0x300 kernel/time/timer.c:1890
do_wait_for_common+0x2d9/0x480 kernel/sched/completion.c:85
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x48/0x60 kernel/sched/completion.c:138
kthread_stop+0x178/0x580 kernel/kthread.c:666
ext4_stop_mmpd+0x43/0xb0 fs/ext4/mmp.c:263
ext4_fill_super+0x6d67/0xa110 fs/ext4/super.c:5089
mount_bdev+0x2c9/0x3f0 fs/super.c:1400
legacy_get_tree+0xeb/0x180 fs/fs_context.c:611
vfs_get_tree+0x88/0x270 fs/super.c:1530
do_new_mount+0x2ba/0xb40 fs/namespace.c:3012
do_mount fs/namespace.c:3355 [inline]
__do_sys_mount fs/namespace.c:3563 [inline]
__se_sys_mount+0x2d5/0x3c0 fs/namespace.c:3540
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x66/0xd0
RIP: 0033:0x7f0d9899a90a
RSP: 002b:00007f0d96801e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f0d96801ef0 RCX: 00007f0d9899a90a
RDX: 0000400000000040 RSI: 0000400000000000 RDI: 00007f0d96801eb0
RBP: 0000400000000040 R08: 00007f0d96801ef0 R09: 0000000003010008
R10: 0000000003010008 R11: 0000000000000246 R12: 0000400000000000
R13: 00007f0d96801eb0 R14: 0000000000000538 R15: 0000400000000340
</TASK>

Showing all locks held in the system:
4 locks held by kworker/u4:0/9:
#0: ffff8880175cd938 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work+0x78a/0x10c0 kernel/workqueue.c:2283
#1: ffffc90000ce7d20 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work+0x7d0/0x10c0 kernel/workqueue.c:2285
#2: ffffffff8dc36b10 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x166/0xc90 net/core/net_namespace.c:572
#3: ffffffff8cb242a8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:290 [inline]
#3: ffffffff8cb242a8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x280/0x740 kernel/rcu/tree_exp.h:845
1 lock held by khungtaskd/27:
#0: ffffffff8cb1fce0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x30
1 lock held by udevd/3545:
#0: ffff88802081bd18 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_get_by_dev+0x14d/0xa50 block/bdev.c:817
1 lock held by dhcpcd/3837:
#0: ffffffff8dc42788 (rtnl_mutex){+.+.}-{3:3}, at: __netlink_dump_start+0x12e/0x6d0 net/netlink/af_netlink.c:2334
2 locks held by getty/3922:
#0: ffff88814c724098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x21/0x70 drivers/tty/tty_ldisc.c:252
#1: ffffc90002cd62e8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6af/0x1db0 drivers/tty/n_tty.c:2158
3 locks held by kworker/1:4/4211:
#0: ffff888017470938 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x78a/0x10c0 kernel/workqueue.c:2283
#1: ffffc9000307fd20 ((work_completion)(&data->fib_event_work)){+.+.}-{0:0}, at: process_one_work+0x7d0/0x10c0 kernel/workqueue.c:2285
#2: ffff888077b5e240 (&data->fib_lock){+.+.}-{3:3}, at: nsim_fib_event_work+0x2cd/0x4120 drivers/net/netdevsim/fib.c:1480
2 locks held by kworker/1:15/4472:
#0: ffff888017472138 ((wq_completion)rcu_gp){+.+.}-{0:0}, at: process_one_work+0x78a/0x10c0 kernel/workqueue.c:2283
#1: ffffc9000381fd20 ((work_completion)(&rew.rew_work)){+.+.}-{0:0}, at: process_one_work+0x7d0/0x10c0 kernel/workqueue.c:2285
1 lock held by syz.0.346/5651:
#0: ffff88807dc1e0e0 (&type->s_umount_key#28/1){+.+.}-{3:3}, at: alloc_super+0x210/0x940 fs/super.c:229
1 lock held by syz.7.738/7834:
#0: ffff88807dc1e0e0 (&type->s_umount_key#32){++++}-{3:3}, at: iterate_supers+0xac/0x1e0 fs/super.c:716
2 locks held by syz-executor/7861:
#0: ffffffff8dc42788 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
#0: ffffffff8dc42788 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x94c/0xee0 net/core/rtnetlink.c:5644
#1: ffffffff8cb242a8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:322 [inline]
#1: ffffffff8cb242a8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x350/0x740 kernel/rcu/tree_exp.h:845
1 lock held by syz.2.747/7925:
#0: ffffffff8dc42788 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:72 [inline]
#0: ffffffff8dc42788 (rtnl_mutex){+.+.}-{3:3}, at: rtnetlink_rcv_msg+0x94c/0xee0 net/core/rtnetlink.c:5644
2 locks held by syz.6.748/7952:
#0: ffff88802081bd18 (&disk->open_mutex){+.+.}-{3:3}, at: blkdev_put+0xfb/0x790 block/bdev.c:912
#1: ffff88814788e468 (&lo->lo_mutex){+.+.}-{3:3}, at: lo_release+0x4d/0x1f0 drivers/block/loop.c:2070
1 lock held by dhcpcd/7970:
#0: ffff88807a96a120 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1684 [inline]
#0: ffff88807a96a120 (sk_lock-AF_PACKET){+.+.}-{0:0}, at: packet_do_bind+0x33/0xd50 net/packet/af_packet.c:3213

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 27 Comm: khungtaskd Not tainted 5.15.178-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2d0 lib/dump_stack.c:106
nmi_cpu_backtrace+0x46a/0x4a0 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x181/0x2a0 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:148 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline]
watchdog+0xe72/0xeb0 kernel/hung_task.c:295
kthread+0x3f6/0x4f0 kernel/kthread.c:334
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:287
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 PID: 4207 Comm: kworker/u4:4 Not tainted 5.15.178-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Workqueue: loop6 loop_rootcg_workfn
RIP: 0010:__bfs kernel/locking/lockdep.c:1698 [inline]
RIP: 0010:__bfs_backwards kernel/locking/lockdep.c:1805 [inline]
RIP: 0010:check_irq_usage kernel/locking/lockdep.c:2745 [inline]
RIP: 0010:check_prev_add kernel/locking/lockdep.c:3057 [inline]
RIP: 0010:check_prevs_add kernel/locking/lockdep.c:3172 [inline]
RIP: 0010:validate_chain+0x1a38/0x5930 kernel/locking/lockdep.c:3788
Code: 3a 11 4d 85 e4 0f 84 f9 04 00 00 4d 8d 74 24 10 4c 89 f1 48 c1 e9 03 48 b8 00 00 00 00 00 fc ff df 48 89 4c 24 50 80 3c 01 00 <74> 08 4c 89 f7 e8 1e cb 66 00 49 8b 1e 48 85 db 0f 84 99 26 00 00
RSP: 0018:ffffc9000302f020 EFLAGS: 00000046
RAX: dffffc0000000000 RBX: 1ffff92000605e28 RCX: 1ffffffff267670f
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffc9000302f140
RBP: ffffc9000302f2d0 R08: dffffc0000000000 R09: fffffbfff2131821
R10: 0000000000000000 R11: dffffc0000000001 R12: ffffffff933b3868
R13: ffff8880626bc730 R14: ffffffff933b3878 R15: ffffffff933b3868
FS: 0000000000000000(0000) GS:ffff8880b8f00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 000000007711e000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
__lock_acquire+0x1295/0x1ff0 kernel/locking/lockdep.c:5012
lock_acquire+0x1db/0x4f0 kernel/locking/lockdep.c:5623
do_write_seqcount_begin_nested include/linux/seqlock.h:519 [inline]
do_write_seqcount_begin include/linux/seqlock.h:545 [inline]
psi_group_change+0x11b/0x1180 kernel/sched/psi.c:710
psi_enqueue kernel/sched/stats.h:104 [inline]
enqueue_task+0x2fe/0x3a0 kernel/sched/core.c:1970
activate_task kernel/sched/core.c:2002 [inline]
ttwu_do_activate+0x1cf/0x430 kernel/sched/core.c:3611
ttwu_queue kernel/sched/core.c:3820 [inline]
try_to_wake_up+0x795/0x1300 kernel/sched/core.c:4143
wakeup_softirqd kernel/softirq.c:80 [inline]
raise_softirq_irqoff kernel/softirq.c:688 [inline]
raise_softirq+0xf1/0x1a0 kernel/softirq.c:696
blk_mq_raise_softirq block/blk-mq.c:652 [inline]
blk_mq_complete_request_remote+0x1a5/0x640 block/blk-mq.c:673
blk_mq_complete_request+0x15/0xa0 block/blk-mq.c:689
loop_handle_cmd drivers/block/loop.c:2251 [inline]
loop_process_work+0x19f1/0x2af0 drivers/block/loop.c:2274
process_one_work+0x8a1/0x10c0 kernel/workqueue.c:2310
worker_thread+0xaca/0x1280 kernel/workqueue.c:2457
kthread+0x3f6/0x4f0 kernel/kthread.c:334
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:287
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Mar 3, 2025, 3:46:34 PM3/3/25
to syzkaller...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: c16c81c81336 Linux 5.15.178
git tree: linux-5.15.y
console output: https://syzkaller.appspot.com/x/log.txt?x=15c9afb8580000
kernel config: https://syzkaller.appspot.com/x/.config?x=d302c69e93fb6774
dashboard link: https://syzkaller.appspot.com/bug?extid=13b8de76bda8c4760b9a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133e6464580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16b467a0580000
mounted in repro: https://storage.googleapis.com/syzbot-assets/cb4a64d181a7/mount_0.gz
fsck result: failed (log: https://syzkaller.appspot.com/x/fsck.log?x=153e6464580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+13b8de...@syzkaller.appspotmail.com

INFO: task syz-executor344:4365 blocked for more than 143 seconds.
Not tainted 5.15.178-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor344 state:D stack:25920 pid: 4365 ppid: 4180 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5027 [inline]
__schedule+0x12c4/0x45b0 kernel/sched/core.c:6373
schedule+0x11b/0x1f0 kernel/sched/core.c:6456
schedule_timeout+0xac/0x300 kernel/time/timer.c:1890
do_wait_for_common+0x2d9/0x480 kernel/sched/completion.c:85
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x48/0x60 kernel/sched/completion.c:138
kthread_stop+0x178/0x580 kernel/kthread.c:666
ext4_stop_mmpd+0x43/0xb0 fs/ext4/mmp.c:263
ext4_fill_super+0x6d67/0xa110 fs/ext4/super.c:5089
mount_bdev+0x2c9/0x3f0 fs/super.c:1400
legacy_get_tree+0xeb/0x180 fs/fs_context.c:611
vfs_get_tree+0x88/0x270 fs/super.c:1530
do_new_mount+0x2ba/0xb40 fs/namespace.c:3012
do_mount fs/namespace.c:3355 [inline]
__do_sys_mount fs/namespace.c:3563 [inline]
__se_sys_mount+0x2d5/0x3c0 fs/namespace.c:3540
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3b/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x66/0xd0
RIP: 0033:0x7eff800690fa
RSP: 002b:00007fff3e3e36e8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007fff3e3e3700 RCX: 00007eff800690fa
RDX: 0000400000000040 RSI: 0000400000000000 RDI: 00007fff3e3e3700
RBP: 0000400000000040 R08: 00007fff3e3e3740 R09: 00007fff3e3e3740
R10: 0000000003010008 R11: 0000000000000202 R12: 0000400000000000
R13: 00007fff3e3e3740 R14: 0000000000000003 R15: 0000000003010008
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/27:
#0: ffffffff8cb1fce0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire+0x0/0x30
2 locks held by getty/3927:
#0: ffff88802be76098 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x21/0x70 drivers/tty/tty_ldisc.c:252
#1: ffffc900025c62e8 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6af/0x1db0 drivers/tty/n_tty.c:2158
1 lock held by syz-executor344/4365:
#0: ffff888019bb60e0 (&type->s_umount_key#28/1){+.+.}-{3:3}, at: alloc_super+0x210/0x940 fs/super.c:229

=============================================

NMI backtrace for cpu 0
CPU: 0 PID: 27 Comm: khungtaskd Not tainted 5.15.178-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e3/0x2d0 lib/dump_stack.c:106
nmi_cpu_backtrace+0x46a/0x4a0 lib/nmi_backtrace.c:111
nmi_trigger_cpumask_backtrace+0x181/0x2a0 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:148 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:210 [inline]
watchdog+0xe72/0xeb0 kernel/hung_task.c:295
kthread+0x3f6/0x4f0 kernel/kthread.c:334
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:287
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt arch/x86/include/asm/irqflags.h:51 [inline]
NMI backtrace for cpu 1 skipped: idling at arch_safe_halt arch/x86/include/asm/irqflags.h:89 [inline]
NMI backtrace for cpu 1 skipped: idling at acpi_safe_halt drivers/acpi/processor_idle.c:108 [inline]
NMI backtrace for cpu 1 skipped: idling at acpi_idle_do_entry+0x10f/0x340 drivers/acpi/processor_idle.c:562


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

syzbot

unread,
Apr 13, 2025, 8:37:07 AM4/13/25
to syzkaller...@googlegroups.com
syzbot suspects this issue could be fixed by backporting the following commit:

commit a7c01fa93aeb03ab76cd3cb2107990dd160498e6
git tree: upstream
Author: Jason A. Donenfeld <Ja...@zx2c4.com>
Date: Mon Jul 11 23:21:23 2022 +0000

signal: break out of wait loops on kthread_stop()

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=14a23398580000
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11fffa97980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11be98b7980000


Please keep in mind that other backports might be required as well.

For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Reply all
Reply to author
Forward
0 new messages