Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
INFO: task hung in synchronize_rcu
INFO: task kworker/u4:3:112 blocked for more than 143 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:3 state:D stack:26528 pid: 112 ppid: 2 flags:0x00004000
Workqueue: events_unbound fsnotify_mark_destroy_workfn
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
do_wait_for_common kernel/sched/completion.c:85 [inline]
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
__synchronize_srcu+0x132/0x220 kernel/rcu/srcutree.c:935
fsnotify_mark_destroy_workfn+0xfd/0x340 fs/notify/mark.c:836
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: task kworker/u4:5:204 blocked for more than 144 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/u4:5 state:D stack:26288 pid: 204 ppid: 2 flags:0x00004000
Workqueue: events_unbound fsnotify_connector_destroy_workfn
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
schedule_timeout+0x1d8/0x250 kernel/time/timer.c:1855
do_wait_for_common kernel/sched/completion.c:85 [inline]
__wait_for_common kernel/sched/completion.c:106 [inline]
wait_for_common kernel/sched/completion.c:117 [inline]
wait_for_completion+0x163/0x260 kernel/sched/completion.c:138
__synchronize_srcu+0x132/0x220 kernel/rcu/srcutree.c:935
fsnotify_connector_destroy_workfn+0x49/0xa0 fs/notify/mark.c:164
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
INFO: task syz-executor.0:6991 blocked for more than 144 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.0 state:D stack:23480 pid: 6991 ppid: 1 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
synchronize_rcu_expedited+0x44b/0x610 kernel/rcu/tree_exp.h:854
namespace_unlock+0x1af/0x410 fs/namespace.c:1425
do_umount fs/namespace.c:1648 [inline]
path_umount+0x7aa/0x12a0 fs/namespace.c:1735
ksys_umount fs/namespace.c:1754 [inline]
__do_sys_umount fs/namespace.c:1759 [inline]
__se_sys_umount fs/namespace.c:1757 [inline]
__x64_sys_umount+0xfb/0x150 fs/namespace.c:1757
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x460ba7
Code: Bad RIP value.
RSP: 002b:00007ffd3f1ebec8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000460ba7
RDX: 00000000004031b8 RSI: 0000000000000002 RDI: 00007ffd3f1ebf70
RBP: 0000000000000049 R08: 0000000000000000 R09: 000000000000000a
R10: 0000000000000005 R11: 0000000000000246 R12: 00007ffd3f1ed000
R13: 0000000001dcaa60 R14: 0000000000000000 R15: 00007ffd3f1ed000
INFO: task syz-executor.2:6996 blocked for more than 145 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.2 state:D stack:24248 pid: 6996 ppid: 1 flags:0x00000004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:4661
__mutex_lock_common kernel/locking/mutex.c:1033 [inline]
__mutex_lock+0x3e2/0x10e0 kernel/locking/mutex.c:1103
exp_funnel_lock kernel/rcu/tree_exp.h:322 [inline]
synchronize_rcu_expedited+0x27e/0x610 kernel/rcu/tree_exp.h:838
namespace_unlock+0x1af/0x410 fs/namespace.c:1425
do_umount fs/namespace.c:1648 [inline]
path_umount+0x7aa/0x12a0 fs/namespace.c:1735
ksys_umount fs/namespace.c:1754 [inline]
__do_sys_umount fs/namespace.c:1759 [inline]
__se_sys_umount fs/namespace.c:1757 [inline]
__x64_sys_umount+0xfb/0x150 fs/namespace.c:1757
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x460ba7
Code: Bad RIP value.
RSP: 002b:00007fff82b87c98 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000460ba7
RDX: 00000000004031b8 RSI: 0000000000000002 RDI: 00007fff82b87d40
RBP: 0000000000000047 R08: 0000000000000000 R09: 000000000000000a
R10: 0000000000000005 R11: 0000000000000246 R12: 00007fff82b88dd0
R13: 00000000028f6a60 R14: 0000000000000000 R15: 00007fff82b88dd0
INFO: task syz-executor.4:6997 blocked for more than 146 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.4 state:D stack:24248 pid: 6997 ppid: 1 flags:0x00004004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
exp_funnel_lock kernel/rcu/tree_exp.h:313 [inline]
synchronize_rcu_expedited+0x59d/0x610 kernel/rcu/tree_exp.h:838
bdi_remove_from_list mm/backing-dev.c:865 [inline]
bdi_unregister+0x16b/0x590 mm/backing-dev.c:871
release_bdi+0xa1/0xc0 mm/backing-dev.c:893
kref_put include/linux/kref.h:65 [inline]
bdi_put+0x72/0xa0 mm/backing-dev.c:901
generic_shutdown_super+0x2aa/0x370 fs/super.c:478
kill_anon_super+0x36/0x60 fs/super.c:1108
afs_kill_super+0xdf/0x180 fs/afs/super.c:544
deactivate_locked_super+0x94/0x160 fs/super.c:335
deactivate_super+0xad/0xd0 fs/super.c:366
cleanup_mnt+0x3a3/0x530 fs/namespace.c:1118
task_work_run+0xdd/0x190 kernel/task_work.c:141
tracehook_notify_resume include/linux/tracehook.h:188 [inline]
exit_to_user_mode_loop kernel/entry/common.c:165 [inline]
exit_to_user_mode_prepare+0x1e1/0x200 kernel/entry/common.c:192
syscall_exit_to_user_mode+0x7e/0x2e0 kernel/entry/common.c:267
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x460ba7
Code: Bad RIP value.
RSP: 002b:00007ffd0373e9c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000460ba7
RDX: 00000000004031b8 RSI: 0000000000000002 RDI: 00007ffd0373ea70
RBP: 0000000000000049 R08: 0000000000000000 R09: 000000000000000a
R10: 0000000000000005 R11: 0000000000000246 R12: 00007ffd0373fb00
R13: 0000000001a2da60 R14: 0000000000000000 R15: 00007ffd0373fb00
INFO: task syz-executor.1:6999 blocked for more than 147 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.1 state:D stack:24248 pid: 6999 ppid: 1 flags:0x00000004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
exp_funnel_lock kernel/rcu/tree_exp.h:313 [inline]
synchronize_rcu_expedited+0x59d/0x610 kernel/rcu/tree_exp.h:838
namespace_unlock+0x1af/0x410 fs/namespace.c:1425
do_umount fs/namespace.c:1648 [inline]
path_umount+0x7aa/0x12a0 fs/namespace.c:1735
ksys_umount fs/namespace.c:1754 [inline]
__do_sys_umount fs/namespace.c:1759 [inline]
__se_sys_umount fs/namespace.c:1757 [inline]
__x64_sys_umount+0xfb/0x150 fs/namespace.c:1757
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x460ba7
Code: Bad RIP value.
RSP: 002b:00007ffdedc5b388 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000460ba7
RDX: 00000000004031b8 RSI: 0000000000000002 RDI: 00007ffdedc5b430
RBP: 000000000000004d R08: 0000000000000000 R09: 000000000000000a
R10: 0000000000000005 R11: 0000000000000246 R12: 00007ffdedc5c4c0
R13: 000000000173ba60 R14: 0000000000000000 R15: 00007ffdedc5c4c0
INFO: task syz-executor.3:7001 blocked for more than 147 seconds.
Not tainted 5.9.0-syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor.3 state:D stack:24248 pid: 7001 ppid: 1 flags:0x00000004
Call Trace:
context_switch kernel/sched/core.c:3778 [inline]
__schedule+0xec9/0x2280 kernel/sched/core.c:4527
schedule+0xd0/0x2a0 kernel/sched/core.c:4602
exp_funnel_lock kernel/rcu/tree_exp.h:313 [inline]
synchronize_rcu_expedited+0x59d/0x610 kernel/rcu/tree_exp.h:838
namespace_unlock+0x1af/0x410 fs/namespace.c:1425
do_umount fs/namespace.c:1648 [inline]
path_umount+0x7aa/0x12a0 fs/namespace.c:1735
ksys_umount fs/namespace.c:1754 [inline]
__do_sys_umount fs/namespace.c:1759 [inline]
__se_sys_umount fs/namespace.c:1757 [inline]
__x64_sys_umount+0xfb/0x150 fs/namespace.c:1757
do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x460ba7
Code: Bad RIP value.
RSP: 002b:00007fff577ad1d8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000460ba7
RDX: 00000000004031b8 RSI: 0000000000000002 RDI: 00007fff577ad280
RBP: 0000000000000040 R08: 0000000000000000 R09: 000000000000000a
R10: 0000000000000005 R11: 0000000000000246 R12: 00007fff577ae310
R13: 0000000002716a60 R14: 0000000000000000 R15: 00007fff577ae310
Showing all locks held in the system:
5 locks held by kworker/0:0/5:
2 locks held by kworker/u4:3/112:
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x82b/0x1670 kernel/workqueue.c:2240
#1: ffffc900012e7da8 ((reaper_work).work){+.+.}-{0:0}, at: process_one_work+0x85f/0x1670 kernel/workqueue.c:2244
2 locks held by kworker/u4:5/204:
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
#0: ffff8880aa071138 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x82b/0x1670 kernel/workqueue.c:2240
#1: ffffc900017b7da8 (connector_reaper_work){+.+.}-{0:0}, at: process_one_work+0x85f/0x1670 kernel/workqueue.c:2244
1 lock held by khungtaskd/1175:
#0: ffffffff8a067ec0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x53/0x260 kernel/locking/lockdep.c:5852
1 lock held by in:imklog/6559:
1 lock held by syz-executor.0/6991:
#0: ffffffff8a071028 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:322 [inline]
#0: ffffffff8a071028 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x27e/0x610 kernel/rcu/tree_exp.h:838
1 lock held by syz-executor.2/6996:
#0: ffffffff8a071028 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:322 [inline]
#0: ffffffff8a071028 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x27e/0x610 kernel/rcu/tree_exp.h:838
2 locks held by kworker/0:5/8449:
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic64_set include/asm-generic/atomic-instrumented.h:856 [inline]
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: atomic_long_set include/asm-generic/atomic-long.h:41 [inline]
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_data kernel/workqueue.c:616 [inline]
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: set_work_pool_and_clear_pending kernel/workqueue.c:643 [inline]
#0: ffff8880aa063d38 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x82b/0x1670 kernel/workqueue.c:2240
#1: ffffc9000a75fda8 (key_gc_work){+.+.}-{0:0}, at: process_one_work+0x85f/0x1670 kernel/workqueue.c:2244
=============================================
NMI backtrace for cpu 1
CPU: 1 PID: 1175 Comm: khungtaskd Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x198/0x1fd lib/dump_stack.c:118
nmi_cpu_backtrace.cold+0x70/0xb1 lib/nmi_backtrace.c:101
nmi_trigger_cpumask_backtrace+0x1b3/0x223 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:209 [inline]
watchdog+0xd7d/0x1000 kernel/hung_task.c:295
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Sending NMI from CPU 1 to CPUs 0:
NMI backtrace for cpu 0
CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 5.9.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: afs afs_manage_cell_work
RIP: 0010:io_serial_in+0x60/0x80 drivers/tty/serial/8250/8250_port.c:448
Code: 0f b6 8d f1 00 00 00 48 8d 7d 40 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 d3 e3 80 3c 02 00 75 13 03 5d 40 89 da ec <5b> 0f b6 c0 5d c3 e8 d5 2a 9c fd eb c9 e8 fe 2a 9c fd eb e6 66 90
RSP: 0018:ffffc90000cbf738 EFLAGS: 00000002
RAX: dffffc0000000000 RBX: 00000000000003fd RCX: 0000000000000000
RDX: 00000000000003fd RSI: ffffffff841aec6c RDI: ffffffff8df44880
RBP: ffffffff8df44840 R08: 0000000000000001 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000020
R13: fffffbfff1be895b R14: fffffbfff1be8912 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff8880ae400000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f80d1b15000 CR3: 000000009e500000 CR4: 00000000001506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
serial_in drivers/tty/serial/8250/8250.h:113 [inline]
wait_for_xmitr+0x9a/0x210 drivers/tty/serial/8250/8250_port.c:2061
serial8250_console_putchar+0x1b/0x50 drivers/tty/serial/8250/8250_port.c:3241
uart_console_write+0x59/0x100 drivers/tty/serial/serial_core.c:1942
serial8250_console_write+0x92e/0xb30 drivers/tty/serial/8250/8250_port.c:3313
call_console_drivers kernel/printk/printk.c:1824 [inline]
console_unlock+0xadb/0xe60 kernel/printk/printk.c:2501
vprintk_emit+0x2ff/0x740 kernel/printk/printk.c:2029
vprintk_func+0x8d/0x1d4 kernel/printk/printk_safe.c:393
printk+0xba/0xed kernel/printk/printk.c:2078
fscache_print_cookie+0x15b/0x200 fs/fscache/cookie.c:40
fscache_hash_cookie+0x94b/0xa10 fs/fscache/cookie.c:224
__fscache_acquire_cookie+0x187/0x610 fs/fscache/cookie.c:298
fscache_acquire_cookie include/linux/fscache.h:334 [inline]
afs_activate_cell fs/afs/cell.c:678 [inline]
afs_manage_cell fs/afs/cell.c:768 [inline]
afs_manage_cell_work+0x4c0/0x11c0 fs/afs/cell.c:832
process_one_work+0x94c/0x1670 kernel/workqueue.c:2269
worker_thread+0x64c/0x1120 kernel/workqueue.c:2415
kthread+0x3b5/0x4a0 kernel/kthread.c:292
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:294
Tested on:
commit: f8eb8d1c afs: Add tracing for cell refcount and active use..
git tree: git://
git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git
console output:
https://syzkaller.appspot.com/x/log.txt?x=128500a0500000
kernel config:
https://syzkaller.appspot.com/x/.config?x=302928762dfb5528