[syzbot] [xfs?] INFO: task hung in xfs_buf_item_unpin (2)

16 views
Skip to first unread message

syzbot

unread,
Aug 16, 2024, 1:17:34ā€ÆAMAug 16
to chanda...@oracle.com, djw...@kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 1fb918967b56 Merge tag 'for-6.11-rc3-tag' of git://git.ker..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=115e1429980000
kernel config: https://syzkaller.appspot.com/x/.config?x=92c0312151c4e32e
dashboard link: https://syzkaller.appspot.com/bug?extid=837bcd54843dd6262f2f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/7b0e9f1a37aa/disk-1fb91896.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/d554edb47a8e/vmlinux-1fb91896.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d1ec77d87b65/bzImage-1fb91896.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+837bcd...@syzkaller.appspotmail.com

INFO: task kworker/1:1H:43 blocked for more than 143 seconds.
Not tainted 6.11.0-rc3-syzkaller-00066-g1fb918967b56 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/1:1H state:D stack:25320 pid:43 tgid:43 ppid:2 flags:0x00004000
Workqueue: xfs-log/loop4 xlog_ioend_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5188 [inline]
__schedule+0x17ae/0x4a10 kernel/sched/core.c:6529
__schedule_loop kernel/sched/core.c:6606 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6621
schedule_timeout+0xb0/0x310 kernel/time/timer.c:2557
___down_common kernel/locking/semaphore.c:225 [inline]
__down_common+0x343/0x7f0 kernel/locking/semaphore.c:246
down+0x84/0xc0 kernel/locking/semaphore.c:63
xfs_buf_lock+0x164/0x510 fs/xfs/xfs_buf.c:1196
xfs_buf_item_unpin+0x1dd/0x710 fs/xfs/xfs_buf_item.c:582
xlog_cil_committed+0x82f/0xf00 fs/xfs/xfs_log_cil.c:910
xlog_cil_process_committed+0x15c/0x1b0 fs/xfs/xfs_log_cil.c:941
xlog_state_shutdown_callbacks+0x2ba/0x3b0 fs/xfs/xfs_log.c:487
xlog_force_shutdown+0x32c/0x390 fs/xfs/xfs_log.c:3530
xlog_ioend_work+0xad/0x100 fs/xfs/xfs_log.c:1244
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
INFO: task syz.4.26:5406 blocked for more than 144 seconds.
Not tainted 6.11.0-rc3-syzkaller-00066-g1fb918967b56 #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.4.26 state:D stack:21208 pid:5406 tgid:5405 ppid:5216 flags:0x00004004
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5188 [inline]
__schedule+0x17ae/0x4a10 kernel/sched/core.c:6529
__schedule_loop kernel/sched/core.c:6606 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6621
xlog_wait fs/xfs/xfs_log_priv.h:587 [inline]
xlog_wait_on_iclog+0x501/0x770 fs/xfs/xfs_log.c:840
xlog_force_lsn+0x523/0x9e0 fs/xfs/xfs_log.c:3066
xfs_log_force_seq+0x1da/0x450 fs/xfs/xfs_log.c:3103
__xfs_trans_commit+0xb98/0x1290 fs/xfs/xfs_trans.c:900
xfs_sync_sb_buf+0x2dc/0x370 fs/xfs/libxfs/xfs_sb.c:1178
xfs_ioc_setlabel fs/xfs/xfs_ioctl.c:1143 [inline]
xfs_file_ioctl+0x165b/0x19e0 fs/xfs/xfs_ioctl.c:1298
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f5e315799b9
RSP: 002b:00007f5e32318038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f5e31715f80 RCX: 00007f5e315799b9
RDX: 0000000020000040 RSI: 0000000041009432 RDI: 0000000000000008
RBP: 00007f5e315e78d8 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f5e31715f80 R15: 00007ffda968eb98
</TASK>

Showing all locks held in the system:
4 locks held by kworker/u8:0/11:
#0: ffff8880b923e9d8 (&rq->__lock){-.-.}-{2:2}, at: raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:560
#1: ffff8880b9228948 (&per_cpu_ptr(group->pcpu, cpu)->seq){-.-.}-{0:0}, at: psi_task_switch+0x441/0x770 kernel/sched/psi.c:989
#2: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#2: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#2: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:408 [inline]
#2: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: batadv_nc_worker+0xcb/0x610 net/batman-adv/network-coding.c:719
#3: ffffffff94ebb578 (&obj_hash[i].lock){-.-.}-{2:2}, at: debug_object_activate+0x16d/0x510 lib/debugobjects.c:709
3 locks held by kworker/u8:1/12:
#0: ffff888015489148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff888015489148 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc90000117d00 ((linkwatch_work).work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc90000117d00 ((linkwatch_work).work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
#2: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: linkwatch_event+0xe/0x60 net/core/link_watch.c:276
1 lock held by khungtaskd/30:
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6626
2 locks held by kworker/1:1H/43:
#0: ffff88807dc5a148 ((wq_completion)xfs-log/loop4){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff88807dc5a148 ((wq_completion)xfs-log/loop4){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc90000b37d00 ((work_completion)(&iclog->ic_end_io_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc90000b37d00 ((work_completion)(&iclog->ic_end_io_work)){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
5 locks held by kworker/u8:4/63:
#0: ffff8880162e3148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff8880162e3148 ((wq_completion)netns){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc900015e7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc900015e7d00 (net_cleanup_work){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
#2: ffffffff8fa63750 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x16a/0xcc0 net/core/net_namespace.c:594
#3: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: default_device_exit_batch+0xe9/0xa90 net/core/dev.c:11875
#4: ffffffff8e73d6b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:328 [inline]
#4: ffffffff8e73d6b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x451/0x830 kernel/rcu/tree_exp.h:958
3 locks held by kworker/u8:5/83:
#0: ffff88802a205148 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff88802a205148 ((wq_completion)ipv6_addrconf){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc9000233fd00 ((work_completion)(&(&ifa->dad_work)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc9000233fd00 ((work_completion)(&(&ifa->dad_work)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
#2: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: addrconf_dad_work+0xd0/0x16f0 net/ipv6/addrconf.c:4194
2 locks held by dhcpcd/4881:
#0: ffff88807c593678 (nlk_cb_mutex-ROUTE){+.+.}-{3:3}, at: __netlink_dump_start+0x119/0x780 net/netlink/af_netlink.c:2404
#1: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock net/core/rtnetlink.c:79 [inline]
#1: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_dumpit+0x99/0x200 net/core/rtnetlink.c:6506
2 locks held by getty/4971:
#0: ffff88802f8080a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
#1: ffffc900031332f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6ac/0x1e00 drivers/tty/n_tty.c:2211
3 locks held by kworker/1:5/5272:
#0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff888015480948 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc900044f7d00 ((work_completion)(&(&devlink->rwork)->work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc900044f7d00 ((work_completion)(&(&devlink->rwork)->work)){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
#2: ffffffff8e73d6b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: exp_funnel_lock kernel/rcu/tree_exp.h:328 [inline]
#2: ffffffff8e73d6b8 (rcu_state.exp_mutex){+.+.}-{3:3}, at: synchronize_rcu_expedited+0x451/0x830 kernel/rcu/tree_exp.h:958
1 lock held by syz.4.26/5406:
#0: ffff88806c67c420 (sb_writers#19){.+.+}-{0:0}, at: mnt_want_write_file+0x61/0x200 fs/namespace.c:559
3 locks held by syz-executor/7304:
#0: ffffffff8fad5a30 (cb_lock){++++}-{3:3}, at: genl_rcv+0x19/0x40 net/netlink/genetlink.c:1218
#1: ffffffff8fad58e8 (genl_mutex){+.+.}-{3:3}, at: genl_lock net/netlink/genetlink.c:35 [inline]
#1: ffffffff8fad58e8 (genl_mutex){+.+.}-{3:3}, at: genl_op_lock net/netlink/genetlink.c:60 [inline]
#1: ffffffff8fad58e8 (genl_mutex){+.+.}-{3:3}, at: genl_rcv_msg+0x121/0xec0 net/netlink/genetlink.c:1209
#2: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: ieee80211_register_hw+0x2c4e/0x3e10 net/mac80211/main.c:1518
7 locks held by syz-executor/7537:
#0: ffff88806d07a420 (sb_writers#8){.+.+}-{0:0}, at: file_start_write include/linux/fs.h:2881 [inline]
#0: ffff88806d07a420 (sb_writers#8){.+.+}-{0:0}, at: vfs_write+0x227/0xc90 fs/read_write.c:586
#1: ffff88805dc05c88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x1eb/0x500 fs/kernfs/file.c:325
#2: ffff8880226511e8 (kn->active#51){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x20f/0x500 fs/kernfs/file.c:326
#3: ffffffff8f30cda8 (nsim_bus_dev_list_lock){+.+.}-{3:3}, at: new_device_store+0x1b4/0x890 drivers/net/netdevsim/bus.c:166
#4: ffff8880605da0e8 (&dev->mutex){....}-{3:3}, at: device_lock include/linux/device.h:1009 [inline]
#4: ffff8880605da0e8 (&dev->mutex){....}-{3:3}, at: __device_attach+0x8e/0x520 drivers/base/dd.c:1004
#5: ffff88804e059250 (&devlink->lock_key#27){+.+.}-{3:3}, at: nsim_drv_probe+0xcb/0xb80 drivers/net/netdevsim/dev.c:1534
#6: ffffffff8fa702c8 (rtnl_mutex){+.+.}-{3:3}, at: register_nexthop_notifier+0x84/0x290 net/ipv4/nexthop.c:3872
3 locks held by syz.1.317/7709:

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.11.0-rc3-syzkaller-00066-g1fb918967b56 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
nmi_cpu_backtrace+0x49c/0x4d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x198/0x320 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:223 [inline]
watchdog+0xfee/0x1030 kernel/hung_task.c:379
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 141 Comm: kworker/u8:6 Not tainted 6.11.0-rc3-syzkaller-00066-g1fb918967b56 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/27/2024
Workqueue: events_unbound cfg80211_wiphy_work
RIP: 0010:unwind_next_frame+0x796/0x2a00 arch/x86/kernel/unwind_orc.c:512
Code: 48 c1 e8 03 42 0f b6 04 28 84 c0 4d 89 f7 0f 85 5b 1a 00 00 0f b7 5d 00 c1 eb 0b 80 e3 01 48 8b 44 24 78 42 0f b6 04 28 84 c0 <4c> 8b 74 24 48 0f 85 5e 1a 00 00 88 1a 48 89 6c 24 28 0f b7 6d 00
RSP: 0018:ffffc90002d6e928 EFLAGS: 00000246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88801b7a3c00
RDX: ffffc90002d6ea35 RSI: ffffffff8e5a3d80 RDI: 0000000000000002
RBP: ffffffff9087a06e R08: 0000000000000003 R09: ffffffff81412e12
R10: 0000000000000002 R11: ffff88801b7a3c00 R12: ffffffff900f0c7c
R13: dffffc0000000000 R14: 1ffff920005add40 R15: 1ffff920005add40
FS: 0000000000000000(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555569aa15c8 CR3: 0000000012890000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
arch_stack_walk+0x151/0x1b0 arch/x86/kernel/stacktrace.c:25
stack_trace_save+0x118/0x1d0 kernel/stacktrace.c:122
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xe0/0x150 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2252 [inline]
slab_free mm/slub.c:4473 [inline]
kfree+0x149/0x360 mm/slub.c:4594
ieee80211_inform_bss+0xbb2/0x1080 net/mac80211/scan.c:160
rdev_inform_bss net/wireless/rdev-ops.h:418 [inline]
cfg80211_inform_single_bss_data+0xe93/0x2030 net/wireless/scan.c:2335
cfg80211_inform_bss_data+0x3dd/0x5a70 net/wireless/scan.c:3159
cfg80211_inform_bss_frame_data+0x3b8/0x720 net/wireless/scan.c:3254
ieee80211_bss_info_update+0x8a7/0xbc0 net/mac80211/scan.c:226
ieee80211_rx_bss_info net/mac80211/ibss.c:1100 [inline]
ieee80211_rx_mgmt_probe_beacon net/mac80211/ibss.c:1579 [inline]
ieee80211_ibss_rx_queued_mgmt+0x1962/0x2d70 net/mac80211/ibss.c:1606
ieee80211_iface_process_skb net/mac80211/iface.c:1588 [inline]
ieee80211_iface_work+0x8a5/0xf20 net/mac80211/iface.c:1642
cfg80211_wiphy_work+0x2db/0x490 net/wireless/core.c:440
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
Aug 17, 2024, 5:19:27ā€ÆAMAug 17
to chanda...@oracle.com, djw...@kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: 85652baa895b Merge tag 'block-6.11-20240824' of git://git...
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=1245bf13980000
kernel config: https://syzkaller.appspot.com/x/.config?x=92c0312151c4e32e
dashboard link: https://syzkaller.appspot.com/bug?extid=837bcd54843dd6262f2f
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=12350ad5980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=147927c5980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f48bb8a1e215/disk-85652baa.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a4eced38e143/vmlinux-85652baa.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3d21ddc8de52/bzImage-85652baa.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/5578e92bb7fe/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+837bcd...@syzkaller.appspotmail.com

INFO: task kworker/0:1H:43 blocked for more than 143 seconds.
Not tainted 6.11.0-rc3-syzkaller-00271-g85652baa895b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:kworker/0:1H state:D stack:26736 pid:43 tgid:43 ppid:2 flags:0x00004000
Workqueue: xfs-log/loop0 xlog_ioend_work
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5188 [inline]
__schedule+0x17ae/0x4a10 kernel/sched/core.c:6529
__schedule_loop kernel/sched/core.c:6606 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6621
schedule_timeout+0xb0/0x310 kernel/time/timer.c:2557
___down_common kernel/locking/semaphore.c:225 [inline]
__down_common+0x343/0x7f0 kernel/locking/semaphore.c:246
down+0x84/0xc0 kernel/locking/semaphore.c:63
xfs_buf_lock+0x164/0x510 fs/xfs/xfs_buf.c:1196
xfs_buf_item_unpin+0x1dd/0x710 fs/xfs/xfs_buf_item.c:582
xlog_cil_committed+0x82f/0xf00 fs/xfs/xfs_log_cil.c:910
xlog_cil_process_committed+0x15c/0x1b0 fs/xfs/xfs_log_cil.c:941
xlog_state_shutdown_callbacks+0x2ba/0x3b0 fs/xfs/xfs_log.c:487
xlog_force_shutdown+0x32c/0x390 fs/xfs/xfs_log.c:3530
xlog_ioend_work+0xad/0x100 fs/xfs/xfs_log.c:1244
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
INFO: task syz-executor224:5234 blocked for more than 143 seconds.
Not tainted 6.11.0-rc3-syzkaller-00271-g85652baa895b #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz-executor224 state:D stack:22224 pid:5234 tgid:5234 ppid:5232 flags:0x00004006
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5188 [inline]
__schedule+0x17ae/0x4a10 kernel/sched/core.c:6529
__schedule_loop kernel/sched/core.c:6606 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6621
xlog_wait fs/xfs/xfs_log_priv.h:587 [inline]
xlog_wait_on_iclog+0x501/0x770 fs/xfs/xfs_log.c:840
xlog_force_lsn+0x523/0x9e0 fs/xfs/xfs_log.c:3066
xfs_log_force_seq+0x1da/0x450 fs/xfs/xfs_log.c:3103
__xfs_trans_commit+0xb98/0x1290 fs/xfs/xfs_trans.c:900
xfs_sync_sb_buf+0x2dc/0x370 fs/xfs/libxfs/xfs_sb.c:1178
xfs_ioc_setlabel fs/xfs/xfs_ioctl.c:1143 [inline]
xfs_file_ioctl+0x165b/0x19e0 fs/xfs/xfs_ioctl.c:1298
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:907 [inline]
__se_sys_ioctl+0xfc/0x170 fs/ioctl.c:893
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f816deedc99
RSP: 002b:00007fff94625058 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000020000100 RCX: 00007f816deedc99
RDX: 0000000020000340 RSI: 0000000041009432 RDI: 0000000000000006
RBP: 00007f816df6a5f0 R08: 00005555629df4c0 R09: 00005555629df4c0
R10: 00005555629df4c0 R11: 0000000000000246 R12: 00007fff94625080
R13: 00007fff946252a8 R14: 431bde82d7b634db R15: 00007f816df3603b
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/30:
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:326 [inline]
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:838 [inline]
#0: ffffffff8e7382e0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x55/0x2a0 kernel/locking/lockdep.c:6626
2 locks held by kworker/0:1H/43:
#0: ffff888022723148 ((wq_completion)xfs-log/loop0){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3206 [inline]
#0: ffff888022723148 ((wq_completion)xfs-log/loop0){+.+.}-{0:0}, at: process_scheduled_works+0x90a/0x1830 kernel/workqueue.c:3312
#1: ffffc90000b37d00 ((work_completion)(&iclog->ic_end_io_work)){+.+.}-{0:0}, at: process_one_work kernel/workqueue.c:3207 [inline]
#1: ffffc90000b37d00 ((work_completion)(&iclog->ic_end_io_work)){+.+.}-{0:0}, at: process_scheduled_works+0x945/0x1830 kernel/workqueue.c:3312
2 locks held by getty/4972:
#0: ffff8880303d20a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x25/0x70 drivers/tty/tty_ldisc.c:243
#1: ffffc900031332f0 (&ldata->atomic_read_lock){+.+.}-{3:3}, at: n_tty_read+0x6ac/0x1e00 drivers/tty/n_tty.c:2211
1 lock held by syz-executor224/5234:
#0: ffff8880118c2420 (sb_writers#9){.+.+}-{0:0}, at: mnt_want_write_file+0x61/0x200 fs/namespace.c:559

=============================================

NMI backtrace for cpu 0
CPU: 0 UID: 0 PID: 30 Comm: khungtaskd Not tainted 6.11.0-rc3-syzkaller-00271-g85652baa895b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:93 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119
nmi_cpu_backtrace+0x49c/0x4d0 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x198/0x320 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:162 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:223 [inline]
watchdog+0xff4/0x1040 kernel/hung_task.c:379
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 1057 Comm: kworker/u8:5 Not tainted 6.11.0-rc3-syzkaller-00271-g85652baa895b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024
Workqueue: events_unbound toggle_allocation_gate
RIP: 0010:preempt_count_add+0x6/0x190 kernel/sched/core.c:5643
Code: 48 8b 3c 24 e8 cb aa 94 00 eb 8c e9 ee fe ff ff 0f 1f 40 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 41 57 <41> 56 53 89 fb 48 c7 c0 c0 a9 ce 94 48 c1 e8 03 49 bf 00 00 00 00
RSP: 0018:ffffc90003b77498 EFLAGS: 00000002
RAX: 0000000000000002 RBX: ffff8880b933e9c0 RCX: ffffffff94cea903
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
RBP: 0000000000000000 R08: ffffffff813578d1 R09: 1ffff11017267d38
R10: dffffc0000000000 R11: ffffed1017267d39 R12: dffffc0000000000
R13: 00000000fffffabc R14: ffffc90003b77600 R15: ffffffff8e080a48
FS: 0000000000000000(0000) GS:ffff8880b9300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005628b5ca0680 CR3: 000000000e534000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<NMI>
</NMI>
<TASK>
_raw_spin_lock_nested+0x15/0x40 kernel/locking/spinlock.c:377
raw_spin_rq_lock_nested+0x2a/0x140 kernel/sched/core.c:560
raw_spin_rq_lock kernel/sched/sched.h:1415 [inline]
_raw_spin_rq_lock_irqsave kernel/sched/sched.h:1435 [inline]
rq_lock_irqsave kernel/sched/sched.h:1700 [inline]
sched_balance_update_blocked_averages+0x140/0x16a0 kernel/sched/fair.c:9490
sched_balance_newidle+0x49a/0xfd0 kernel/sched/fair.c:12412
pick_next_task_fair+0x27a/0xde0 kernel/sched/fair.c:8582
__pick_next_task+0xb0/0x2c0 kernel/sched/core.c:5812
__schedule+0x728/0x4a10 kernel/sched/core.c:6484
__schedule_loop kernel/sched/core.c:6606 [inline]
schedule+0x14b/0x320 kernel/sched/core.c:6621
toggle_allocation_gate+0x16a/0x250 mm/kfence/core.c:835
process_one_work kernel/workqueue.c:3231 [inline]
process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312
worker_thread+0x86d/0xd40 kernel/workqueue.c:3390
kthread+0x2f0/0x390 kernel/kthread.c:389
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 1.477 msecs


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

syzbot

unread,
Aug 18, 2024, 12:36:03ā€ÆAMAug 18
to ax...@kernel.dk, chanda...@oracle.com, djw...@kernel.org, h...@lst.de, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
syzbot has bisected this issue to:

commit 3eb96946f0be6bf447cbdf219aba22bc42672f92
Author: Christoph Hellwig <h...@lst.de>
Date: Wed May 24 06:05:38 2023 +0000

block: make bio_check_eod work for zero sized devices

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=178bfdcb980000
start commit: 85652baa895b Merge tag 'block-6.11-20240824' of git://git...
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=144bfdcb980000
console output: https://syzkaller.appspot.com/x/log.txt?x=104bfdcb980000
Reported-by: syzbot+837bcd...@syzkaller.appspotmail.com
Fixes: 3eb96946f0be ("block: make bio_check_eod work for zero sized devices")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Dave Chinner

unread,
Aug 19, 2024, 6:16:49ā€ÆPMAug 19
to syzbot, chanda...@oracle.com, djw...@kernel.org, linux-...@vger.kernel.org, linu...@vger.kernel.org, syzkall...@googlegroups.com
That XFS deadlock looks to have been there for a while.

The test case is setting the loop dev to zero size immediately after
mount and then clearing the filesystem label. xfs_sync_sb_buf()
calls xfs_trans_bhold() so the transaction commit does not unlock
the buffer. The transaction is synchronous, so it issues the journal
IO then goes to sleep waiting for journal IO completion with the
superblock buffer locked.

The Journal IO errors out and runs a shutdown, which runs error
handling on the items to error them out. That requires taking the
buffer lock on buffer items, and that's already held by the log
force waiting for IO completion...

I think it can be fixed by doing something like this:

static void
xlog_state_shutdown_callbacks(
struct xlog *log)
{
struct xlog_in_core *iclog;
LIST_HEAD(cb_list);

iclog = log->l_iclog;
do {
if (atomic_read(&iclog->ic_refcnt)) {
/* Reference holder will re-run iclog callbacks. */
continue;
}
list_splice_init(&iclog->ic_callbacks, &cb_list);
+
+ /*
+ * Wake waiters before processing callbacks as synchronous
+ * transactions might waiting on completion whilst holding
+ * locks we need to error out log items here.
+ */
+ wake_up_all(&iclog->ic_write_wait);
+ wake_up_all(&iclog->ic_force_wait);
spin_unlock(&log->l_icloglock);

xlog_cil_process_committed(&cb_list);

spin_lock(&log->l_icloglock);
- wake_up_all(&iclog->ic_write_wait);
- wake_up_all(&iclog->ic_force_wait);
} while ((iclog = iclog->ic_next) != log->l_iclog);

wake_up_all(&log->l_flush_wait);
}

The log is makred as shutdown before xlog_state_shutdown_callbacks()
is run and xlog_wait_on_iclog() checks if the log is shut down
after it is woken. Hence if we wake the waiters first, they'll see
the shutdown, get -EIO returned from xfs_trans_commit() and unlock
and release all the objects they have locked.

Once they do this, the xfs_buf_item_unpin() call that is stuck
waiting for the buffer lock to be released will gain the lock and
make progress. This will allow the journal shutdown to complete, and
the hang will go away.

-Dave.
--
Dave Chinner
da...@fromorbit.com
Reply all
Reply to author
Forward
0 new messages