[syzbot] [ntfs3?] possible deadlock in ntfs_mark_rec_free (2)

9 views
Skip to first unread message

syzbot

unread,
Apr 29, 2024, 11:29:30 PMApr 29
to almaz.ale...@paragon-software.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: e33c4963bf53 Merge tag 'nfsd-6.9-5' of git://git.kernel.or..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12a032a0980000
kernel config: https://syzkaller.appspot.com/x/.config?x=545d4b3e07d6ccbc
dashboard link: https://syzkaller.appspot.com/bug?extid=016b09736213e65d106e
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/e0ce27d8874a/disk-e33c4963.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/b4f35a65c416/vmlinux-e33c4963.xz
kernel image: https://storage.googleapis.com/syzbot-assets/f1c3abd538d5/bzImage-e33c4963.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+016b09...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.9.0-rc5-syzkaller-00053-ge33c4963bf53 #0 Not tainted
------------------------------------------------------
kswapd0/87 is trying to acquire lock:
ffff88820b27c128 (&wnd->rw_lock/1){+.+.}-{3:3}, at: ntfs_mark_rec_free+0x2f4/0x400 fs/ntfs3/fsntfs.c:742

but task is already holding lock:
ffffffff8d9304a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x166/0x19a0 mm/vmscan.c:6782

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #2 (fs_reclaim){+.+.}-{0:0}:
__fs_reclaim_acquire mm/page_alloc.c:3698 [inline]
fs_reclaim_acquire+0x102/0x160 mm/page_alloc.c:3712
might_alloc include/linux/sched/mm.h:312 [inline]
slab_pre_alloc_hook mm/slub.c:3746 [inline]
slab_alloc_node mm/slub.c:3827 [inline]
__do_kmalloc_node mm/slub.c:3965 [inline]
__kmalloc_node+0xbb/0x480 mm/slub.c:3973
kmalloc_node include/linux/slab.h:648 [inline]
kvmalloc_node+0x9d/0x1a0 mm/util.c:634
kvmalloc include/linux/slab.h:766 [inline]
run_add_entry+0x759/0xbe0 fs/ntfs3/run.c:389
attr_allocate_clusters+0x213/0x720 fs/ntfs3/attrib.c:181
attr_set_size+0x1514/0x2c60 fs/ntfs3/attrib.c:572
ntfs_set_size+0x13d/0x220 fs/ntfs3/inode.c:839
ntfs_extend+0x401/0x570 fs/ntfs3/file.c:335
ntfs_file_write_iter+0x433/0x2050 fs/ntfs3/file.c:1115
call_write_iter include/linux/fs.h:2110 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0x6db/0x1100 fs/read_write.c:590
ksys_write+0x12f/0x260 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x260 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #1 (&ni->file.run_lock#2){++++}-{3:3}:
down_write+0x3a/0x50 kernel/locking/rwsem.c:1579
ntfs_extend_mft+0x138/0x430 fs/ntfs3/fsntfs.c:511
ntfs_look_free_mft+0x661/0xdd0 fs/ntfs3/fsntfs.c:709
ntfs_create_inode+0x3a7/0x42c0 fs/ntfs3/inode.c:1329
ntfs_atomic_open+0x4d6/0x650 fs/ntfs3/namei.c:434
atomic_open fs/namei.c:3360 [inline]
lookup_open.isra.0+0xc98/0x13c0 fs/namei.c:3468
open_last_lookups fs/namei.c:3566 [inline]
path_openat+0x92f/0x2990 fs/namei.c:3796
do_filp_open+0x1dc/0x430 fs/namei.c:3826
do_sys_openat2+0x17a/0x1e0 fs/open.c:1406
do_sys_open fs/open.c:1421 [inline]
__do_sys_openat fs/open.c:1437 [inline]
__se_sys_openat fs/open.c:1432 [inline]
__x64_sys_openat+0x175/0x210 fs/open.c:1432
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x260 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

-> #0 (&wnd->rw_lock/1){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
down_write_nested+0x3d/0x50 kernel/locking/rwsem.c:1695
ntfs_mark_rec_free+0x2f4/0x400 fs/ntfs3/fsntfs.c:742
ni_delete_all+0x6ad/0x880 fs/ntfs3/frecord.c:1637
ni_clear+0x519/0x6a0 fs/ntfs3/frecord.c:106
evict+0x2ed/0x6c0 fs/inode.c:667
iput_final fs/inode.c:1741 [inline]
iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
iput+0x5c/0x80 fs/inode.c:1757
dentry_unlink_inode+0x295/0x440 fs/dcache.c:400
__dentry_kill+0x1d0/0x600 fs/dcache.c:603
shrink_kill fs/dcache.c:1048 [inline]
shrink_dentry_list+0x140/0x5d0 fs/dcache.c:1075
prune_dcache_sb+0xeb/0x150 fs/dcache.c:1156
super_cache_scan+0x32a/0x550 fs/super.c:221
do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
shrink_slab_memcg mm/shrinker.c:548 [inline]
shrink_slab+0xa87/0x1310 mm/shrinker.c:626
shrink_one+0x493/0x7c0 mm/vmscan.c:4774
shrink_many mm/vmscan.c:4835 [inline]
lru_gen_shrink_node mm/vmscan.c:4935 [inline]
shrink_node+0x231f/0x3a80 mm/vmscan.c:5894
kswapd_shrink_node mm/vmscan.c:6704 [inline]
balance_pgdat+0x9a0/0x19a0 mm/vmscan.c:6895
kswapd+0x5ea/0xbf0 mm/vmscan.c:7164
kthread+0x2c1/0x3a0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

other info that might help us debug this:

Chain exists of:
&wnd->rw_lock/1 --> &ni->file.run_lock#2 --> fs_reclaim

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(fs_reclaim);
lock(&ni->file.run_lock#2);
lock(fs_reclaim);
lock(&wnd->rw_lock/1);

*** DEADLOCK ***

2 locks held by kswapd0/87:
#0: ffffffff8d9304a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat+0x166/0x19a0 mm/vmscan.c:6782
#1: ffff88820b27e0e0 (&type->s_umount_key#89){++++}-{3:3}, at: super_trylock_shared fs/super.c:561 [inline]
#1: ffff88820b27e0e0 (&type->s_umount_key#89){++++}-{3:3}, at: super_cache_scan+0x96/0x550 fs/super.c:196

stack backtrace:
CPU: 0 PID: 87 Comm: kswapd0 Not tainted 6.9.0-rc5-syzkaller-00053-ge33c4963bf53 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
check_noncircular+0x31a/0x400 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x2478/0x3b30 kernel/locking/lockdep.c:5137
lock_acquire kernel/locking/lockdep.c:5754 [inline]
lock_acquire+0x1b1/0x560 kernel/locking/lockdep.c:5719
down_write_nested+0x3d/0x50 kernel/locking/rwsem.c:1695
ntfs_mark_rec_free+0x2f4/0x400 fs/ntfs3/fsntfs.c:742
ni_delete_all+0x6ad/0x880 fs/ntfs3/frecord.c:1637
ni_clear+0x519/0x6a0 fs/ntfs3/frecord.c:106
evict+0x2ed/0x6c0 fs/inode.c:667
iput_final fs/inode.c:1741 [inline]
iput.part.0+0x5a8/0x7f0 fs/inode.c:1767
iput+0x5c/0x80 fs/inode.c:1757
dentry_unlink_inode+0x295/0x440 fs/dcache.c:400
__dentry_kill+0x1d0/0x600 fs/dcache.c:603
shrink_kill fs/dcache.c:1048 [inline]
shrink_dentry_list+0x140/0x5d0 fs/dcache.c:1075
prune_dcache_sb+0xeb/0x150 fs/dcache.c:1156
super_cache_scan+0x32a/0x550 fs/super.c:221
do_shrink_slab+0x44f/0x11c0 mm/shrinker.c:435
shrink_slab_memcg mm/shrinker.c:548 [inline]
shrink_slab+0xa87/0x1310 mm/shrinker.c:626
shrink_one+0x493/0x7c0 mm/vmscan.c:4774
shrink_many mm/vmscan.c:4835 [inline]
lru_gen_shrink_node mm/vmscan.c:4935 [inline]
shrink_node+0x231f/0x3a80 mm/vmscan.c:5894
kswapd_shrink_node mm/vmscan.c:6704 [inline]
balance_pgdat+0x9a0/0x19a0 mm/vmscan.c:6895
kswapd+0x5ea/0xbf0 mm/vmscan.c:7164
kthread+0x2c1/0x3a0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

syzbot

unread,
May 17, 2024, 1:28:21 AM (4 days ago) May 17
to almaz.ale...@paragon-software.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com
syzbot has found a reproducer for the following issue on:

HEAD commit: fda5695d692c Merge branch 'for-next/core' into for-kernelci
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=15248fb8980000
kernel config: https://syzkaller.appspot.com/x/.config?x=95dc1de8407c7270
dashboard link: https://syzkaller.appspot.com/bug?extid=016b09736213e65d106e
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13787684980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10a93c92980000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/07f3214ff0d9/disk-fda5695d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/70e2e2c864e8/vmlinux-fda5695d.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b259942a16dc/Image-fda5695d.gz.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/0c9ec56039c3/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+016b09...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.9.0-rc7-syzkaller-gfda5695d692c #0 Not tainted
------------------------------------------------------
kworker/u8:7/652 is trying to acquire lock:
ffff0000d80fa128 (&wnd->rw_lock/1){+.+.}-{3:3}, at: ntfs_mark_rec_free+0x48/0x270 fs/ntfs3/fsntfs.c:742

but task is already holding lock:
ffff0000decb6fa0 (&ni->ni_lock#3){+.+.}-{3:3}, at: ni_trylock fs/ntfs3/ntfs_fs.h:1143 [inline]
ffff0000decb6fa0 (&ni->ni_lock#3){+.+.}-{3:3}, at: ni_write_inode+0x168/0xda4 fs/ntfs3/frecord.c:3265

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&ni->ni_lock#3){+.+.}-{3:3}:
__mutex_lock_common+0x190/0x21a0 kernel/locking/mutex.c:608
__mutex_lock kernel/locking/mutex.c:752 [inline]
mutex_lock_nested+0x2c/0x38 kernel/locking/mutex.c:804
ntfs_set_state+0x1a4/0x5c0 fs/ntfs3/fsntfs.c:947
mi_read+0x3e0/0x4d8 fs/ntfs3/record.c:185
mi_format_new+0x174/0x514 fs/ntfs3/record.c:420
ni_add_subrecord+0xd0/0x3c4 fs/ntfs3/frecord.c:372
ntfs_look_free_mft+0x4c8/0xd1c fs/ntfs3/fsntfs.c:715
ni_create_attr_list+0x764/0xf54 fs/ntfs3/frecord.c:876
ni_ins_attr_ext+0x300/0xa0c fs/ntfs3/frecord.c:974
ni_insert_attr fs/ntfs3/frecord.c:1141 [inline]
ni_insert_resident fs/ntfs3/frecord.c:1525 [inline]
ni_add_name+0x658/0xc14 fs/ntfs3/frecord.c:3047
ni_rename+0xc8/0x1d8 fs/ntfs3/frecord.c:3087
ntfs_rename+0x610/0xae0 fs/ntfs3/namei.c:334
vfs_rename+0x9bc/0xc84 fs/namei.c:4880
do_renameat2+0x9c8/0xe40 fs/namei.c:5037
__do_sys_renameat2 fs/namei.c:5071 [inline]
__se_sys_renameat2 fs/namei.c:5068 [inline]
__arm64_sys_renameat2+0xe0/0xfc fs/namei.c:5068
__invoke_syscall arch/arm64/kernel/syscall.c:34 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:48
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:133
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:152
el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712
el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730
el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598

-> #0 (&wnd->rw_lock/1){+.+.}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x3384/0x763c kernel/locking/lockdep.c:5137
lock_acquire+0x248/0x73c kernel/locking/lockdep.c:5754
down_write_nested+0x58/0xcc kernel/locking/rwsem.c:1695
ntfs_mark_rec_free+0x48/0x270 fs/ntfs3/fsntfs.c:742
ni_write_inode+0xa28/0xda4 fs/ntfs3/frecord.c:3365
ntfs3_write_inode+0x70/0x98 fs/ntfs3/inode.c:1046
write_inode fs/fs-writeback.c:1498 [inline]
__writeback_single_inode+0x5f0/0x1548 fs/fs-writeback.c:1715
writeback_sb_inodes+0x700/0x101c fs/fs-writeback.c:1941
wb_writeback+0x404/0x1048 fs/fs-writeback.c:2117
wb_do_writeback fs/fs-writeback.c:2264 [inline]
wb_workfn+0x394/0x104c fs/fs-writeback.c:2304
process_one_work+0x7b8/0x15d4 kernel/workqueue.c:3267
process_scheduled_works kernel/workqueue.c:3348 [inline]
worker_thread+0x938/0xef4 kernel/workqueue.c:3429
kthread+0x288/0x310 kernel/kthread.c:388
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&ni->ni_lock#3);
lock(&wnd->rw_lock/1);
lock(&ni->ni_lock#3);
lock(&wnd->rw_lock/1);

*** DEADLOCK ***

3 locks held by kworker/u8:7/652:
#0: ffff0000c20c6948 ((wq_completion)writeback){+.+.}-{0:0}, at: process_one_work+0x668/0x15d4 kernel/workqueue.c:3241
#1: ffff800098d87c20 ((work_completion)(&(&wb->dwork)->work)){+.+.}-{0:0}, at: process_one_work+0x6b4/0x15d4 kernel/workqueue.c:3241
#2: ffff0000decb6fa0 (&ni->ni_lock#3){+.+.}-{3:3}, at: ni_trylock fs/ntfs3/ntfs_fs.h:1143 [inline]
#2: ffff0000decb6fa0 (&ni->ni_lock#3){+.+.}-{3:3}, at: ni_write_inode+0x168/0xda4 fs/ntfs3/frecord.c:3265

stack backtrace:
CPU: 1 PID: 652 Comm: kworker/u8:7 Not tainted 6.9.0-rc7-syzkaller-gfda5695d692c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Workqueue: writeback wb_workfn (flush-7:0)
Call trace:
dump_backtrace+0x1b8/0x1e4 arch/arm64/kernel/stacktrace.c:317
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:324
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xe4/0x150 lib/dump_stack.c:114
dump_stack+0x1c/0x28 lib/dump_stack.c:123
print_circular_bug+0x150/0x1b8 kernel/locking/lockdep.c:2060
check_noncircular+0x310/0x404 kernel/locking/lockdep.c:2187
check_prev_add kernel/locking/lockdep.c:3134 [inline]
check_prevs_add kernel/locking/lockdep.c:3253 [inline]
validate_chain kernel/locking/lockdep.c:3869 [inline]
__lock_acquire+0x3384/0x763c kernel/locking/lockdep.c:5137
lock_acquire+0x248/0x73c kernel/locking/lockdep.c:5754
down_write_nested+0x58/0xcc kernel/locking/rwsem.c:1695
ntfs_mark_rec_free+0x48/0x270 fs/ntfs3/fsntfs.c:742
ni_write_inode+0xa28/0xda4 fs/ntfs3/frecord.c:3365
ntfs3_write_inode+0x70/0x98 fs/ntfs3/inode.c:1046
write_inode fs/fs-writeback.c:1498 [inline]
__writeback_single_inode+0x5f0/0x1548 fs/fs-writeback.c:1715
writeback_sb_inodes+0x700/0x101c fs/fs-writeback.c:1941
wb_writeback+0x404/0x1048 fs/fs-writeback.c:2117
wb_do_writeback fs/fs-writeback.c:2264 [inline]
wb_workfn+0x394/0x104c fs/fs-writeback.c:2304
process_one_work+0x7b8/0x15d4 kernel/workqueue.c:3267
process_scheduled_works kernel/workqueue.c:3348 [inline]
worker_thread+0x938/0xef4 kernel/workqueue.c:3429
kthread+0x288/0x310 kernel/kthread.c:388
ret_from_fork+0x10/0x20 arch/arm64/kernel/entry.S:860


---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

syzbot

unread,
May 17, 2024, 12:29:06 PM (3 days ago) May 17
to almaz.ale...@paragon-software.com, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ll...@lists.linux.dev, nat...@kernel.org, ndesau...@google.com, nt...@lists.linux.dev, syzkall...@googlegroups.com, tr...@redhat.com
syzbot has bisected this issue to:

commit e0f363a98830e8d7d70fbaf91c07ae0b7c57aafe
Author: Konstantin Komarov <almaz.ale...@paragon-software.com>
Date: Mon May 8 07:36:28 2023 +0000

fs/ntfs3: Mark ntfs dirty when on-disk struct is corrupted

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1431b56c980000
start commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=1631b56c980000
console output: https://syzkaller.appspot.com/x/log.txt?x=1231b56c980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f59c50304274d557
dashboard link: https://syzkaller.appspot.com/bug?extid=016b09736213e65d106e
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16340168980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=107e7f04980000

Reported-by: syzbot+016b09...@syzkaller.appspotmail.com
Fixes: e0f363a98830 ("fs/ntfs3: Mark ntfs dirty when on-disk struct is corrupted")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Reply all
Reply to author
Forward
0 new messages