[syzbot] possible deadlock in ntfs_fiemap

11 views
Skip to first unread message

syzbot

unread,
Nov 30, 2022, 7:51:59 AM11/30/22
to almaz.ale...@paragon-software.com, linux-...@vger.kernel.org, ll...@lists.linux.dev, nat...@kernel.org, ndesau...@google.com, nt...@lists.linux.dev, syzkall...@googlegroups.com, tr...@redhat.com
Hello,

syzbot found the following issue on:

HEAD commit: 01f856ae6d0c Merge tag 'net-6.1-rc8-2' of git://git.kernel..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15bc5fc3880000
kernel config: https://syzkaller.appspot.com/x/.config?x=2325e409a9a893e1
dashboard link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/5428d604f56a/disk-01f856ae.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/e953d290d254/vmlinux-01f856ae.xz
kernel image: https://storage.googleapis.com/syzbot-assets/3f71610a4904/bzImage-01f856ae.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+96cee7...@syzkaller.appspotmail.com

loop5: detected capacity change from 0 to 4096
ntfs3: loop5: Different NTFS' sector size (2048) and media sector size (512)
======================================================
WARNING: possible circular locking dependency detected
6.1.0-rc7-syzkaller-00101-g01f856ae6d0c #0 Not tainted
------------------------------------------------------
syz-executor.5/25213 is trying to acquire lock:
ffff88801d328fd8 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x8f/0x110 mm/memory.c:5645

but task is already holding lock:
ffff88801ebd34a0 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
ffff88801ebd34a0 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&ni->ni_lock/4){+.+.}-{3:3}
:
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__mutex_lock_common+0x1bd/0x26e0 kernel/locking/mutex.c:603
__mutex_lock kernel/locking/mutex.c:747 [inline]
mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:799
ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
attr_data_get_block+0x301/0x2370 fs/ntfs3/attrib.c:917
ntfs_file_mmap+0x48c/0x730 fs/ntfs3/file.c:387
call_mmap include/linux/fs.h:2204 [inline]
mmap_region+0xfe6/0x1e20 mm/mmap.c:2625
do_mmap+0x8d9/0xf30 mm/mmap.c:1412
vm_mmap_pgoff+0x19e/0x2b0 mm/util.c:520
ksys_mmap_pgoff+0x48c/0x6d0 mm/mmap.c:1458
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&mm->mmap_lock#2){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5646
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2);
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2
);

*** DEADLOCK ***

1 lock held by syz-executor.5/25213:
#0: ffff88801ebd34a0 (&ni->ni_lock
/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243

stack backtrace:
CPU: 0 PID: 25213 Comm: syz-executor.5 Not tainted 6.1.0-rc7-syzkaller-00101-g01f856ae6d0c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2177
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5646
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f692648c0d9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f6927202168 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f69265abf80 RCX: 00007f692648c0d9
RDX: 0000000020000240 RSI: 00000000c020660b RDI: 0000000000000004
RBP: 00007f69264e7ae9 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ffe980edc3f R14: 00007f6927202300 R15: 0000000000022000
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

syzbot

unread,
Dec 8, 2022, 11:00:54 PM12/8/22
to almaz.ale...@paragon-software.com, linux-...@vger.kernel.org, ll...@lists.linux.dev, nat...@kernel.org, ndesau...@google.com, nt...@lists.linux.dev, syzkall...@googlegroups.com, tr...@redhat.com
syzbot has found a reproducer for the following issue on:

HEAD commit: f3e8416619ce Merge tag 'soc-fixes-6.1-5' of git://git.kern..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=10e21467880000
kernel config: https://syzkaller.appspot.com/x/.config?x=d58e7fe7f9cf5e24
dashboard link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=116f8843880000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11646857880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/628abc27cbe7/disk-f3e84166.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/2f19ea836174/vmlinux-f3e84166.xz
kernel image: https://storage.googleapis.com/syzbot-assets/f2e1347e85a5/bzImage-f3e84166.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/d38a9877608a/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+96cee7...@syzkaller.appspotmail.com

======================================================
WARNING: possible circular locking dependency detected
6.1.0-rc8-syzkaller-00035-gf3e8416619ce #0 Not tainted
------------------------------------------------------
syz-executor145/3632 is trying to acquire lock:
ffff88801981bb58 (&mm->mmap_lock#2){++++}-{3:3}, at: __might_fault+0x8f/0x110 mm/memory.c:5644

but task is already holding lock:
ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243

which lock already depends on the new lock.


the existing dependency chain (in reverse order) is:

-> #1 (&ni->ni_lock/4){+.+.}-{3:3}:
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__mutex_lock_common+0x1bd/0x26e0 kernel/locking/mutex.c:603
__mutex_lock kernel/locking/mutex.c:747 [inline]
mutex_lock_nested+0x17/0x20 kernel/locking/mutex.c:799
ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
attr_data_get_block+0x301/0x2370 fs/ntfs3/attrib.c:917
ntfs_file_mmap+0x48c/0x730 fs/ntfs3/file.c:387
call_mmap include/linux/fs.h:2204 [inline]
mmap_region+0xfe6/0x1e20 mm/mmap.c:2622
do_mmap+0x8d9/0xf30 mm/mmap.c:1412
vm_mmap_pgoff+0x19e/0x2b0 mm/util.c:520
ksys_mmap_pgoff+0x48c/0x6d0 mm/mmap.c:1458
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

-> #0 (&mm->mmap_lock#2){++++}-{3:3}:
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5645
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2);
lock(&ni->ni_lock/4);
lock(&mm->mmap_lock#2);

*** DEADLOCK ***

1 lock held by syz-executor145/3632:
#0: ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ni_lock fs/ntfs3/ntfs_fs.h:1108 [inline]
#0: ffff88807288d220 (&ni->ni_lock/4){+.+.}-{3:3}, at: ntfs_fiemap+0x101/0x180 fs/ntfs3/file.c:1243

stack backtrace:
CPU: 0 PID: 3632 Comm: syz-executor145 Not tainted 6.1.0-rc8-syzkaller-00035-gf3e8416619ce #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106
check_noncircular+0x2cc/0x390 kernel/locking/lockdep.c:2177
check_prev_add kernel/locking/lockdep.c:3097 [inline]
check_prevs_add kernel/locking/lockdep.c:3216 [inline]
validate_chain+0x1898/0x6ae0 kernel/locking/lockdep.c:3831
__lock_acquire+0x1292/0x1f60 kernel/locking/lockdep.c:5055
lock_acquire+0x182/0x3c0 kernel/locking/lockdep.c:5668
__might_fault+0xb2/0x110 mm/memory.c:5645
_copy_to_user+0x26/0x130 lib/usercopy.c:29
copy_to_user include/linux/uaccess.h:169 [inline]
fiemap_fill_next_extent+0x22e/0x410 fs/ioctl.c:144
ni_fiemap+0xf57/0x1130 fs/ntfs3/frecord.c:1934
ntfs_fiemap+0x134/0x180 fs/ntfs3/file.c:1245
ioctl_fiemap fs/ioctl.c:219 [inline]
do_vfs_ioctl+0x187f/0x29a0 fs/ioctl.c:810
__do_sys_ioctl fs/ioctl.c:868 [inline]
__se_sys_ioctl+0x83/0x170 fs/ioctl.c:856
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f84c755aca9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 14 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 c0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffdcb203e18 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f84c755aca9
RDX: 0000000020000040 RSI: 00000000c020660b RDI: 0000000000000006
RBP: 00007f84c751a2b0 R08: 0000000000000000 R09: 0000000000000000

Hillf Danton

unread,
Dec 9, 2022, 2:36:47 AM12/9/22
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On 08 Dec 2022 20:00:53 -0800
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: f3e8416619ce Merge tag 'soc-fixes-6.1-5' of git://git.kern..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11646857880000

mmap ioctl_fiemap
mmap_lock ni_lock
ni_lock run_lock
run_lock mmap_lock

To break the deadlock above, remove lock chains prior to mmap_lock in the ioctl path.

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master

--- x/fs/ntfs3/file.c
+++ y/fs/ntfs3/file.c
@@ -1240,12 +1240,8 @@ int ntfs_fiemap(struct inode *inode, str
if (err)
return err;

- ni_lock(ni);
-
err = ni_fiemap(ni, fieinfo, start, len);

- ni_unlock(ni);
-
return err;
}

--- x/fs/ntfs3/frecord.c
+++ y/fs/ntfs3/frecord.c
@@ -1906,6 +1906,8 @@ int ni_fiemap(struct ntfs_inode *ni, str
u32 flags;
bool ok;

+ ni_lock(ni);
+
if (S_ISDIR(ni->vfs_inode.i_mode)) {
run = &ni->dir.alloc_run;
attr = ni_find_attr(ni, NULL, NULL, ATTR_ALLOC, I30_NAME,
@@ -1917,6 +1919,7 @@ int ni_fiemap(struct ntfs_inode *ni, str
NULL);
if (!attr) {
err = -EINVAL;
+ ni_unlock(ni);
goto out;
}
if (is_attr_compressed(attr)) {
@@ -1925,17 +1928,19 @@ int ni_fiemap(struct ntfs_inode *ni, str
ntfs_inode_warn(
&ni->vfs_inode,
"fiemap is not supported for compressed file (cp -r)");
+ ni_unlock(ni);
goto out;
}
run_lock = &ni->file.run_lock;
}

if (!attr || !attr->non_res) {
- err = fiemap_fill_next_extent(
- fieinfo, 0, 0,
- attr ? le32_to_cpu(attr->res.data_size) : 0,
- FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_LAST |
- FIEMAP_EXTENT_MERGED);
+ u64 sz = attr ? le32_to_cpu(attr->res.data_size) : 0;
+ ni_unlock(ni);
+ err = fiemap_fill_next_extent(fieinfo, 0, 0, sz,
+ FIEMAP_EXTENT_DATA_INLINE |
+ FIEMAP_EXTENT_LAST |
+ FIEMAP_EXTENT_MERGED);
goto out;
}

@@ -1945,6 +1950,7 @@ int ni_fiemap(struct ntfs_inode *ni, str
end = alloc_size;

down_read(run_lock);
+ ni_unlock(ni);

while (vbo < end) {
if (idx == -1) {
@@ -2027,8 +2033,10 @@ int ni_fiemap(struct ntfs_inode *ni, str
if (vbo + dlen >= end)
flags |= FIEMAP_EXTENT_LAST;

+ up_read(run_lock);
err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen,
flags);
+ down_read(run_lock);
if (err < 0)
break;
if (err == 1) {
@@ -2048,7 +2056,9 @@ int ni_fiemap(struct ntfs_inode *ni, str
if (vbo + bytes >= end)
flags |= FIEMAP_EXTENT_LAST;

+ up_read(run_lock);
err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags);
+ down_read(run_lock);
if (err < 0)
break;
if (err == 1) {
--

syzbot

unread,
Dec 9, 2022, 10:21:29 AM12/9/22
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-and-tested-by: syzbot+96cee7...@syzkaller.appspotmail.com

Tested on:

commit: 0d1409e4 Merge tag 'drm-fixes-2022-12-09' of git://ano..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=12726a43880000
kernel config: https://syzkaller.appspot.com/x/.config?x=d58e7fe7f9cf5e24
dashboard link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86
compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
patch: https://syzkaller.appspot.com/x/patch.diff?x=14dba623880000

Note: testing is done by a robot and is best-effort only.

Tetsuo Handa

unread,
Apr 12, 2023, 9:11:53 AM4/12/23
to syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
syzbot is reporting circular locking dependency between ntfs_file_mmap()
(which has mm->mmap_lock => ni->ni_lock dependency) and ntfs_fiemap()
(which has ni->ni_lock => mm->mmap_lock dependency).

Since ni_fiemap() is called by ioctl(FS_IOC_FIEMAP) via optional
"struct inode_operations"->fiemap callback, I assume that importance of
ni_fiemap() is lower than ntfs_file_mmap().

Also, since Documentation/filesystems/fiemap.rst says that "If an error
is encountered while copying the extent to user memory, -EFAULT will be
returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT
error.

Therefore, in order to eliminate possibility of deadlock, until

Assumed ni_lock.
TODO: Less aggressive locks.

comment in ni_fiemap() is removed, use ni_fiemap() with best-effort basis
(i.e. fail with -EFAULT when a page fault is inevitable).

Reported-by: syzbot <syzbot+96cee7...@syzkaller.appspotmail.com>
Link: https://syzkaller.appspot.com/bug?extid=96cee7d33ca3f87eee86
Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
fs/ntfs3/file.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index e9bdc1ff08c9..a9e7204e1579 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -1146,9 +1146,11 @@ int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
return err;

ni_lock(ni);
+ pagefault_disable();

err = ni_fiemap(ni, fieinfo, start, len);

+ pagefault_enable();
ni_unlock(ni);

return err;
--
2.34.1

Matthew Wilcox

unread,
Apr 12, 2023, 9:14:05 AM4/12/23
to Tetsuo Handa, syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
On Wed, Apr 12, 2023 at 10:11:08PM +0900, Tetsuo Handa wrote:
> syzbot is reporting circular locking dependency between ntfs_file_mmap()
> (which has mm->mmap_lock => ni->ni_lock dependency) and ntfs_fiemap()
> (which has ni->ni_lock => mm->mmap_lock dependency).
>
> Since ni_fiemap() is called by ioctl(FS_IOC_FIEMAP) via optional
> "struct inode_operations"->fiemap callback, I assume that importance of
> ni_fiemap() is lower than ntfs_file_mmap().
>
> Also, since Documentation/filesystems/fiemap.rst says that "If an error
> is encountered while copying the extent to user memory, -EFAULT will be
> returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT
> error.

What? No, that doesn't mean "You can return -EFAULT because random luck".
That means "If you pass it an invalid address, you'll get -EFAULT back".

NACK.

Tetsuo Handa

unread,
Apr 12, 2023, 9:30:46 AM4/12/23
to Matthew Wilcox, syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
On 2023/04/12 22:13, Matthew Wilcox wrote:
>> Also, since Documentation/filesystems/fiemap.rst says that "If an error
>> is encountered while copying the extent to user memory, -EFAULT will be
>> returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT
>> error.
>
> What? No, that doesn't mean "You can return -EFAULT because random luck".
> That means "If you pass it an invalid address, you'll get -EFAULT back".
>
> NACK.

Then, why does fiemap.rst say "If an error is encountered" rather than
"If an invalid address is passed" ?

Does the definition of -EFAULT limited to "the caller does not have permission
to access this address" ?

----------
int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
u64 phys, u64 len, u32 flags)
{
struct fiemap_extent extent;
struct fiemap_extent __user *dest = fieinfo->fi_extents_start;

/* only count the extents */
if (fieinfo->fi_extents_max == 0) {
fieinfo->fi_extents_mapped++;
return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0;
}

if (fieinfo->fi_extents_mapped >= fieinfo->fi_extents_max)
return 1;

if (flags & SET_UNKNOWN_FLAGS)
flags |= FIEMAP_EXTENT_UNKNOWN;
if (flags & SET_NO_UNMOUNTED_IO_FLAGS)
flags |= FIEMAP_EXTENT_ENCODED;
if (flags & SET_NOT_ALIGNED_FLAGS)
flags |= FIEMAP_EXTENT_NOT_ALIGNED;

memset(&extent, 0, sizeof(extent));
extent.fe_logical = logical;
extent.fe_physical = phys;
extent.fe_length = len;
extent.fe_flags = flags;

dest += fieinfo->fi_extents_mapped;
if (copy_to_user(dest, &extent, sizeof(extent)))
return -EFAULT;

fieinfo->fi_extents_mapped++;
if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max)
return 1;
return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0;
}
----------

If copy_to_user() must not fail other than "the caller does not have
permission to access this address", what should we do for now?
Just remove ntfs_fiemap() and return -EOPNOTSUPP ?

Matthew Wilcox

unread,
Apr 12, 2023, 10:24:27 AM4/12/23
to Tetsuo Handa, syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Konstantin Komarov, Hillf Danton, linux-fsdevel, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
On Wed, Apr 12, 2023 at 10:29:37PM +0900, Tetsuo Handa wrote:
> On 2023/04/12 22:13, Matthew Wilcox wrote:
> >> Also, since Documentation/filesystems/fiemap.rst says that "If an error
> >> is encountered while copying the extent to user memory, -EFAULT will be
> >> returned.", I assume that ioctl(FS_IOC_FIEMAP) users can handle -EFAULT
> >> error.
> >
> > What? No, that doesn't mean "You can return -EFAULT because random luck".
> > That means "If you pass it an invalid address, you'll get -EFAULT back".
> >
> > NACK.
>
> Then, why does fiemap.rst say "If an error is encountered" rather than
> "If an invalid address is passed" ?

Because people are bad at writing.

> Does the definition of -EFAULT limited to "the caller does not have permission
> to access this address" ?

Or the address isn't mapped.

> If copy_to_user() must not fail other than "the caller does not have
> permission to access this address", what should we do for now?
> Just remove ntfs_fiemap() and return -EOPNOTSUPP ?

No, fix it properly. Or don't fix it at all and let somebody else fix it.

Tetsuo Handa

unread,
Apr 17, 2023, 10:03:53 AM4/17/23
to syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Mark Fasheh, Alexander Viro, Christian Brauner, Konstantin Komarov, linux-fsdevel, Hillf Danton, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
syzbot is reporting circular locking dependency between ntfs_file_mmap()
(which has mm->mmap_lock => ni->ni_lock => ni->file.run_lock dependency)
and ntfs_fiemap() (which has ni->ni_lock => ni->file.run_lock =>
mm->mmap_lock dependency), for commit c4b929b85bdb ("vfs: vfs-level fiemap
interface") implemented fiemap_fill_next_extent() using copy_to_user()
where direct mm->mmap_lock dependency is inevitable.

Since ntfs3 does not want to release ni->ni_lock and/or ni->file.run_lock
in order to make sure that "struct ATTRIB" does not change during
ioctl(FS_IOC_FIEMAP) request, let's make it possible to call
fiemap_fill_next_extent() with filesystem locks held.

This patch adds fiemap_fill_next_kernel_extent() which spools
"struct fiemap_extent" to dynamically allocated kernel buffer, and
fiemap_copy_kernel_extent() which copies spooled "struct fiemap_extent"
to userspace buffer after filesystem locks are released.
Reported-by: syzbot <syzbot+c300ab...@syzkaller.appspotmail.com>
Link: https://syzkaller.appspot.com/bug?extid=c300ab283ba3bc072439
Fixes: 4342306f0f0d ("fs/ntfs3: Add file operations and implementation")
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
fs/ioctl.c | 52 ++++++++++++++++++++++++++++++++++++------
fs/ntfs3/file.c | 4 ++++
fs/ntfs3/frecord.c | 10 ++++----
include/linux/fiemap.h | 24 +++++++++++++++++--
4 files changed, 76 insertions(+), 14 deletions(-)

diff --git a/fs/ioctl.c b/fs/ioctl.c
index 5b2481cd4750..60ddc2760932 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -112,11 +112,10 @@ static int ioctl_fibmap(struct file *filp, int __user *p)
#define SET_UNKNOWN_FLAGS (FIEMAP_EXTENT_DELALLOC)
#define SET_NO_UNMOUNTED_IO_FLAGS (FIEMAP_EXTENT_DATA_ENCRYPTED)
#define SET_NOT_ALIGNED_FLAGS (FIEMAP_EXTENT_DATA_TAIL|FIEMAP_EXTENT_DATA_INLINE)
-int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
- u64 phys, u64 len, u32 flags)
+int do_fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
+ u64 phys, u64 len, u32 flags, bool is_kernel)
{
struct fiemap_extent extent;
- struct fiemap_extent __user *dest = fieinfo->fi_extents_start;

/* only count the extents */
if (fieinfo->fi_extents_max == 0) {
@@ -140,16 +139,55 @@ int fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
extent.fe_length = len;
extent.fe_flags = flags;

- dest += fieinfo->fi_extents_mapped;
- if (copy_to_user(dest, &extent, sizeof(extent)))
- return -EFAULT;
+ if (!is_kernel) {
+ struct fiemap_extent __user *dest = fieinfo->fi_extents_start;
+
+ dest += fieinfo->fi_extents_mapped;
+ if (copy_to_user(dest, &extent, sizeof(extent)))
+ return -EFAULT;
+ } else {
+ struct fiemap_extent_list *entry = kmalloc(sizeof(*entry), GFP_NOFS);
+
+ if (!entry)
+ return -ENOMEM;
+ memmove(&entry->extent, &extent, sizeof(extent));
+ list_add_tail(&entry->list, &fieinfo->fi_extents_list);
+ }

fieinfo->fi_extents_mapped++;
if (fieinfo->fi_extents_mapped == fieinfo->fi_extents_max)
return 1;
return (flags & FIEMAP_EXTENT_LAST) ? 1 : 0;
}
-EXPORT_SYMBOL(fiemap_fill_next_extent);
+EXPORT_SYMBOL(do_fiemap_fill_next_extent);
+
+int fiemap_copy_kernel_extent(struct fiemap_extent_info *fieinfo, int err)
+{
+ struct fiemap_extent __user *dest;
+ struct fiemap_extent_list *entry, *tmp;
+ unsigned int len = 0;
+
+ list_for_each_entry(entry, &fieinfo->fi_extents_list, list)
+ len++;
+ if (!len)
+ return err;
+ fieinfo->fi_extents_mapped -= len;
+ dest = fieinfo->fi_extents_start + fieinfo->fi_extents_mapped;
+ list_for_each_entry(entry, &fieinfo->fi_extents_list, list) {
+ if (copy_to_user(dest, &entry->extent, sizeof(entry->extent))) {
+ err = -EFAULT;
+ break;
+ }
+ dest++;
+ fieinfo->fi_extents_mapped++;
+ }
+ list_for_each_entry_safe(entry, tmp, &fieinfo->fi_extents_list, list) {
+ list_del(&entry->list);
+ kfree(entry);
+ }
+ return err;
+}
+EXPORT_SYMBOL(fiemap_copy_kernel_extent);

/**
* fiemap_prep - check validity of requested flags for fiemap
diff --git a/fs/ntfs3/file.c b/fs/ntfs3/file.c
index e9bdc1ff08c9..1a3e28f71599 100644
--- a/fs/ntfs3/file.c
+++ b/fs/ntfs3/file.c
@@ -1145,12 +1145,16 @@ int ntfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
if (err)
return err;

+ INIT_LIST_HEAD(&fieinfo->fi_extents_list);
+
ni_lock(ni);

err = ni_fiemap(ni, fieinfo, start, len);

ni_unlock(ni);

+ err = fiemap_copy_kernel_extent(fieinfo, err);
+
return err;
}

diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c
index f1df52dfab74..b70f9dfb71ab 100644
--- a/fs/ntfs3/frecord.c
+++ b/fs/ntfs3/frecord.c
@@ -1941,8 +1941,7 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
}

if (!attr || !attr->non_res) {
- err = fiemap_fill_next_extent(
- fieinfo, 0, 0,
+ err = fiemap_fill_next_kernel_extent(fieinfo, 0, 0,
attr ? le32_to_cpu(attr->res.data_size) : 0,
FIEMAP_EXTENT_DATA_INLINE | FIEMAP_EXTENT_LAST |
FIEMAP_EXTENT_MERGED);
@@ -2037,8 +2036,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
if (vbo + dlen >= end)
flags |= FIEMAP_EXTENT_LAST;

- err = fiemap_fill_next_extent(fieinfo, vbo, lbo, dlen,
- flags);
+ err = fiemap_fill_next_kernel_extent(fieinfo, vbo, lbo,
+ dlen, flags);
if (err < 0)
break;
if (err == 1) {
@@ -2058,7 +2057,8 @@ int ni_fiemap(struct ntfs_inode *ni, struct fiemap_extent_info *fieinfo,
if (vbo + bytes >= end)
flags |= FIEMAP_EXTENT_LAST;

- err = fiemap_fill_next_extent(fieinfo, vbo, lbo, bytes, flags);
+ err = fiemap_fill_next_kernel_extent(fieinfo, vbo, lbo, bytes,
+ flags);
if (err < 0)
break;
if (err == 1) {
diff --git a/include/linux/fiemap.h b/include/linux/fiemap.h
index c50882f19235..10cb33ed80a9 100644
--- a/include/linux/fiemap.h
+++ b/include/linux/fiemap.h
@@ -5,17 +5,37 @@
#include <uapi/linux/fiemap.h>
#include <linux/fs.h>

+struct fiemap_extent_list {
+ struct list_head list;
+ struct fiemap_extent extent;
+};
+
struct fiemap_extent_info {
unsigned int fi_flags; /* Flags as passed from user */
unsigned int fi_extents_mapped; /* Number of mapped extents */
unsigned int fi_extents_max; /* Size of fiemap_extent array */
struct fiemap_extent __user *fi_extents_start; /* Start of
fiemap_extent array */
+ struct list_head fi_extents_list; /* List of fiemap_extent_list */
};

int fiemap_prep(struct inode *inode, struct fiemap_extent_info *fieinfo,
u64 start, u64 *len, u32 supported_flags);
-int fiemap_fill_next_extent(struct fiemap_extent_info *info, u64 logical,
- u64 phys, u64 len, u32 flags);
+int do_fiemap_fill_next_extent(struct fiemap_extent_info *fieinfo, u64 logical,
+ u64 phys, u64 len, u32 flags, bool is_kernel);
+
+static inline int fiemap_fill_next_extent(struct fiemap_extent_info *info,
+ u64 logical, u64 phys, u64 len, u32 flags)
+{
+ return do_fiemap_fill_next_extent(info, logical, phys, len, flags, false);
+}
+
+static inline int fiemap_fill_next_kernel_extent(struct fiemap_extent_info *info,
+ u64 logical, u64 phys, u64 len, u32 flags)
+{
+ return do_fiemap_fill_next_extent(info, logical, phys, len, flags, true);
+}
+
+int fiemap_copy_kernel_extent(struct fiemap_extent_info *info, int err);

#endif /* _LINUX_FIEMAP_H 1 */
--
2.34.1

Al Viro

unread,
Apr 20, 2023, 5:00:59 PM4/20/23
to Tetsuo Handa, syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Mark Fasheh, Christian Brauner, Konstantin Komarov, linux-fsdevel, Hillf Danton, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
On Mon, Apr 17, 2023 at 11:03:26PM +0900, Tetsuo Handa wrote:
> syzbot is reporting circular locking dependency between ntfs_file_mmap()
> (which has mm->mmap_lock => ni->ni_lock => ni->file.run_lock dependency)
> and ntfs_fiemap() (which has ni->ni_lock => ni->file.run_lock =>
> mm->mmap_lock dependency), for commit c4b929b85bdb ("vfs: vfs-level fiemap
> interface") implemented fiemap_fill_next_extent() using copy_to_user()
> where direct mm->mmap_lock dependency is inevitable.
>
> Since ntfs3 does not want to release ni->ni_lock and/or ni->file.run_lock
> in order to make sure that "struct ATTRIB" does not change during
> ioctl(FS_IOC_FIEMAP) request, let's make it possible to call
> fiemap_fill_next_extent() with filesystem locks held.
>
> This patch adds fiemap_fill_next_kernel_extent() which spools
> "struct fiemap_extent" to dynamically allocated kernel buffer, and
> fiemap_copy_kernel_extent() which copies spooled "struct fiemap_extent"
> to userspace buffer after filesystem locks are released.

Ugh... I'm pretty certain that this is a wrong approach. What is going
on in ntfs_file_mmap() that requires that kind of locking?

AFAICS, that's the part that looks odd... Details, please.

Al Viro

unread,
Apr 20, 2023, 5:11:12 PM4/20/23
to Tetsuo Handa, syzbot, nt...@lists.linux.dev, syzkall...@googlegroups.com, Mark Fasheh, Christian Brauner, Konstantin Komarov, linux-fsdevel, Hillf Danton, linux-mm, tr...@redhat.com, ndesau...@google.com, nat...@kernel.org
if (ni->i_valid < to) {
if (!inode_trylock(inode)) {
err = -EAGAIN;
goto out;
}
err = ntfs_extend_initialized_size(file, ni,
ni->i_valid, to);
inode_unlock(inode);
if (err)
goto out;
}

See that inode_trylock() there? That's another sign of the same
problem; it's just that their internal locks (taken by
ntfs_extend_initialized_size()) are taken without the same
kind of papering over the problem.

'to' here is guaranteed to be under the i_size_read(inode); is
that some kind of delayed allocation? Or lazy extending
truncate(), perhaps? I'm not familiar with ntfs codebase (or
ntfs layout, for that matter), so could somebody describe what
exactly is going on in that code?

Frankly, my impression is that this stuff is done in the wrong
place...
Reply all
Reply to author
Forward
0 new messages