[syzbot] [ntfs3?] general protection fault in pick_link

37 views
Skip to first unread message

syzbot

unread,
Nov 14, 2024, 1:08:25 PM11/14/24
to almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot found the following issue on:

HEAD commit: a9cda7c0ffed Merge tag 'irq_urgent_for_v6.12_rc7' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11e55ea7980000
kernel config: https://syzkaller.appspot.com/x/.config?x=64aa0d9945bd5c1
dashboard link: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=133fe35f980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13e93e30580000

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7feb34a89c2a/non_bootable_disk-a9cda7c0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/29b6746965e0/vmlinux-a9cda7c0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/455959afbe37/bzImage-a9cda7c0.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/d478326f6d12/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com

loop0: detected capacity change from 0 to 4096
ntfs3(loop0): Different NTFS sector size (4096) and media sector size (512).
ntfs3(loop0): ino=1b, "file0" attr_set_size
ntfs3(loop0): Mark volume as dirty due to NTFS errors
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 5310 Comm: syz-executor255 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:pick_link+0x51c/0xd50 fs/namei.c:1864
Code: c1 e8 03 42 80 3c 38 00 74 08 48 89 df e8 fc 00 e9 ff 48 8b 2b 48 85 ed 0f 84 92 00 00 00 e8 7b 36 7f ff 48 89 e8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 0f 85 a2 05 00 00 0f b6 5d 00 bf 2f 00 00 00
RSP: 0018:ffffc9000d147998 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88804558dec8 RCX: ffff88801ec7a440
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000002 R08: ffffffff8215a35f R09: 1ffffffff203a13d
R10: dffffc0000000000 R11: fffffbfff203a13e R12: 1ffff92001a28f93
R13: ffffc9000d147af8 R14: 1ffff92001a28f5f R15: dffffc0000000000
FS: 0000555577611380(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcc0a595ed8 CR3: 0000000035760000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
step_into+0xca9/0x1080 fs/namei.c:1923
lookup_last fs/namei.c:2556 [inline]
path_lookupat+0x16f/0x450 fs/namei.c:2580
filename_lookup+0x256/0x610 fs/namei.c:2609
user_path_at+0x3a/0x60 fs/namei.c:3016
do_mount fs/namespace.c:3844 [inline]
__do_sys_mount fs/namespace.c:4057 [inline]
__se_sys_mount+0x297/0x3c0 fs/namespace.c:4034
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4b18ad5b19
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 17 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffc2e486c48 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0030656c69662f2e RCX: 00007f4b18ad5b19
RDX: 0000000000000000 RSI: 00000000200000c0 RDI: 0000000000000000
RBP: 00007f4b18b685f0 R08: 0000000000000000 R09: 00005555776124c0
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc2e486c70
R13: 00007ffc2e486e98 R14: 431bde82d7b634db R15: 00007f4b18b1e03b
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:pick_link+0x51c/0xd50 fs/namei.c:1864
Code: c1 e8 03 42 80 3c 38 00 74 08 48 89 df e8 fc 00 e9 ff 48 8b 2b 48 85 ed 0f 84 92 00 00 00 e8 7b 36 7f ff 48 89 e8 48 c1 e8 03 <42> 0f b6 04 38 84 c0 0f 85 a2 05 00 00 0f b6 5d 00 bf 2f 00 00 00
RSP: 0018:ffffc9000d147998 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88804558dec8 RCX: ffff88801ec7a440
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000002 R08: ffffffff8215a35f R09: 1ffffffff203a13d
R10: dffffc0000000000 R11: fffffbfff203a13e R12: 1ffff92001a28f93
R13: ffffc9000d147af8 R14: 1ffff92001a28f5f R15: dffffc0000000000
FS: 0000555577611380(0000) GS:ffff88801fc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fcc0a595ed8 CR3: 0000000035760000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: c1 e8 03 shr $0x3,%eax
3: 42 80 3c 38 00 cmpb $0x0,(%rax,%r15,1)
8: 74 08 je 0x12
a: 48 89 df mov %rbx,%rdi
d: e8 fc 00 e9 ff call 0xffe9010e
12: 48 8b 2b mov (%rbx),%rbp
15: 48 85 ed test %rbp,%rbp
18: 0f 84 92 00 00 00 je 0xb0
1e: e8 7b 36 7f ff call 0xff7f369e
23: 48 89 e8 mov %rbp,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 0f b6 04 38 movzbl (%rax,%r15,1),%eax <-- trapping instruction
2f: 84 c0 test %al,%al
31: 0f 85 a2 05 00 00 jne 0x5d9
37: 0f b6 5d 00 movzbl 0x0(%rbp),%ebx
3b: bf 2f 00 00 00 mov $0x2f,%edi


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Lizhi Xu

unread,
Nov 15, 2024, 3:26:34 AM11/15/24
to syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
the symlink inode is corrupted.

#syz test

diff --git a/fs/namei.c b/fs/namei.c
index 4a4a22a08ac2..f5dbccb3aafc 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1844,6 +1844,9 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
if (unlikely(error))
return ERR_PTR(error);

+ if (!S_ISLNK(inode->i_mode))
+ return ERR_PTR(-EINVAL);
+
res = READ_ONCE(inode->i_link);
if (!res) {
const char * (*get)(struct dentry *, struct inode *,

syzbot

unread,
Nov 15, 2024, 3:47:05 AM11/15/24
to linux-...@vger.kernel.org, lizh...@windriver.com, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
Tested-by: syzbot+73d8fc...@syzkaller.appspotmail.com

Tested on:

commit: cfaaa7d0 Merge tag 'net-6.12-rc8' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17ea9130580000
kernel config: https://syzkaller.appspot.com/x/.config?x=d2aeec8c0b2e420c
dashboard link: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=140931a7980000

Note: testing is done by a robot and is best-effort only.

Lizhi Xu

unread,
Nov 15, 2024, 4:49:16 AM11/15/24
to syzbot+73d8fc...@syzkaller.appspotmail.com, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
syzbot reported a null-ptr-deref in pick_link. [1]
When symlink's inode is corrupted, the value of the i_link is 2 in this case,
it will trigger null pointer deref when accessing *res in pick_link().

To avoid this issue, add a check for inode mode, return -EINVAL when it's
not symlink.

[1]
Reported-and-tested-by: syzbot+73d8fc...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
Signed-off-by: Lizhi Xu <lizh...@windriver.com>
---
fs/namei.c | 3 +++
1 file changed, 3 insertions(+)
--
2.43.0

Jan Kara

unread,
Nov 15, 2024, 6:43:18 AM11/15/24
to Lizhi Xu, syzbot+73d8fc...@syzkaller.appspotmail.com, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Fri 15-11-24 17:49:08, Lizhi Xu wrote:
> syzbot reported a null-ptr-deref in pick_link. [1]
> When symlink's inode is corrupted, the value of the i_link is 2 in this case,
> it will trigger null pointer deref when accessing *res in pick_link().
>
> To avoid this issue, add a check for inode mode, return -EINVAL when it's
> not symlink.
>
> [1]
> KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
> CPU: 0 UID: 0 PID: 5310 Comm: syz-executor255 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
> RIP: 0010:pick_link+0x51c/0xd50 fs/namei.c:1864

Hum, based on line number is:

if (*res == '/') { <<<< HERE
error = nd_jump_root(nd);
if (unlikely(error))

So res would be non-zero but a small number.
So I don't see how we can get here without inode being a symlink.
pick_link() is called from step_into() which has among other things:

if (likely(!d_is_symlink(path.dentry)) || ...)
do something and return

so we are checking whether the inode is a symlink before calling
pick_link(). And yes, the d_is_symlink() is using cached type in
dentry->d_flags so they could mismatch. But inode is not supposed to change
its type during its lifetime so if there is a mismatch that is the problem
that needs to be fixed.

Honza
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Al Viro

unread,
Nov 15, 2024, 8:06:22 AM11/15/24
to Lizhi Xu, syzbot+73d8fc...@syzkaller.appspotmail.com, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com
On Fri, Nov 15, 2024 at 05:49:08PM +0800, Lizhi Xu wrote:
> syzbot reported a null-ptr-deref in pick_link. [1]
> When symlink's inode is corrupted, the value of the i_link is 2 in this case,
> it will trigger null pointer deref when accessing *res in pick_link().
>
> To avoid this issue, add a check for inode mode, return -EINVAL when it's
> not symlink.

NAK. Don't paper over filesystem bugs at pathwalk time - it's the wrong
place for that. Fix it at in-core inode creation time.

Al Viro

unread,
Nov 15, 2024, 8:25:01 AM11/15/24
to Lizhi Xu, syzbot+73d8fc...@syzkaller.appspotmail.com, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzkall...@googlegroups.com
BTW, seeing that ntfs doesn't even touch ->i_link, you are dealing
with aftermath of memory corruption, so it's definitely papering over
the actual bug here.

Lizhi Xu

unread,
Nov 15, 2024, 8:02:16 PM11/15/24
to ja...@suse.cz, almaz.ale...@paragon-software.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Our idea is the same. Because d_is_symlink() has confirmed the mode of
symlink in step_into(), I will confirm whether the mode of symlink's inode
has changed when the value of i_link is 2 in pick_link().
> do something and return
>
> so we are checking whether the inode is a symlink before calling
> pick_link(). And yes, the d_is_symlink() is using cached type in
> dentry->d_flags so they could mismatch. But inode is not supposed to change
> its type during its lifetime so if there is a mismatch that is the problem
> that needs to be fixed.
I think syzbot executed the following two syscalls when triggering this problem:

link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)

Obviously, this is to mount a link. Whether the mount operation itself will
change or corrupt the i_link value and mode value of the symlink is not
clear to me yet.

BR,
Lizhi

Al Viro

unread,
Nov 15, 2024, 8:25:48 PM11/15/24
to Lizhi Xu, ja...@suse.cz, almaz.ale...@paragon-software.com, bra...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Sat, Nov 16, 2024 at 09:02:07AM +0800, Lizhi Xu wrote:

> Our idea is the same. Because d_is_symlink() has confirmed the mode of
> symlink in step_into(), I will confirm whether the mode of symlink's inode
> has changed when the value of i_link is 2 in pick_link().
> > do something and return
> >
> > so we are checking whether the inode is a symlink before calling
> > pick_link(). And yes, the d_is_symlink() is using cached type in
> > dentry->d_flags so they could mismatch. But inode is not supposed to change
> > its type during its lifetime so if there is a mismatch that is the problem
> > that needs to be fixed.
> I think syzbot executed the following two syscalls when triggering this problem:
>
> link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
> mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)
>
> Obviously, this is to mount a link. Whether the mount operation itself will
> change or corrupt the i_link value and mode value of the symlink is not
> clear to me yet.

Odds are, it's not a valid struct inode instance in the first place.
It's not inode->i_link that is a problem (*nothing* should ever store
that value in there and ntfs doesn't even try that - grep and you'll see);
it's inode itself.

Have you tried KASAN-enabled build? Might be interesting to see if
it catches anything...

Lizhi Xu

unread,
Nov 15, 2024, 8:39:57 PM11/15/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Fri, 15 Nov 2024 13:24:55 +0000, Al Viro wrote:
> On Fri, Nov 15, 2024 at 01:06:15PM +0000, Al Viro wrote:
> > On Fri, Nov 15, 2024 at 05:49:08PM +0800, Lizhi Xu wrote:
> > > syzbot reported a null-ptr-deref in pick_link. [1]
> > > When symlink's inode is corrupted, the value of the i_link is 2 in this case,
> > > it will trigger null pointer deref when accessing *res in pick_link().
> > >
> > > To avoid this issue, add a check for inode mode, return -EINVAL when it's
> > > not symlink.
> >
> > NAK. Don't paper over filesystem bugs at pathwalk time - it's the wrong
> > place for that. Fix it at in-core inode creation time.
>
> BTW, seeing that ntfs doesn't even touch ->i_link, you are dealing
Yes, ntfs3 does not handle the relevant code of i_link.
> with aftermath of memory corruption, so it's definitely papering over
> the actual bug here.
I see that finding out how the value of i_link becomes 2 is the key.

Al Viro

unread,
Nov 15, 2024, 9:32:48 PM11/15/24
to Lizhi Xu, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
How about 'how the memory currently pointed to by inode had come to be
available for use by something that stored 2 at that particular offset'?

Lizhi Xu

unread,
Nov 19, 2024, 6:29:51 AM11/19/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
syzbot reported a null-ptr-deref in pick_link. [1]

First, i_link and i_dir_seq are in the same union, they share the same memory
address, and i_dir_seq will be updated during the execution of walk_component,
which makes the value of i_link equal to i_dir_seq.
In this case, setting i_dir_seq is triggered by move_mount, and the calltrace
is as follows:
move_mount()->
user_path_at()->
filename_lookup()->
path_lookupat()->
lookup_last()->
walk_component()->
__lookup_slow()->
ntfs_lookup()->
d_splice_alias()->
__d_add()->
end_dir_add()

In pick_link(), the simple "if (!i_link)" is used to determine whether i_link
has been set, which is not rigorous enough.

On the other hand, the mode value of the symlink inode becomes REG because
attr_set_size() fails to set the attribute and calls ntfs_bad_inode().
By confirming that the i_link pointer value is valid, the null-ptr-deref
problem in pick_link can be avoided.

[1]
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 0 UID: 0 PID: 5310 Comm: syz-executor255 Not tainted 6.12.0-rc6-syzkaller-00318-ga9cda7c0ffed #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:pick_link+0x51c/0xd50 fs/namei.c:1864
Fixes: ad6cc4c338f4 ("finally fold get_link() into pick_link()")
Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
V1 --> V2: add the root cause of the i_link not set issue and imporve the check

fs/namei.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namei.c b/fs/namei.c
index 4a4a22a08ac2..49bd5ac16137 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1845,7 +1845,7 @@ static const char *pick_link(struct nameidata *nd, struct path *link,
return ERR_PTR(error);

res = READ_ONCE(inode->i_link);
- if (!res) {
+ if (!virt_addr_valid(res)) {
const char * (*get)(struct dentry *, struct inode *,
struct delayed_call *);
get = inode->i_op->get_link;
--
2.43.0

Al Viro

unread,
Nov 19, 2024, 11:36:54 AM11/19/24
to Lizhi Xu, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Tue, Nov 19, 2024 at 07:29:45PM +0800, Lizhi Xu wrote:
> syzbot reported a null-ptr-deref in pick_link. [1]
>
> First, i_link and i_dir_seq are in the same union, they share the same memory
> address, and i_dir_seq will be updated during the execution of walk_component,
> which makes the value of i_link equal to i_dir_seq.
> In this case, setting i_dir_seq is triggered by move_mount, and the calltrace
> is as follows:
> move_mount()->
> user_path_at()->
> filename_lookup()->
> path_lookupat()->
> lookup_last()->
> walk_component()->
> __lookup_slow()->
> ntfs_lookup()->
> d_splice_alias()->
> __d_add()->
> end_dir_add()
>
> In pick_link(), the simple "if (!i_link)" is used to determine whether i_link
> has been set, which is not rigorous enough.
>
> On the other hand, the mode value of the symlink inode becomes REG because
> attr_set_size() fails to set the attribute and calls ntfs_bad_inode().
> By confirming that the i_link pointer value is valid, the null-ptr-deref
> problem in pick_link can be avoided.

So basically your theory is that make_bad_inode() is called on a live directory
inode (already reachable from dcache and remaining there), whereas the sucker
somehow gets a new dentry alias which looks like a symlink. Right?

NAK on the "mitigation", just in case anyone decides to pick that - no matter
how we deal with the problem, sprinkling virt_addr_valid() is *NOT* a solution.

Lizhi Xu

unread,
Nov 19, 2024, 10:04:52 PM11/19/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
syzbot reported a null-ptr-deref in pick_link. [1]

First, i_link and i_dir_seq are in the same union, they share the same memory
address, and i_dir_seq will be updated during the execution of walk_component,
which makes the value of i_link equal to i_dir_seq.

Secondly, the chmod execution failed, which resulted in setting the mode value
of file0's inode to REG when executing ntfs_bad_inode.

Third, when creating a symbolic link using the file0 whose inode has been marked
as bad, it is not determined whether its inode is bad, which ultimately leads to
null-ptr-deref when performing a mount operation on the symbolic link bus because
the i_link value is equal to i_dir_seq=2.

Note: ("file0, bus" are defined in reproducer [2])

To avoid null-ptr-deref in pick_link, when creating a symbolic link, first check
whether the inode of file is already bad.
[2]
move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
chmod(&(0x7f0000000080)='./file0\x00', 0x0)
link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)

Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
Signed-off-by: Lizhi Xu <lizh...@windriver.com>
---
V1 --> V2: add the root cause of the i_link not set issue and imporve the check
V2 --> V3: when creating a symbolic link, first check whether the inode of file is bad.

fs/ntfs3/inode.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index be04d2845bb7..fefbdcf75016 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -1719,6 +1719,9 @@ int ntfs_link_inode(struct inode *inode, struct dentry *dentry)
struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
struct NTFS_DE *de;

+ if (is_bad_inode(inode))
+ return -EIO;
+
/* Allocate PATH_MAX bytes. */
de = __getname();
if (!de)
--
2.43.0

Al Viro

unread,
Nov 20, 2024, 11:10:53 AM11/20/24
to Lizhi Xu, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Wed, Nov 20, 2024 at 11:04:43AM +0800, Lizhi Xu wrote:
> syzbot reported a null-ptr-deref in pick_link. [1]
>
> First, i_link and i_dir_seq are in the same union, they share the same memory
> address, and i_dir_seq will be updated during the execution of walk_component,
> which makes the value of i_link equal to i_dir_seq.
>
> Secondly, the chmod execution failed, which resulted in setting the mode value
> of file0's inode to REG when executing ntfs_bad_inode.
>
> Third, when creating a symbolic link using the file0 whose inode has been marked
> as bad, it is not determined whether its inode is bad, which ultimately leads to
> null-ptr-deref when performing a mount operation on the symbolic link bus because
> the i_link value is equal to i_dir_seq=2.
>
> Note: ("file0, bus" are defined in reproducer [2])
>
> To avoid null-ptr-deref in pick_link, when creating a symbolic link, first check
> whether the inode of file is already bad.

I would really like to understand how the hell did that bad inode end up passed
to d_splice_alias()/d_instantiate()/whatever it had been.

That's the root cause - and it looks like ntfs is too free with make_bad_inode()
in general, which might cause other problems.

Lizhi Xu

unread,
Nov 20, 2024, 10:13:36 PM11/20/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Wed, 20 Nov 2024 16:10:45 +0000, Al Viro wrote:
> On Wed, Nov 20, 2024 at 11:04:43AM +0800, Lizhi Xu wrote:
> > syzbot reported a null-ptr-deref in pick_link. [1]
> >
> > First, i_link and i_dir_seq are in the same union, they share the same memory
> > address, and i_dir_seq will be updated during the execution of walk_component,
> > which makes the value of i_link equal to i_dir_seq.
> >
> > Secondly, the chmod execution failed, which resulted in setting the mode value
> > of file0's inode to REG when executing ntfs_bad_inode.
> >
> > Third, when creating a symbolic link using the file0 whose inode has been marked
> > as bad, it is not determined whether its inode is bad, which ultimately leads to
> > null-ptr-deref when performing a mount operation on the symbolic link bus because
> > the i_link value is equal to i_dir_seq=2.
> >
> > Note: ("file0, bus" are defined in reproducer [2])
> >
> > To avoid null-ptr-deref in pick_link, when creating a symbolic link, first check
> > whether the inode of file is already bad.
>
> I would really like to understand how the hell did that bad inode end up passed
> to d_splice_alias()/d_instantiate()/whatever it had been.
1. In the move_mount() process, the inode is created by ntfs_alloc_inode() and enters d_splice_alias() by ntfs_lookup(), at this time inode is good, as shown below:
move_mount()->
user_path_at()->
filename_lookup()->
path_lookupat()->
lookup_last()->
walk_component()->
__lookup_slow()->
ntfs_lookup()->
d_splice_alias()->

2. The subsequent chmod fails, causing the inode to be set to bad.
3. During the link operation, d_instantiate() is executed in ntfs_link() to associate the bad inode with the dentry.
4. During the mount operation, walk_component executes pick_link, triggering null-ptr-deref.

Reproducer:
move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
chmod(&(0x7f0000000080)='./file0\x00', 0x0)
link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)
>

Al Viro

unread,
Nov 20, 2024, 10:55:36 PM11/20/24
to Lizhi Xu, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Thu, Nov 21, 2024 at 11:13:29AM +0800, Lizhi Xu wrote:
> user_path_at()->
> filename_lookup()->
> path_lookupat()->
> lookup_last()->
> walk_component()->
> __lookup_slow()->
> ntfs_lookup()->
> d_splice_alias()->
>
> 2. The subsequent chmod fails, causing the inode to be set to bad.

What's wrong with "return an error"?

> 3. During the link operation, d_instantiate() is executed in ntfs_link() to associate the bad inode with the dentry.

Yecchhh... If nothing else, check for is_bad_inode() should be there
for as long as make_bad_inode() is done on live inodes.

Lizhi Xu

unread,
Nov 21, 2024, 12:28:07 AM11/21/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Thu, 21 Nov 2024 03:55:29 +0000, Al Viro wrote:
> > user_path_at()->
> > filename_lookup()->
> > path_lookupat()->
> > lookup_last()->
> > walk_component()->
> > __lookup_slow()->
> > ntfs_lookup()->
> > d_splice_alias()->
> >
> > 2. The subsequent chmod fails, causing the inode to be set to bad.
>
> What's wrong with "return an error"?
make_bad_inode() is executed in attr_set_size() and the error code -ENOENT is returned;
attr_set_size() defined in fs/ntfs3/attrib.c
>
> > 3. During the link operation, d_instantiate() is executed in ntfs_link() to associate the bad inode with the dentry.
>
> Yecchhh... If nothing else, check for is_bad_inode() should be there
> for as long as make_bad_inode() is done on live inodes.
I checked is_bad_inode() in ntfs_link(), see line 146 of ntfs_link(), please see my V3 of the patch [2].

[1]
fs/ntfs3/namei.c
19 static int ntfs_link(struct dentry *ode, struct inode *dir, struct dentry *de)
18 {
17 int err;
16 struct inode *inode = d_inode(ode);
15 struct ntfs_inode *ni = ntfs_i(inode);
14
13 if (S_ISDIR(inode->i_mode))
12 return -EPERM;
11
10 if (inode->i_nlink >= NTFS_LINK_MAX)
9 return -EMLINK;
8
7 ni_lock_dir(ntfs_i(dir));
6 if (inode != dir)
5 ni_lock(ni);
4
3 inc_nlink(inode);
2 ihold(inode);
1
146 err = ntfs_link_inode(inode, de);
1
2 if (!err) {
3 inode_set_ctime_current(inode);
4 inode_set_mtime_to_ts(dir, inode_set_ctime_current(dir));
5 mark_inode_dirty(inode);
6 mark_inode_dirty(dir);
7 d_instantiate(de, inode);

[2]

Lizhi Xu

unread,
Nov 22, 2024, 2:49:58 AM11/22/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
I will release the patch of the v4 version and add root cause:
During the execution of the link command, it sets the inode of the symlink file to the
already bad inode of file0 by calling d_instantiate, which ultimately leads to
null-ptr-deref when performing a mount operation on the symbolic link bus
because it use bad inode's i_link and its value is equal to i_dir_seq=2.

BR,
Lizhi

Lizhi Xu

unread,
Nov 22, 2024, 3:10:45 AM11/22/24
to lizh...@windriver.com, vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
syzbot reported a null-ptr-deref in pick_link. [1]

First, i_link and i_dir_seq are in the same union, they share the same memory
address, and i_dir_seq will be updated during the execution of walk_component,
which makes the value of i_link equal to i_dir_seq.

Secondly, the chmod execution failed, which resulted in setting the mode value
of file0's inode to REG when executing ntfs_bad_inode.

Third, during the execution of the link command, it sets the inode of the
symlink file to the already bad inode of file0 by calling d_instantiate, which
ultimately leads to null-ptr-deref when performing a mount operation on the
symbolic link bus because it use bad inode's i_link and its value is equal to
i_dir_seq=2.

Note: ("file0, bus" are defined in reproducer [2])

To avoid null-ptr-deref in pick_link, when creating a symbolic link, first check
whether the inode of file is already bad.

move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
chmod(&(0x7f0000000080)='./file0\x00', 0x0)
link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)

Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
Signed-off-by: Lizhi Xu <lizh...@windriver.com>
---
V1 --> V2: add the root cause of the i_link not set issue and imporve the check
V2 --> V3: when creating a symbolic link, first check whether the inode of file is bad.
V3 --> V4: add comments for symlink use bad inode, it is the root cause

fs/ntfs3/inode.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
index be04d2845bb7..fefbdcf75016 100644
--- a/fs/ntfs3/inode.c
+++ b/fs/ntfs3/inode.c
@@ -1719,6 +1719,9 @@ int ntfs_link_inode(struct inode *inode, struct dentry *dentry)
struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
struct NTFS_DE *de;

+ if (is_bad_inode(inode))
+ return -EIO;
+
/* Allocate PATH_MAX bytes. */
de = __getname();
if (!de)
--
2.43.0

Jan Kara

unread,
Nov 22, 2024, 6:50:54 AM11/22/24
to Lizhi Xu, vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Fri 22-11-24 16:10:25, Lizhi Xu wrote:
> syzbot reported a null-ptr-deref in pick_link. [1]
>
> First, i_link and i_dir_seq are in the same union, they share the same memory
> address, and i_dir_seq will be updated during the execution of walk_component,
> which makes the value of i_link equal to i_dir_seq.
>
> Secondly, the chmod execution failed, which resulted in setting the mode value
> of file0's inode to REG when executing ntfs_bad_inode.
>
> Third, during the execution of the link command, it sets the inode of the
> symlink file to the already bad inode of file0 by calling d_instantiate, which
> ultimately leads to null-ptr-deref when performing a mount operation on the
> symbolic link bus because it use bad inode's i_link and its value is equal to
> i_dir_seq=2.
>
> Note: ("file0, bus" are defined in reproducer [2])
>
> To avoid null-ptr-deref in pick_link, when creating a symbolic link, first check
> whether the inode of file is already bad.

So actually there's no symbolic link involved here at all (which what was
confusing me all the time).

> move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
> chmod(&(0x7f0000000080)='./file0\x00', 0x0)
> link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
> mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)

This creates only a hardlink. And in fact the creation of the link seems to
be totally irrelevant for this problem. I believe:

move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
chmod(&(0x7f0000000080)='./file0\x00', 0x0)
mount$overlay(0x0, &(0x7f00000000c0)='./file0\x00', 0x0, 0x0, 0x0)

would be as good reproducer of the problem. The core of the problem is that
NTFS3 calls make_bad_inode() on inode that is accessible to userspace and
is something else than a regular file. As long as that happens, some
variant of this NULL-ptr-dereference can happen as well, just the
reproducers will be somewhat different.

So I don't think patching ntfs_link_inode() makes a lot of sense. If
anything, I'd patch NTFS3 to not mark the inode as bad somewhere inside
ntfs_setattr() and deal with the error in a better way.

Honza

>
> Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
> Signed-off-by: Lizhi Xu <lizh...@windriver.com>
> ---
> V1 --> V2: add the root cause of the i_link not set issue and imporve the check
> V2 --> V3: when creating a symbolic link, first check whether the inode of file is bad.
> V3 --> V4: add comments for symlink use bad inode, it is the root cause
>
> fs/ntfs3/inode.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/ntfs3/inode.c b/fs/ntfs3/inode.c
> index be04d2845bb7..fefbdcf75016 100644
> --- a/fs/ntfs3/inode.c
> +++ b/fs/ntfs3/inode.c
> @@ -1719,6 +1719,9 @@ int ntfs_link_inode(struct inode *inode, struct dentry *dentry)
> struct ntfs_sb_info *sbi = inode->i_sb->s_fs_info;
> struct NTFS_DE *de;
>
> + if (is_bad_inode(inode))
> + return -EIO;
> +
> /* Allocate PATH_MAX bytes. */
> de = __getname();
> if (!de)
> --
> 2.43.0
>

Lizhi Xu

unread,
Nov 22, 2024, 8:10:02 PM11/22/24
to lizh...@windriver.com, vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
syzbot reported a null-ptr-deref in pick_link. [1]

First, i_link and i_dir_seq are in the same union, they share the same memory
address, and i_dir_seq will be updated during the execution of walk_component,
which makes the value of i_link equal to i_dir_seq.

Secondly, the chmod execution failed, which resulted in setting the mode value
of file0's inode to REG when executing ntfs_bad_inode.

Third, during the execution of the link command, it sets the inode of the
hardlink file to the already bad inode of file0 by calling d_instantiate, which
ultimately leads to null-ptr-deref when performing a mount operation on the
hardlink bus because it use bad inode's i_link and its value is equal to
i_dir_seq=2.

Note: ("file0, bus" are defined in reproducer [2])

To avoid null-ptr-deref in pick_link, when creating a hardlink, first check
whether the inode of file is already bad.

move_mount(0xffffffffffffff9c, &(0x7f00000003c0)='./file0\x00', 0xffffffffffffff9c, &(0x7f0000000400)='./file0/file0\x00', 0x140)
chmod(&(0x7f0000000080)='./file0\x00', 0x0)
link(&(0x7f0000000200)='./file0\x00', &(0x7f0000000240)='./bus\x00')
mount$overlay(0x0, &(0x7f00000000c0)='./bus\x00', 0x0, 0x0, 0x0)

Reported-by: syzbot+73d8fc...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=73d8fc29ec7cba8286fa
Signed-off-by: Lizhi Xu <lizh...@windriver.com>
---
V1 --> V2: add the root cause of the i_link not set issue and imporve the check
V2 --> V3: when creating a symbolic link, first check whether the inode of file is bad.
V3 --> V4: add comments for symlink use bad inode, it is the root cause
V4 --> V5: update comments for link command, it is creating hardlink

Al Viro

unread,
Nov 22, 2024, 8:32:59 PM11/22/24
to Lizhi Xu, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
On Sat, Nov 23, 2024 at 09:09:56AM +0800, Lizhi Xu wrote:

[snip]

hardlink, surely?

Lizhi Xu

unread,
Nov 23, 2024, 11:43:09 PM11/23/24
to vi...@zeniv.linux.org.uk, almaz.ale...@paragon-software.com, bra...@kernel.org, ja...@suse.cz, linux-...@vger.kernel.org, linux-...@vger.kernel.org, lizh...@windriver.com, nt...@lists.linux.dev, syzbot+73d8fc...@syzkaller.appspotmail.com, syzkall...@googlegroups.com
syzbot reported a null-ptr-deref in pick_link. [1]

First, i_link and i_dir_seq are in the same union, they share the same memory
address, and i_dir_seq will be updated during the execution of walk_component,
which makes the value of i_link equal to i_dir_seq.

Secondly, the chmod execution failed, which resulted in setting the mode value
of file0's inode to REG when executing ntfs_bad_inode.

Third, during the execution of the link syscall, two things are done in
d_instantiate. One is to set the bad inode to the dentry of the bus. More
importantly, the symlink flag obtained in d_flags_for_inode(bad inode) is used
to update the d_flags of the dentry of the bus through __d_set_inode_and_type,
which will cause the dentry flag of the created bus to have symlink. The bus
should have been a hardlink, but it eventually became a symlink.
These two points constitute the root cause.

When executing the mount syscall, in walk_component(), when executing to
step_into() and using d_is_symlink() to judge, because the flag of the dentry
of the bus is symlink, it leads to the wrong process, and trigger null-ptr-deref
in pick_link().
V5 --> V6: add comments for link syscall return a symlink and update subject
Reply all
Reply to author
Forward
0 new messages