[syzbot] [exfat?] [ocfs2?] kernel BUG in link_path_walk

14 views
Skip to first unread message

syzbot

unread,
Dec 3, 2025, 7:07:29 PM12/3/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot found the following issue on:

HEAD commit: 7d31f578f323 Add linux-next specific files for 20251128
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=1612b912580000
kernel config: https://syzkaller.appspot.com/x/.config?x=6336d8e94a7c517d
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=172c8192580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16c3b0c2580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/6b49d8ad90de/disk-7d31f578.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/dbe2d4988ca7/vmlinux-7d31f578.xz
kernel image: https://storage.googleapis.com/syzbot-assets/fc0448ab2411/bzImage-7d31f578.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/ec39deb2cf11/mount_0.gz
fsck result: OK (log: https://syzkaller.appspot.com/x/fsck.log?x=12c3b0c2580000)

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com

VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) encountered for inode ffff88805618b338
fs ocfs2 mode 100000 opflags 0x2 flags 0x20 state 0x0 count 2
------------[ cut here ]------------
kernel BUG at fs/namei.c:630!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6303 Comm: syz.0.92 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:lookup_inode_permission_may_exec fs/namei.c:630 [inline]
RIP: 0010:may_lookup fs/namei.c:1900 [inline]
RIP: 0010:link_path_walk+0x18cb/0x18d0 fs/namei.c:2537
Code: e8 5a 1f ea fe 90 0f 0b e8 b2 96 83 ff 44 89 fd e9 6a fd ff ff e8 a5 96 83 ff 48 89 ef 48 c7 c6 40 d8 79 8b e8 36 1f ea fe 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55
RSP: 0018:ffffc900046ef8a0 EFLAGS: 00010282
RAX: 000000000000008e RBX: ffffc900046efc58 RCX: f91f6529a96d0200
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffff88805618b338 R08: ffffc900046ef567 R09: 1ffff920008ddeac
R10: dffffc0000000000 R11: fffff520008ddead R12: 0000000000008000
R13: ffffc900046efc20 R14: 0000000000008000 R15: ffff88802509b320
FS: 000055555cffa500(0000) GS:ffff888125e4f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc32730f000 CR3: 0000000072f4e000 CR4: 00000000003526f0
Call Trace:
<TASK>
path_openat+0x2b3/0x3dd0 fs/namei.c:4783
do_filp_open+0x1fa/0x410 fs/namei.c:4814
do_sys_openat2+0x121/0x200 fs/open.c:1430
do_sys_open fs/open.c:1436 [inline]
__do_sys_open fs/open.c:1444 [inline]
__se_sys_open fs/open.c:1440 [inline]
__x64_sys_open+0x11e/0x150 fs/open.c:1440
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f4644d8f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe02ccf2f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f4644fe5fa0 RCX: 00007f4644d8f749
RDX: 0000000000000000 RSI: 0000000000145142 RDI: 0000200000000240
RBP: 00007f4644e13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f4644fe5fa0 R14: 00007f4644fe5fa0 R15: 0000000000000003
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:lookup_inode_permission_may_exec fs/namei.c:630 [inline]
RIP: 0010:may_lookup fs/namei.c:1900 [inline]
RIP: 0010:link_path_walk+0x18cb/0x18d0 fs/namei.c:2537
Code: e8 5a 1f ea fe 90 0f 0b e8 b2 96 83 ff 44 89 fd e9 6a fd ff ff e8 a5 96 83 ff 48 89 ef 48 c7 c6 40 d8 79 8b e8 36 1f ea fe 90 <0f> 0b 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 55
RSP: 0018:ffffc900046ef8a0 EFLAGS: 00010282
RAX: 000000000000008e RBX: ffffc900046efc58 RCX: f91f6529a96d0200
RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000
RBP: ffff88805618b338 R08: ffffc900046ef567 R09: 1ffff920008ddeac
R10: dffffc0000000000 R11: fffff520008ddead R12: 0000000000008000
R13: ffffc900046efc20 R14: 0000000000008000 R15: ffff88802509b320
FS: 000055555cffa500(0000) GS:ffff888125e4f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fc32730f000 CR3: 0000000072f4e000 CR4: 00000000003526f0


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Mateusz Guzik

unread,
Dec 3, 2025, 7:46:32 PM12/3/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
this is probably mine, but first some extra debug:


#syz test

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..0df3bd2b947d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1896,6 +1896,7 @@ static inline int may_lookup(struct mnt_idmap *idmap,
{
int err, mask;

+ VFS_BUG_ON(!d_can_lookup(nd->path.dentry));
mask = nd->flags & LOOKUP_RCU ? MAY_NOT_BLOCK : 0;
err = lookup_inode_permission_may_exec(idmap, nd->inode, mask);
if (likely(!err))
@@ -2527,6 +2528,9 @@ static int link_path_walk(const char *name, struct nameidata *nd)
return 0;
}

+ VFS_BUG_ON(!d_can_lookup(nd->path.dentry));
+ VFS_BUG_ON(!S_ISDIR(nd->path.dentry->d_inode->i_mode));
+
/* At this point we know we have a real path component. */
for(;;) {
struct mnt_idmap *idmap;

syzbot

unread,
Dec 3, 2025, 8:21:06 PM12/3/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
kernel BUG in link_path_walk

(syz.0.73,6964,1):ocfs2_find_entry_id:420 ERROR: status = -30
------------[ cut here ]------------
kernel BUG at fs/namei.c:2532!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 6964 Comm: syz.0.73 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:link_path_walk+0x1a57/0x1a90 fs/namei.c:2532
Code: 89 e9 80 e1 07 fe c1 38 c1 0f 8c be fd ff ff 4c 89 ef e8 2c e5 e9 ff e9 b1 fd ff ff e8 62 90 83 ff 90 0f 0b e8 5a 90 83 ff 90 <0f> 0b e8 52 90 83 ff 90 0f 0b e8 4a 90 83 ff 4c 89 ff 48 c7 c6 40
RSP: 0018:ffffc9000491f8a0 EFLAGS: 00010293
RAX: ffffffff823e22d6 RBX: dffffc0000000000 RCX: ffff8880250f3d00
RDX: 0000000000000000 RSI: 0000000000008000 RDI: 0000000000004000
RBP: ffff888079181120 R08: ffff8880299ef520 R09: ffff88807acd2000
R10: ffff8880299ef520 R11: ffff88807acd2000 R12: ffffc9000491fc58
R13: ffffc9000491fc28 R14: 0000000000008000 R15: 0000000000100000
FS: 00007ff92e2e36c0(0000) GS:ffff888125f49000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555575344808 CR3: 0000000025c92000 CR4: 00000000003526f0
Call Trace:
<TASK>
path_openat+0x2b3/0x3dd0 fs/namei.c:4787
do_filp_open+0x1fa/0x410 fs/namei.c:4818
do_sys_openat2+0x121/0x200 fs/open.c:1430
do_sys_open fs/open.c:1436 [inline]
__do_sys_open fs/open.c:1444 [inline]
__se_sys_open fs/open.c:1440 [inline]
__x64_sys_open+0x11e/0x150 fs/open.c:1440
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff92d38f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff92e2e3038 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007ff92d5e5fa0 RCX: 00007ff92d38f749
RDX: 0000000000000000 RSI: 0000000000145142 RDI: 0000200000000240
RBP: 00007ff92d413f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007ff92d5e6038 R14: 00007ff92d5e5fa0 R15: 00007ffda4bd0278
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:link_path_walk+0x1a57/0x1a90 fs/namei.c:2532
Code: 89 e9 80 e1 07 fe c1 38 c1 0f 8c be fd ff ff 4c 89 ef e8 2c e5 e9 ff e9 b1 fd ff ff e8 62 90 83 ff 90 0f 0b e8 5a 90 83 ff 90 <0f> 0b e8 52 90 83 ff 90 0f 0b e8 4a 90 83 ff 4c 89 ff 48 c7 c6 40
RSP: 0018:ffffc9000491f8a0 EFLAGS: 00010293
RAX: ffffffff823e22d6 RBX: dffffc0000000000 RCX: ffff8880250f3d00
RDX: 0000000000000000 RSI: 0000000000008000 RDI: 0000000000004000
RBP: ffff888079181120 R08: ffff8880299ef520 R09: ffff88807acd2000
R10: ffff8880299ef520 R11: ffff88807acd2000 R12: ffffc9000491fc58
R13: ffffc9000491fc28 R14: 0000000000008000 R15: 0000000000100000
FS: 00007ff92e2e36c0(0000) GS:ffff888125e49000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f1310116e9c CR3: 0000000025c92000 CR4: 00000000003526f0


Tested on:

commit: b2c27842 Add linux-next specific files for 20251203
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=15d7801a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=caadf525b0ab8d17
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=1281d4c2580000

Mateusz Guzik

unread,
Dec 4, 2025, 2:45:24 AM12/4/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Thu, Dec 4, 2025 at 2:21 AM syzbot
<syzbot+d222f4...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> kernel BUG in link_path_walk
>
> (syz.0.73,6964,1):ocfs2_find_entry_id:420 ERROR: status = -30
> ------------[ cut here ]------------
> kernel BUG at fs/namei.c:2532!

On the commit syzbot is testing on (b2c27842) and with the patch, the
triggered assert is the second one on S_ISDIR:
VFS_BUG_ON(!d_can_lookup(nd->path.dentry));
VFS_BUG_ON(!S_ISDIR(nd->path.dentry->d_inode->i_mode));

d_can_lookup is __d_entry_type(dentry) == DCACHE_DIRECTORY_TYPE;

Or to put it differently, lookup got entered with a bogus state of a
dentry claiming it is a directory, with an inode which is not. Per the
i_mode reported in the opening mail it is a regular file instead.

While I don't see how this can happen, I don't think it is *my* bug
either -- merely nothing else asserted on the 2 things being in
tandem.

syzbot likes to operate on corrupted filesystems, so I'm going to
assume things are going haywire in ocfs2 until proven otherwise.

Al Viro

unread,
Dec 4, 2025, 3:21:50 AM12/4/25
to Mateusz Guzik, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com
On Thu, Dec 04, 2025 at 08:45:08AM +0100, Mateusz Guzik wrote:

> Or to put it differently, lookup got entered with a bogus state of a
> dentry claiming it is a directory, with an inode which is not. Per the
> i_mode reported in the opening mail it is a regular file instead.
>
> While I don't see how this can happen,

->i_op set to something with ->lookup != NULL, ->i_mode - to regular.
Which is to say, bogus ->i_mode change somewhere.

Theoretically it should bail out, having detected the type change
(on inode_wrong_type()). I'd suggest slapping
BUG_ON(inode_wrong_type(inode, new_i_mode_value));
in front of all reassignments (ocfs2_populate_inode() is the initialization
and thus exempt; all other stores to ->i_mode of struct inode in there
are, in principle, suspect. Something like inode->i_mode &= ~S_ISUID
doesn't need checking - we obviously can't change the type there.
Unpleasant part is that struct ocfs2_dinode also has a member called
i_mode (__le16, that one), so stores to that clutter the grep results...

Mateusz Guzik

unread,
Dec 4, 2025, 3:40:17 AM12/4/25
to Al Viro, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com
Now that I wrote this I suspect there is at least one way, regardless
of whether ocfs2 is culprit.

Suppose you are in rcu-walk and someone continuously issues mkdir,
rmdir, creat, unlink on the same pathname. Affected dentry will keep
flipping between directory, negative entry and regular.

While such fuckery will be caught with seq changes, perhaps the
intermediate state can indeed result in finding such a mismatch but
only because of a race.

I'm going to have to chew on it, I don't know if I';ll have time today
to deal with it. Worst case the fix will be to check if this is a dir
in lookup_inode_permission_may_exec instead of merely asserting on it.

Mateusz Guzik

unread,
Dec 4, 2025, 4:10:00 AM12/4/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Wed, Dec 03, 2025 at 04:07:27PM -0800, syzbot wrote:
#syz test

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..87c99149a152 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1896,6 +1896,14 @@ static inline int may_lookup(struct mnt_idmap *idmap,
{
int err, mask;

+ struct dentry *_dentry = nd->path.dentry;
+ struct inode *_inode = READ_ONCE(_dentry->d_inode);
+ if (!d_can_lookup(_dentry) || !_inode || !S_ISDIR(_inode->i_mode)) {
+ spin_lock(&_dentry->d_lock);
+ VFS_BUG_ON_INODE(d_can_lookup(_dentry) && !S_ISDIR(_dentry->d_inode->i_mode), _dentry->d_inode);
+ spin_unlock(&_dentry->d_lock);
+ }
+
mask = nd->flags & LOOKUP_RCU ? MAY_NOT_BLOCK : 0;
err = lookup_inode_permission_may_exec(idmap, nd->inode, mask);
if (likely(!err))
@@ -2527,6 +2535,14 @@ static int link_path_walk(const char *name, struct nameidata *nd)
return 0;
}

+ struct dentry *_dentry = nd->path.dentry;
+ struct inode *_inode = READ_ONCE(_dentry->d_inode);
+ if (!d_can_lookup(_dentry) || !_inode || !S_ISDIR(_inode->i_mode)) {
+ spin_lock(&_dentry->d_lock);
+ VFS_BUG_ON_INODE(d_can_lookup(_dentry) && !S_ISDIR(_dentry->d_inode->i_mode), _dentry->d_inode);
+ spin_unlock(&_dentry->d_lock);
+ }

syzbot

unread,
Dec 4, 2025, 5:13:05 AM12/4/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot tried to test the proposed patch but the build/boot failed:

SYZFAIL: failed to recv rpc

SYZFAIL: failed to recv rpc


Warning: Permanently added '10.128.0.177' (ED25519) to the list of known hosts.
2025/12/04 10:11:47 parsed 1 programs
[ 78.910789][ T5830] cgroup: Unknown subsys name 'net'
[ 79.061524][ T5830] cgroup: Unknown subsys name 'cpuset'
[ 79.071037][ T5830] cgroup: Unknown subsys name 'rlimit'
Setting up swapspace version 1, size = 127995904 bytes
[ 80.470098][ T5830] Adding 124996k swap on ./swap-file. Priority:0 extents:1 across:124996k
[ 83.462586][ T5842] soft_limit_in_bytes is deprecated and will be removed. Please report your usecase to linu...@kvack.org if you depend on this functionality.
[ 83.768296][ T1303] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 83.776268][ T1303] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 84.018160][ T36] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 84.026332][ T36] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 84.079691][ T5149] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[ 84.089721][ T5149] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[ 84.097886][ T5149] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[ 84.120020][ T5149] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[ 84.128054][ T5149] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[ 86.764273][ T5914] chnl_net:caif_netlink_parms(): no params data found
[ 86.952176][ T10] cfg80211: failed to load regulatory.db
[ 86.970822][ T5914] bridge0: port 1(bridge_slave_0) entered blocking state
[ 86.987663][ T5914] bridge0: port 1(bridge_slave_0) entered disabled state
[ 86.995387][ T5914] bridge_slave_0: entered allmulticast mode
[ 87.004680][ T5914] bridge_slave_0: entered promiscuous mode
[ 87.019597][ T5914] bridge0: port 2(bridge_slave_1) entered blocking state
[ 87.038879][ T5914] bridge0: port 2(bridge_slave_1) entered disabled state
[ 87.046589][ T5914] bridge_slave_1: entered allmulticast mode
[ 87.055691][ T5914] bridge_slave_1: entered promiscuous mode
[ 87.132629][ T5914] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 87.165237][ T5914] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[ 87.263236][ T5914] team0: Port device team_slave_0 added
[ 87.273397][ T5914] team0: Port device team_slave_1 added
[ 87.303206][ T5914] batman_adv: batadv0: Adding interface: batadv_slave_0
[ 87.311015][ T5914] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[ 87.339107][ T5914] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[ 87.352524][ T5914] batman_adv: batadv0: Adding interface: batadv_slave_1
[ 87.360592][ T5914] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[ 87.387771][ T5914] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[ 87.429283][ T5914] hsr_slave_0: entered promiscuous mode
[ 87.436383][ T5914] hsr_slave_1: entered promiscuous mode
[ 87.589866][ T5914] netdevsim netdevsim2 netdevsim0: renamed from eth0
[ 87.602532][ T5914] netdevsim netdevsim2 netdevsim1: renamed from eth1
[ 87.612793][ T5914] netdevsim netdevsim2 netdevsim2: renamed from eth2
[ 87.624712][ T5914] netdevsim netdevsim2 netdevsim3: renamed from eth3
[ 87.702119][ T5914] 8021q: adding VLAN 0 to HW filter on device bond0
[ 87.727060][ T5914] 8021q: adding VLAN 0 to HW filter on device team0
[ 87.742924][ T3460] bridge0: port 1(bridge_slave_0) entered blocking state
[ 87.750338][ T3460] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 87.767840][ T13] bridge0: port 2(bridge_slave_1) entered blocking state
[ 87.775679][ T13] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 87.951069][ T5914] 8021q: adding VLAN 0 to HW filter on device batadv0
[ 87.997784][ T5914] veth0_vlan: entered promiscuous mode
[ 88.011677][ T5914] veth1_vlan: entered promiscuous mode
[ 88.040488][ T5914] veth0_macvtap: entered promiscuous mode
[ 88.050521][ T5914] veth1_macvtap: entered promiscuous mode
[ 88.070672][ T5914] batman_adv: batadv0: Interface activated: batadv_slave_0
[ 88.086679][ T5914] batman_adv: batadv0: Interface activated: batadv_slave_1
[ 88.102451][ T1303] netdevsim netdevsim2 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[ 88.112929][ T1303] netdevsim netdevsim2 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[ 88.125269][ T1303] netdevsim netdevsim2 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[ 88.138936][ T1303] netdevsim netdevsim2 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
2025/12/04 10:11:58 executed programs: 0
[ 88.265933][ T5149] Bluetooth: hci0: unexpected cc 0x0c03 length: 249 > 1
[ 88.275000][ T5149] Bluetooth: hci0: unexpected cc 0x1003 length: 249 > 9
[ 88.283928][ T5149] Bluetooth: hci0: unexpected cc 0x1001 length: 249 > 9
[ 88.294110][ T5149] Bluetooth: hci0: unexpected cc 0x0c23 length: 249 > 4
[ 88.303601][ T5149] Bluetooth: hci0: unexpected cc 0x0c38 length: 249 > 2
[ 88.516655][ T5945] chnl_net:caif_netlink_parms(): no params data found
[ 88.589071][ T5945] bridge0: port 1(bridge_slave_0) entered blocking state
[ 88.597183][ T5945] bridge0: port 1(bridge_slave_0) entered disabled state
[ 88.605206][ T5945] bridge_slave_0: entered allmulticast mode
[ 88.613430][ T5945] bridge_slave_0: entered promiscuous mode
[ 88.621792][ T5945] bridge0: port 2(bridge_slave_1) entered blocking state
[ 88.629190][ T5945] bridge0: port 2(bridge_slave_1) entered disabled state
[ 88.636388][ T5945] bridge_slave_1: entered allmulticast mode
[ 88.644502][ T5945] bridge_slave_1: entered promiscuous mode
[ 88.679834][ T5945] bond0: (slave bond_slave_0): Enslaving as an active interface with an up link
[ 88.693258][ T5945] bond0: (slave bond_slave_1): Enslaving as an active interface with an up link
[ 88.729876][ T5945] team0: Port device team_slave_0 added
[ 88.739741][ T5945] team0: Port device team_slave_1 added
[ 88.770447][ T5945] batman_adv: batadv0: Adding interface: batadv_slave_0
[ 88.778982][ T5945] batman_adv: batadv0: The MTU of interface batadv_slave_0 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[ 88.806119][ T5945] batman_adv: batadv0: Not using interface batadv_slave_0 (retrying later): interface not active
[ 88.819400][ T5945] batman_adv: batadv0: Adding interface: batadv_slave_1
[ 88.826898][ T5945] batman_adv: batadv0: The MTU of interface batadv_slave_1 is too small (1500) to handle the transport of batman-adv packets. Packets going over this interface will be fragmented on layer2 which could impact the performance. Setting the MTU to 1532 would solve the problem.
[ 88.854775][ T5945] batman_adv: batadv0: Not using interface batadv_slave_1 (retrying later): interface not active
[ 88.906117][ T5945] hsr_slave_0: entered promiscuous mode
[ 88.913653][ T5945] hsr_slave_1: entered promiscuous mode
[ 88.920136][ T5945] debugfs: 'hsr0' already exists in 'hsr'
[ 88.926073][ T5945] Cannot create hsr debugfs directory
[ 89.100742][ T5945] netdevsim netdevsim0 netdevsim0: renamed from eth0
[ 89.113003][ T5945] netdevsim netdevsim0 netdevsim1: renamed from eth1
[ 89.123282][ T5945] netdevsim netdevsim0 netdevsim2: renamed from eth2
[ 89.134611][ T5945] netdevsim netdevsim0 netdevsim3: renamed from eth3
[ 89.164880][ T5945] bridge0: port 2(bridge_slave_1) entered blocking state
[ 89.172220][ T5945] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 89.180595][ T5945] bridge0: port 1(bridge_slave_0) entered blocking state
[ 89.188151][ T5945] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 89.201467][ T13] bridge0: port 1(bridge_slave_0) entered disabled state
[ 89.210194][ T13] bridge0: port 2(bridge_slave_1) entered disabled state
[ 89.271133][ T5945] 8021q: adding VLAN 0 to HW filter on device bond0
[ 89.292385][ T5945] 8021q: adding VLAN 0 to HW filter on device team0
[ 89.304472][ T13] bridge0: port 1(bridge_slave_0) entered blocking state
[ 89.311731][ T13] bridge0: port 1(bridge_slave_0) entered forwarding state
[ 89.328520][ T1303] bridge0: port 2(bridge_slave_1) entered blocking state
[ 89.336026][ T1303] bridge0: port 2(bridge_slave_1) entered forwarding state
[ 89.510299][ T5945] 8021q: adding VLAN 0 to HW filter on device batadv0
[ 89.555268][ T5945] veth0_vlan: entered promiscuous mode
[ 89.567506][ T5945] veth1_vlan: entered promiscuous mode
[ 89.602029][ T5945] veth0_macvtap: entered promiscuous mode
[ 89.611643][ T5945] veth1_macvtap: entered promiscuous mode
[ 89.630992][ T5945] batman_adv: batadv0: Interface activated: batadv_slave_0
[ 89.645744][ T5945] batman_adv: batadv0: Interface activated: batadv_slave_1
[ 89.661459][ T13] netdevsim netdevsim0 netdevsim0: set [1, 0] type 2 family 0 port 6081 - 0
[ 89.675047][ T13] netdevsim netdevsim0 netdevsim1: set [1, 0] type 2 family 0 port 6081 - 0
[ 89.685863][ T13] netdevsim netdevsim0 netdevsim2: set [1, 0] type 2 family 0 port 6081 - 0
[ 89.695952][ T13] netdevsim netdevsim0 netdevsim3: set [1, 0] type 2 family 0 port 6081 - 0
[ 89.763899][ T3460] wlan0: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 89.773242][ T3460] wlan0: Creating new IBSS network, BSSID 50:50:50:50:50:50
[ 89.806188][ T13] wlan1: Created IBSS using preconfigured BSSID 50:50:50:50:50:50
[ 89.814827][ T13] wlan1: Creating new IBSS network, BSSID 50:50:50:50:50:50
SYZFAIL: failed to recv rpc
[ 90.239026][ T13] netdevsim netdevsim2 netdevsim3 (unregistering): unset [1, 0] type 2 family 0 port 6081 - 0


syzkaller build log:
go env (err=<nil>)
AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE='auto'
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/syzkaller/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/syzkaller/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build2089224975=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/syzkaller/jobs-2/linux/gopath/src/github.com/google/syzkaller/go.mod'
GOMODCACHE='/syzkaller/jobs-2/linux/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/syzkaller/jobs-2/linux/gopath'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/syzkaller/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'

git status (err=<nil>)
HEAD detached at d6526ea3e
nothing to commit, working tree clean


tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
go list -f '{{.Stale}}' -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=d6526ea3e6ad9081c902859bbb80f9f840377cb4 -X github.com/google/syzkaller/prog.gitRevisionDate=20251126-113115" ./sys/syz-sysgen | grep -q false || go install -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=d6526ea3e6ad9081c902859bbb80f9f840377cb4 -X github.com/google/syzkaller/prog.gitRevisionDate=20251126-113115" ./sys/syz-sysgen
make .descriptions
tput: No value for $TERM and no -T specified
tput: No value for $TERM and no -T specified
Makefile:31: run command via tools/syz-env for best compatibility, see:
Makefile:32: https://github.com/google/syzkaller/blob/master/docs/contributing.md#using-syz-env
bin/syz-sysgen
touch .descriptions
GOOS=linux GOARCH=amd64 go build -ldflags="-s -w -X github.com/google/syzkaller/prog.GitRevision=d6526ea3e6ad9081c902859bbb80f9f840377cb4 -X github.com/google/syzkaller/prog.gitRevisionDate=20251126-113115" -o ./bin/linux_amd64/syz-execprog github.com/google/syzkaller/tools/syz-execprog
mkdir -p ./bin/linux_amd64
g++ -o ./bin/linux_amd64/syz-executor executor/executor.cc \
-m64 -O2 -pthread -Wall -Werror -Wparentheses -Wunused-const-variable -Wframe-larger-than=16384 -Wno-stringop-overflow -Wno-array-bounds -Wno-format-overflow -Wno-unused-but-set-variable -Wno-unused-command-line-argument -static-pie -std=c++17 -I. -Iexecutor/_include -DGOOS_linux=1 -DGOARCH_amd64=1 \
-DHOSTGOOS_linux=1 -DGIT_REVISION=\"d6526ea3e6ad9081c902859bbb80f9f840377cb4\"
/usr/bin/ld: /tmp/cc9mWJPn.o: in function `Connection::Connect(char const*, char const*)':
executor.cc:(.text._ZN10Connection7ConnectEPKcS1_[_ZN10Connection7ConnectEPKcS1_]+0x104): warning: Using 'gethostbyname' in statically linked applications requires at runtime the shared libraries from the glibc version used for linking
./tools/check-syzos.sh 2>/dev/null



Tested on:

commit: bc04acf4 Add linux-next specific files for 20251204
git tree: linux-next
kernel config: https://syzkaller.appspot.com/x/.config?x=a94030c847137a18
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=1731d01a580000

Mateusz Guzik

unread,
Dec 4, 2025, 5:15:30 AM12/4/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
syzbot had an internal failure, so let's try again

syzbot

unread,
Dec 4, 2025, 6:56:03 AM12/4/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
kernel BUG in link_path_walk

VFS_BUG_ON_INODE(d_can_lookup(_dentry) && !S_ISDIR(_dentry->d_inode->i_mode)) encountered for inode ffff888074eca4f8
fs ocfs2 mode 100000 opflags 0x2 flags 0x20 state 0x0 count 2
------------[ cut here ]------------
kernel BUG at fs/namei.c:2542!
Oops: invalid opcode: 0000 [#1] SMP KASAN PTI
CPU: 1 UID: 0 PID: 7668 Comm: syz.0.211 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:link_path_walk+0x1d7f/0x1d90 fs/namei.c:2542
Code: e8 a6 16 ea fe 90 0f 0b e8 de 8c 83 ff 41 89 ef e9 d2 fc ff ff e8 d1 8c 83 ff 4c 89 ff 48 c7 c6 40 d6 79 8b e8 82 16 ea fe 90 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
RSP: 0018:ffffc9000c5a78a0 EFLAGS: 00010282
RAX: 00000000000000b2 RBX: ffffc9000c5a7c20 RCX: 68650f632580b300
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff888011640020 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffffbfff1bbae20 R12: 0000000000008000
R13: ffffc9000c5a7c28 R14: ffff888074f0a0b8 R15: ffff888074eca4f8
FS: 00007f56a8b4f6c0(0000) GS:ffff888125f3a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f56a73fdf98 CR3: 00000000742f6000 CR4: 00000000003526f0
Call Trace:
<TASK>
path_openat+0x2b3/0x3dd0 fs/namei.c:4799
do_filp_open+0x1fa/0x410 fs/namei.c:4830
do_sys_openat2+0x121/0x200 fs/open.c:1430
do_sys_open fs/open.c:1436 [inline]
__do_sys_open fs/open.c:1444 [inline]
__se_sys_open fs/open.c:1440 [inline]
__x64_sys_open+0x11e/0x150 fs/open.c:1440
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f56a7d8f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f56a8b4f038 EFLAGS: 00000246 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 00007f56a7fe5fa0 RCX: 00007f56a7d8f749
RDX: 0000000000000000 RSI: 0000000000145142 RDI: 0000200000000240
RBP: 00007f56a7e13f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f56a7fe6038 R14: 00007f56a7fe5fa0 R15: 00007ffd61dc1878
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:link_path_walk+0x1d7f/0x1d90 fs/namei.c:2542
Code: e8 a6 16 ea fe 90 0f 0b e8 de 8c 83 ff 41 89 ef e9 d2 fc ff ff e8 d1 8c 83 ff 4c 89 ff 48 c7 c6 40 d6 79 8b e8 82 16 ea fe 90 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
RSP: 0018:ffffc9000c5a78a0 EFLAGS: 00010282
RAX: 00000000000000b2 RBX: ffffc9000c5a7c20 RCX: 68650f632580b300
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffff888011640020 R08: 0000000000000003 R09: 0000000000000004
R10: dffffc0000000000 R11: fffffbfff1bbae20 R12: 0000000000008000
R13: ffffc9000c5a7c28 R14: ffff888074f0a0b8 R15: ffff888074eca4f8
FS: 00007f56a8b4f6c0(0000) GS:ffff888125f3a000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f56a73fdf98 CR3: 00000000742f6000 CR4: 00000000003526f0


Tested on:

commit: bc04acf4 Add linux-next specific files for 20251204
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=107bd01a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=a94030c847137a18
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=1377401a580000

Mateusz Guzik

unread,
Dec 4, 2025, 6:58:44 AM12/4/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Thu, Dec 4, 2025 at 12:56 PM syzbot
<syzbot+d222f4...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> kernel BUG in link_path_walk
>
> VFS_BUG_ON_INODE(d_can_lookup(_dentry) && !S_ISDIR(_dentry->d_inode->i_mode)) encountered for inode ffff888074eca4f8
> fs ocfs2 mode 100000 opflags 0x2 flags 0x20 state 0x0 count 2

note the patch at hand made sure to avoid transient states by taking a
lock on the dentry:
+ struct dentry *_dentry = nd->path.dentry;
+ struct inode *_inode = READ_ONCE(_dentry->d_inode);
+ if (!d_can_lookup(_dentry) || !_inode || !S_ISDIR(_inode->i_mode)) {
+ spin_lock(&_dentry->d_lock);
+ VFS_BUG_ON_INODE(d_can_lookup(_dentry) &&
!S_ISDIR(_dentry->d_inode->i_mode), _dentry->d_inode);
+ spin_unlock(&_dentry->d_lock);
+ }

So the state *is* indeed bogus and this is most likely something ocfs2-internal.

I'm buggering off this report.

Tetsuo Handa

unread,
Dec 10, 2025, 4:45:48 AM12/10/25
to Mateusz Guzik, Al Viro, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
to S_IFREG. Since make_bad_inode() might be called after an inode is fully
constructed, make_bad_inode() should not needlessly change file type.

Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
---
Should we implement all callbacks (except get_offset_ctx callback which is
currently used by only tmpfs which does not call make_bad_inode()) within
bad_inode_ops, for there might be a callback which is expected to be non-NULL
for !S_IFREG types? Implementing missing callbacks is good for eliminating
possibility of NULL function pointer call. Since VFS is using

if (!inode->i_op->foo)
return error;
inode->i_op->foo();

pattern instead of

pFoo = READ_ONCE(inode->i_op->foo)
if (!pFoo)
return error;
pFoo();

pattern, suddenly replacing "one i_op with i_op->foo != NULL" with "another
i_op with i_op->foo == NULL" has possibility of NULL pointer function call
(e.g. https://lkml.kernel.org/r/18a58415-4aa9-4cba...@I-love.SAKURA.ne.jp ).
If we implement missing callbacks, e.g. vfs_fileattr_get() will start
calling security_inode_file_getattr() on bad inode, but we can eliminate
possibility of inode->i_op->fileattr_get == NULL when make_bad_inode() is
called from security_inode_file_getattr() for some reason.

fs/bad_inode.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 0ef9bcb744dd..ff6c2daecd1c 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -207,7 +207,19 @@ void make_bad_inode(struct inode *inode)
{
remove_inode_hash(inode);

- inode->i_mode = S_IFREG;
+ switch (inode->i_mode & S_IFMT) {
+ case S_IFREG:
+ case S_IFDIR:
+ case S_IFLNK:
+ case S_IFCHR:
+ case S_IFBLK:
+ case S_IFIFO:
+ case S_IFSOCK:
+ inode->i_mode &= S_IFMT;
+ break;
+ default:
+ inode->i_mode = S_IFREG;
+ }
simple_inode_init_ts(inode);
inode->i_op = &bad_inode_ops;
inode->i_opflags &= ~IOP_XATTR;
--
2.47.3


Jan Kara

unread,
Dec 10, 2025, 5:09:19 AM12/10/25
to Tetsuo Handa, Mateusz Guzik, Al Viro, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed 10-12-25 18:45:26, Tetsuo Handa wrote:
> syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> constructed, make_bad_inode() should not needlessly change file type.
>
> Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
> Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>

No. make_bad_inode() must not be called once the inode is fully visible
because that can cause all sorts of fun. That function is really only good
for handling a situation when read of an inode from the disk failed or
similar early error paths. It would be great if make_bad_inode() could do
something like:

VFS_BUG_ON_INODE(!(inode_state_read_once(inode) & I_NEW));

but sadly that is not currently possible because inodes start with i_state
set to 0 and some places do call make_bad_inode() before I_NEW is set in
i_state. Matheusz wanted to clean that up a bit AFAIK.

Until the cleanup is done, perhaps we could add:

VFS_BUG_ON_INODE(inode->i_dentry->first);

to make_bad_inode() and watch the fireworks from syzbot. But at least the
bugs would be attributed to the place where they are happening.

Honza
--
Jan Kara <ja...@suse.com>
SUSE Labs, CR

Mateusz Guzik

unread,
Dec 10, 2025, 5:09:40 AM12/10/25
to Tetsuo Handa, Al Viro, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed, Dec 10, 2025 at 10:45 AM Tetsuo Handa
<penguin...@i-love.sakura.ne.jp> wrote:
>
> syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> constructed, make_bad_inode() should not needlessly change file type.
>

ouch

So let's say calls to make_bad_inode *after* d_instantiate are unavoidable.

While screwing around with inode type for bogus inodes has merit,
switching things up from under dentry is bogus in its own right and
the choice of ISREG is questionable at best -- as is the call
introduces internally inconsistent state, in the case of the original
report a dentry claiming it's a directory pointing to a regular inode.

The non-screwed way of handling this would introduce a known BAD inode
type and patch up dentries as needed as well, but that might be a lot
of churn for not that much benefit.

As is, I don't know if leaving the type unchanged is safe -- there
might be nasty cases which depend on the change.

At the same time I claim the assert I introduced is mandatory.

Absent thorough audit, I think the pragmatic way forward for the time
being is to give code a chance to assert correctness and adjust the
assert in lookup code accordingly.

Right now that's not possible since the reassignment of type happens
without the inode spinlock held. Since make_bad_inode() calls into
remove_inode_hash which takes the spinlock, it should be safe to also
take it around the reassignment.

Then code wishing to assert on type race-free can take the spinlock to
stabilize it.

Mateusz Guzik

unread,
Dec 10, 2025, 5:24:56 AM12/10/25
to Jan Kara, Tetsuo Handa, Al Viro, syzbot, bra...@kernel.org, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed, Dec 10, 2025 at 11:09 AM Jan Kara <ja...@suse.cz> wrote:
>
> On Wed 10-12-25 18:45:26, Tetsuo Handa wrote:
> > syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> > introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> > handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> > to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> > constructed, make_bad_inode() should not needlessly change file type.
> >
> > Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
> > Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
> > Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
>
> No. make_bad_inode() must not be called once the inode is fully visible
> because that can cause all sorts of fun. That function is really only good
> for handling a situation when read of an inode from the disk failed or
> similar early error paths. It would be great if make_bad_inode() could do
> something like:
>
> VFS_BUG_ON_INODE(!(inode_state_read_once(inode) & I_NEW));
>
> but sadly that is not currently possible because inodes start with i_state
> set to 0 and some places do call make_bad_inode() before I_NEW is set in
> i_state. Matheusz wanted to clean that up a bit AFAIK.
>

[ most unfortunate timing, I just sent an e-mail with an assumption
that make_bad_inode() has to be callable after the inode+dentries got
published. :> ]

I'm delighted to see the call is considered bogus.

As for being able to assert on it, I noted the current flag handling
for lifecycle tracking is unhelpful.

Per your response, i_state == 0 is overloaded to mean the inode is
fully sorted out *and* that it is brand new.

Instead clear-cut indicators are needed to track where the inode is in
its lifecycle.

I proposed 2 ways: a dedicated enum or fucking around with flags.

Indeed the easiest stepping stone for the time being would be to push
up I_NEW to alloc_inode and assert on it in places which set the flag.
I'm going to cook it up.

> Until the cleanup is done, perhaps we could add:
>
> VFS_BUG_ON_INODE(inode->i_dentry->first);
>
> to make_bad_inode() and watch the fireworks from syzbot. But at least the
> bugs would be attributed to the place where they are happening.
>

Note the assert which is currently tripping over is very much
necessary for correctness as the new routine skips checking the type
on its own.

Thus the issue needs to get solved for 6.19.

Trying to weed out all of the make_bad_inode callers is probably too
much for the release cycle.

So I stand by patching this up to a state where the lookup routine can
reliably check that this is what happened, should it find a non-dir
inode and doing a proper fix for the next merge window.

Mateusz Guzik

unread,
Dec 10, 2025, 6:00:36 AM12/10/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
the spin lock is needed because there are *two* fields being checked.
I am not adding explicit memory barriers for smething like this.

#syz test

diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 0ef9bcb744dd..8e9127d4dcc1 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -207,11 +207,17 @@ void make_bad_inode(struct inode *inode)
{
remove_inode_hash(inode);

+ /*
+ * Taking the spinlock is a temporary hack to let lookup assert on the state,
+ * see lookup_inode_permission_may_exec().
+ */
+ spin_lock(&inode->i_lock);
inode->i_mode = S_IFREG;
simple_inode_init_ts(inode);
inode->i_op = &bad_inode_ops;
inode->i_opflags &= ~IOP_XATTR;
inode->i_fop = &bad_file_ops;
+ spin_unlock(&inode->i_lock);
}
EXPORT_SYMBOL(make_bad_inode);

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..f2a0f858b7d6 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -626,9 +626,26 @@ EXPORT_SYMBOL(inode_permission);
static __always_inline int lookup_inode_permission_may_exec(struct mnt_idmap *idmap,
struct inode *inode, int mask)
{
- /* Lookup already checked this to return -ENOTDIR */
- VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode), inode);
VFS_BUG_ON((mask & ~MAY_NOT_BLOCK) != 0);
+#ifdef CONFIG_DEBUG_VFS
+ /*
+ * We skip the type check on the assumption this is a directory, which was
+ * checked for by our caller.
+ *
+ * However, there are bogus consumers of make_bad_inode() which can mess this up,
+ * to be fixed soon(tm).
+ *
+ * In the meantime make sure we are dealing with the expected state before tripping
+ * over. If this *is* a "bad inode", the resulting state is bug-compatible with
+ * historical behavior. See the previous remark about sorting this out.
+ */
+ if (!S_ISDIR(inode->i_mode)) {
+ spin_lock(&inode->i_lock);
+ if (!is_bad_inode(inode))
+ VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode), inode);
+ spin_unlock(&inode->i_lock);
+ }
+#endif

mask |= MAY_EXEC;

syzbot

unread,
Dec 10, 2025, 6:19:05 AM12/10/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
kernel BUG in ocfs2_journal_toggle_dirty

(syz.0.554,7359,0):ocfs2_assign_bh:2417 ERROR: status = -30
(syz.0.554,7359,0):ocfs2_inode_lock_full_nested:2512 ERROR: status = -30
(syz.0.554,7359,0):ocfs2_shutdown_local_alloc:412 ERROR: status = -30
------------[ cut here ]------------
kernel BUG at fs/ocfs2/journal.c:1027!
Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI
CPU: 0 UID: 0 PID: 7359 Comm: syz.0.554 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
RIP: 0010:ocfs2_journal_toggle_dirty+0x33f/0x350 fs/ocfs2/journal.c:1027
Code: ff ff e8 44 bf b0 07 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 4e fe ff ff 48 89 df e8 fc 5a 7d fe e9 41 fe ff ff e8 f2 7e 15 fe 90 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
RSP: 0018:ffffc9000d436fc0 EFLAGS: 00010293
RAX: ffffffff83ac415e RBX: 00000000ffffffff RCX: ffff88803ae90000
RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
RBP: ffffc9000d437070 R08: ffffffff8fa21977 R09: 1ffffffff1f4432e
R10: dffffc0000000000 R11: fffffbfff1f4432f R12: 1ffff110024dca22
R13: ffff88804353b600 R14: ffff888011a24000 R15: ffff8880126e5110
FS: 00007ff6959f66c0(0000) GS:ffff88808d22f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f3a33db2000 CR3: 000000005209e000 CR4: 0000000000352ef0
Call Trace:
<TASK>
ocfs2_journal_shutdown+0x524/0xab0 fs/ocfs2/journal.c:1109
ocfs2_mount_volume fs/ocfs2/super.c:1785 [inline]
ocfs2_fill_super+0x5574/0x63a0 fs/ocfs2/super.c:1083
get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1691
vfs_get_tree+0x92/0x2a0 fs/super.c:1751
fc_mount fs/namespace.c:1199 [inline]
do_new_mount_fc fs/namespace.c:3636 [inline]
do_new_mount+0x302/0xa10 fs/namespace.c:3712
do_mount fs/namespace.c:4035 [inline]
__do_sys_mount fs/namespace.c:4224 [inline]
__se_sys_mount+0x313/0x410 fs/namespace.c:4201
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7ff696390f6a
Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb a6 e8 de 1a 00 00 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ff6959f5e68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007ff6959f5ef0 RCX: 00007ff696390f6a
RDX: 0000200000004440 RSI: 0000200000000040 RDI: 00007ff6959f5eb0
RBP: 0000200000004440 R08: 00007ff6959f5ef0 R09: 00000000000008c0
R10: 00000000000008c0 R11: 0000000000000246 R12: 0000200000000040
R13: 00007ff6959f5eb0 R14: 0000000000004421 R15: 0000200000000080
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:ocfs2_journal_toggle_dirty+0x33f/0x350 fs/ocfs2/journal.c:1027
Code: ff ff e8 44 bf b0 07 89 d9 80 e1 07 80 c1 03 38 c1 0f 8c 4e fe ff ff 48 89 df e8 fc 5a 7d fe e9 41 fe ff ff e8 f2 7e 15 fe 90 <0f> 0b 66 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 90 90 90 90 90
RSP: 0018:ffffc9000d436fc0 EFLAGS: 00010293
RAX: ffffffff83ac415e RBX: 00000000ffffffff RCX: ffff88803ae90000
RDX: 0000000000000000 RSI: 00000000ffffffff RDI: 0000000000000000
RBP: ffffc9000d437070 R08: ffffffff8fa21977 R09: 1ffffffff1f4432e
R10: dffffc0000000000 R11: fffffbfff1f4432f R12: 1ffff110024dca22
R13: ffff88804353b600 R14: ffff888011a24000 R15: ffff8880126e5110
FS: 00007ff6959f66c0(0000) GS:ffff88808d22f000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa8e1fff000 CR3: 000000005209e000 CR4: 0000000000352ef0


Tested on:

commit: 0048fbb4 Merge tag 'locking-futex-2025-12-10' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1477da1a580000
kernel config: https://syzkaller.appspot.com/x/.config?x=de48dccdf203ea90
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=11e2ea1a580000

Mateusz Guzik

unread,
Dec 10, 2025, 6:28:23 AM12/10/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
On Wed, Dec 10, 2025 at 12:25 PM syzbot
<syzbot+d222f4...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> kernel BUG in ocfs2_journal_toggle_dirty
>
> (syz.0.554,7359,0):ocfs2_assign_bh:2417 ERROR: status = -30
> (syz.0.554,7359,0):ocfs2_inode_lock_full_nested:2512 ERROR: status = -30
> (syz.0.554,7359,0):ocfs2_shutdown_local_alloc:412 ERROR: status = -30
> ------------[ cut here ]------------
> kernel BUG at fs/ocfs2/journal.c:1027!
> <TASK>
> ocfs2_journal_shutdown+0x524/0xab0 fs/ocfs2/journal.c:1109
> ocfs2_mount_volume fs/ocfs2/super.c:1785 [inline]
> ocfs2_fill_super+0x5574/0x63a0 fs/ocfs2/super.c:1083
> get_tree_bdev_flags+0x40e/0x4d0 fs/super.c:1691
> vfs_get_tree+0x92/0x2a0 fs/super.c:1751
> fc_mount fs/namespace.c:1199 [inline]
> do_new_mount_fc fs/namespace.c:3636 [inline]
> do_new_mount+0x302/0xa10 fs/namespace.c:3712
> do_mount fs/namespace.c:4035 [inline]
> __do_sys_mount fs/namespace.c:4224 [inline]
> __se_sys_mount+0x313/0x410 fs/namespace.c:4201
> do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
> do_syscall_64+0xfa/0xf80 arch/x86/entry/syscall_64.c:94
> entry_SYSCALL_64_after_hwframe+0x77/0x7f

That's a different bug.

Al Viro

unread,
Dec 10, 2025, 10:35:18 AM12/10/25
to Mateusz Guzik, Tetsuo Handa, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed, Dec 10, 2025 at 11:09:24AM +0100, Mateusz Guzik wrote:
> On Wed, Dec 10, 2025 at 10:45 AM Tetsuo Handa
> <penguin...@i-love.sakura.ne.jp> wrote:
> >
> > syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> > introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> > handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> > to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> > constructed, make_bad_inode() should not needlessly change file type.
> >
>
> ouch
>
> So let's say calls to make_bad_inode *after* d_instantiate are unavoidable.

... and each one is a bug.

Mateusz Guzik

unread,
Dec 10, 2025, 2:44:28 PM12/10/25
to syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Justin Case suggested the following:

#syz test

diff --git a/fs/bad_inode.c b/fs/bad_inode.c
index 0ef9bcb744dd..8e9127d4dcc1 100644
--- a/fs/bad_inode.c
+++ b/fs/bad_inode.c
@@ -207,11 +207,17 @@ void make_bad_inode(struct inode *inode)
{
remove_inode_hash(inode);

+ /*
+ * Taking the spinlock is a temporary hack to let lookup assert on the state,
+ * see lookup_inode_permission_may_exec().
+ */
+ spin_lock(&inode->i_lock);
inode->i_mode = S_IFREG;
simple_inode_init_ts(inode);
inode->i_op = &bad_inode_ops;
inode->i_opflags &= ~IOP_XATTR;
inode->i_fop = &bad_file_ops;
+ spin_unlock(&inode->i_lock);
}
EXPORT_SYMBOL(make_bad_inode);

diff --git a/fs/namei.c b/fs/namei.c
index bf0f66f0e9b9..79cc14d635b5 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -626,12 +626,31 @@ EXPORT_SYMBOL(inode_permission);
+ return inode_permission(idmap, inode, mask);
+#if 0
if (unlikely(!(inode->i_opflags & (IOP_FASTPERM | IOP_FASTPERM_MAY_EXEC))))
return inode_permission(idmap, inode, mask);

@@ -639,6 +658,7 @@ static __always_inline int lookup_inode_permission_may_exec(struct mnt_idmap *id
return inode_permission(idmap, inode, mask);

return security_inode_permission(inode, mask);
+#endif
}

/**

syzbot

unread,
Dec 10, 2025, 3:06:06 PM12/10/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
Tested-by: syzbot+d222f4...@syzkaller.appspotmail.com

Tested on:

commit: 0048fbb4 Merge tag 'locking-futex-2025-12-10' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15607992580000
kernel config: https://syzkaller.appspot.com/x/.config?x=de48dccdf203ea90
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=14261a1a580000

Note: testing is done by a robot and is best-effort only.

Al Viro

unread,
Dec 10, 2025, 3:43:24 PM12/10/25
to Mateusz Guzik, Tetsuo Handa, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
FWIW, I'm very tempted to fold make_bad_inode() into iget_failed(). Other
callers tend to be either pointless (e.g. ext2_new_inode() after reaching
fail: label - we only get there if inode has never reached inode hash
table; make_bad_inode() in there should've been gone for a long time)
or outright broken.

There's not a lot of callers, thankfully; I'm going through those at the
moment, but so far the impression is that we should be able to simply bury
the damn thing.

Al Viro

unread,
Dec 10, 2025, 3:54:59 PM12/10/25
to Mateusz Guzik, Tetsuo Handa, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
While we are at it, 73861970938a "minixfs: Verify inode mode when loading from
disk" that introduced one of those is seriously misguided - sanity check belongs
in V1_minix_iget/V2_minix_iget, and should be handled there the same way we
deal with zero i_nlink.

We really ought to take that function out - as it is, it's an attractive
nuisance...

Al Viro

unread,
Dec 10, 2025, 4:13:47 PM12/10/25
to Mateusz Guzik, Jan Kara, Tetsuo Handa, syzbot, bra...@kernel.org, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed, Dec 10, 2025 at 11:24:40AM +0100, Mateusz Guzik wrote:

> I'm delighted to see the call is considered bogus.
>
> As for being able to assert on it, I noted the current flag handling
> for lifecycle tracking is unhelpful.
>
> Per your response, i_state == 0 is overloaded to mean the inode is
> fully sorted out *and* that it is brand new.
>
> Instead clear-cut indicators are needed to track where the inode is in
> its lifecycle.
>
> I proposed 2 ways: a dedicated enum or fucking around with flags.
>
> Indeed the easiest stepping stone for the time being would be to push
> up I_NEW to alloc_inode and assert on it in places which set the flag.
> I'm going to cook it up.

You are misinterpreting what I_NEW is about - it is badly named, no
arguments here, but it's _not_ "inode is new".

It's "it's in inode hash, but if you find it on lookup, you'll need to wait -
it's not entirely set up".

A plenty of inodes never enter that state at all. Hell, consider pipes.
Or sockets. Or anything on procfs. Or sysfs, or...

We never look those up by inumber and there'd be no sane way to do that
anyway. They never get hashed, nor should they.

Al Viro

unread,
Dec 10, 2025, 4:32:45 PM12/10/25
to Mateusz Guzik, Tetsuo Handa, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
In this case I strongly suspect that it had been introduced in

commit 58b6fcd2ab34399258dc509f701d0986a8e0bcaa
Author: Ahmet Eray Karadag <eray...@gmail.com>
Date: Tue Nov 18 03:18:34 2025 +0300

ocfs2: mark inode bad upon validation failure during read

Folks, make_bad_inode() is *NOT* magic and having it anywhere in "hardening"
patch is a major red flag. Please, don't do it, and I would recommend
reverting that commit, possibly along with the rest of the series.

Al Viro

unread,
Dec 10, 2025, 4:47:12 PM12/10/25
to Mateusz Guzik, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com
#syz test

commit 9c7d3d572d0a67484e9cbe178184cfd9a89aa430
Author: Al Viro <vi...@zeniv.linux.org.uk>
Date: Wed Dec 10 16:44:53 2025 -0500

Revert "ocfs2: mark inode bad upon validation failure during read"

This reverts commit 58b6fcd2ab34399258dc509f701d0986a8e0bcaa.

You can't use make_bad_inode() on live inodes.

diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 8340525e5589..53d649436017 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -1708,8 +1708,6 @@ int ocfs2_read_inode_block_full(struct inode *inode, struct buffer_head **bh,
rc = ocfs2_read_blocks(INODE_CACHE(inode), OCFS2_I(inode)->ip_blkno,
1, &tmp, flags, ocfs2_validate_inode_block);

- if (rc < 0)
- make_bad_inode(inode);
/* If ocfs2_read_blocks() got us a new bh, pass it up. */
if (!rc && !*bh)
*bh = tmp;

syzbot

unread,
Dec 10, 2025, 5:09:05 PM12/10/25
to bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, mjg...@gmail.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, vi...@zeniv.linux.org.uk
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
Tested-by: syzbot+d222f4...@syzkaller.appspotmail.com

Tested on:

commit: 0048fbb4 Merge tag 'locking-futex-2025-12-10' of git:/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=107c7992580000
kernel config: https://syzkaller.appspot.com/x/.config?x=de48dccdf203ea90
dashboard link: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
compiler: Debian clang version 20.1.8 (++20250708063551+0c9f909b7976-1~exp1~20250708183702.136), Debian LLD 20.1.8
patch: https://syzkaller.appspot.com/x/patch.diff?x=10c51a1a580000

Mateusz Guzik

unread,
Dec 10, 2025, 6:27:21 PM12/10/25
to Al Viro, Jan Kara, Tetsuo Handa, syzbot, bra...@kernel.org, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Wed, Dec 10, 2025 at 10:13 PM Al Viro <vi...@zeniv.linux.org.uk> wrote:
>
> On Wed, Dec 10, 2025 at 11:24:40AM +0100, Mateusz Guzik wrote:
>
> > I'm delighted to see the call is considered bogus.
> >
> > As for being able to assert on it, I noted the current flag handling
> > for lifecycle tracking is unhelpful.
> >
> > Per your response, i_state == 0 is overloaded to mean the inode is
> > fully sorted out *and* that it is brand new.
> >
> > Instead clear-cut indicators are needed to track where the inode is in
> > its lifecycle.
> >
> > I proposed 2 ways: a dedicated enum or fucking around with flags.
> >
> > Indeed the easiest stepping stone for the time being would be to push
> > up I_NEW to alloc_inode and assert on it in places which set the flag.
> > I'm going to cook it up.
>
> You are misinterpreting what I_NEW is about - it is badly named, no
> arguments here, but it's _not_ "inode is new".
>
> It's "it's in inode hash, but if you find it on lookup, you'll need to wait -
> it's not entirely set up".
>

Comments in the hash code make it pretty clear. The above is a part of
a bigger picture, which I already talked about in the 'light refcount'
patchset or whatever the name was.

The general problem statement is that the VFS layer suffers from a
chronic lack of assertions, which in turn helps people add latent bugs
(case in point: make_bad_inode() armed with asserts on state would
have blown up immediately and this entire thread would have been
avoided).

One of the things missing to create good coverage is reliable inode
lifecycle tracking. Currently *some of it* can be determined with some
of the flags in ->i_state, but even there are important states which
are straight up missing, notably whether the filesystem claims the
inode is ready to use *or* not ready at all (not even in the hash) is
indistinguishable by ->i_state. Trying to figure it out by other means
is avoidable tech debt. Bare minimum denoted states have to
distinguish between just allocated, creation aborted, ready to use, in
teardown and finally torn down.

An important part is validating whether inode at hand adheres to the
API contract when the filesystem claims it is ready. For example
->i_mode has to have a valid type set, but syzkaller convinced ntfs to
let an inode with invalid mode get out, which later resulted in a warn
in execve code because may_open() did not apply any of the checks to
it. This is the kind of a problem which can and should be checked for
before the inode is allowed to be used.

Another benefit is that some of the state can be pre-computed. For example this:
static inline int do_inode_permission(struct mnt_idmap *idmap,
struct inode *inode, int mask)
{
if (unlikely(!(inode->i_opflags & IOP_FASTPERM))) {
if (likely(inode->i_op->permission))
return inode->i_op->permission(idmap, inode,
mask);

/* This gets set once for the inode lifetime */
spin_lock(&inode->i_lock);
inode->i_opflags |= IOP_FASTPERM;
spin_unlock(&inode->i_lock);
}
return generic_permission(idmap, inode, mask);
}

... will stop setting the flag as this aspect will be already sorted
out. There is another crapper of the sort in vfs_readlink and one can
suspect more will show up over time.

Back to lifecycle tracking, I_NEW could change semantics to mean "this
inode *is not* ready for use yet". unlock_new_inode() already serves
as a "this inode *is* ready for use" indicator, it just happens to not
be mandatory to call. Another routine could be added for filesystems
which don't use the hash to cover that gap.

Then for filesystems which *do* use the hash the entire thing is transparent.

> A plenty of inodes never enter that state at all. Hell, consider pipes.
> Or sockets. Or anything on procfs. Or sysfs, or...
>

So whether I change the meaning of I_NEW or add another flag, I will
still need to patch these suckers to do *something* on the inodes they
create.

That's not the whole story, but should be enough to convey what I'm gunning for.

This will also have a side effect of giving I_NEW a more fitting use.

Jan Kara

unread,
Dec 11, 2025, 4:00:51 AM12/11/25
to Al Viro, Mateusz Guzik, syzbot, bra...@kernel.org, ja...@suse.cz, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Ahmet Eray Karadag, Albin Babu Varghese, Heming Zhao
On Wed 10-12-25 21:47:30, Al Viro wrote:
> #syz test
>
> commit 9c7d3d572d0a67484e9cbe178184cfd9a89aa430
> Author: Al Viro <vi...@zeniv.linux.org.uk>
> Date: Wed Dec 10 16:44:53 2025 -0500
>
> Revert "ocfs2: mark inode bad upon validation failure during read"
>
> This reverts commit 58b6fcd2ab34399258dc509f701d0986a8e0bcaa.
>
> You can't use make_bad_inode() on live inodes.

At first I was confused because ocfs2_read_inode_block_full() gets called
when loading new inode into memory and that's a place for which
make_bad_inode() is safe. But then I've noticed ocfs2 does reread the inode
in many places through ocfs2_read_inode_block() and that could be marking
fully alive inode as bad. So this commit is indeed buggy. Adding relevant
people to CC.

Guys, maybe I'm misunderstanding the changelog of 58b6fcd2ab34 but the
justification:

The VFS open(O_DIRECT) operation appears to incorrectly clear the inode's
I_DIRTY flag without ensuring the dirty metadata (reflecting the earlier
buffered write, e.g., an updated i_size) is flushed to disk.

looks bogus. Combinations of direct and buffered IO work perfectly fine for
other filesystems (definitely not corrupting them). VFS definitely does not
clear dirty flags without writing back the inode.

The particular syzbot reproducers mentioned in 58b6fcd2ab34 are likely
confusing ocfs2 by calling LOOP_SET_STATUS(64) on the loopback device with
mounted ocfs2 filesystem which may effectively corrupt the filesystem
underneath. So I suspect proper fix for your issues is actually
https://lore.kernel.org/all/20251114144204.240...@gmail.com/.

Perhaps we should ping Jens to pick it up.

Honza

>
> diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
> index 8340525e5589..53d649436017 100644
> --- a/fs/ocfs2/inode.c
> +++ b/fs/ocfs2/inode.c
> @@ -1708,8 +1708,6 @@ int ocfs2_read_inode_block_full(struct inode *inode, struct buffer_head **bh,
> rc = ocfs2_read_blocks(INODE_CACHE(inode), OCFS2_I(inode)->ip_blkno,
> 1, &tmp, flags, ocfs2_validate_inode_block);
>
> - if (rc < 0)
> - make_bad_inode(inode);
> /* If ocfs2_read_blocks() got us a new bh, pass it up. */
> if (!rc && !*bh)
> *bh = tmp;

Tetsuo Handa

unread,
Jan 6, 2026, 5:11:11 AMJan 6
to Jan Kara, Mateusz Guzik, Al Viro, syzbot, bra...@kernel.org, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On 2025/12/10 19:09, Jan Kara wrote:
> On Wed 10-12-25 18:45:26, Tetsuo Handa wrote:
>> syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
>> introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
>> handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
>> to S_IFREG. Since make_bad_inode() might be called after an inode is fully
>> constructed, make_bad_inode() should not needlessly change file type.
>>
>> Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
>> Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
>
> No. make_bad_inode() must not be called once the inode is fully visible
> because that can cause all sorts of fun. That function is really only good
> for handling a situation when read of an inode from the disk failed or
> similar early error paths.
I'm surprised to hear that.

But since commit 58b6fcd2ab34 ("ocfs2: mark inode bad upon validation failure
during read") is a bug fix, we want to somehow prevent this bug from re-opening.

Minimal change for this release cycle might look like

----------
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index b5fcc2725a29..2c97c8b4013f 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -1715,8 +1715,13 @@ int ocfs2_read_inode_block_full(struct inode *inode, struct buffer_head **bh,
rc = ocfs2_read_blocks(INODE_CACHE(inode), OCFS2_I(inode)->ip_blkno,
1, &tmp, flags, ocfs2_validate_inode_block);

- if (rc < 0)
+ if (rc < 0) {
+ /* Preserve file type while making operations no-op. */
+ umode_t mode = inode->i_mode & S_IFMT;
+
make_bad_inode(inode);
+ inode->i_mode = mode;
+ }
/* If ocfs2_read_blocks() got us a new bh, pass it up. */
if (!rc && !*bh)
*bh = tmp;
----------

but what approach do you prefer?

Introduce a copy of bad_{inode,file}_ops for ocfs2 and replace
a call to make_bad_inode() with updating only {inode,file}_ops ?

Or, modify existing {inode,file}_ops for ocfs2 to check whether
an I/O error has occurred in the past?

Jan Kara

unread,
Jan 7, 2026, 4:36:43 AMJan 7
to Tetsuo Handa, Jan Kara, Mateusz Guzik, Al Viro, syzbot, bra...@kernel.org, jl...@evilplan.org, jose...@linux.alibaba.com, linki...@kernel.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, ma...@fasheh.com, ocfs2...@lists.linux.dev, sj155...@samsung.com, syzkall...@googlegroups.com, Chuck Lever
On Tue 06-01-26 19:10:41, Tetsuo Handa wrote:
> On 2025/12/10 19:09, Jan Kara wrote:
> > On Wed 10-12-25 18:45:26, Tetsuo Handa wrote:
> >> syzbot is hitting VFS_BUG_ON_INODE(!S_ISDIR(inode->i_mode)) check
> >> introduced by commit e631df89cd5d ("fs: speed up path lookup with cheaper
> >> handling of MAY_EXEC"), for make_bad_inode() is blindly changing file type
> >> to S_IFREG. Since make_bad_inode() might be called after an inode is fully
> >> constructed, make_bad_inode() should not needlessly change file type.
> >>
> >> Reported-by: syzbot+d222f4...@syzkaller.appspotmail.com
> >> Closes: https://syzkaller.appspot.com/bug?extid=d222f4b7129379c3d5bc
> >> Signed-off-by: Tetsuo Handa <penguin...@I-love.SAKURA.ne.jp>
> >
> > No. make_bad_inode() must not be called once the inode is fully visible
> > because that can cause all sorts of fun. That function is really only good
> > for handling a situation when read of an inode from the disk failed or
> > similar early error paths.
> I'm surprised to hear that.
>
> But since commit 58b6fcd2ab34 ("ocfs2: mark inode bad upon validation
> failure during read") is a bug fix, we want to somehow prevent this bug
> from re-opening.

Since Jens has picked up
https://lore.kernel.org/all/20251217190040.49...@gmail.com/
yesterday I suspect the original reproducer for OCFS2 will not cause issue
anymore even without 58b6fcd2ab34 because as far as I had a look the
original problem was caused by the loop device getting messed up under a
mounted OCFS2 filesystem. It would be good to verify my analysis is correct
but I think just reverting 58b6fcd2ab34 might be the best option at this
point.

Honza
Reply all
Reply to author
Forward
0 new messages