[syzbot] [kernfs?] WARNING: suspicious RCU usage in kernfs_node_dentry

4 views
Skip to first unread message

syzbot

unread,
Feb 18, 2025, 4:27:21 AM2/18/25
to big...@linutronix.de, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 253c82b3a2ce Add linux-next specific files for 20250217
git tree: linux-next
console+strace: https://syzkaller.appspot.com/x/log.txt?x=15141898580000
kernel config: https://syzkaller.appspot.com/x/.config?x=f984ff1d92cb4017
dashboard link: https://syzkaller.appspot.com/bug?extid=ecccecbc636b455f9084
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=175b7da4580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=17141898580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/af8b0f79abb4/disk-253c82b3.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/463f7d416e09/vmlinux-253c82b3.xz
kernel image: https://storage.googleapis.com/syzbot-assets/07101e97dd04/bzImage-253c82b3.xz

The issue was bisected to:

commit 5b2fabf7fe8f745ff214ff003e6067b64f172271
Author: Sebastian Andrzej Siewior <big...@linutronix.de>
Date: Thu Feb 13 14:50:20 2025 +0000

kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=15a9fda4580000
final oops: https://syzkaller.appspot.com/x/report.txt?x=17a9fda4580000
console output: https://syzkaller.appspot.com/x/log.txt?x=13a9fda4580000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+ecccec...@syzkaller.appspotmail.com
Fixes: 5b2fabf7fe8f ("kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().")

=============================
WARNING: suspicious RCU usage
6.14.0-rc2-next-20250217-syzkaller #0 Not tainted
-----------------------------
fs/kernfs/mount.c:243 suspicious rcu_dereference_check() usage!

other info that might help us debug this:


rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syz-executor215/5837:
#0: ffff88802ab0b870 (&fc->uapi_mutex){+.+.}-{4:4}, at: __do_sys_fsconfig fs/fsopen.c:465 [inline]
#0: ffff88802ab0b870 (&fc->uapi_mutex){+.+.}-{4:4}, at: __se_sys_fsconfig+0x9b2/0xf60 fs/fsopen.c:344
#1: ffff8880355560e0 (&type->s_umount_key#46){+.+.}-{4:4}, at: __super_lock fs/super.c:56 [inline]
#1: ffff8880355560e0 (&type->s_umount_key#46){+.+.}-{4:4}, at: super_lock+0x196/0x400 fs/super.c:120
#2: ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: class_rwsem_read_constructor include/linux/rwsem.h:241 [inline]
#2: ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_node_dentry+0xc3/0x2d0 fs/kernfs/mount.c:223

stack backtrace:
CPU: 0 UID: 0 PID: 5837 Comm: syz-executor215 Not tainted 6.14.0-rc2-next-20250217-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
lockdep_rcu_suspicious+0x226/0x340 kernel/locking/lockdep.c:6847
kernfs_node_dentry+0x24b/0x2d0 fs/kernfs/mount.c:243
cgroup_do_get_tree+0x248/0x390 kernel/cgroup/cgroup.c:2233
cgroup_get_tree+0xbb/0x230 kernel/cgroup/cgroup.c:2272
vfs_get_tree+0x90/0x2b0 fs/super.c:1759
vfs_cmd_create+0xa0/0x1f0 fs/fsopen.c:225
__do_sys_fsconfig fs/fsopen.c:467 [inline]
__se_sys_fsconfig+0xa33/0xf60 fs/fsopen.c:344
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f09e41a2de9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe4f61eba8 EFLAGS: 00000246 ORIG_RAX: 00000000000001af
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f09e41a2de9
RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f09e41ec036
R13: 00007ffe4f61ebe0 R14: 00007ffe4f61ec20 R15: 0000000000000000
</TASK>

============================================
WARNING: possible recursive locking detected
6.14.0-rc2-next-20250217-syzkaller #0 Not tainted
--------------------------------------------
syz-executor215/5837 is trying to acquire lock:
ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_dop_revalidate+0xa2/0x5d0 fs/kernfs/dir.c:1178

but task is already holding lock:
ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: class_rwsem_read_constructor include/linux/rwsem.h:241 [inline]
ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_node_dentry+0xc3/0x2d0 fs/kernfs/mount.c:223

other info that might help us debug this:
Possible unsafe locking scenario:

CPU0
----
lock(&root->kernfs_rwsem);
lock(&root->kernfs_rwsem);

*** DEADLOCK ***

May be due to missing lock nesting notation

3 locks held by syz-executor215/5837:
#0: ffff88802ab0b870 (&fc->uapi_mutex){+.+.}-{4:4}, at: __do_sys_fsconfig fs/fsopen.c:465 [inline]
#0: ffff88802ab0b870 (&fc->uapi_mutex){+.+.}-{4:4}, at: __se_sys_fsconfig+0x9b2/0xf60 fs/fsopen.c:344
#1: ffff8880355560e0 (&type->s_umount_key#46){+.+.}-{4:4}, at: __super_lock fs/super.c:56 [inline]
#1: ffff8880355560e0 (&type->s_umount_key#46){+.+.}-{4:4}, at: super_lock+0x196/0x400 fs/super.c:120
#2: ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: class_rwsem_read_constructor include/linux/rwsem.h:241 [inline]
#2: ffff88801bef7148 (&root->kernfs_rwsem){++++}-{4:4}, at: kernfs_node_dentry+0xc3/0x2d0 fs/kernfs/mount.c:223

stack backtrace:
CPU: 1 UID: 0 PID: 5837 Comm: syz-executor215 Not tainted 6.14.0-rc2-next-20250217-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 12/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120
print_deadlock_bug+0x483/0x620 kernel/locking/lockdep.c:3039
check_deadlock kernel/locking/lockdep.c:3091 [inline]
validate_chain+0x15e2/0x5920 kernel/locking/lockdep.c:3893
__lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5228
lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5851
down_read+0xb1/0xa40 kernel/locking/rwsem.c:1524
kernfs_dop_revalidate+0xa2/0x5d0 fs/kernfs/dir.c:1178
d_revalidate fs/namei.c:923 [inline]
lookup_dcache fs/namei.c:1651 [inline]
lookup_one_unlocked+0x23b/0x2d0 fs/namei.c:2972
lookup_one_positive_unlocked fs/namei.c:3003 [inline]
lookup_positive_unlocked+0x2b/0xb0 fs/namei.c:3043
kernfs_node_dentry+0x139/0x2d0 fs/kernfs/mount.c:244
cgroup_do_get_tree+0x248/0x390 kernel/cgroup/cgroup.c:2233
cgroup_get_tree+0xbb/0x230 kernel/cgroup/cgroup.c:2272
vfs_get_tree+0x90/0x2b0 fs/super.c:1759
vfs_cmd_create+0xa0/0x1f0 fs/fsopen.c:225
__do_sys_fsconfig fs/fsopen.c:467 [inline]
__se_sys_fsconfig+0xa33/0xf60 fs/fsopen.c:344
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f09e41a2de9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 61 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe4f61eba8 EFLAGS: 00000246 ORIG_RAX: 00000000000001af
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f09e41a2de9
RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f09e41ec036
R13: 00007ffe4f61ebe0 R14: 00007ffe4f61ec20 R15: 0000000000000000
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Sebastian Andrzej Siewior

unread,
Feb 18, 2025, 11:39:44 AM2/18/25
to syzbot, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, Thomas Gleixner
syzbot reported two warnings:
- kernfs_node::name was accessed outside of a RCU section so it created
warning. The kernfs_rwsem was held so it was okay but it wasn't seen.

- While kernfs_rwsem was held invoked lookup_positive_unlocked()->
kernfs_dop_revalidate() which acquired kernfs_rwsem.

kernfs_rwsem was both acquired as a read lock so it can be acquired
twice. However if a writer acquires the lock after the first reader then
neither the writer nor the second reader can obtain the lock so it
deadlocks.

The reason for the lock is to ensure that kernfs_node::name remain
stable during lookup_positive_unlocked()'s invocation. The function can
not be invoked within a RCU section because it may sleep.

Make a temporary copy of the kernfs_node::name under the lock so
GFP_KERNEL can be used and use this instead.

Reported-by: syzbot+ecccec...@syzkaller.appspotmail.com
Fixes: 5b2fabf7fe8f ("kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().")
Signed-off-by: Sebastian Andrzej Siewior <big...@linutronix.de>
---
fs/kernfs/mount.c | 35 +++++++++++++++++++++++++----------
1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index d1f512b7bf867..f1cea282aae32 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -220,12 +220,19 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn,
return dentry;

root = kernfs_root(kn);
- guard(rwsem_read)(&root->kernfs_rwsem);
-
- knparent = find_next_ancestor(kn, NULL);
- if (WARN_ON(!knparent)) {
- dput(dentry);
+ /*
+ * As long as kn is valid, its parent can not vanish. This is cgroup's
+ * kn so it not have its parent replaced. Therefore it is safe to use
+ * the ancestor node outside of the RCU or locked section.
+ */
+ if (WARN_ON_ONCE(!(root->flags & KERNFS_ROOT_INVARIANT_PARENT)))
return ERR_PTR(-EINVAL);
+ scoped_guard(rcu) {
+ knparent = find_next_ancestor(kn, NULL);
+ if (WARN_ON(!knparent)) {
+ dput(dentry);
+ return ERR_PTR(-EINVAL);
+ }
}

do {
@@ -235,14 +242,22 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn,

if (kn == knparent)
return dentry;
- kntmp = find_next_ancestor(kn, knparent);
- if (WARN_ON(!kntmp)) {
- dput(dentry);
- return ERR_PTR(-EINVAL);
+
+ scoped_guard(rwsem_read, &root->kernfs_rwsem) {
+ kntmp = find_next_ancestor(kn, knparent);
+ if (WARN_ON(!kntmp)) {
+ dput(dentry);
+ return ERR_PTR(-EINVAL);
+ }
+ name = kstrdup(kernfs_rcu_name(kntmp), GFP_KERNEL);
+ }
+ if (!name) {
+ dput(dentry);
+ return ERR_PTR(-ENOMEM);
}
- name = rcu_dereference(kntmp->name);
dtmp = lookup_positive_unlocked(name, dentry, strlen(name));
dput(dentry);
+ kfree(name);
if (IS_ERR(dtmp))
return dtmp;
knparent = kntmp;
--
2.47.2

Tejun Heo

unread,
Feb 18, 2025, 2:21:22 PM2/18/25
to Sebastian Andrzej Siewior, syzbot, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, Thomas Gleixner
On Tue, Feb 18, 2025 at 05:39:38PM +0100, Sebastian Andrzej Siewior wrote:
> syzbot reported two warnings:
> - kernfs_node::name was accessed outside of a RCU section so it created
> warning. The kernfs_rwsem was held so it was okay but it wasn't seen.
>
> - While kernfs_rwsem was held invoked lookup_positive_unlocked()->
> kernfs_dop_revalidate() which acquired kernfs_rwsem.
>
> kernfs_rwsem was both acquired as a read lock so it can be acquired
> twice. However if a writer acquires the lock after the first reader then
> neither the writer nor the second reader can obtain the lock so it
> deadlocks.
>
> The reason for the lock is to ensure that kernfs_node::name remain
> stable during lookup_positive_unlocked()'s invocation. The function can
> not be invoked within a RCU section because it may sleep.
>
> Make a temporary copy of the kernfs_node::name under the lock so
> GFP_KERNEL can be used and use this instead.
>
> Reported-by: syzbot+ecccec...@syzkaller.appspotmail.com
> Fixes: 5b2fabf7fe8f ("kernfs: Acquire kernfs_rwsem in kernfs_node_dentry().")
> Signed-off-by: Sebastian Andrzej Siewior <big...@linutronix.de>

Acked-by: Tejun Heo <t...@kernel.org>

Thanks.

--
tejun

Al Viro

unread,
Feb 20, 2025, 3:39:28 PM2/20/25
to Sebastian Andrzej Siewior, syzbot, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, Thomas Gleixner
On Tue, Feb 18, 2025 at 05:39:38PM +0100, Sebastian Andrzej Siewior wrote:

> + scoped_guard(rcu) {
> + knparent = find_next_ancestor(kn, NULL);
> + if (WARN_ON(!knparent)) {
> + dput(dentry);
> + return ERR_PTR(-EINVAL);
> + }

NAK. dput() is blocking; you *can't* do that under rcu_read_lock().

Sebastian Andrzej Siewior

unread,
Feb 21, 2025, 2:50:02 AM2/21/25
to Al Viro, syzbot, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, Thomas Gleixner
Thank you.
Since it got merged, let me do another one on top.

Sebastian

Sebastian Andrzej Siewior

unread,
Feb 21, 2025, 3:42:37 AM2/21/25
to Al Viro, syzbot, gre...@linuxfoundation.org, linux-...@vger.kernel.org, syzkall...@googlegroups.com, t...@kernel.org, Thomas Gleixner
Al Viro pointed out that dput() might sleep and must not be invoked
within an RCU section.

Keep only find_next_ancestor() winthin the RCU section.
Correct the wording in the comment.

Fixes: 6ef5b6fae3040 ("kernfs: Drop kernfs_rwsem while invoking lookup_positive_unlocked().")
Reported-by: Al Viro <vi...@zeniv.linux.org.uk>
Signed-off-by: Sebastian Andrzej Siewior <big...@linutronix.de>
---
fs/kernfs/mount.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/fs/kernfs/mount.c b/fs/kernfs/mount.c
index f1cea282aae32..5124e196c2bfd 100644
--- a/fs/kernfs/mount.c
+++ b/fs/kernfs/mount.c
@@ -222,17 +222,17 @@ struct dentry *kernfs_node_dentry(struct kernfs_node *kn,
root = kernfs_root(kn);
/*
* As long as kn is valid, its parent can not vanish. This is cgroup's
- * kn so it not have its parent replaced. Therefore it is safe to use
+ * kn so it can't have its parent replaced. Therefore it is safe to use
* the ancestor node outside of the RCU or locked section.
*/
if (WARN_ON_ONCE(!(root->flags & KERNFS_ROOT_INVARIANT_PARENT)))
return ERR_PTR(-EINVAL);
scoped_guard(rcu) {
knparent = find_next_ancestor(kn, NULL);
- if (WARN_ON(!knparent)) {
- dput(dentry);
- return ERR_PTR(-EINVAL);
- }
+ }
+ if (WARN_ON(!knparent)) {
+ dput(dentry);
+ return ERR_PTR(-EINVAL);
}

do {
--
2.47.2

Reply all
Reply to author
Forward
0 new messages