[syzbot] [fs?] BUG: sleeping function called from invalid context in vfree (2)

19 views
Skip to first unread message

syzbot

unread,
Aug 18, 2025, 4:05:35 AM8/18/25
to andre...@igalia.com, da...@stgolabs.net, dvh...@infradead.org, linux-...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, pet...@infradead.org, syzkall...@googlegroups.com, tg...@linutronix.de
Hello,

syzbot found the following issue on:

HEAD commit: 8f5ae30d69d7 Linux 6.17-rc1
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
console output: https://syzkaller.appspot.com/x/log.txt?x=15232442580000
kernel config: https://syzkaller.appspot.com/x/.config?x=8c5ac3d8b8abfcb
dashboard link: https://syzkaller.appspot.com/bug?extid=f65a2014305525a9f816
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
userspace arch: arm64
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14cbaba2580000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1157faf0580000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/18a2e4bd0c4a/disk-8f5ae30d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/3b5395881b25/vmlinux-8f5ae30d.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e875f4e3b7ff/Image-8f5ae30d.gz.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f65a20...@syzkaller.appspotmail.com

BUG: sleeping function called from invalid context at mm/vmalloc.c:3409
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 6664, name: syz-executor
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
no locks held by syz-executor/6664.
Preemption disabled at:
[<ffff80008b015088>] __schedule_loop kernel/sched/core.c:7042 [inline]
[<ffff80008b015088>] schedule+0xac/0x230 kernel/sched/core.c:7058
CPU: 0 UID: 0 PID: 6664 Comm: syz-executor Not tainted 6.17.0-rc1-syzkaller-g8f5ae30d69d7 #0 PREEMPT
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 06/30/2025
Call trace:
show_stack+0x2c/0x3c arch/arm64/kernel/stacktrace.c:499 (C)
__dump_stack+0x30/0x40 lib/dump_stack.c:94
dump_stack_lvl+0xd8/0x12c lib/dump_stack.c:120
dump_stack+0x1c/0x28 lib/dump_stack.c:129
__might_resched+0x348/0x4c4 kernel/sched/core.c:8957
__might_sleep+0x94/0x110 kernel/sched/core.c:8886
vfree+0xa0/0x3dc mm/vmalloc.c:3409
kvfree+0x24/0x40 mm/slub.c:5093
futex_hash_free+0x84/0x9c kernel/futex/core.c:1742
__mmdrop+0x2c0/0x4ec kernel/fork.c:692
mmdrop include/linux/sched/mm.h:55 [inline]
mmdrop_sched include/linux/sched/mm.h:83 [inline]
mmdrop_lazy_tlb_sched include/linux/sched/mm.h:110 [inline]
finish_task_switch+0x4a0/0x5a4 kernel/sched/core.c:5250
context_switch kernel/sched/core.c:5360 [inline]
__schedule+0x13b4/0x2864 kernel/sched/core.c:6961
__schedule_loop kernel/sched/core.c:7043 [inline]
schedule+0xb4/0x230 kernel/sched/core.c:7058
do_nanosleep+0x174/0x508 kernel/time/hrtimer.c:2100
hrtimer_nanosleep+0x154/0x2a4 kernel/time/hrtimer.c:2147
common_nsleep+0xa0/0xb8 kernel/time/posix-timers.c:1353
__do_sys_clock_nanosleep kernel/time/posix-timers.c:1399 [inline]
__se_sys_clock_nanosleep kernel/time/posix-timers.c:1376 [inline]
__arm64_sys_clock_nanosleep+0x328/0x364 kernel/time/posix-timers.c:1376
__invoke_syscall arch/arm64/kernel/syscall.c:35 [inline]
invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:49
el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:132
do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:151
el0_svc+0x58/0x180 arch/arm64/kernel/entry-common.c:879
el0t_64_sync_handler+0x84/0x12c arch/arm64/kernel/entry-common.c:898
el0t_64_sync+0x198/0x19c arch/arm64/kernel/entry.S:596


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Hillf Danton

unread,
Aug 18, 2025, 6:08:15 AM8/18/25
to syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
> Date: Mon, 18 Aug 2025 01:05:33 -0700 [thread overview]
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 8f5ae30d69d7 Linux 6.17-rc1
> git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> console output: https://syzkaller.appspot.com/x/log.txt?x=15232442580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=8c5ac3d8b8abfcb
> dashboard link: https://syzkaller.appspot.com/bug?extid=f65a2014305525a9f816
#syz test upstream master

--- x/include/linux/mm_types.h
+++ y/include/linux/mm_types.h
@@ -1166,6 +1166,7 @@ struct mm_struct {
#ifdef CONFIG_PREEMPT_RT
struct rcu_head delayed_drop;
#endif
+ struct work_struct drop_work;
#ifdef CONFIG_HUGETLB_PAGE
atomic_long_t hugetlb_usage;
#endif
--- x/kernel/fork.c
+++ y/kernel/fork.c
@@ -666,6 +666,14 @@ static void cleanup_lazy_tlbs(struct mm_
on_each_cpu(do_check_lazy_tlb, (void *)mm, 1);
}

+static void mmdrop_workfn(struct work_struct *work)
+{
+ struct mm_struct *mm;
+
+ mm = container_of(work, struct mm_struct, drop_work);
+ futex_hash_free(mm);
+ free_mm(mm);
+}
/*
* Called when the last reference to the mm
* is dropped: either by a lazy thread or by
@@ -689,9 +697,8 @@ void __mmdrop(struct mm_struct *mm)
mm_pasid_drop(mm);
mm_destroy_cid(mm);
percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS);
- futex_hash_free(mm);
-
- free_mm(mm);
+ INIT_WORK(&mm->drop_work, mmdrop_workfn);
+ schedule_work(&mm->drop_work);
}
EXPORT_SYMBOL_GPL(__mmdrop);

--

syzbot

unread,
Aug 18, 2025, 7:42:05 AM8/18/25
to hda...@sina.com, linux-...@vger.kernel.org, syzkall...@googlegroups.com
Hello,

syzbot has tested the proposed patch and the reproducer did not trigger any issue:

Reported-by: syzbot+f65a20...@syzkaller.appspotmail.com
Tested-by: syzbot+f65a20...@syzkaller.appspotmail.com

Tested on:

commit: c17b750b Linux 6.17-rc2
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16a85234580000
kernel config: https://syzkaller.appspot.com/x/.config?x=cc86c8eaeb53db06
dashboard link: https://syzkaller.appspot.com/bug?extid=f65a2014305525a9f816
compiler: Debian clang version 20.1.7 (++20250616065708+6146a88f6049-1~exp1~20250616065826.132), Debian LLD 20.1.7
userspace arch: arm64
patch: https://syzkaller.appspot.com/x/patch.diff?x=15bba442580000

Note: testing is done by a robot and is best-effort only.

Breno Leitao

unread,
Aug 18, 2025, 8:41:05 AM8/18/25
to Hillf Danton, syzbot, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, Aug 18, 2025 at 06:07:57PM +0800, Hillf Danton wrote:
> > Date: Mon, 18 Aug 2025 01:05:33 -0700 [thread overview]
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 8f5ae30d69d7 Linux 6.17-rc1
> > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15232442580000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=8c5ac3d8b8abfcb
> > dashboard link: https://syzkaller.appspot.com/bug?extid=f65a2014305525a9f816
> > userspace arch: arm64
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14cbaba2580000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1157faf0580000
>
> #syz test upstream master

I was hitting this issue and I've tested it and the BUG is not there any
more.

Do you know which commit caused this "regression" ?

> --- x/include/linux/mm_types.h
> +++ y/include/linux/mm_types.h
>
> @@ -689,9 +697,8 @@ void __mmdrop(struct mm_struct *mm)
> mm_pasid_drop(mm);
> mm_destroy_cid(mm);
> percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS);
> - futex_hash_free(mm);
> -
> - free_mm(mm);
> + INIT_WORK(&mm->drop_work, mmdrop_workfn);

should INIT_WORK() be called at setup phase other than at any
__mmdrop()?

Also, is the scheduling overhead a concern here?

Hillf Danton

unread,
Aug 18, 2025, 9:19:23 AM8/18/25
to Breno Leitao, syzbot, Thomas Gleixner, linux-...@vger.kernel.org, syzkall...@googlegroups.com
On Mon, 18 Aug 2025 05:40:58 -0700 Breno Leitao wrote:
> On Mon, Aug 18, 2025 at 06:07:57PM +0800, Hillf Danton wrote:
> > > Date: Mon, 18 Aug 2025 01:05:33 -0700 [thread overview]
> > > Hello,
> > >
> > > syzbot found the following issue on:
> > >
> > > HEAD commit: 8f5ae30d69d7 Linux 6.17-rc1
> > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15232442580000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=8c5ac3d8b8abfcb
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=f65a2014305525a9f816
> > > userspace arch: arm64
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14cbaba2580000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1157faf0580000
> >
> > #syz test upstream master
>
> I was hitting this issue and I've tested it and the BUG is not there any
> more.
>
> Do you know which commit caused this "regression" ?
>
Looks like the tglx dude's work [1]

[1] Subject: [tip: locking/urgent] futex: Move futex cleanup to __mmdrop()
https://lore.kernel.org/lkml/175414093081.1420.8088049602488588887.tip-bot2@tip-bot2/

> > --- x/include/linux/mm_types.h
> > +++ y/include/linux/mm_types.h
> >
> > @@ -689,9 +697,8 @@ void __mmdrop(struct mm_struct *mm)
> > mm_pasid_drop(mm);
> > mm_destroy_cid(mm);
> > percpu_counter_destroy_many(mm->rss_stat, NR_MM_COUNTERS);
> > - futex_hash_free(mm);
> > -
> > - free_mm(mm);
> > + INIT_WORK(&mm->drop_work, mmdrop_workfn);
>
> should INIT_WORK() be called at setup phase other than at any
> __mmdrop()?
>
> Also, is the scheduling overhead a concern here?
>
Feel free to forget/ignore the tecknique details like your concerns
here because the diff is only to cut the added vfree in atomic context
from a square skull.

syzbot

unread,
Nov 16, 2025, 10:12:14 AM11/16/25
to syzkall...@googlegroups.com
Auto-closing this bug as obsolete.
No recent activity, existing reproducers are no longer triggering the issue.
Reply all
Reply to author
Forward
0 new messages