Re: [patch V2 3/4] sched/mmcid: Drop per CPU CID immediately when switching to per task mode

2 views
Skip to first unread message

Thomas Gleixner

unread,
Feb 10, 2026, 5:44:13 AMFeb 10
to Shinichiro Kawasaki, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Tue, Feb 10 2026 at 07:33, Shinichiro Kawasaki wrote:
> On Feb 02, 2026 / 10:39, Thomas Gleixner wrote:
>> When a exiting task initiates the switch from per CPU back to per task
>> mode, it has already dropped its CID and marked itself inactive. But a
>> leftover from an earlier iteration of the rework then reassigns the per
>> CPU CID to the exiting task with the transition bit set.
>>
>> That's wrong as the task is already marked CID inactive, which means it is
>> inconsistent state. It's harmless because the CID is marked in transit and
>> therefore dropped back into the pool when the exiting task schedules out
>> either through preemption or the final schedule().
>>
>> Simply drop the per CPU CID when the exiting task triggered the transition.
>>
>> Fixes: fbd0e71dc370 ("sched/mmcid: Provide CID ownership mode fixup functions")
>> Signed-off-by: Thomas Gleixner <tg...@kernel.org>
>> Reviewed-by: Mathieu Desnoyers <mathieu....@efficios.com>
>
> Hello all,
>
> While I evaluated v6.19 kernel, I observed a BUG KASAN. The KASAN is recreated
> in stable manner by running the test case zbd/013 of blktests [1] on some of my
> test systems. I bisected and found that this patch as the commit 007d84287c74
> triggered the KASAN. When I reverted this patch from v6.19 kernel, the KASAN
> disappeared. Of note is that the KASAN symptom slightly varies for each run. I
> observed KASAN slab-use-after-free [2], use-after-free [3] and slab-out-of-
> bounds [4]. All those KASANs happened "in sched_mm_cid_exit".

And none of them make any sense. The patch does:

- mm_cid_transit_to_task(current, this_cpu_ptr(mm->mm_cid.pcpu));
+ mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu));

Both access mm->mm_cid and mm->mm_cid.pcpu. mm is valid at that point as
this is way before the task disconnects from the mm.

The new code also accesses the CID bitmap which is at the end of
mm_struct. But the subsequent mm_cid_fixup_cpus_to_tasks(mm) touches all
of those too. So none of this makes any sense at all.

> [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500

Can you please decode these symbols (file/line) so that we actually see
which access is flagged by KASAN?

Also .config and compiler version would be helpful.

Keeping the splats below for the KASAN folks to digest.

Thanks,

tglx

> Actions for fix will be appreciated. If I can help by trying trial some patches
> on my test systems, please let me know.
>
> [1] https://github.com/linux-blktests/blktests
>
> [2] KASAN slab-use-after-free
>
> [ 64.540760] [ T1234] run blktests zbd/013 at 2026-02-10 11:06:48
> [ 64.638773] [ T1252] null_blk: disk nullb1 created
> [ 64.749061] [ T1252] null_blk: nullb2: using native zone append
> [ 64.764569] [ T1252] null_blk: disk nullb2 created
> [ 65.767294] [ T1296] ==================================================================
> [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500
> [ 65.769378] [ T1296] Write of size 8 at addr ffff888149792410 by task cryptsetup/1296
>
> [ 65.770700] [ T1296] CPU: 1 UID: 0 PID: 1296 Comm: cryptsetup Not tainted 6.19.0 #571 PREEMPT(voluntary)
> [ 65.770705] [ T1296] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
> [ 65.770709] [ T1296] Call Trace:
> [ 65.770711] [ T1296] <TASK>
> [ 65.770713] [ T1296] dump_stack_lvl+0x6a/0x90
> [ 65.770718] [ T1296] ? sched_mm_cid_exit+0x298/0x500
> [ 65.770721] [ T1296] print_report+0x170/0x4f3
> [ 65.770725] [ T1296] ? __virt_addr_valid+0x22e/0x4e0
> [ 65.770729] [ T1296] ? sched_mm_cid_exit+0x298/0x500
> [ 65.770732] [ T1296] kasan_report+0xad/0x150
> [ 65.770737] [ T1296] ? sched_mm_cid_exit+0x298/0x500
> [ 65.770742] [ T1296] kasan_check_range+0x115/0x1f0
> [ 65.770745] [ T1296] sched_mm_cid_exit+0x298/0x500
> [ 65.770750] [ T1296] do_exit+0x25e/0x24c0
> [ 65.770755] [ T1296] ? __pfx_do_exit+0x10/0x10
> [ 65.770758] [ T1296] ? lockdep_hardirqs_on+0x88/0x130
> [ 65.770761] [ T1296] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 65.770764] [ T1296] ? do_syscall_64+0x1d7/0x540
> [ 65.770766] [ T1296] ? do_raw_spin_lock+0x124/0x260
> [ 65.770769] [ T1296] ? lock_acquire+0x180/0x300
> [ 65.770771] [ T1296] ? find_held_lock+0x2b/0x80
> [ 65.770775] [ T1296] __x64_sys_exit+0x3e/0x50
> [ 65.770780] [ T1296] x64_sys_call+0x14fe/0x1500
> [ 65.770784] [ T1296] do_syscall_64+0x95/0x540
> [ 65.770787] [ T1296] ? lockdep_hardirqs_on+0x88/0x130
> [ 65.770790] [ T1296] ? _raw_spin_unlock_irq+0x24/0x50
> [ 65.770792] [ T1296] ? _raw_spin_unlock_irq+0x34/0x50
> [ 65.770795] [ T1296] ? __x64_sys_rt_sigprocmask+0x23d/0x400
> [ 65.770798] [ T1296] ? __pfx___x64_sys_rt_sigprocmask+0x10/0x10
> [ 65.770800] [ T1296] ? rcu_nocb_unlock_irqrestore+0x87/0xb0
> [ 65.770804] [ T1296] ? rcu_do_batch+0x867/0xd90
> [ 65.770809] [ T1296] ? lockdep_hardirqs_on+0x88/0x130
> [ 65.770811] [ T1296] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 65.770813] [ T1296] ? do_syscall_64+0x1d7/0x540
> [ 65.770816] [ T1296] ? __pfx_sched_clock_cpu+0x10/0x10
> [ 65.770819] [ T1296] ? lock_is_held_type+0xd5/0x140
> [ 65.770824] [ T1296] ? irqtime_account_irq+0xe4/0x330
> [ 65.770827] [ T1296] ? lockdep_softirqs_on+0xc3/0x140
> [ 65.770829] [ T1296] ? __irq_exit_rcu+0x126/0x240
> [ 65.770832] [ T1296] ? handle_softirqs+0x6c5/0x790
> [ 65.770836] [ T1296] ? __pfx_handle_softirqs+0x10/0x10
> [ 65.770839] [ T1296] ? irqtime_account_irq+0x1a2/0x330
> [ 65.770842] [ T1296] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
> [ 65.770844] [ T1296] ? irqentry_exit+0xe2/0x6a0
> [ 65.770848] [ T1296] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 65.770850] [ T1296] RIP: 0033:0x7f96978fef89
> [ 65.770854] [ T1296] Code: ff 31 c9 48 89 88 20 06 00 00 31 c0 87 07 83 e8 01 7f 19 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 31 ff b8 3c 00 00 00 0f 05 <eb> f5 89 95 74 ff ff ff e8 9a d0 ff ff 83 bd 74 ff ff ff 01 0f 85
> [ 65.770856] [ T1296] RSP: 002b:00007f9691de0d30 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
> [ 65.770861] [ T1296] RAX: ffffffffffffffda RBX: 00007f9691de16c0 RCX: 00007f96978fef89
> [ 65.770863] [ T1296] RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000000000000000
> [ 65.770865] [ T1296] RBP: 00007f9691de0df0 R08: 0000000015fc5864 R09: 0000000000000000
> [ 65.770866] [ T1296] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f9691de16c0
> [ 65.770867] [ T1296] R13: 00007fff8d18af10 R14: 00007f9691de1cdc R15: 00007fff8d18b017
> [ 65.770875] [ T1296] </TASK>
>
> [ 65.805902] [ T1296] Allocated by task 668:
> [ 65.806662] [ T1296] kasan_save_stack+0x2c/0x50
> [ 65.807400] [ T1296] kasan_save_track+0x10/0x30
> [ 65.808130] [ T1296] __kasan_slab_alloc+0x7a/0x90
> [ 65.808842] [ T1296] kmem_cache_alloc_noprof+0x238/0x7a0
> [ 65.809569] [ T1296] getname_flags.part.0+0x48/0x4d0
> [ 65.810280] [ T1296] do_sys_openat2+0xa8/0x180
> [ 65.810972] [ T1296] __x64_sys_openat+0x10a/0x200
> [ 65.811637] [ T1296] do_syscall_64+0x95/0x540
> [ 65.812267] [ T1296] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> [ 65.813538] [ T1296] Freed by task 668:
> [ 65.814189] [ T1296] kasan_save_stack+0x2c/0x50
> [ 65.814884] [ T1296] kasan_save_track+0x10/0x30
> [ 65.815545] [ T1296] kasan_save_free_info+0x37/0x70
> [ 65.816318] [ T1296] __kasan_slab_free+0x67/0x80
> [ 65.817002] [ T1296] kmem_cache_free+0x1ae/0x6d0
> [ 65.817700] [ T1296] audit_reset_context+0x3c7/0xeb0
> [ 65.818401] [ T1296] syscall_exit_work+0x17f/0x1b0
> [ 65.819124] [ T1296] do_syscall_64+0x2fe/0x540
> [ 65.819812] [ T1296] entry_SYSCALL_64_after_hwframe+0x76/0x7e
>
> [ 65.821100] [ T1296] The buggy address belongs to the object at ffff888149792200
> which belongs to the cache names_cache of size 4096
> [ 65.822824] [ T1296] The buggy address is located 528 bytes inside of
> freed 4096-byte region [ffff888149792200, ffff888149793200)
>
> [ 65.825027] [ T1296] The buggy address belongs to the physical page:
> [ 65.825856] [ T1296] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x149790
> [ 65.826846] [ T1296] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> [ 65.827840] [ T1296] flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> [ 65.828768] [ T1296] page_type: f5(slab)
> [ 65.829405] [ T1296] raw: 0017ffffc0000040 ffff888100902b40 ffffea0005314600 dead000000000002
> [ 65.830402] [ T1296] raw: 0000000000000000 0000000000070007 00000000f5000000 0000000000000000
> [ 65.831493] [ T1296] head: 0017ffffc0000040 ffff888100902b40 ffffea0005314600 dead000000000002
> [ 65.832644] [ T1296] head: 0000000000000000 0000000000070007 00000000f5000000 0000000000000000
> [ 65.833723] [ T1296] head: 0017ffffc0000003 ffffea000525e401 00000000ffffffff 00000000ffffffff
> [ 65.834798] [ T1296] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008
> [ 65.835827] [ T1296] page dumped because: kasan: bad access detected
>
> [ 65.837253] [ T1296] Memory state around the buggy address:
> [ 65.838039] [ T1296] ffff888149792300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 65.838991] [ T1296] ffff888149792380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 65.839939] [ T1296] >ffff888149792400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 65.840894] [ T1296] ^
> [ 65.841569] [ T1296] ffff888149792480: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 65.842554] [ T1296] ffff888149792500: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> [ 65.843504] [ T1296] ==================================================================
> [ 65.844500] [ T1296] Disabling lock debugging due to kernel taint
> [ 71.925834] [ T1650] device-mapper: zone: dm-0 using emulated zone append
> [ 72.474170] [ C1] hrtimer: interrupt took 1119829 ns
>
> [3] KASAN use-after-free
>
> [ 145.885127] [ T1246] run blktests zbd/013 at 2026-02-10 10:57:04
> [ 145.985394] [ T1264] null_blk: disk nullb1 created
> [ 146.091908] [ T1264] null_blk: nullb2: using native zone append
> [ 146.106425] [ T1264] null_blk: disk nullb2 created
> [ 147.822863] [ T1479] ==================================================================
> [ 147.823592] [ T1479] BUG: KASAN: use-after-free in sched_mm_cid_exit+0x298/0x500
> [ 147.824479] [ T1479] Write of size 8 at addr ffff8881185cb050 by task cryptsetup/1479
>
> [ 147.825468] [ T1479] CPU: 2 UID: 0 PID: 1479 Comm: cryptsetup Not tainted 6.19.0 #571 PREEMPT(voluntary)
> [ 147.825472] [ T1479] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
> [ 147.825476] [ T1479] Call Trace:
> [ 147.825478] [ T1479] <TASK>
> [ 147.825480] [ T1479] dump_stack_lvl+0x6a/0x90
> [ 147.825484] [ T1479] ? sched_mm_cid_exit+0x298/0x500
> [ 147.825487] [ T1479] print_report+0x170/0x4f3
> [ 147.825490] [ T1479] ? __virt_addr_valid+0x22e/0x4e0
> [ 147.825494] [ T1479] ? sched_mm_cid_exit+0x298/0x500
> [ 147.825496] [ T1479] kasan_report+0xad/0x150
> [ 147.825500] [ T1479] ? sched_mm_cid_exit+0x298/0x500
> [ 147.825504] [ T1479] kasan_check_range+0x115/0x1f0
> [ 147.825507] [ T1479] sched_mm_cid_exit+0x298/0x500
> [ 147.825510] [ T1479] do_exit+0x25e/0x24c0
> [ 147.825514] [ T1479] ? lockdep_hardirqs_on+0x88/0x130
> [ 147.825517] [ T1479] ? __pfx_do_exit+0x10/0x10
> [ 147.825520] [ T1479] ? irqtime_account_irq+0xe4/0x330
> [ 147.825524] [ T1479] __x64_sys_exit+0x3e/0x50
> [ 147.825526] [ T1479] x64_sys_call+0x14fe/0x1500
> [ 147.825529] [ T1479] do_syscall_64+0x95/0x540
> [ 147.825531] [ T1479] ? __pfx_handle_softirqs+0x10/0x10
> [ 147.825534] [ T1479] ? irqtime_account_irq+0x1a2/0x330
> [ 147.825536] [ T1479] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
> [ 147.825539] [ T1479] ? irqentry_exit+0xe2/0x6a0
> [ 147.825542] [ T1479] entry_SYSCALL_64_after_hwframe+0x76/0x7e
> [ 147.825544] [ T1479] RIP: 0033:0x7f505e211f89
> [ 147.825547] [ T1479] Code: ff 31 c9 48 89 88 20 06 00 00 31 c0 87 07 83 e8 01 7f 19 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 31 ff b8 3c 00 00 00 0f 05 <eb> f5 89 95 74 ff ff ff e8 9a d0 ff ff 83 bd 74 ff ff ff 01 0f 85
> [ 147.825549] [ T1479] RSP: 002b:00007f50585fbd30 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
> [ 147.825553] [ T1479] RAX: ffffffffffffffda RBX: 00007f50585fc6c0 RCX: 00007f505e211f89
> [ 147.825555] [ T1479] RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000000000000000
> [ 147.825556] [ T1479] RBP: 00007f50585fbdf0 R08: 00005566eb14ea20 R09: 00005566eb14ea38
> [ 147.825558] [ T1479] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f50585fc6c0
> [ 147.825559] [ T1479] R13: 00007fff4289e220 R14: 00007f50585fccdc R15: 00007fff4289e327
> [ 147.825564] [ T1479] </TASK>
>
> [ 147.844213] [ T1479] The buggy address belongs to the physical page:
> [ 147.845137] [ T1479] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x10 pfn:0x1185cb
> [ 147.846323] [ T1479] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
> [ 147.847389] [ T1479] raw: 0017ffffc0000000 dead000000000100 dead000000000122 0000000000000000
> [ 147.848662] [ T1479] raw: 0000000000000010 0000000000000000 00000000ffffffff 0000000000000000
> [ 147.849887] [ T1479] page dumped because: kasan: bad access detected
>
> [ 147.851495] [ T1479] Memory state around the buggy address:
> [ 147.852479] [ T1479] ffff8881185caf00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 147.853600] [ T1479] ffff8881185caf80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 147.854690] [ T1479] >ffff8881185cb000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 147.855852] [ T1479] ^
> [ 147.856798] [ T1479] ffff8881185cb080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 147.857855] [ T1479] ffff8881185cb100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
> [ 147.858857] [ T1479] ==================================================================
> [ 147.859888] [ T1479] Disabling lock debugging due to kernel taint
> [ 153.349607] [ T1982] device-mapper: zone: dm-0 using emulated zone append
> [ 153.715923] [ C3] hrtimer: interrupt took 475570 ns
> [ 282.408372] [ T3034] null_blk: disk nullb0 created
> [ 282.409360] [ T3034] null_blk: module loaded
>
> [4] KASAN slab-out-of-bounds
>
> Feb 09 15:14:28 testnode2 unknown: run blktests zbd/013 at 2026-02-09 15:14:28
> Feb 09 15:14:28 testnode2 kernel: null_blk: disk nullb1 created
> Feb 09 15:14:28 testnode2 kernel: null_blk: nullb2: using native zone append
> Feb 09 15:14:28 testnode2 kernel: null_blk: disk nullb2 created
> Feb 09 15:14:29 testnode2 kernel: ==================================================================
> Feb 09 15:14:29 testnode2 kernel: BUG: KASAN: slab-out-of-bounds in sched_mm_cid_exit+0x298/0x500
> Feb 09 15:14:29 testnode2 kernel: Write of size 8 at addr ffff8881580db050 by task cryptsetup/136938
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: CPU: 3 UID: 0 PID: 136938 Comm: cryptsetup Not tainted 6.19.0 #571 PREEMPT(voluntary)
> Feb 09 15:14:29 testnode2 kernel: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
> Feb 09 15:14:29 testnode2 kernel: Call Trace:
> Feb 09 15:14:29 testnode2 kernel: <TASK>
> Feb 09 15:14:29 testnode2 kernel: dump_stack_lvl+0x6a/0x90
> Feb 09 15:14:29 testnode2 kernel: ? sched_mm_cid_exit+0x298/0x500
> Feb 09 15:14:29 testnode2 kernel: print_report+0x170/0x4f3
> Feb 09 15:14:29 testnode2 kernel: ? __virt_addr_valid+0x22e/0x4e0
> Feb 09 15:14:29 testnode2 kernel: ? sched_mm_cid_exit+0x298/0x500
> Feb 09 15:14:29 testnode2 kernel: kasan_report+0xad/0x150
> Feb 09 15:14:29 testnode2 kernel: ? sched_mm_cid_exit+0x298/0x500
> Feb 09 15:14:29 testnode2 kernel: kasan_check_range+0x115/0x1f0
> Feb 09 15:14:29 testnode2 kernel: sched_mm_cid_exit+0x298/0x500
> Feb 09 15:14:29 testnode2 kernel: do_exit+0x25e/0x24c0
> Feb 09 15:14:29 testnode2 kernel: ? __pfx_do_exit+0x10/0x10
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: __x64_sys_exit+0x3e/0x50
> Feb 09 15:14:29 testnode2 kernel: x64_sys_call+0x14fe/0x1500
> Feb 09 15:14:29 testnode2 kernel: do_syscall_64+0x95/0x540
> Feb 09 15:14:29 testnode2 kernel: ? sched_tick+0x330/0x960
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: ? trace_hardirqs_on_prepare+0xfd/0x130
> Feb 09 15:14:29 testnode2 kernel: ? do_syscall_64+0x1d7/0x540
> Feb 09 15:14:29 testnode2 kernel: ? do_futex+0x1bf/0x210
> Feb 09 15:14:29 testnode2 kernel: ? __pfx_do_futex+0x10/0x10
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: ? profile_tick+0x18/0x90
> Feb 09 15:14:29 testnode2 kernel: ? __x64_sys_futex+0x22f/0x4a0
> Feb 09 15:14:29 testnode2 kernel: ? __pfx_do_raw_spin_lock+0x10/0x10
> Feb 09 15:14:29 testnode2 kernel: ? lock_release+0x242/0x2f0
> Feb 09 15:14:29 testnode2 kernel: ? __pfx___x64_sys_futex+0x10/0x10
> Feb 09 15:14:29 testnode2 kernel: ? timerqueue_add+0x207/0x3c0
> Feb 09 15:14:29 testnode2 kernel: ? enqueue_hrtimer+0x1f0/0x290
> Feb 09 15:14:29 testnode2 kernel: ? sched_clock_cpu+0x65/0x5c0
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: ? trace_hardirqs_on_prepare+0xfd/0x130
> Feb 09 15:14:29 testnode2 kernel: ? do_syscall_64+0x1d7/0x540
> Feb 09 15:14:29 testnode2 kernel: ? lock_release+0x242/0x2f0
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: ? trace_hardirqs_on+0x14/0x140
> Feb 09 15:14:29 testnode2 kernel: ? kvm_sched_clock_read+0xd/0x20
> Feb 09 15:14:29 testnode2 kernel: ? sched_clock+0xc/0x30
> Feb 09 15:14:29 testnode2 kernel: ? sched_clock_cpu+0x65/0x5c0
> Feb 09 15:14:29 testnode2 kernel: ? irqtime_account_irq+0xe4/0x330
> Feb 09 15:14:29 testnode2 kernel: ? kvm_sched_clock_read+0xd/0x20
> Feb 09 15:14:29 testnode2 kernel: ? sched_clock+0xc/0x30
> Feb 09 15:14:29 testnode2 kernel: ? sched_clock_cpu+0x65/0x5c0
> Feb 09 15:14:29 testnode2 kernel: ? __pfx_sched_clock_cpu+0x10/0x10
> Feb 09 15:14:29 testnode2 kernel: ? flush_tlb_func+0xb5/0x760
> Feb 09 15:14:29 testnode2 kernel: ? irqtime_account_irq+0x1a2/0x330
> Feb 09 15:14:29 testnode2 kernel: ? rcu_is_watching+0x11/0xb0
> Feb 09 15:14:29 testnode2 kernel: ? trace_hardirqs_on_prepare+0xfd/0x130
> Feb 09 15:14:29 testnode2 kernel: ? irqentry_exit+0xe2/0x6a0
> Feb 09 15:14:29 testnode2 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Feb 09 15:14:29 testnode2 kernel: RIP: 0033:0x7fca4fbf5f89
> Feb 09 15:14:29 testnode2 kernel: Code: ff 31 c9 48 89 88 20 06 00 00 31 c0 87 07 83 e8 01 7f 19 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 31 ff b8 3c 00 00 00 0f 05 <eb> f5 89 95 74 ff ff ff e8 9a d0 ff ff 83 bd 74 ff ff ff 01 0f 85
> Feb 09 15:14:29 testnode2 kernel: RSP: 002b:00007fca497fad30 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
> Feb 09 15:14:29 testnode2 kernel: RAX: ffffffffffffffda RBX: 00007fca497fb6c0 RCX: 00007fca4fbf5f89
> Feb 09 15:14:29 testnode2 kernel: RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000000000000000
> Feb 09 15:14:29 testnode2 kernel: RBP: 00007fca497fadf0 R08: 0000557abe711cb0 R09: 0000557abe711cc8
> Feb 09 15:14:29 testnode2 kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 00007fca497fb6c0
> Feb 09 15:14:29 testnode2 kernel: R13: 00007ffc5119c9c0 R14: 00007fca497fbcdc R15: 00007ffc5119cac7
> Feb 09 15:14:29 testnode2 kernel: </TASK>
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: Allocated by task 136663:
> Feb 09 15:14:29 testnode2 kernel: kasan_save_stack+0x2c/0x50
> Feb 09 15:14:29 testnode2 kernel: kasan_save_track+0x10/0x30
> Feb 09 15:14:29 testnode2 kernel: __kasan_slab_alloc+0x7a/0x90
> Feb 09 15:14:29 testnode2 kernel: kmem_cache_alloc_noprof+0x238/0x7a0
> Feb 09 15:14:29 testnode2 kernel: mempool_alloc_noprof+0x150/0x250
> Feb 09 15:14:29 testnode2 kernel: bio_alloc_bioset+0x1d7/0x720
> Feb 09 15:14:29 testnode2 kernel: blkdev_direct_IO+0x3a7/0x1f40
> Feb 09 15:14:29 testnode2 kernel: blkdev_write_iter+0x52b/0xba0
> Feb 09 15:14:29 testnode2 kernel: aio_write+0x33a/0x7c0
> Feb 09 15:14:29 testnode2 kernel: io_submit_one+0xd97/0x1a00
> Feb 09 15:14:29 testnode2 kernel: __x64_sys_io_submit+0x15d/0x2b0
> Feb 09 15:14:29 testnode2 kernel: do_syscall_64+0x95/0x540
> Feb 09 15:14:29 testnode2 kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: Freed by task 37:
> Feb 09 15:14:29 testnode2 kernel: kasan_save_stack+0x2c/0x50
> Feb 09 15:14:29 testnode2 kernel: kasan_save_track+0x10/0x30
> Feb 09 15:14:29 testnode2 kernel: kasan_save_free_info+0x37/0x70
> Feb 09 15:14:29 testnode2 kernel: __kasan_slab_free+0x67/0x80
> Feb 09 15:14:29 testnode2 kernel: slab_free_after_rcu_debug+0xf5/0x200
> Feb 09 15:14:29 testnode2 kernel: rcu_do_batch+0x37a/0xd90
> Feb 09 15:14:29 testnode2 kernel: rcu_core+0x6f1/0xad0
> Feb 09 15:14:29 testnode2 kernel: handle_softirqs+0x1ee/0x790
> Feb 09 15:14:29 testnode2 kernel: run_ksoftirqd+0x3b/0x60
> Feb 09 15:14:29 testnode2 kernel: smpboot_thread_fn+0x2fd/0x9a0
> Feb 09 15:14:29 testnode2 kernel: kthread+0x3af/0x770
> Feb 09 15:14:29 testnode2 kernel: ret_from_fork+0x55c/0x810
> Feb 09 15:14:29 testnode2 kernel: ret_from_fork_asm+0x1a/0x30
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: Last potentially related work creation:
> Feb 09 15:14:29 testnode2 kernel: kasan_save_stack+0x2c/0x50
> Feb 09 15:14:29 testnode2 kernel: kasan_record_aux_stack+0xac/0xc0
> Feb 09 15:14:29 testnode2 kernel: kmem_cache_free+0x4af/0x6d0
> Feb 09 15:14:29 testnode2 kernel: mempool_free+0xbe/0x110
> Feb 09 15:14:29 testnode2 kernel: blk_update_request+0x443/0x1190
> Feb 09 15:14:29 testnode2 kernel: scsi_end_request+0x70/0x7b0
> Feb 09 15:14:29 testnode2 kernel: scsi_io_completion+0xea/0x1440
> Feb 09 15:14:29 testnode2 kernel: blk_complete_reqs+0xa8/0x120
> Feb 09 15:14:29 testnode2 kernel: handle_softirqs+0x1ee/0x790
> Feb 09 15:14:29 testnode2 kernel: run_ksoftirqd+0x3b/0x60
> Feb 09 15:14:29 testnode2 kernel: smpboot_thread_fn+0x2fd/0x9a0
> Feb 09 15:14:29 testnode2 kernel: kthread+0x3af/0x770
> Feb 09 15:14:29 testnode2 kernel: ret_from_fork+0x55c/0x810
> Feb 09 15:14:29 testnode2 kernel: ret_from_fork_asm+0x1a/0x30
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: The buggy address belongs to the object at ffff8881580daf00
> which belongs to the cache bio-264 of size 264
> Feb 09 15:14:29 testnode2 kernel: The buggy address is located 72 bytes to the right of
> allocated 264-byte region [ffff8881580daf00, ffff8881580db008)
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: The buggy address belongs to the physical page:
> Feb 09 15:14:29 testnode2 kernel: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1580da
> Feb 09 15:14:29 testnode2 kernel: head: order:1 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> Feb 09 15:14:29 testnode2 kernel: flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
> Feb 09 15:14:29 testnode2 kernel: page_type: f5(slab)
> Feb 09 15:14:29 testnode2 kernel: raw: 0017ffffc0000040 ffff88810536c500 dead000000000122 0000000000000000
> Feb 09 15:14:29 testnode2 kernel: raw: 0000000000000000 0000000000150015 00000000f5000000 0000000000000000
> Feb 09 15:14:29 testnode2 kernel: head: 0017ffffc0000040 ffff88810536c500 dead000000000122 0000000000000000
> Feb 09 15:14:29 testnode2 kernel: head: 0000000000000000 0000000000150015 00000000f5000000 0000000000000000
> Feb 09 15:14:29 testnode2 kernel: head: 0017ffffc0000001 ffffea0005603681 00000000ffffffff 00000000ffffffff
> Feb 09 15:14:29 testnode2 kernel: head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000002
> Feb 09 15:14:29 testnode2 kernel: page dumped because: kasan: bad access detected
> Feb 09 15:14:29 testnode2 kernel:
> Feb 09 15:14:29 testnode2 kernel: Memory state around the buggy address:
> Feb 09 15:14:29 testnode2 kernel: ffff8881580daf00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Feb 09 15:14:29 testnode2 kernel: ffff8881580daf80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Feb 09 15:14:29 testnode2 kernel: >ffff8881580db000: fb fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
> Feb 09 15:14:29 testnode2 kernel: ^
> Feb 09 15:14:29 testnode2 kernel: ffff8881580db080: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Feb 09 15:14:29 testnode2 kernel: ffff8881580db100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Feb 09 15:14:29 testnode2 kernel: ==================================================================
> Feb 09 15:14:34 testnode2 kernel: device-mapper: zone: dm-0 using emulated zone append
> Feb 09 15:16:09 testnode2 kernel: null_blk: disk nullb0 created
> Feb 09 15:16:09 testnode2 kernel: null_blk: module loaded

Shinichiro Kawasaki

unread,
Feb 10, 2026, 6:51:20 AMFeb 10
to Thomas Gleixner, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Feb 10, 2026 / 11:44, Thomas Gleixner wrote:
> On Tue, Feb 10 2026 at 07:33, Shinichiro Kawasaki wrote:
[...]
> > [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500
>
> Can you please decode these symbols (file/line) so that we actually see
> which access is flagged by KASAN?

Sure, faddr2line points to the line the patch touched:

$ ./scripts/faddr2line vmlinux sched_mm_cid_exit+0x298/0x500
sched_mm_cid_exit+0x298/0x500:
arch_clear_bit at arch/x86/include/asm/bitops.h:79
(inlined by) clear_bit at include/asm-generic/bitops/instrumented-atomic.h:42
(inlined by) mm_drop_cid at kernel/sched/sched.h:3746
(inlined by) mm_drop_cid_on_cpu at kernel/sched/sched.h:3762
(inlined by) sched_mm_cid_exit at kernel/sched/core.c:10737

Quote from kernel/sched/core.c:
--------------------------------------------------------------------------------
10735 * mm::mm_cid::lock. Drop it. |
10736 */ |
10737 mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu));
10738 } |
10739 mm_cid_fixup_cpus_to_tasks(mm); |
--------------------------------------------------------------------------------

> Also .config and compiler version would be helpful.

Please find the attached config file. Its first line records the compiler
version:

CONFIG_CC_VERSION_TEXT="gcc (GCC) 15.1.1 20250425 (Red Hat 15.1.1-1)"

If I can provide any other clues, please let me know.
_config_KASAN_in_schd_mm_cid_exit

Peter Zijlstra

unread,
Feb 10, 2026, 8:03:15 AMFeb 10
to Shinichiro Kawasaki, Thomas Gleixner, LKML, Ihor Solodrai, Shrikanth Hegde, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Tue, Feb 10, 2026 at 11:51:10AM +0000, Shinichiro Kawasaki wrote:
> On Feb 10, 2026 / 11:44, Thomas Gleixner wrote:
> > On Tue, Feb 10 2026 at 07:33, Shinichiro Kawasaki wrote:
> [...]
> > > [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500
> >
> > Can you please decode these symbols (file/line) so that we actually see
> > which access is flagged by KASAN?
>
> Sure, faddr2line points to the line the patch touched:
>
> $ ./scripts/faddr2line vmlinux sched_mm_cid_exit+0x298/0x500
> sched_mm_cid_exit+0x298/0x500:
> arch_clear_bit at arch/x86/include/asm/bitops.h:79
> (inlined by) clear_bit at include/asm-generic/bitops/instrumented-atomic.h:42
> (inlined by) mm_drop_cid at kernel/sched/sched.h:3746
> (inlined by) mm_drop_cid_on_cpu at kernel/sched/sched.h:3762
> (inlined by) sched_mm_cid_exit at kernel/sched/core.c:10737

Could you please reproduce with the below added?

Just to double check that that cid value isn't out of bounds.

---
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index bd350e40859d..dadfd6abc1fa 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3743,6 +3743,7 @@ static __always_inline bool cid_on_task(unsigned int cid)

static __always_inline void mm_drop_cid(struct mm_struct *mm, unsigned int cid)
{
+ WARN_ONCE(cid >= nr_cpu_ids, "XXX cid(%x) out of range(%x)\n", cid, nr_cpu_ids);
clear_bit(cid, mm_cidmask(mm));
}

Thomas Gleixner

unread,
Feb 10, 2026, 8:33:55 AMFeb 10
to Shinichiro Kawasaki, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Tue, Feb 10 2026 at 11:51, Shinichiro Kawasaki wrote:
> On Feb 10, 2026 / 11:44, Thomas Gleixner wrote:
>> > [ 65.768341] [ T1296] BUG: KASAN: slab-use-after-free in sched_mm_cid_exit+0x298/0x500
>>
>> Can you please decode these symbols (file/line) so that we actually see
>> which access is flagged by KASAN?
>
> Sure, faddr2line points to the line the patch touched:
>
> $ ./scripts/faddr2line vmlinux sched_mm_cid_exit+0x298/0x500
> sched_mm_cid_exit+0x298/0x500:
> arch_clear_bit at arch/x86/include/asm/bitops.h:79
> (inlined by) clear_bit at include/asm-generic/bitops/instrumented-atomic.h:42
> (inlined by) mm_drop_cid at kernel/sched/sched.h:3746
> (inlined by) mm_drop_cid_on_cpu at kernel/sched/sched.h:3762
> (inlined by) sched_mm_cid_exit at kernel/sched/core.c:10737

Ok. That's useful and I think I know what's going on.

fork() switches to per CPU mode and sets the TRANSIT bit on the task and
the CPU.

While the task is out in user space and therefore not scheduling, other
tasks are exiting and when this task exits it hits the mode change.

It still has the transit bit set in both task::mm::mm_cid:cid and in the
per CPU cid store. sched_mm_cid_remove_user() clears the TRANSIT bit in
the task and drops the CID, but it does not touch the per CPU storage.

That's functionally correct because a CID is only owned by the CPU when
the ONCPU bit is set, which is mutually exclusive with the TRANSIT flag.

Now mm_drop_cid_on_cpu() assumes for the wrong reason that the CID is
CPU owned because the prior mode was per CPU. So it clears the (not set)
ONCPU bit and then invokes clear_bit() with an insanely large bit
number because TRANSIT is set (bit 29). Duh.

Can you please try the fix below?

Thanks

tglx
---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 854984967fe2..61c2d65156b5 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10729,10 +10729,9 @@ void sched_mm_cid_exit(struct task_struct *t)
return;
/*
* Mode change. The task has the CID unset
- * already. The CPU CID is still valid and
- * does not have MM_CID_TRANSIT set as the
- * mode change has just taken effect under
- * mm::mm_cid::lock. Drop it.
+ * already and dealt with an eventually set
+ * TRANSIT bit. If the CID is owned by the CPU
+ * then drop it.
*/
mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu));
}
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index bd350e40859d..1b4283e9edc3 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3758,8 +3758,10 @@ static __always_inline void mm_unset_cid_on_task(struct task_struct *t)
static __always_inline void mm_drop_cid_on_cpu(struct mm_struct *mm, struct mm_cid_pcpu *pcp)
{
/* Clear the ONCPU bit, but do not set UNSET in the per CPU storage */
- pcp->cid = cpu_cid_to_cid(pcp->cid);
- mm_drop_cid(mm, pcp->cid);
+ if (cid_on_cpu(pcp->cid)) {
+ pcp->cid = cpu_cid_to_cid(pcp->cid);
+ mm_drop_cid(mm, pcp->cid);
+ }
}

static inline unsigned int __mm_get_cid(struct mm_struct *mm, unsigned int max_cids)




Shinichiro Kawasaki

unread,
Feb 10, 2026, 9:15:34 AMFeb 10
to Peter Zijlstra, Thomas Gleixner, LKML, Ihor Solodrai, Shrikanth Hegde, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
Thanks for the action. I have applied the patch to v6.19 kernel, and reproduced
the KASAN. The added WARN was printed as follows. (Now I'm trying the fix patch
candidate that Thomas shared in another post)

[ 73.897104] [ T1031] run blktests zbd/013 at 2026-02-10 23:09:21
[ 73.987761] [ T1049] null_blk: disk nullb1 created
[ 74.417726] [ T1049] null_blk: nullb2: using native zone append
[ 74.436675] [ T1049] null_blk: disk nullb2 created
[ 75.983893] [ T1175] ------------[ cut here ]------------
[ 75.984939] [ T1175] XXX cid(20000003) out of range(4)
[ 75.985515] [ T1175] WARNING: kernel/sched/sched.h:3746 at sched_mm_cid_exit+0x37b/0x530, CPU#3: cryptsetup/1175
[ 75.986573] [ T1175] Modules linked in: dm_crypt null_blk nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables qrtr sunrpc 9pnet_virtio 9pnet pcspkr netfs i2c_piix4 i2c_smbus loop fuse dm_multipath nfnetlink vsock_loopback vmw_vsock_virtio_transport_common zram vsock xfs nvme bochs drm_client_lib drm_shmem_helper drm_kms_helper nvme_core drm nvme_keyring sym53c8xx nvme_auth scsi_transport_spi hkdf e1000 floppy serio_raw ata_generic pata_acpi i2c_dev qemu_fw_cfg
[ 75.992120] [ T1175] CPU: 3 UID: 0 PID: 1175 Comm: cryptsetup Not tainted 6.19.0+ #387 PREEMPT(voluntary)
[ 75.993151] [ T1175] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
[ 75.994146] [ T1175] RIP: 0010:sched_mm_cid_exit+0x37e/0x530
[ 75.994773] [ T1175] Code: 01 00 00 e8 74 90 48 00 48 8d bd 30 01 00 00 48 83 c4 10 5b 5d 41 5c 41 5d 41 5e e9 5c 27 f9 ff 48 8d 3d 75 cf 85 04 44 89 e6 <67> 48 0f b9 3a 48 b8 00 00 00 00 00 fc ff df 48 89 da 83 e3 07 48
[ 75.996798] [ T1175] RSP: 0018:ffff888124bb7b20 EFLAGS: 00010016
[ 75.997442] [ T1175] RAX: 0000000000000003 RBX: ffffffff95e37da0 RCX: 1ffff110272ab021
[ 75.998296] [ T1175] RDX: 0000000000000004 RSI: 0000000020000003 RDI: ffffffff95e49f30
[ 75.999094] [ T1175] RBP: ffff888139558000 R08: ffff888139558108 R09: 0000000040000000
[ 75.999958] [ T1175] R10: 0000000000000003 R11: 0000000000000000 R12: 0000000020000003
[ 76.000812] [ T1175] R13: 0000000000000000 R14: ffff888139558178 R15: ffff88811d6baf80
[ 76.001632] [ T1175] FS: 00007f72777fc6c0(0000) GS:ffff888408490000(0000) knlGS:0000000000000000
[ 76.002579] [ T1175] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 76.003299] [ T1175] CR2: 00007f7276ff97d0 CR3: 0000000104970000 CR4: 00000000000006f0
[ 76.004088] [ T1175] Call Trace:
[ 76.004476] [ T1175] <TASK>
[ 76.004793] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.005431] [ T1175] do_exit+0x25e/0x24c0
[ 76.005870] [ T1175] ? __pfx___up_read+0x10/0x10
[ 76.006389] [ T1175] ? __pfx_do_exit+0x10/0x10
[ 76.006867] [ T1175] ? lock_release+0x1ab/0x2f0
[ 76.007401] [ T1175] __x64_sys_exit+0x3e/0x50
[ 76.007835] [ T1175] x64_sys_call+0x14fe/0x1500
[ 76.008355] [ T1175] do_syscall_64+0x95/0x540
[ 76.008790] [ T1175] ? __pfx_do_madvise+0x10/0x10
[ 76.009336] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.009900] [ T1175] ? trace_hardirqs_on+0x14/0x140
[ 76.010458] [ T1175] ? lockdep_hardirqs_on+0x88/0x130
[ 76.010969] [ T1175] ? kvm_sched_clock_read+0xd/0x20
[ 76.011534] [ T1175] ? sched_clock+0xc/0x30
[ 76.011980] [ T1175] ? sched_clock_cpu+0x65/0x5c0
[ 76.012998] [ T1175] ? __pfx_rcu_do_batch+0x10/0x10
[ 76.014067] [ T1175] ? lockdep_hardirqs_on+0x88/0x130
[ 76.015102] [ T1175] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 76.016316] [ T1175] ? do_syscall_64+0x1d7/0x540
[ 76.017297] [ T1175] ? irqtime_account_irq+0xe4/0x330
[ 76.018350] [ T1175] ? lockdep_softirqs_on+0xc3/0x140
[ 76.019355] [ T1175] ? __irq_exit_rcu+0x126/0x240
[ 76.020361] [ T1175] ? handle_softirqs+0x6c5/0x790
[ 76.021380] [ T1175] ? __pfx_handle_softirqs+0x10/0x10
[ 76.022421] [ T1175] ? irqtime_account_irq+0x1a2/0x330
[ 76.023426] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.024526] [ T1175] ? irqentry_exit+0xe2/0x6a0
[ 76.025475] [ T1175] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 76.026569] [ T1175] RIP: 0033:0x7f727d48df89
[ 76.027485] [ T1175] Code: ff 31 c9 48 89 88 20 06 00 00 31 c0 87 07 83 e8 01 7f 19 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 31 ff b8 3c 00 00 00 0f 05 <eb> f5 89 95 74 ff ff ff e8 9a d0 ff ff 83 bd 74 ff ff ff 01 0f 85
[ 76.030452] [ T1175] RSP: 002b:00007f72777fbd30 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
[ 76.031767] [ T1175] RAX: ffffffffffffffda RBX: 00007f72777fc6c0 RCX: 00007f727d48df89
[ 76.033032] [ T1175] RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000000000000000
[ 76.034377] [ T1175] RBP: 00007f72777fbdf0 R08: 00000000dd4d2955 R09: 0000000000000000
[ 76.035605] [ T1175] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f72777fc6c0
[ 76.036884] [ T1175] R13: 00007ffd89867320 R14: 00007f72777fccdc R15: 00007ffd89867427
[ 76.038169] [ T1175] </TASK>
[ 76.038894] [ T1175] irq event stamp: 116
[ 76.039771] [ T1175] hardirqs last enabled at (115): [<ffffffff941114d4>] _raw_spin_unlock_irq+0x24/0x50
[ 76.041167] [ T1175] hardirqs last disabled at (116): [<ffffffff941111e2>] _raw_spin_lock_irq+0x52/0x60
[ 76.042569] [ T1175] softirqs last enabled at (100): [<ffffffff9151adc6>] __irq_exit_rcu+0x126/0x240
[ 76.043945] [ T1175] softirqs last disabled at (63): [<ffffffff9151adc6>] __irq_exit_rcu+0x126/0x240
[ 76.045320] [ T1175] ---[ end trace 0000000000000000 ]---
[ 76.046319] [ T1175] ==================================================================
[ 76.047489] [ T1175] BUG: KASAN: use-after-free in sched_mm_cid_exit+0x27c/0x530
[ 76.048669] [ T1175] Write of size 8 at addr ffff88813d558b90 by task cryptsetup/1175

[ 76.050476] [ T1175] CPU: 3 UID: 0 PID: 1175 Comm: cryptsetup Tainted: G W 6.19.0+ #387 PREEMPT(voluntary)
[ 76.050480] [ T1175] Tainted: [W]=WARN
[ 76.050481] [ T1175] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-4.fc42 04/01/2014
[ 76.050483] [ T1175] Call Trace:
[ 76.050484] [ T1175] <TASK>
[ 76.050486] [ T1175] dump_stack_lvl+0x6a/0x90
[ 76.050490] [ T1175] ? sched_mm_cid_exit+0x27c/0x530
[ 76.050492] [ T1175] print_report+0x170/0x4f3
[ 76.050495] [ T1175] ? __virt_addr_valid+0x22e/0x4e0
[ 76.050499] [ T1175] ? sched_mm_cid_exit+0x27c/0x530
[ 76.050501] [ T1175] kasan_report+0xad/0x150
[ 76.050506] [ T1175] ? sched_mm_cid_exit+0x27c/0x530
[ 76.050510] [ T1175] kasan_check_range+0x115/0x1f0
[ 76.050512] [ T1175] sched_mm_cid_exit+0x27c/0x530
[ 76.050515] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.050518] [ T1175] do_exit+0x25e/0x24c0
[ 76.050521] [ T1175] ? __pfx___up_read+0x10/0x10
[ 76.050524] [ T1175] ? __pfx_do_exit+0x10/0x10
[ 76.050526] [ T1175] ? lock_release+0x1ab/0x2f0
[ 76.050530] [ T1175] __x64_sys_exit+0x3e/0x50
[ 76.050533] [ T1175] x64_sys_call+0x14fe/0x1500
[ 76.050535] [ T1175] do_syscall_64+0x95/0x540
[ 76.050537] [ T1175] ? __pfx_do_madvise+0x10/0x10
[ 76.050541] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.050544] [ T1175] ? trace_hardirqs_on+0x14/0x140
[ 76.050546] [ T1175] ? lockdep_hardirqs_on+0x88/0x130
[ 76.050551] [ T1175] ? kvm_sched_clock_read+0xd/0x20
[ 76.050553] [ T1175] ? sched_clock+0xc/0x30
[ 76.050554] [ T1175] ? sched_clock_cpu+0x65/0x5c0
[ 76.050556] [ T1175] ? __pfx_rcu_do_batch+0x10/0x10
[ 76.050560] [ T1175] ? lockdep_hardirqs_on+0x88/0x130
[ 76.050562] [ T1175] ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 76.050564] [ T1175] ? do_syscall_64+0x1d7/0x540
[ 76.050567] [ T1175] ? irqtime_account_irq+0xe4/0x330
[ 76.050569] [ T1175] ? lockdep_softirqs_on+0xc3/0x140
[ 76.050571] [ T1175] ? __irq_exit_rcu+0x126/0x240
[ 76.050573] [ T1175] ? handle_softirqs+0x6c5/0x790
[ 76.050577] [ T1175] ? __pfx_handle_softirqs+0x10/0x10
[ 76.050579] [ T1175] ? irqtime_account_irq+0x1a2/0x330
[ 76.050582] [ T1175] ? lockdep_hardirqs_on_prepare+0xce/0x1b0
[ 76.050584] [ T1175] ? irqentry_exit+0xe2/0x6a0
[ 76.050587] [ T1175] entry_SYSCALL_64_after_hwframe+0x76/0x7e
[ 76.050589] [ T1175] RIP: 0033:0x7f727d48df89
[ 76.050591] [ T1175] Code: ff 31 c9 48 89 88 20 06 00 00 31 c0 87 07 83 e8 01 7f 19 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 31 ff b8 3c 00 00 00 0f 05 <eb> f5 89 95 74 ff ff ff e8 9a d0 ff ff 83 bd 74 ff ff ff 01 0f 85
[ 76.050593] [ T1175] RSP: 002b:00007f72777fbd30 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
[ 76.050596] [ T1175] RAX: ffffffffffffffda RBX: 00007f72777fc6c0 RCX: 00007f727d48df89
[ 76.050597] [ T1175] RDX: 0000000000000000 RSI: 0000000000800000 RDI: 0000000000000000
[ 76.050598] [ T1175] RBP: 00007f72777fbdf0 R08: 00000000dd4d2955 R09: 0000000000000000
[ 76.050600] [ T1175] R10: 0000000000000008 R11: 0000000000000246 R12: 00007f72777fc6c0
[ 76.050601] [ T1175] R13: 00007ffd89867320 R14: 00007f72777fccdc R15: 00007ffd89867427
[ 76.050606] [ T1175] </TASK>

[ 76.100141] [ T1175] The buggy address belongs to the physical page:
[ 76.101101] [ T1175] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0xffff88813d559100 pfn:0x13d558
[ 76.102440] [ T1175] flags: 0x17ffffc0000000(node=0|zone=2|lastcpupid=0x1fffff)
[ 76.103496] [ T1175] raw: 0017ffffc0000000 ffffea0004edd808 ffffea0004f85008 0000000000000000
[ 76.104692] [ T1175] raw: ffff88813d559100 0000000000070000 00000000ffffffff 0000000000000000
[ 76.105893] [ T1175] page dumped because: kasan: bad access detected

[ 76.107458] [ T1175] Memory state around the buggy address:
[ 76.108369] [ T1175] ffff88813d558a80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 76.109509] [ T1175] ffff88813d558b00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 76.110672] [ T1175] >ffff88813d558b80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 76.111823] [ T1175] ^
[ 76.112661] [ T1175] ffff88813d558c00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 76.113829] [ T1175] ffff88813d558c80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
[ 76.115000] [ T1175] ==================================================================
[ 76.116174] [ T1175] Disabling lock debugging due to kernel taint
[ 81.299309] [ T1577] device-mapper: zone: dm-0 using emulated zone append
[ 81.659065] [ C0] hrtimer: interrupt took 1305020 ns

Shinichiro Kawasaki

unread,
Feb 10, 2026, 9:55:26 AMFeb 10
to Thomas Gleixner, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Feb 10, 2026 / 14:33, Thomas Gleixner wrote:
[...]
Thomas, the fix patch worked! I applied the patch on top of v6.19 kernel, then
the KASAN is no longer observed. I confirmed it with my two test nodes. Thank
you very much for the swift fix :)

In case the patch will be posted as a formal one,

Tested-by: Shin'ichiro Kawasaki <shinichir...@wdc.com>

P.S. I stop working here tonight. If my response will be required, I will do so
tomorrow.

Thomas Gleixner

unread,
Feb 10, 2026, 11:21:00 AMFeb 10
to Shinichiro Kawasaki, Linus Torvalds, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
Shinichiro reported a KASAN UAF, which is actually an out of bounds access
in the MMCID management code.

CPU0 CPU1
T1 runs in userspace
T0: fork(T4) -> Switch to per CPU CID mode
fixup() set MM_CID_TRANSIT on T1/CPU1
T4 exit()
T3 exit()
T2 exit()
T1 exit() switch to per task mode
---> Out of bounds access.

As T1 has not scheduled after T0 set the TRANSIT bit, it exits with the
TRANSIT bit set. sched_mm_cid_remove_user() clears the TRANSIT bit in
the task and drops the CID, but it does not touch the per CPU storage.
That's functionally correct because a CID is only owned by the CPU when
the ONCPU bit is set, which is mutually exclusive with the TRANSIT flag.

Now sched_mm_cid_exit() assumes that the CID is CPU owned because the
prior mode was per CPU. It invokes mm_drop_cid_on_cpu() which clears the
not set ONCPU bit and then invokes clear_bit() with an insanely large
bit number because TRANSIT is set (bit 29).

Prevent that by actually validating that the CID is CPU owned in
mm_drop_cid_on_cpu().

Fixes: 007d84287c74 ("sched/mmcid: Drop per CPU CID immediately when switching to per task mode")
Reported-by: Shinichiro Kawasaki <shinichir...@wdc.com>
Signed-off-by: Thomas Gleixner <tg...@kernel.org>
Tested-by: Shinichiro Kawasaki <shinichir...@wdc.com>
Cc: sta...@vger.kernel.org
Closes: https://lore.kernel.org/aYsZrixn9b6s_2zL@shinmob
---

Linus, can you please take that directly?

---
kernel/sched/core.c | 7 +++----
kernel/sched/sched.h | 6 ++++--
2 files changed, 7 insertions(+), 6 deletions(-)

--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -10729,10 +10729,9 @@ void sched_mm_cid_exit(struct task_struc
return;
/*
* Mode change. The task has the CID unset
- * already. The CPU CID is still valid and
- * does not have MM_CID_TRANSIT set as the
- * mode change has just taken effect under
- * mm::mm_cid::lock. Drop it.
+ * already and dealt with an eventually set
+ * TRANSIT bit. If the CID is owned by the CPU
+ * then drop it.
*/
mm_drop_cid_on_cpu(mm, this_cpu_ptr(mm->mm_cid.pcpu));
}
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -3762,8 +3762,10 @@ static __always_inline void mm_unset_cid

Mathieu Desnoyers

unread,
Feb 10, 2026, 11:28:25 AMFeb 10
to Thomas Gleixner, Shinichiro Kawasaki, Linus Torvalds, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On 2026-02-10 11:20, Thomas Gleixner wrote:
> Shinichiro reported a KASAN UAF, which is actually an out of bounds access
> in the MMCID management code.
[...]
>
> Fixes: 007d84287c74 ("sched/mmcid: Drop per CPU CID immediately when switching to per task mode")
> Reported-by: Shinichiro Kawasaki <shinichir...@wdc.com>
> Signed-off-by: Thomas Gleixner <tg...@kernel.org>
> Tested-by: Shinichiro Kawasaki <shinichir...@wdc.com>
> Cc: sta...@vger.kernel.org
> Closes: https://lore.kernel.org/aYsZrixn9b6s_2zL@shinmob
> ---
>
> Linus, can you please take that directly?

Reviewed-by: Mathieu Desnoyers <mathieu....@efficios.com>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com

Takashi Iwai

unread,
Feb 11, 2026, 5:33:09 AMFeb 11
to Thomas Gleixner, Shinichiro Kawasaki, Linus Torvalds, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
FWIW, I actually hit this bug yesterday on my laptop with 6.19 kernel,
so it's not only theoretical.

---- 8< ----
Feb 10 12:35:20 valkyrie kernel: BUG: unable to handle page fault for address: ffff8ec348b322d0
Feb 10 12:35:20 valkyrie kernel: #PF: supervisor write access in kernel mode
Feb 10 12:35:20 valkyrie kernel: #PF: error_code(0x0003) - permissions violation
Feb 10 12:35:20 valkyrie kernel: PGD 345801067 P4D 345801067 PUD 107790063 PMD 146391063 PTE 8000000148b32121
Feb 10 12:35:20 valkyrie kernel: Oops: Oops: 0003 [#1] SMP NOPTI
Feb 10 12:35:20 valkyrie kernel: CPU: 5 UID: 1000 PID: 17173 Comm: git Tainted: G E 6.19.0-test+ #679 PREEMPT(voluntary) 18755027502f5b378a0509f6d0a6ba52d8674d8b
Feb 10 12:35:20 valkyrie kernel: Tainted: [E]=UNSIGNED_MODULE
Feb 10 12:35:20 valkyrie kernel: Hardware name: LENOVO 21M2S03K00/21M2S03K00, BIOS R2NET42W (1.16 ) 10/10/2025
Feb 10 12:35:20 valkyrie kernel: RIP: 0010:sched_mm_cid_exit+0xdf/0x1f0
Feb 10 12:35:20 valkyrie kernel: Code: 48 03 05 8c e9 48 02 8b 08 81 e1 ff ff ff bf 89 08 8b 05 34 74 b7 01 83 c0 3f c1 e8 03 25 f8 ff ff 1f 48 8d 84 43 c0 06 00 00 <f0> 48 0f b3 08 48 81 fe ff ef ff ff 77 08 48 89 d7 e8 4b a7 cc 00
Feb 10 12:35:20 valkyrie kernel: RSP: 0018:ffffd4358bea3c08 EFLAGS: 00010002
Feb 10 12:35:20 valkyrie kernel: RAX: ffff8ec344b322d0 RBX: ffff8ec344b31c00 RCX: 0000000020000008
Feb 10 12:35:20 valkyrie kernel: RDX: ffff8ec344b31d10 RSI: ffff8ec344b31d0f RDI: 0000000000000007
Feb 10 12:35:20 valkyrie kernel: RBP: 0000000000000000 R08: 0000000000000010 R09: 0000000000000001
Feb 10 12:35:20 valkyrie kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Feb 10 12:35:20 valkyrie kernel: R13: ffff8ec370e76300 R14: ffff8ec30c100000 R15: 0000000000000000
Feb 10 12:35:20 valkyrie kernel: FS: 00007f7e95e956c0(0000) GS:ffff8ed292088000(0000) knlGS:0000000000000000
Feb 10 12:35:20 valkyrie kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 10 12:35:20 valkyrie kernel: CR2: ffff8ec348b322d0 CR3: 0000000108f7b000 CR4: 0000000000f50ef0
Feb 10 12:35:20 valkyrie kernel: PKRU: 55555554
Feb 10 12:35:20 valkyrie kernel: Call Trace:
Feb 10 12:35:20 valkyrie kernel: <TASK>
Feb 10 12:35:20 valkyrie kernel: do_exit+0xad/0xa70
Feb 10 12:35:20 valkyrie kernel: __x64_sys_exit+0x1b/0x20
Feb 10 12:35:20 valkyrie kernel: x64_sys_call+0x1502/0x1510
Feb 10 12:35:20 valkyrie kernel: do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? __do_sys_newfstatat+0x32/0x60
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? do_syscall_64+0x81/0x650
Feb 10 12:35:20 valkyrie kernel: ? __irq_exit_rcu+0x3d/0xe0
Feb 10 12:35:20 valkyrie kernel: entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 10 12:35:20 valkyrie kernel: RIP: 0033:0x7f7ea21de556
Feb 10 12:35:20 valkyrie kernel: Code: 8b 44 24 08 31 c9 48 89 88 20 06 00 00 31 c0 87 03 83 e8 01 7f 16 ba 3c 00 00 00 66 0f 1f 84 00 00 00 00 00 31 ff 89 d0 0f 05 <eb> f8 48 89 df e8 be cd ff ff 83 ed 01 0f 85 aa fd ff ff eb d7 48
Feb 10 12:35:20 valkyrie kernel: RSP: 002b:00007f7e95e94ee0 EFLAGS: 00000246 ORIG_RAX: 000000000000003c
Feb 10 12:35:20 valkyrie kernel: RAX: ffffffffffffffda RBX: 00007f7e95e95cdc RCX: 00007f7ea21de556
Feb 10 12:35:20 valkyrie kernel: RDX: 000000000000003c RSI: 0000000000800000 RDI: 0000000000000000
Feb 10 12:35:20 valkyrie kernel: RBP: 00007f7e95695000 R08: 00000000000000ca R09: 0000000000000007
Feb 10 12:35:20 valkyrie kernel: R10: 0000000000000008 R11: 0000000000000246 R12: 0000000000801000
Feb 10 12:35:20 valkyrie kernel: R13: 0000000000000000 R14: 00007ffcf5038cc0 R15: 00007f7e95695000
Feb 10 12:35:20 valkyrie kernel: </TASK>
Feb 10 12:35:20 valkyrie kernel: Modules linked in: tun(E) ccm(E) michael_mic(E) rfcomm(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq(E) snd_seq_device(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) af_packet(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) cmac(E) algif_hash(E) algif_skcipher(E) af_alg(E) ip_set(E) bnep(E) binfmt_misc(E) nls_iso8859_1(E) nls_cp437(E) vfat(E) fat(E) qrtr_mhi(E) snd_acp_legacy_mach(E) snd_acp_mach(E) snd_soc_nau8821(E) snd_acp3x_rn(E) snd_acp70(E) snd_acp_i2s(E) snd_acp_pdm(E) snd_soc_dmic(E) snd_acp_pcm(E) snd_sof_amd_acp70(E) snd_sof_amd_acp63(E) snd_sof_amd_vangogh(E) snd_sof_amd_rembrandt(E) snd_sof_amd_renoir(E) snd_sof_amd_acp(E) snd_sof_pci(E) snd_sof_xtensa_dsp(E) snd_ctl_led(E) snd_sof(E) snd_hda_codec_alc269(E) snd_sof_utils(E) snd_hda_scodec_component(E) snd_pci_ps(E) snd_hda_codec_realtek_lib(E) snd_soc_acpi_amd_match
(E) snd_soc_acpi_amd_sdca_quirks(E)
Feb 10 12:35:20 valkyrie kernel: snd_hda_codec_generic(E) snd_soc_sdca(E) snd_hda_codec_atihdmi(E) snd_hda_codec_hdmi(E) qrtr(E) snd_soc_core(E) intel_rapl_msr(E) amd_atl(E) snd_hda_intel(E) snd_compress(E) intel_rapl_common(E) btusb(E) snd_rpl_pci_acp6x(E) btrtl(E) ath12k(E) snd_hda_codec(E) snd_acp_pci(E) btintel(E) snd_intel_dspcfg(E) btbcm(E) uvcvideo(E) snd_amd_acpi_mach(E) mhi(E) snd_hda_core(E) btmtk(E) videobuf2_vmalloc(E) snd_acp_legacy_common(E) kvm_amd(E) videobuf2_memops(E) qmi_helpers(E) snd_pci_acp6x(E) spd5118(E) bluetooth(E) snd_hwdep(E) uvc(E) kvm(E) snd_pci_acp5x(E) videobuf2_v4l2(E) mac80211(E) amd_pmf(E) think_lmi(E) snd_pcm(E) thinkpad_acpi(E) videodev(E) snd_rn_pci_acp3x(E) irqbypass(E) amdtee(E) libarc4(E) snd_acp_config(E) sparse_keymap(E) snd_timer(E) snd_soc_acpi(E) i2c_piix4(E) videobuf2_common(E) platform_profile(E) pcspkr(E) mc(E) wmi_bmof(E) firmware_attributes_class(E) snd(E) tiny_power_button(E) cfg80211(E) snd_pci_acp3x(E) soundcore(E) k10temp(E) i2c
_smbus(E) thermal(E) battery(E) rfkill(E) ac(E)
Feb 10 12:35:20 valkyrie kernel: amd_sfh(E) fan(E) button(E) tee(E) joydev(E) amd_pmc(E) loop(E) fuse(E) dm_mod(E) efi_pstore(E) dmi_sysfs(E) ip_tables(E) x_tables(E) ext4(E) mbcache(E) jbd2(E) amdgpu(E) amdxcp(E) ucsi_acpi(E) i2c_algo_bit(E) drm_ttm_helper(E) typec_ucsi(E) ttm(E) roles(E) drm_exec(E) drm_panel_backlight_quirks(E) typec(E) drm_suballoc_helper(E) xhci_pci(E) nvme(E) drm_buddy(E) drm_display_helper(E) nvme_core(E) cec(E) hid_multitouch(E) nvme_keyring(E) xhci_hcd(E) video(E) rc_core(E) amdxdna(E) hid_generic(E) nvme_auth(E) ghash_clmulni_intel(E) sp5100_tco(E) gpu_sched(E) usbcore(E) ccp(E) crc16(E) hkdf(E) thunderbolt(E) wmi(E) i2c_hid_acpi(E) i2c_hid(E) serio_raw(E) br_netfilter(E) bridge(E) stp(E) llc(E) nf_tables(E) msr(E) nfnetlink(E) efivarfs(E) aesni_intel(E)
Feb 10 12:35:20 valkyrie kernel: CR2: ffff8ec348b322d0
Feb 10 12:35:20 valkyrie kernel: ---[ end trace 0000000000000000 ]---
Feb 10 12:35:20 valkyrie kernel: RIP: 0010:sched_mm_cid_exit+0xdf/0x1f0
Feb 10 12:35:20 valkyrie kernel: Code: 48 03 05 8c e9 48 02 8b 08 81 e1 ff ff ff bf 89 08 8b 05 34 74 b7 01 83 c0 3f c1 e8 03 25 f8 ff ff 1f 48 8d 84 43 c0 06 00 00 <f0> 48 0f b3 08 48 81 fe ff ef ff ff 77 08 48 89 d7 e8 4b a7 cc 00
Feb 10 12:35:20 valkyrie kernel: RSP: 0018:ffffd4358bea3c08 EFLAGS: 00010002
Feb 10 12:35:20 valkyrie kernel: RAX: ffff8ec344b322d0 RBX: ffff8ec344b31c00 RCX: 0000000020000008
Feb 10 12:35:20 valkyrie kernel: RDX: ffff8ec344b31d10 RSI: ffff8ec344b31d0f RDI: 0000000000000007
Feb 10 12:35:20 valkyrie kernel: RBP: 0000000000000000 R08: 0000000000000010 R09: 0000000000000001
Feb 10 12:35:20 valkyrie kernel: R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
Feb 10 12:35:20 valkyrie kernel: R13: ffff8ec370e76300 R14: ffff8ec30c100000 R15: 0000000000000000
Feb 10 12:35:20 valkyrie kernel: FS: 00007f7e95e956c0(0000) GS:ffff8ed292088000(0000) knlGS:0000000000000000
Feb 10 12:35:20 valkyrie kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 10 12:35:20 valkyrie kernel: CR2: ffff8ec348b322d0 CR3: 0000000108f7b000 CR4: 0000000000f50ef0
Feb 10 12:35:20 valkyrie kernel: PKRU: 55555554
Feb 10 12:35:20 valkyrie kernel: note: git[17173] exited with irqs disabled
Feb 10 12:35:20 valkyrie kernel: note: git[17173] exited with preempt_count 1
---- 8< ----

The stack decode showed the very same code path.

% scripts/faddr2line vmlinux 'sched_mm_cid_exit+0xdf'
sched_mm_cid_exit+0xdf/0x1f0:
arch_clear_bit at arch/x86/include/asm/bitops.h:79
(inlined by) clear_bit at include/asm-generic/bitops/instrumented-atomic.h:42
(inlined by) mm_drop_cid at kernel/sched/sched.h:3746
(inlined by) mm_drop_cid_on_cpu at kernel/sched/sched.h:3762
(inlined by) sched_mm_cid_exit at kernel/sched/core.c:10737

This happened only once, and can't be reproduced since then, though.
I must have been a very bad lock yesterday.


Takashi

Linus Torvalds

unread,
Feb 11, 2026, 4:00:48 PMFeb 11
to Thomas Gleixner, Shinichiro Kawasaki, LKML, Ihor Solodrai, Shrikanth Hegde, Peter Zijlstra, Mathieu Desnoyers, Michael Jeanson, Andrey Ryabinin, Alexander Potapenko, kasa...@googlegroups.com
On Tue, 10 Feb 2026 at 08:21, Thomas Gleixner <tg...@kernel.org> wrote:
>
> Linus, can you please take that directly?

Done. Thanks,

Linus
Reply all
Reply to author
Forward
0 new messages