[syzbot] [mm?] [arch?] BUG: sleeping function called from invalid context in __tlb_batch_free_encoded_pages

7 views

Skip to first unread message

syzbot

unread,

Apr 30, 2026, 3:21:37 AMApr 30

to ak...@linux-foundation.org, aneesh...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, npi...@gmail.com, pet...@infradead.org, syzkall...@googlegroups.com, wi...@kernel.org

Hello,

syzbot found the following issue on:

HEAD commit: dca922e019dd Merge tag 'xsa48x-7.1-tag' of git://git.kerne..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11cd6b6c580000
kernel config: https://syzkaller.appspot.com/x/.config?x=59da38148f3a3d24
dashboard link: https://syzkaller.appspot.com/bug?extid=a169a27b0538ba43e5d3
compiler: gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-dca922e0.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/7b447b1b93a9/vmlinux-dca922e0.xz
kernel image: https://storage.googleapis.com/syzbot-assets/af7830f5dabf/bzImage-dca922e0.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+a169a2...@syzkaller.appspotmail.com

BUG: sleeping function called from invalid context at mm/mmu_gather.c:142
in_atomic(): 0, irqs_disabled(): 0, non_block: 0, pid: 5677, name: rm
preempt_count: 0, expected: 0
RCU nest depth: 1, expected: 0
2 locks held by rm/5677:
#0: ffff888022c20338 (&mm->mmap_lock){++++}-{4:4}, at: mmap_write_lock include/linux/mmap_lock.h:536 [inline]
#0: ffff888022c20338 (&mm->mmap_lock){++++}-{4:4}, at: exit_mmap+0x22c/0xa10 mm/mmap.c:1308
#1: ffffffff8e7e54e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire.constprop.0+0x7/0x30 include/linux/rcupdate.h:300
CPU: 1 UID: 0 PID: 5677 Comm: rm Not tainted syzkaller #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
__might_resched.cold+0x1ec/0x232 kernel/sched/core.c:9162
__tlb_batch_free_encoded_pages+0x11e/0x280 mm/mmu_gather.c:142
tlb_batch_pages_flush mm/mmu_gather.c:151 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:417 [inline]
tlb_flush_mmu mm/mmu_gather.c:424 [inline]
tlb_finish_mmu+0x1b0/0x810 mm/mmu_gather.c:549
exit_mmap+0x454/0xa10 mm/mmap.c:1313
__mmput+0x12a/0x410 kernel/fork.c:1178
mmput+0x67/0x80 kernel/fork.c:1201
exit_mm kernel/exit.c:581 [inline]
do_exit+0x833/0x2a60 kernel/exit.c:963
do_group_exit+0xd5/0x2a0 kernel/exit.c:1117
__do_sys_exit_group kernel/exit.c:1128 [inline]
__se_sys_exit_group kernel/exit.c:1126 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1126
x64_sys_call+0x102c/0x1530 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x10b/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f9fc102d6c5
Code: Unable to access opcode bytes at 0x7f9fc102d69b.
RSP: 002b:00007ffd51104bf8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f9fc112efe8 RCX: 00007f9fc102d6c5
RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000000
RBP: 0000000000000001 R08: 00007ffd51104b88 R09: 0000000000000000
R10: 00007ffd51104a20 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f9fc112d680 R15: 00007f9fc112f000
</TASK>

====================================
WARNING: rm/5677 still has locks held!
syzkaller #0 Tainted: G W
------------------------------------
1 lock held by rm/5677:
#0: ffffffff8e7e54e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire.constprop.0+0x7/0x30 include/linux/rcupdate.h:300

stack backtrace:
CPU: 1 UID: 0 PID: 5677 Comm: rm Tainted: G W syzkaller #0 PREEMPT(full)
Tainted: [W]=WARN
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x100/0x190 lib/dump_stack.c:120
print_held_locks_bug kernel/locking/lockdep.c:6752 [inline]
debug_check_no_locks_held+0x90/0xa0 kernel/locking/lockdep.c:6760
do_exit+0x13ea/0x2a60 kernel/exit.c:997
do_group_exit+0xd5/0x2a0 kernel/exit.c:1117
__do_sys_exit_group kernel/exit.c:1128 [inline]
__se_sys_exit_group kernel/exit.c:1126 [inline]
__x64_sys_exit_group+0x3e/0x50 kernel/exit.c:1126
x64_sys_call+0x102c/0x1530 arch/x86/include/generated/asm/syscalls_64.h:232
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0x10b/0xf80 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f9fc102d6c5
Code: Unable to access opcode bytes at 0x7f9fc102d69b.
RSP: 002b:00007ffd51104bf8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007f9fc112efe8 RCX: 00007f9fc102d6c5
RDX: 00000000000000e7 RSI: ffffffffffffff88 RDI: 0000000000000000
RBP: 0000000000000001 R08: 00007ffd51104b88 R09: 0000000000000000
R10: 00007ffd51104a20 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 00007f9fc112d680 R15: 00007f9fc112f000
</TASK>

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Will Deacon

unread,

8:05 AM (9 hours ago) 8:05 AM

to syzbot, ak...@linux-foundation.org, aneesh...@kernel.org, linux...@vger.kernel.org, linux-...@vger.kernel.org, linu...@kvack.org, npi...@gmail.com, pet...@infradead.org, syzkall...@googlegroups.com, da...@kernel.org, qi.z...@linux.dev

I finally got a chance to look at this one...

... so we're getting deep in the gather code tearing down an mm, where
we free the unmapped pages but we're somehow inside an RCU read-side
critical section.

The last report on the sysbot dashboard is from 30th April, which
coincides with 99ebc509eef5 ("mm: memcontrol: fix rcu unbalance in
get_non_dying_memcg_end()") landing upstream. Unfortunately, there's no
reproducer available to test that concretely but it looks like we can
end up in there via the page freeing path. So hopefully this is fixed.

As an aside, it's a bit of a pity that the rcu_read_lock() callsite is
identified only as the useless rcu_lock_acquire.constprop.0() function
in this backtrace.

Will

Reply all

Reply to author

Forward

0 new messages