[syzbot] [kvm?] INFO: task hung in kvm_swap_active_memslots (2)

3 views
Skip to first unread message

syzbot

unread,
5:44 AM (15 hours ago) 5:44 AM
to k...@vger.kernel.org, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com
Hello,

syzbot found the following issue on:

HEAD commit: 3a8660878839 Linux 6.18-rc1
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=160a05e2580000
kernel config: https://syzkaller.appspot.com/x/.config?x=e854293d7f44b5a5
dashboard link: https://syzkaller.appspot.com/bug?extid=5c566b850d6ab6f0427a
compiler: gcc (Debian 12.2.0-14+deb12u1) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/87a66406ce1a/disk-3a866087.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/7c3300da5269/vmlinux-3a866087.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b4fcefdaf57b/bzImage-3a866087.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+5c566b...@syzkaller.appspotmail.com

INFO: task syz.2.1185:11790 blocked for more than 143 seconds.
Not tainted syzkaller #0
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
task:syz.2.1185 state:D stack:25976 pid:11790 tgid:11789 ppid:5836 task_flags:0x400140 flags:0x00080002
Call Trace:
<TASK>
context_switch kernel/sched/core.c:5325 [inline]
__schedule+0x1190/0x5de0 kernel/sched/core.c:6929
__schedule_loop kernel/sched/core.c:7011 [inline]
schedule+0xe7/0x3a0 kernel/sched/core.c:7026
kvm_swap_active_memslots+0x2ea/0x7d0 virt/kvm/kvm_main.c:1642
kvm_activate_memslot virt/kvm/kvm_main.c:1786 [inline]
kvm_create_memslot virt/kvm/kvm_main.c:1852 [inline]
kvm_set_memslot+0xd3b/0x1380 virt/kvm/kvm_main.c:1964
kvm_set_memory_region+0xe53/0x1610 virt/kvm/kvm_main.c:2120
kvm_set_internal_memslot+0x9f/0xe0 virt/kvm/kvm_main.c:2143
__x86_set_memory_region+0x2f6/0x740 arch/x86/kvm/x86.c:13242
kvm_alloc_apic_access_page+0xc5/0x140 arch/x86/kvm/lapic.c:2788
vmx_vcpu_create+0x503/0xbd0 arch/x86/kvm/vmx/vmx.c:7599
kvm_arch_vcpu_create+0x688/0xb20 arch/x86/kvm/x86.c:12706
kvm_vm_ioctl_create_vcpu virt/kvm/kvm_main.c:4207 [inline]
kvm_vm_ioctl+0xfec/0x3fd0 virt/kvm/kvm_main.c:5158
vfs_ioctl fs/ioctl.c:51 [inline]
__do_sys_ioctl fs/ioctl.c:597 [inline]
__se_sys_ioctl fs/ioctl.c:583 [inline]
__x64_sys_ioctl+0x18e/0x210 fs/ioctl.c:583
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xcd/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f3c9978eec9
RSP: 002b:00007f3c9a676038 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00007f3c999e5fa0 RCX: 00007f3c9978eec9
RDX: 0000000000000000 RSI: 000000000000ae41 RDI: 0000000000000003
RBP: 00007f3c99811f91 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007f3c999e6038 R14: 00007f3c999e5fa0 R15: 00007ffda33577e8
</TASK>

Showing all locks held in the system:
1 lock held by khungtaskd/31:
#0: ffffffff8e3c42e0 (rcu_read_lock){....}-{1:3}, at: rcu_lock_acquire include/linux/rcupdate.h:331 [inline]
#0: ffffffff8e3c42e0 (rcu_read_lock){....}-{1:3}, at: rcu_read_lock include/linux/rcupdate.h:867 [inline]
#0: ffffffff8e3c42e0 (rcu_read_lock){....}-{1:3}, at: debug_show_all_locks+0x36/0x1c0 kernel/locking/lockdep.c:6775
2 locks held by getty/8058:
#0: ffff88803440d0a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x24/0x80 drivers/tty/tty_ldisc.c:243
#1: ffffc9000e0cd2f0 (&ldata->atomic_read_lock){+.+.}-{4:4}, at: n_tty_read+0x41b/0x14f0 drivers/tty/n_tty.c:2222
2 locks held by syz.2.1185/11790:
#0: ffff888032d640a8 (&kvm->slots_lock){+.+.}-{4:4}, at: class_mutex_constructor include/linux/mutex.h:228 [inline]
#0: ffff888032d640a8 (&kvm->slots_lock){+.+.}-{4:4}, at: kvm_alloc_apic_access_page+0x27/0x140 arch/x86/kvm/lapic.c:2782
#1: ffff888032d64138 (&kvm->slots_arch_lock){+.+.}-{4:4}, at: kvm_set_memslot+0x34/0x1380 virt/kvm/kvm_main.c:1915
4 locks held by kworker/u8:44/13722:
#0: ffff88801ba9f148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work+0x12a2/0x1b70 kernel/workqueue.c:3238
#1: ffffc9000b6cfd00 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work+0x929/0x1b70 kernel/workqueue.c:3239
#2: ffffffff900e8630 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net+0xad/0x8b0 net/core/net_namespace.c:669
#3: ffffffff8e3cf878 (rcu_state.exp_mutex){+.+.}-{4:4}, at: exp_funnel_lock+0x1a3/0x3c0 kernel/rcu/tree_exp.h:343
4 locks held by syz-executor/14037:
#0: ffff888056654dc8 (&hdev->req_lock){+.+.}-{4:4}, at: hci_dev_do_close+0x26/0x90 net/bluetooth/hci_core.c:499
#1: ffff8880566540b8 (&hdev->lock){+.+.}-{4:4}, at: hci_dev_close_sync+0x3ae/0x11d0 net/bluetooth/hci_sync.c:5291
#2: ffffffff90371248 (hci_cb_list_lock){+.+.}-{4:4}, at: hci_disconn_cfm include/net/bluetooth/hci_core.h:2118 [inline]
#2: ffffffff90371248 (hci_cb_list_lock){+.+.}-{4:4}, at: hci_conn_hash_flush+0xbb/0x260 net/bluetooth/hci_conn.c:2602
#3: ffff88803179c338 (&conn->lock#2){+.+.}-{4:4}, at: l2cap_conn_del+0x80/0x730 net/bluetooth/l2cap_core.c:1762
1 lock held by syz.0.1905/15421:
#0: ffffffff900e8630 (pernet_ops_rwsem){++++}-{4:4}, at: copy_net_ns+0x2d6/0x690 net/core/net_namespace.c:576
1 lock held by syz.0.1905/15423:
#0: ffffffff900e8630 (pernet_ops_rwsem){++++}-{4:4}, at: copy_net_ns+0x2d6/0x690 net/core/net_namespace.c:576
2 locks held by syz.1.1908/15436:
#0: ffff88807a444dc8 (&hdev->req_lock){+.+.}-{4:4}, at: hci_dev_do_close+0x26/0x90 net/bluetooth/hci_core.c:499
#1: ffff88807a4440b8 (&hdev->lock){+.+.}-{4:4}, at: hci_dev_close_sync+0x3ae/0x11d0 net/bluetooth/hci_sync.c:5291

=============================================

NMI backtrace for cpu 1
CPU: 1 UID: 0 PID: 31 Comm: khungtaskd Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/02/2025
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:94 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120
nmi_cpu_backtrace+0x27b/0x390 lib/nmi_backtrace.c:113
nmi_trigger_cpumask_backtrace+0x29c/0x300 lib/nmi_backtrace.c:62
trigger_all_cpu_backtrace include/linux/nmi.h:160 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:332 [inline]
watchdog+0xf3f/0x1170 kernel/hung_task.c:495
kthread+0x3c5/0x780 kernel/kthread.c:463
ret_from_fork+0x675/0x7d0 arch/x86/kernel/process.c:158
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

Alexander Potapenko

unread,
6:03 AM (15 hours ago) 6:03 AM
to syzbot, k...@vger.kernel.org, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com, Sean Christopherson
It's worth noting that in addition to upstream this bug has been reported by syzkaller on several other kernel branches.
In each report, the task was shown to be holding the same pair of locks: &kvm->slots_arch_lock and &kvm->slots_lock.

Alexander Potapenko

unread,
6:06 AM (15 hours ago) 6:06 AM
to syzbot, k...@vger.kernel.org, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com
On Mon, Nov 17, 2025 at 11:44 AM syzbot
<syzbot+5c566b...@syzkaller.appspotmail.com> wrote:
>

Sean Christopherson

unread,
11:54 AM (9 hours ago) 11:54 AM
to Alexander Potapenko, syzbot, k...@vger.kernel.org, linux-...@vger.kernel.org, pbon...@redhat.com, syzkall...@googlegroups.com
Ya, though that's not terribly interesting because kvm_swap_active_memslots()
holds those locks, and the issue is specifically that kvm_swap_active_memslots()
is waiting on kvm->mn_active_invalidate_count to go to zero.

Paolo even called out this possibility in commit 52ac8b358b0c ("KVM: Block memslot
updates across range_start() and range_end()"):

: Losing the rwsem fairness does theoretically allow MMU notifiers to
: block install_new_memslots forever. Note that mm/mmu_notifier.c's own
: retry scheme in mmu_interval_read_begin also uses wait/wake_up
: and is likewise not fair.

In every reproducer, the "VMM" process is either getting thrashed by reclaim, or
the process itself is generating a constant stream of mmu_notifier invalidations.

I don't see an easy, or even decent, solution for this. Forcing new invalidations
to wait isn't really an option because in-flight invalidations may be sleepable
(and KVM has zero visibility into the the behavior of invalidator), while new
invalidations may not be sleepable.

And on the KVM side, bailing from kvm_activate_memslot() on a pending signal
isn't an option, because kvm_activate_memslot() must not fail. Hmm, at least,
not without terminating the VM. I guess maybe that's an option? Add a timeout
(maybe with a module param) to the kvm_swap_active_memslots() loop, and then WARN
and kill the VM if the timeout is hit.
Reply all
Reply to author
Forward
0 new messages