[syzbot] WARNING in nested_vmx_vmexit

18 views
Skip to first unread message

syzbot

unread,
Dec 5, 2021, 8:42:24 AM12/5/21
to b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, sea...@google.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
Hello,

syzbot found the following issue on:

HEAD commit: 5f58da2befa5 Merge tag 'drm-fixes-2021-12-03-1' of git://a..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=14927309b00000
kernel config: https://syzkaller.appspot.com/x/.config?x=e9ea28d2c3c2c389
dashboard link: https://syzkaller.appspot.com/bug?extid=f1d2136db9c80d4733e8
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f1d213...@syzkaller.appspotmail.com

------------[ cut here ]------------
WARNING: CPU: 0 PID: 21158 at arch/x86/kvm/vmx/nested.c:4548 nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547
Modules linked in:
CPU: 0 PID: 21158 Comm: syz-executor.1 Not tainted 5.16.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547
Code: df e8 17 88 a9 00 e9 b1 f7 ff ff 89 d9 80 e1 07 38 c1 0f 8c 51 eb ff ff 48 89 df e8 4d 87 a9 00 e9 44 eb ff ff e8 63 b3 5d 00 <0f> 0b e9 2e f8 ff ff e8 57 b3 5d 00 0f 0b e9 00 f1 ff ff 89 e9 80
RSP: 0018:ffffc9000439f6e8 EFLAGS: 00010293
RAX: ffffffff8126d4cd RBX: 0000000000000000 RCX: ffff888032290000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
RBP: 0000000000000001 R08: ffffffff8126ccf0 R09: ffffed1003cd9808
R10: ffffed1003cd9808 R11: 0000000000000000 R12: ffff88801e6cc000
R13: ffff88802f96e000 R14: dffffc0000000000 R15: 1ffff11005f2dc5d
FS: 0000000000000000(0000) GS:ffff8880b9b00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fb73aecedd8 CR3: 00000000143a4000 CR4: 00000000003526e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
vmx_leave_nested arch/x86/kvm/vmx/nested.c:6220 [inline]
nested_vmx_free_vcpu+0x83/0xc0 arch/x86/kvm/vmx/nested.c:330
vmx_free_vcpu+0x11f/0x2a0 arch/x86/kvm/vmx/vmx.c:6799
kvm_arch_vcpu_destroy+0x6b/0x240 arch/x86/kvm/x86.c:10989
kvm_vcpu_destroy+0x29/0x90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:441
kvm_free_vcpus arch/x86/kvm/x86.c:11426 [inline]
kvm_arch_destroy_vm+0x3ef/0x6b0 arch/x86/kvm/x86.c:11545
kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1189 [inline]
kvm_put_kvm+0x751/0xe40 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1220
kvm_vcpu_release+0x53/0x60 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3489
__fput+0x3fc/0x870 fs/file_table.c:280
task_work_run+0x146/0x1c0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0x705/0x24f0 kernel/exit.c:832
do_group_exit+0x168/0x2d0 kernel/exit.c:929
get_signal+0x1740/0x2120 kernel/signal.c:2852
arch_do_signal_or_restart+0x9c/0x730 arch/x86/kernel/signal.c:868
handle_signal_work kernel/entry/common.c:148 [inline]
exit_to_user_mode_loop kernel/entry/common.c:172 [inline]
exit_to_user_mode_prepare+0x191/0x220 kernel/entry/common.c:207
__syscall_exit_to_user_mode_work kernel/entry/common.c:289 [inline]
syscall_exit_to_user_mode+0x2e/0x70 kernel/entry/common.c:300
do_syscall_64+0x53/0xd0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f3388806b19
Code: Unable to access opcode bytes at RIP 0x7f3388806aef.
RSP: 002b:00007f338773a218 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: fffffffffffffe00 RBX: 00007f338891a0e8 RCX: 00007f3388806b19
RDX: 0000000000000000 RSI: 0000000000000080 RDI: 00007f338891a0e8
RBP: 00007f338891a0e0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007f338891a0ec
R13: 00007fffbe0e838f R14: 00007f338773a300 R15: 0000000000022000
</TASK>


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Vitaly Kuznetsov

unread,
Dec 6, 2021, 4:16:39 AM12/6/21
to k...@vger.kernel.org, jmat...@google.com, Sean Christopherson, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, sea...@google.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
syzbot <syzbot+f1d213...@syzkaller.appspotmail.com> writes:

> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 5f58da2befa5 Merge tag 'drm-fixes-2021-12-03-1' of git://a..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=14927309b00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=e9ea28d2c3c2c389
> dashboard link: https://syzkaller.appspot.com/bug?extid=f1d2136db9c80d4733e8
> compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+f1d213...@syzkaller.appspotmail.com
>
> ------------[ cut here ]------------
> WARNING: CPU: 0 PID: 21158 at arch/x86/kvm/vmx/nested.c:4548 nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547
> Modules linked in:
> CPU: 0 PID: 21158 Comm: syz-executor.1 Not tainted 5.16.0-rc3-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
> RIP: 0010:nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4547

The comment above this WARN_ON_ONCE() says:

4541) /*
4542) * The only expected VM-instruction error is "VM entry with
4543) * invalid control field(s)." Anything else indicates a
4544) * problem with L0. And we should never get here with a
4545) * VMFail of any type if early consistency checks are enabled.
4546) */
4547) WARN_ON_ONCE(vmcs_read32(VM_INSTRUCTION_ERROR) !=
4548) VMXERR_ENTRY_INVALID_CONTROL_FIELD);

which I think should still be valid and so the problem needs to be
looked at L0 (GCE infrastructure). Sean, Jim, your call :-)
--
Vitaly

Sean Christopherson

unread,
Dec 6, 2021, 11:05:11 AM12/6/21
to Vitaly Kuznetsov, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
The assertion itself is still valid, but look at the call stack. This is firing
when KVM tears down the VM, i.e. vmx->fail is likely stale. I'll bet dollars to
donuts that commit c8607e4a086f ("KVM: x86: nVMX: don't fail nested VM entry on
invalid guest state if !from_vmentry") is to blame. L1 is running with
unrestricted_guest=Y, so the only way vmx->emulation_required should become true
is if L2 is active and is not an unrestricted guest.

I objected to the patch[*], but looking back at the dates, it appears that I did
so after the patch was queued and my comments were never addressed.
I'll see if I can reproduce this with a selftest. The fix is likely just:

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index dc4909b67c5c..927a7c43b73b 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6665,10 +6665,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
* consistency check VM-Exit due to invalid guest state and bail.
*/
if (unlikely(vmx->emulation_required)) {
-
- /* We don't emulate invalid state of a nested guest */
- vmx->fail = is_guest_mode(vcpu);
-
vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
vmx->exit_reason.failed_vmentry = 1;
kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);

[*] https://lore.kernel.org/all/YWDWPbgJ...@google.com/

Vitaly Kuznetsov

unread,
Dec 6, 2021, 11:16:13 AM12/6/21
to Sean Christopherson, Maxim Levitsky, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
Oh, I see, true that!

> I'll bet dollars to
> donuts that commit c8607e4a086f ("KVM: x86: nVMX: don't fail nested VM entry on
> invalid guest state if !from_vmentry") is to blame. L1 is running with
> unrestricted_guest=Y, so the only way vmx->emulation_required should become true
> is if L2 is active and is not an unrestricted guest.
>
> I objected to the patch[*], but looking back at the dates, it appears that I did
> so after the patch was queued and my comments were never addressed.
> I'll see if I can reproduce this with a selftest. The fix is likely just:
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index dc4909b67c5c..927a7c43b73b 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -6665,10 +6665,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> * consistency check VM-Exit due to invalid guest state and bail.
> */
> if (unlikely(vmx->emulation_required)) {
> -
> - /* We don't emulate invalid state of a nested guest */
> - vmx->fail = is_guest_mode(vcpu);
> -
> vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
> vmx->exit_reason.failed_vmentry = 1;
> kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
>
> [*] https://lore.kernel.org/all/YWDWPbgJ...@google.com/
>

Let's also summon Max to the discussion to get his thoughts.
--
Vitaly

Sean Christopherson

unread,
Dec 6, 2021, 11:45:12 AM12/6/21
to Vitaly Kuznetsov, Maxim Levitsky, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
On Mon, Dec 06, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <sea...@google.com> writes:
> > I objected to the patch[*], but looking back at the dates, it appears that I did
> > so after the patch was queued and my comments were never addressed.
> > I'll see if I can reproduce this with a selftest. The fix is likely just:
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index dc4909b67c5c..927a7c43b73b 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6665,10 +6665,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > * consistency check VM-Exit due to invalid guest state and bail.
> > */
> > if (unlikely(vmx->emulation_required)) {
> > -
> > - /* We don't emulate invalid state of a nested guest */
> > - vmx->fail = is_guest_mode(vcpu);
> > -
> > vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
> > vmx->exit_reason.failed_vmentry = 1;
> > kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
> >
> > [*] https://lore.kernel.org/all/YWDWPbgJ...@google.com/

Boom. VCPU_RUN exits with KVM_EXIT_INTERNAL_ERROR.

diff --git a/tools/testing/selftests/kvm/x86_64/vmx_close_while_nested_test.c b/tools/testing/selftests/kvm/x86_64/vmx_close_while_nested_test.c
index 2835a17f1b7a..4f77c5d7c7b9 100644
--- a/tools/testing/selftests/kvm/x86_64/vmx_close_while_nested_test.c
+++ b/tools/testing/selftests/kvm/x86_64/vmx_close_while_nested_test.c
@@ -27,6 +27,11 @@ enum {
/* The virtual machine object. */
static struct kvm_vm *vm;

+static void l2_guest_infinite_loop(void)
+{
+ while (1);
+}
+
static void l2_guest_code(void)
{
/* Exit to L0 */
@@ -53,6 +58,9 @@ static void l1_guest_code(struct vmx_pages *vmx_pages)
int main(int argc, char *argv[])
{
vm_vaddr_t vmx_pages_gva;
+ struct kvm_sregs sregs;
+ struct kvm_regs regs;
+ int r;

nested_vmx_check_supported();

@@ -83,4 +91,17 @@ int main(int argc, char *argv[])
TEST_FAIL("Unknown ucall %lu", uc.cmd);
}
}
+
+ memset(&regs, 0, sizeof(regs));
+ vcpu_regs_get(vm, VCPU_ID, &regs);
+ regs.rip = (u64)l2_guest_infinite_loop;
+ vcpu_regs_set(vm, VCPU_ID, &regs);
+
+ memset(&sregs, 0, sizeof(sregs));
+ vcpu_sregs_get(vm, VCPU_ID, &sregs);
+ sregs.tr.unusable = 1;
+ vcpu_sregs_set(vm, VCPU_ID, &sregs);
+
+ r = _vcpu_run(vm, VCPU_ID);
+ TEST_ASSERT(0, "Unexpected return from L2, r = %d, exit_reason = %d", r, vcpu_state(vm, VCPU_ID)->exit_reason);
}

------------[ cut here ]------------
WARNING: CPU: 6 PID: 273926 at arch/x86/kvm/vmx/nested.c:4565 nested_vmx_vmexit+0xd59/0xdb0 [kvm_intel]
CPU: 6 PID: 273926 Comm: vmx_close_while Not tainted 5.15.2-7cc36c3e14ae-pop #279
Hardware name: ASUS Q87M-E/Q87M-E, BIOS 1102 03/03/2014
RIP: 0010:nested_vmx_vmexit+0xd59/0xdb0 [kvm_intel]
Call Trace:
vmx_leave_nested+0x30/0x40 [kvm_intel]
nested_vmx_free_vcpu+0x16/0x20 [kvm_intel]
vmx_free_vcpu+0x4b/0x60 [kvm_intel]
kvm_arch_vcpu_destroy+0x40/0x160 [kvm]
kvm_vcpu_destroy+0x1d/0x50 [kvm]
kvm_arch_destroy_vm+0xc1/0x1c0 [kvm]
kvm_put_kvm+0x187/0x2a0 [kvm]
kvm_vm_release+0x1d/0x30 [kvm]
__fput+0x95/0x250
task_work_run+0x5f/0x90
do_exit+0x3c8/0xab0
do_group_exit+0x47/0xb0
__x64_sys_exit_group+0x14/0x20
do_syscall_64+0x3b/0xc0
entry_SYSCALL_64_after_hwframe+0x44/0xae

Sean Christopherson

unread,
Dec 6, 2021, 12:21:55 PM12/6/21
to Vitaly Kuznetsov, Maxim Levitsky, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
On Mon, Dec 06, 2021, Vitaly Kuznetsov wrote:
> Sean Christopherson <sea...@google.com> writes:
> > I objected to the patch[*], but looking back at the dates, it appears that I did
> > so after the patch was queued and my comments were never addressed.
> > I'll see if I can reproduce this with a selftest. The fix is likely just:
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > index dc4909b67c5c..927a7c43b73b 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -6665,10 +6665,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > * consistency check VM-Exit due to invalid guest state and bail.
> > */
> > if (unlikely(vmx->emulation_required)) {
> > -
> > - /* We don't emulate invalid state of a nested guest */
> > - vmx->fail = is_guest_mode(vcpu);
> > -
> > vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
> > vmx->exit_reason.failed_vmentry = 1;
> > kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
> >
> > [*] https://lore.kernel.org/all/YWDWPbgJ...@google.com/
>
> Let's also summon Max to the discussion to get his thoughts.

Thinking more on this, we should do three things:

1. Revert this part back to "vmx->fail = 0".

2. Override SS.RPL and CS.RPL on RSM for VMX. Not sure this is strictly
necessary, I'm struggling to remember how SS.RPL and SS.DPL can get out of
sync.

IF RFLAGS.VM = 0 AND (in VMX root operation OR the “unrestricted guest”
VM-execution control is 0)
THEN
CS.RPL := SS.DPL;
SS.RPL := SS.DPL;
FI;

3. Modify RSM to go into TRIPLE_FAULT if vmx->emulation_required is true after
loading state for RSM. On AMD, whose SMRAM KVM emulates, all segment state
is read-only, i.e. if it's modified to be invalid then KVM essentially do
whatever it wants.

4. Reject KVM_RUN if is_guest_mode() and vmx->emulation_required are true. By
handling the RSM case explicitly, this means userspace has attempted to run
L2 with garbage, which KVM most definitely doesn't want to support.

5. Add KVM_BUG_ON(is_guest_mode(vcpu), ...) in the emulation_required path in
vmx_vcpu_run(), as reaching that point means KVM botched VM-Enter, RSM or
the #4 above.

Sean Christopherson

unread,
Dec 6, 2021, 12:22:50 PM12/6/21
to Vitaly Kuznetsov, Maxim Levitsky, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
On Mon, Dec 06, 2021, Sean Christopherson wrote:
> On Mon, Dec 06, 2021, Vitaly Kuznetsov wrote:
> > Sean Christopherson <sea...@google.com> writes:
> > > I objected to the patch[*], but looking back at the dates, it appears that I did
> > > so after the patch was queued and my comments were never addressed.
> > > I'll see if I can reproduce this with a selftest. The fix is likely just:
> > >
> > > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> > > index dc4909b67c5c..927a7c43b73b 100644
> > > --- a/arch/x86/kvm/vmx/vmx.c
> > > +++ b/arch/x86/kvm/vmx/vmx.c
> > > @@ -6665,10 +6665,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > > * consistency check VM-Exit due to invalid guest state and bail.
> > > */
> > > if (unlikely(vmx->emulation_required)) {
> > > -
> > > - /* We don't emulate invalid state of a nested guest */
> > > - vmx->fail = is_guest_mode(vcpu);
> > > -
> > > vmx->exit_reason.full = EXIT_REASON_INVALID_STATE;
> > > vmx->exit_reason.failed_vmentry = 1;
> > > kvm_register_mark_available(vcpu, VCPU_EXREG_EXIT_INFO_1);
> > >
> > > [*] https://lore.kernel.org/all/YWDWPbgJ...@google.com/
> >
> > Let's also summon Max to the discussion to get his thoughts.
>
> Thinking more on this, we should do three things:

Doh, hit send too soon. s/three/five, I'm not _that_ bad at math...

Maxim Levitsky

unread,
Dec 7, 2021, 3:21:12 AM12/7/21
to Vitaly Kuznetsov, Sean Christopherson, k...@vger.kernel.org, jmat...@google.com, syzbot, b...@alien8.de, dave....@linux.intel.com, h...@zytor.com, jo...@8bytes.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, syzkall...@googlegroups.com, tg...@linutronix.de, wanp...@tencent.com, x...@kernel.org
I'll take a look at that soon. Need to page in the reason why
this has to be done.


I remember that we have wierd case of L1 beeing in invalid state,
because some of the SMM's invalid state is leaking to L1 after a migration,
and otherwise L1 while nested has to be in valid state:

L1 really can't be in invalid state while in VMXON (even if it could theoretically,
no way it can reach that point - I need to refresh my VMX knowelege to be sure about it),
and L2 has to have valid state since that is the defintion of VMX
- if L2 were in invalid state, then L1 would have to emualte L2 and not use VMX.

I'll take a more detailed look soon at this.

Best regards,
Maxim Levitsky

syzbot

unread,
Dec 7, 2021, 6:20:23 AM12/7/21
to b...@alien8.de, dave....@linux.intel.com, fghee...@gmail.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, mlev...@redhat.com, pbon...@redhat.com, sea...@google.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
syzbot has found a reproducer for the following issue on:

HEAD commit: f80ef9e49fdf Merge tag 'docs-5.16-3' of git://git.lwn.net/..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15b11d89b00000
kernel config: https://syzkaller.appspot.com/x/.config?x=7d5e878e3399b6cc
dashboard link: https://syzkaller.appspot.com/bug?extid=f1d2136db9c80d4733e8
compiler: Debian clang version 11.0.1-2, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1603533ab00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=175b5f3db00000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+f1d213...@syzkaller.appspotmail.com

L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 6503 at arch/x86/kvm/vmx/nested.c:4550 nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4549
Modules linked in:
CPU: 0 PID: 6503 Comm: syz-executor767 Not tainted 5.16.0-rc4-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:nested_vmx_vmexit+0x16bd/0x17e0 arch/x86/kvm/vmx/nested.c:4549
Code: df e8 07 8e a9 00 e9 b1 f7 ff ff 89 d9 80 e1 07 38 c1 0f 8c 51 eb ff ff 48 89 df e8 3d 8d a9 00 e9 44 eb ff ff e8 53 b9 5d 00 <0f> 0b e9 2e f8 ff ff e8 47 b9 5d 00 0f 0b e9 00 f1 ff ff 89 e9 80
RSP: 0018:ffffc90001a5fa50 EFLAGS: 00010293
RAX: ffffffff8126de2d RBX: 0000000000000000 RCX: ffff88807482d700
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000007
RBP: 0000000000000001 R08: ffffffff8126d650 R09: ffffed10041fb808
R10: ffffed10041fb808 R11: 0000000000000000 R12: ffff888020fdc000
R13: ffff8880797e8000 R14: dffffc0000000000 R15: 1ffff1100f2fd05d
FS: 0000000000000000(0000) GS:ffff8880b9a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020002000 CR3: 000000000c88e000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
vmx_leave_nested arch/x86/kvm/vmx/nested.c:6222 [inline]
nested_vmx_free_vcpu+0x83/0xc0 arch/x86/kvm/vmx/nested.c:330
vmx_free_vcpu+0x11f/0x2a0 arch/x86/kvm/vmx/vmx.c:6799
kvm_arch_vcpu_destroy+0x6b/0x240 arch/x86/kvm/x86.c:10990
kvm_vcpu_destroy+0x29/0x90 arch/x86/kvm/../../../virt/kvm/kvm_main.c:441
kvm_free_vcpus arch/x86/kvm/x86.c:11427 [inline]
kvm_arch_destroy_vm+0x3ef/0x6b0 arch/x86/kvm/x86.c:11546
kvm_destroy_vm arch/x86/kvm/../../../virt/kvm/kvm_main.c:1189 [inline]
kvm_put_kvm+0x751/0xe40 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1220
kvm_vm_release+0x42/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:1243
__fput+0x3fc/0x870 fs/file_table.c:280
task_work_run+0x146/0x1c0 kernel/task_work.c:164
exit_task_work include/linux/task_work.h:32 [inline]
do_exit+0x705/0x24f0 kernel/exit.c:832
do_group_exit+0x168/0x2d0 kernel/exit.c:929
__do_sys_exit_group+0x13/0x20 kernel/exit.c:940
__se_sys_exit_group+0x10/0x10 kernel/exit.c:938
__x64_sys_exit_group+0x37/0x40 kernel/exit.c:938
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x44/0xd0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7fe968c95c09
Code: Unable to access opcode bytes at RIP 0x7fe968c95bdf.
RSP: 002b:00007ffc762ba918 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
RAX: ffffffffffffffda RBX: 00007fe968d09270 RCX: 00007fe968c95c09
RDX: 000000000000003c RSI: 00000000000000e7 RDI: 0000000000000000
RBP: 0000000000000000 R08: ffffffffffffffc0 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fe968d09270
R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000001
</TASK>

syzbot

unread,
Dec 7, 2021, 2:19:15 PM12/7/21
to b...@alien8.de, dave....@linux.intel.com, fghee...@gmail.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, mlev...@redhat.com, pbon...@redhat.com, sea...@google.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
syzbot has bisected this issue to:

commit c8607e4a086fae05efe5bffb47c5199c65e7216e
Author: Maxim Levitsky <mlev...@redhat.com>
Date: Mon Sep 13 14:09:53 2021 +0000

KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=10f21e3ab00000
start commit: f80ef9e49fdf Merge tag 'docs-5.16-3' of git://git.lwn.net/..
git tree: upstream
final oops: https://syzkaller.appspot.com/x/report.txt?x=12f21e3ab00000
console output: https://syzkaller.appspot.com/x/log.txt?x=14f21e3ab00000
Reported-by: syzbot+f1d213...@syzkaller.appspotmail.com
Fixes: c8607e4a086f ("KVM: x86: nVMX: don't fail nested VM entry on invalid guest state if !from_vmentry")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Reply all
Reply to author
Forward
0 new messages