BUG: unable to handle kernel NULL pointer dereference in handle_external_interrupt_irqoff

28 views
Skip to first unread message

syzbot

unread,
Mar 22, 2020, 2:43:14 AM3/22/20
to b...@alien8.de, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, sean.j.chr...@intel.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
Hello,

syzbot found the following crash on:

HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985
dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38
compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3f29ca...@syzkaller.appspotmail.com

BUG: kernel NULL pointer dereference, address: 0000000000000086
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD a63a4067 P4D a63a4067 PUD a7627067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 0 PID: 9785 Comm: syz-executor.2 Not tainted 5.6.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x86
Code: Bad RIP value.
RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086
RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000
RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453
RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000
R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274
kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline]
handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272
__irqentry_text_start+0x8/0x8
vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405
save_stack mm/kasan/common.c:72 [inline]
set_track mm/kasan/common.c:80 [inline]
kasan_set_free_info mm/kasan/common.c:337 [inline]
__kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476
__cache_free mm/slab.c:3426 [inline]
kfree+0x10a/0x220 mm/slab.c:3757
tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736
security_file_ioctl+0x55/0xb0 security/security.c:1441
entry_SYSCALL_64_after_hwframe+0x49/0xbe
__lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954
test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline]
hlock_class kernel/locking/lockdep.c:163 [inline]
mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642
lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484
rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208
kvm_check_async_pf_completion+0x34e/0x360 arch/x86/kvm/../../../virt/kvm/async_pf.c:137
vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513
kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735
kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932
kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858
vfs_ioctl fs/ioctl.c:47 [inline]
ksys_ioctl fs/ioctl.c:763 [inline]
__do_sys_ioctl fs/ioctl.c:772 [inline]
__se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770
do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Modules linked in:
CR2: 0000000000000086
---[ end trace 4da75c292cd7e3e8 ]---
RIP: 0010:0x86
Code: Bad RIP value.
RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086
RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000
RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453
RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000
R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzk...@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

Dmitry Vyukov

unread,
Mar 22, 2020, 2:59:48 AM3/22/20
to syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
On Sun, Mar 22, 2020 at 7:43 AM syzbot
<syzbot+3f29ca...@syzkaller.appspotmail.com> wrote:
>
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=16403223e00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985
> dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38
> compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+3f29ca...@syzkaller.appspotmail.com

+clang-built-linux

This only happens on the instance that uses clang. So potentially this
is related to clang. The instance also uses smack lsm, but it's less
likely to be involved I think.
This actually started happening around Mar 6, but the ORC unwinder
somehow fails to unwind stack and prints only questionable frames, so
the reports were classified as "corrupted" and all thrown in the
"corrupted reports" bucket:
https://syzkaller.appspot.com/bug?id=d5bc3e0c66d200d72216ab343a67c4327e4a3452

There is already some discussion about this on the clang-built-linux list:
https://groups.google.com/d/msg/clang-built-linux/Cm3VojRK69I/cfDGxIlTAwAJ

The handle_external_interrupt_irqoff has some inline asm and the
special STACK_FRAME_NON_STANDARD. So it has some potential for bad
interaction with compilers...

The commit range is presumably
fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7
But I don't see anything that says "it's me". The only commit that
does non-trivial changes to x86/vmx seems to be "KVM: VMX: check
descriptor table exits on instruction emulation":

$ git log --oneline
fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7
virt/kvm/ arch/x86/kvm/
86f7e90ce840a KVM: VMX: check descriptor table exits on instruction emulation
e951445f4d3b5 Merge tag 'kvmarm-fixes-5.6-1' of
git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
ef935c25fd648 kvm: x86: Limit the number of "kvm: disabled by bios" messages
aaec7c03de92c KVM: x86: avoid useless copy of cpufreq policy
4f337faf1c55e KVM: allow disabling -Werror
575b255c1663c KVM: x86: allow compiling as non-module with W=1
7943f4acea3ca KVM: SVM: allocate AVIC data structures based on kvm_amd
module parameter
b3f15ec3d809c kvm: arm/arm64: Fold VHE entry/exit work into kvm_vcpu_run_vhe()
51b2569402a38 KVM: arm/arm64: Fix up includes for trace.h
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/000000000000277a0405a16bd5c9%40google.com.

Dmitry Vyukov

unread,
Mar 22, 2020, 3:03:29 AM3/22/20
to syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
And the problem with this crash is that it happens all the time,
basically the only crash that now happens on the instance. So
effectively all kernel testing of all subsystems has stalled due to
this.

syzbot

unread,
Mar 22, 2020, 4:53:14 AM3/22/20
to b...@alien8.de, clang-bu...@googlegroups.com, dvy...@google.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, pbon...@redhat.com, sean.j.chr...@intel.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
syzbot has found a reproducer for the following crash on:

HEAD commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=13059373e00000
kernel config: https://syzkaller.appspot.com/x/.config?x=6dfa02302d6db985
dashboard link: https://syzkaller.appspot.com/bug?extid=3f29ca2efb056a761e38
compiler: clang version 10.0.0 (https://github.com/llvm/llvm-project/ c2443155a0fb245c8f17f2c1c72b6ea391e86e81)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1199c0c5e00000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=15097373e00000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+3f29ca...@syzkaller.appspotmail.com

BUG: kernel NULL pointer dereference, address: 0000000000000086
#PF: supervisor instruction fetch in kernel mode
#PF: error_code(0x0010) - not-present page
PGD 9330b067 P4D 9330b067 PUD 9e66f067 PMD 0
Oops: 0010 [#1] PREEMPT SMP KASAN
CPU: 1 PID: 8439 Comm: syz-executor724 Not tainted 5.6.0-rc6-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
RIP: 0010:0x86
Code: Bad RIP value.
RSP: 0018:ffffc900022e7998 EFLAGS: 00010086
RAX: ffffc900022e79c8 RBX: fffffe0000000000 RCX: ffff88809dcf2500
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
R10: ffff88809dcf2500 R11: 0000000000000002 R12: dffffc0000000000
R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
FS: 0000000001d0d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000005c CR3: 00000000978c3000 CR4: 00000000001426e0
Call Trace:
handle_external_interrupt_irqoff+0x154/0x280 arch/x86/kvm/vmx/vmx.c:6274
kvm_before_interrupt arch/x86/kvm/x86.h:343 [inline]
handle_external_interrupt_irqoff+0x132/0x280 arch/x86/kvm/vmx/vmx.c:6272
__irqentry_text_start+0x8/0x8
vcpu_enter_guest+0x6c77/0x9290 arch/x86/kvm/x86.c:8405
save_stack mm/kasan/common.c:72 [inline]
set_track mm/kasan/common.c:80 [inline]
kasan_set_free_info mm/kasan/common.c:337 [inline]
__kasan_slab_free+0x12e/0x1e0 mm/kasan/common.c:476
__cache_free mm/slab.c:3426 [inline]
kfree+0x10a/0x220 mm/slab.c:3757
tomoyo_path_number_perm+0x525/0x690 security/tomoyo/file.c:736
security_file_ioctl+0x55/0xb0 security/security.c:1441
entry_SYSCALL_64_after_hwframe+0x49/0xbe
__lock_acquire+0xc5a/0x1bc0 kernel/locking/lockdep.c:3954
paravirt_write_msr arch/x86/include/asm/paravirt.h:167 [inline]
wrmsrl arch/x86/include/asm/paravirt.h:200 [inline]
native_x2apic_icr_write arch/x86/include/asm/apic.h:249 [inline]
__x2apic_send_IPI_dest arch/x86/kernel/apic/x2apic_phys.c:112 [inline]
x2apic_send_IPI+0x96/0xc0 arch/x86/kernel/apic/x2apic_phys.c:41
test_bit include/asm-generic/bitops/instrumented-non-atomic.h:110 [inline]
hlock_class kernel/locking/lockdep.c:163 [inline]
mark_lock+0x107/0x1650 kernel/locking/lockdep.c:3642
lock_acquire+0x154/0x250 kernel/locking/lockdep.c:4484
rcu_lock_acquire+0x9/0x30 include/linux/rcupdate.h:208
vcpu_run+0x3a3/0xd50 arch/x86/kvm/x86.c:8513
kvm_arch_vcpu_ioctl_run+0x419/0x880 arch/x86/kvm/x86.c:8735
kvm_vcpu_ioctl+0x67c/0xa80 arch/x86/kvm/../../../virt/kvm/kvm_main.c:2932
lock_is_held include/linux/lockdep.h:361 [inline]
rcu_read_lock_sched_held+0x106/0x170 kernel/rcu/update.c:121
kvm_vm_release+0x50/0x50 arch/x86/kvm/../../../virt/kvm/kvm_main.c:858
vfs_ioctl fs/ioctl.c:47 [inline]
ksys_ioctl fs/ioctl.c:763 [inline]
__do_sys_ioctl fs/ioctl.c:772 [inline]
__se_sys_ioctl+0xf9/0x160 fs/ioctl.c:770
do_syscall_64+0xf3/0x1b0 arch/x86/entry/common.c:294
entry_SYSCALL_64_after_hwframe+0x49/0xbe
Modules linked in:
CR2: 0000000000000086
---[ end trace 480d6b60d5a9226d ]---
RIP: 0010:0x86
Code: Bad RIP value.
RSP: 0018:ffffc900022e7998 EFLAGS: 00010086
RAX: ffffc900022e79c8 RBX: fffffe0000000000 RCX: ffff88809dcf2500
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
R10: ffff88809dcf2500 R11: 0000000000000002 R12: dffffc0000000000
R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
FS: 0000000001d0d880(0000) GS:ffff8880ae900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000000000000005c CR3: 00000000978c3000 CR4: 00000000001426e0

syzbot

unread,
Mar 22, 2020, 9:29:09 AM3/22/20
to b...@alien8.de, clang-bu...@googlegroups.com, da...@davemloft.net, dhow...@redhat.com, dvy...@google.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, ku...@kernel.org, k...@vger.kernel.org, linu...@lists.infradead.org, linux-...@vger.kernel.org, mi...@redhat.com, net...@vger.kernel.org, pbon...@redhat.com, sean.j.chr...@intel.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
syzbot has bisected this bug to:

commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b
Author: David Howells <dhow...@redhat.com>
Date: Thu Jan 30 21:50:36 2020 +0000

rxrpc: Fix insufficient receive notification generation

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1483bb19e00000
start commit: b74b991f Merge tag 'block-5.6-20200320' of git://git.kerne..
git tree: upstream
final crash: https://syzkaller.appspot.com/x/report.txt?x=1683bb19e00000
console output: https://syzkaller.appspot.com/x/log.txt?x=1283bb19e00000
Reported-by: syzbot+3f29ca...@syzkaller.appspotmail.com
Fixes: f71dbf2fb284 ("rxrpc: Fix insufficient receive notification generation")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

Dmitry Vyukov

unread,
Mar 22, 2020, 9:43:52 AM3/22/20
to syzbot, Borislav Petkov, clang-built-linux, David Miller, David Howells, H. Peter Anvin, Jim Mattson, Joerg Roedel, ku...@kernel.org, KVM list, linu...@lists.infradead.org, LKML, Ingo Molnar, netdev, Paolo Bonzini, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
On Sun, Mar 22, 2020 at 2:29 PM syzbot
<syzbot+3f29ca...@syzkaller.appspotmail.com> wrote:
>
> syzbot has bisected this bug to:
>
> commit f71dbf2fb28489a79bde0dca1c8adfb9cdb20a6b
> Author: David Howells <dhow...@redhat.com>
> Date: Thu Jan 30 21:50:36 2020 +0000
>
> rxrpc: Fix insufficient receive notification generation

This is unrelated.
Somehow the crash wasn't reproduced again on the same commit. Can it
depend on host CPU type maybe?

Paolo Bonzini

unread,
Mar 23, 2020, 4:18:25 AM3/23/20
to Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
On 22/03/20 07:59, Dmitry Vyukov wrote:
>
> The commit range is presumably
> fb279f4e238617417b132a550f24c1e86d922558..63849c8f410717eb2e6662f3953ff674727303e7
> But I don't see anything that says "it's me". The only commit that
> does non-trivial changes to x86/vmx seems to be "KVM: VMX: check
> descriptor table exits on instruction emulation":

That seems unlikely, it's a completely different file and it would only
affect the outside (non-nested) environment rather than your own kernel.

The only instance of "0x86" in the registers is in the flags:

> RSP: 0018:ffffc90001ac7998 EFLAGS: 00010086
> RAX: ffffc90001ac79c8 RBX: fffffe0000000000 RCX: 0000000000040000
> RDX: ffffc9000e20f000 RSI: 000000000000b452 RDI: 000000000000b453
> RBP: 0000000000000ec0 R08: ffffffff83987523 R09: ffffffff811c7eca
> R10: ffff8880a4e94200 R11: 0000000000000002 R12: dffffc0000000000
> R13: fffffe0000000ec8 R14: ffffffff880016f0 R15: fffffe0000000ecb
> FS: 00007fb50e370700(0000) GS:ffff8880ae800000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 000000000000005c CR3: 0000000092fc7000 CR4: 00000000001426f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400

That would suggest a miscompilation of the inline assembly, which does
push the flags:

#ifdef CONFIG_X86_64
"mov %%" _ASM_SP ", %[sp]\n\t"
"and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
"push $%c[ss]\n\t"
"push %[sp]\n\t"
#endif
"pushf\n\t"
__ASM_SIZE(push) " $%c[cs]\n\t"
CALL_NOSPEC


It would not explain why it suddenly started to break, unless the clang
version also changed, but it would be easy to ascertain and fix (in
either KVM or clang). Dmitry, can you send me the vmx.o and
kvm-intel.ko files?

Thanks,

Paolo

Alexander Potapenko

unread,
Mar 23, 2020, 12:31:28 PM3/23/20
to Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, Christopherson, Sean J, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
On a quick glance, Clang does not miscompile this part.
Attached is the disassembly of handle_external_interrupt_irqoff() from
v5.4 (where the problem seems to also reproduce) with Clang and GCC.
They do virtually the same (look for asm blob after kvm_before_interrupt()).
handle_external_interrupt_irqoff.gcc.txt
handle_external_interrupt_irqoff.clang.txt

Sean Christopherson

unread,
Mar 23, 2020, 12:39:28 PM3/23/20
to Alexander Potapenko, Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
Clang definitely miscompiles the asm, the indirect call operates on the
EFLAGS value, not on @entry as expected. It looks like clang doesn't honor
ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is
getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for
"callq *0x8(%rsp)", which breaks because of asm's pushes.

clang:

kvm_before_interrupt(vcpu);

asm volatile(
ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp)
ffffffff811b7993: 48 89 e0 mov %rsp,%rax
ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
ffffffff811b799a: 6a 18 pushq $0x18
ffffffff811b799c: 50 push %rax
ffffffff811b799d: 9c pushfq
ffffffff811b799e: 6a 10 pushq $0x10
ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value
kvm_after_interrupt():


gcc:
kvm_before_interrupt(vcpu);

asm volatile(
ffffffff8118e17c: 48 89 e0 mov %rsp,%rax
ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
ffffffff8118e183: 6a 18 pushq $0x18
ffffffff8118e185: 50 push %rax
ffffffff8118e186: 9c pushfq
ffffffff8118e187: 6a 10 pushq $0x10
ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry
kvm_after_interrupt():

Alexander Potapenko

unread,
Mar 23, 2020, 12:43:24 PM3/23/20
to Sean Christopherson, Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, wanp...@tencent.com, the arch/x86 maintainers
Ugh, I completely overlooked this. Right, this is something to work
this on the Clang side.


> clang:
>
> kvm_before_interrupt(vcpu);
>
> asm volatile(
> ffffffff811b798e: 4c 89 74 24 08 mov %r14,0x8(%rsp)
> ffffffff811b7993: 48 89 e0 mov %rsp,%rax
> ffffffff811b7996: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
> ffffffff811b799a: 6a 18 pushq $0x18
> ffffffff811b799c: 50 push %rax
> ffffffff811b799d: 9c pushfq
> ffffffff811b799e: 6a 10 pushq $0x10
> ffffffff811b79a0: ff 54 24 08 callq *0x8(%rsp) <--------- calls the EFLAGS value
> kvm_after_interrupt():
>
>
> gcc:
> kvm_before_interrupt(vcpu);
>
> asm volatile(
> ffffffff8118e17c: 48 89 e0 mov %rsp,%rax
> ffffffff8118e17f: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
> ffffffff8118e183: 6a 18 pushq $0x18
> ffffffff8118e185: 50 push %rax
> ffffffff8118e186: 9c pushfq
> ffffffff8118e187: 6a 10 pushq $0x10
> ffffffff8118e189: ff d3 callq *%rbx <-------- calls @entry
> kvm_after_interrupt():
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bug...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20200323163925.GP28711%40linux.intel.com.



--
Alexander Potapenko
Software Engineer

Google Germany GmbH
Erika-Mann-Straße, 33
80636 München

Geschäftsführer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Nick Desaulniers

unread,
Mar 23, 2020, 12:57:45 PM3/23/20
to Sean Christopherson, Alexander Potapenko, Paolo Bonzini, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers
Thanks for this analysis, it looks like this is dependent on some
particular configuration; here's clang+defconfig+CONFIG_KVM_INTEL=y:

0x000000000000528f <+127>: pushq $0x18
0x0000000000005291 <+129>: push %rcx
0x0000000000005292 <+130>: pushfq
0x0000000000005293 <+131>: pushq $0x10
0x0000000000005295 <+133>: callq *%rax

--
Thanks,
~Nick Desaulniers

Nick Desaulniers

unread,
Mar 23, 2020, 1:29:06 PM3/23/20
to Dmitry Vyukov, Alexander Potapenko, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
I noticed that in the syzcaller config I have, that CONFIG_RETPOLINE
is not set. I'm more reliably able to reproduce this with
clang+defconfig+CONFIG_KVM=y+CONFIG_KVM_INTEL=y+CONFIG_RETPOLINE=n,
ie. by manually disabling retpoline.
--
Thanks,
~Nick Desaulniers

Alexander Potapenko

unread,
Mar 23, 2020, 1:55:19 PM3/23/20
to Nick Desaulniers, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
I've reduced the faulty test case to the following code:

=================================
a;
long b;
register unsigned long current_stack_pointer asm("rsp");
handle_external_interrupt_irqoff() {
asm("and $0xfffffffffffffff0, %%rsp\n\tpush $%c[ss]\n\tpush "
"%[sp]\n\tpushf\n\tpushq $%c[cs]\n\tcall *%[thunk_target]\n"
: [ sp ] "=&r"(b), "+r" (current_stack_pointer)
: [ thunk_target ] "rm"(a), [ ss ] "i"(3 * 8), [ cs ] "i"(2 * 8) );
}
=================================
(in fact creduce even throws away current_stack_pointer, but we
probably want to keep it to prove the point).

Clang generates the following code for it:

$ clang vmx.i -O2 -c -w -o vmx.o
$ objdump -d vmx.o
...
0000000000000000 <handle_external_interrupt_irqoff>:
0: 8b 05 00 00 00 00 mov 0x0(%rip),%eax # 6
<handle_external_interrupt_irqoff+0x6>
6: 89 44 24 fc mov %eax,-0x4(%rsp)
a: 48 83 e4 f0 and $0xfffffffffffffff0,%rsp
e: 6a 18 pushq $0x18
10: 50 push %rax
11: 9c pushfq
12: 6a 10 pushq $0x10
14: ff 54 24 fc callq *-0x4(%rsp)
18: 48 89 05 00 00 00 00 mov %rax,0x0(%rip) # 1f
<handle_external_interrupt_irqoff+0x1f>
1f: c3 retq

The question is whether using current_stack_pointer as an output is
actually a valid way to tell the compiler it should not clobber RSP.
Intuitively it is, but explicitly adding RSP to the clobber list
sounds a bit more bulletproof.

Nick Desaulniers

unread,
Mar 23, 2020, 2:06:30 PM3/23/20
to Alexander Potapenko, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
Ok, I think this reproducer demonstrates the issue:
https://godbolt.org/z/jAafjz
I *think* what's happening is that we're not specifying correctly that
the stack is being modified by inline asm, so using variable
references against the stack pointer is not correct.

commit f5caf621ee357 ("x86/asm: Fix inline asm call constraints for Clang")
has more context about ASM_CALL_CONSTRAINT.

It seems that specifying "rsp" in the clobber list is a -Wdeprecated
warning in GCC, and an error in Clang (unless you remove
current_stack_pointer as an output, but will get Clang to produce the
correct code).

--
Thanks,
~Nick Desaulniers

Alexander Potapenko

unread,
Mar 23, 2020, 2:06:50 PM3/23/20
to Nick Desaulniers, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
Ok, I am wrong: according to
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html it's incorrect to
list RSP in the clobber list.

Nick Desaulniers

unread,
Mar 23, 2020, 2:16:26 PM3/23/20
to Alexander Potapenko, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
You could force `entry` into a register:
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4d22b1b5e822..083a7e980bb5 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6277,7 +6277,7 @@ static void
handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
#endif
ASM_CALL_CONSTRAINT
:
- THUNK_TARGET(entry),
+ [thunk_target] "a"(entry),
[ss]"i"(__KERNEL_DS),
[cs]"i"(__KERNEL_CS)
);

(https://stackoverflow.com/a/48877683/1027966 had some interesting
feedback to this problem)
--
Thanks,
~Nick Desaulniers

Nick Desaulniers

unread,
Mar 23, 2020, 2:49:14 PM3/23/20
to Alexander Potapenko, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
Sean said:
> It looks like clang doesn't honor
> ASM_CALL_CONSTRAINT, which effectively tells the compiler that %rsp is
> getting clobbered, e.g. the "mov %r14,0x8(%rsp)" is loading @entry for
> "callq *0x8(%rsp)", which breaks because of asm's pushes.

I'm not sure about this, I think ASM_CALL_CONSTRAINT may be a red
herring, based on the commit message that added it (commit
f5caf621ee357 ("x86/asm: Fix inline asm call constraints for Clang")).

Further, it seems the "m" in "rm" in THUNK_TARGET for
CONFIG_RETPOLINE=n is problematic.

THUNK_TARGET defines [thunk_target] as "rm" when CONFIG_RETPOLINE is
not set, which isn't constrained enough for this specific case; if
`entry` winds up at the bottom of the stack where rsp points to, then
`%rsp` is good enough to satisfy the constraints for using `entry` as
an input. For inline assembly that modifies the the stack pointer
before using this input, the underspecification of constraints is
dangerous, and results in an indirect call to a previously pushed
flags register.

So maybe we can find why
commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
added THUNK_TARGET with and without "m" constraint, and either:
- remove "m" from THUNK_TARGET. (Maybe this doesn't compile somewhere)
or
- use my above recommendation locally avoiding THUNK_TARGET. We can
use "r" rather than "a" (what Clang would have picked) or "b (what GCC
would have picked) to give the compilers maximal flexibility.
--
Thanks,
~Nick Desaulniers

Nick Desaulniers

unread,
Mar 23, 2020, 3:20:06 PM3/23/20
to pbon...@redhat.com, sean.j.chr...@intel.com, ndesau...@google.com, b...@alien8.de, clang-bu...@googlegroups.com, dvy...@google.com, gli...@google.com, h...@zytor.com, jmat...@google.com, jo...@8bytes.org, k...@vger.kernel.org, linux-...@vger.kernel.org, mi...@redhat.com, syzbot+3f29ca...@syzkaller.appspotmail.com, syzkall...@googlegroups.com, tg...@linutronix.de, vkuz...@redhat.com, wanp...@tencent.com, x...@kernel.org
THUNK_TARGET defines [thunk_target] as having "rm" input constraints
when CONFIG_RETPOLINE is not set, which isn't constrained enough for
this specific case.

For inline assembly that modifies the stack pointer before using this
input, the underspecification of constraints is dangerous, and results
in an indirect call to a previously pushed flags register.

In this case `entry`'s stack slot is good enough to satisfy the "m"
constraint in "rm", but the inline assembly in
handle_external_interrupt_irqoff() modifies the stack pointer via
push+pushf before using this input, which in this case results in
calling what was the previous state of the flags register, rather than
`entry`.

Be more specific in the constraints by requiring `entry` be in a
register, and not a memory operand.

Reported-by: Dmitry Vyukov <dvy...@google.com>
Reported-by: syzbot+3f29ca...@syzkaller.appspotmail.com
Debugged-by: Alexander Potapenko <gli...@google.com>
Debugged-by: Paolo Bonzini <pbon...@redhat.com>
Debugged-by: Sean Christopherson <sean.j.chr...@intel.com>
Signed-off-by: Nick Desaulniers <ndesau...@google.com>
---
arch/x86/kvm/vmx/vmx.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 4d22b1b5e822..310e8c1169b8 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6277,7 +6277,7 @@ static void handle_external_interrupt_irqoff(struct kvm_vcpu *vcpu)
#endif
ASM_CALL_CONSTRAINT
:
- THUNK_TARGET(entry),
+ [thunk_target]"r"(entry),
[ss]"i"(__KERNEL_DS),
[cs]"i"(__KERNEL_CS)
);
--
2.25.1.696.g5e7596f4ac-goog

Nick Desaulniers

unread,
Mar 23, 2020, 3:31:08 PM3/23/20
to Alexander Potapenko, Dmitry Vyukov, Paolo Bonzini, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
So I've sent a patch for the latter; my reason for not pursuing the former is:
1. I assume that the thunk target could be spilled, or a pointer, and
we'd like to keep flexibility for the general case of inline asm that
doesn't modify the stack pointer.
2. `entry` is local to `handle_external_interrupt_irqoff`; it's not
being passed in via pointer as a function parameter.
3. register pressure is irrelevant if the resulting code is incorrect.
--
Thanks,
~Nick Desaulniers

Paolo Bonzini

unread,
Mar 23, 2020, 3:40:05 PM3/23/20
to Nick Desaulniers, Alexander Potapenko, Dmitry Vyukov, syzbot, clang-built-linux, Borislav Petkov, H. Peter Anvin, Jim Mattson, Joerg Roedel, KVM list, LKML, Ingo Molnar, syzkaller-bugs, Thomas Gleixner, Vitaly Kuznetsov, Wanpeng Li, the arch/x86 maintainers, Sean Christopherson
Yes, this is fair enough. I've queued your patch and will send it
shortly to Linus.

Paolo

Reply all
Reply to author
Forward
0 new messages