Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] x86: kvm: use alternatives for VMCALL vs. VMMCALL if kernel text is read-only

76 views
Skip to first unread message

Paolo Bonzini

unread,
Sep 22, 2014, 7:18:02 AM9/22/14
to linux-...@vger.kernel.org, k...@vger.kernel.org, ch...@arachsys.com, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Borislav Petkov
On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
In that case, KVM will fail to patch VMCALL instructions to VMMCALL
as required on AMD processors.

The failure mode is currently a divide-by-zero exception, which obviously
is a KVM bug that has to be fixed. However, picking the right instruction
between VMCALL and VMMCALL will be faster and will help if you cannot upgrade
the hypervisor.

Reported-by: Chris Webb <ch...@arachsys.com>
Cc: Thomas Gleixner <tg...@linutronix.de>
Cc: Ingo Molnar <mi...@redhat.com>
Cc: "H. Peter Anvin" <h...@zytor.com>
Cc: x...@kernel.org
Cc: Borislav Petkov <b...@suse.de>
Signed-off-by: Paolo Bonzini <pbon...@redhat.com>
---
arch/x86/include/asm/cpufeature.h | 1 +
arch/x86/include/asm/kvm_para.h | 10 ++++++++--
arch/x86/kernel/cpu/amd.c | 7 +++++++
3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index bb9b258d60e7..2075e6c34c78 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -202,6 +202,7 @@
#define X86_FEATURE_DECODEASSISTS ( 8*32+12) /* AMD Decode Assists support */
#define X86_FEATURE_PAUSEFILTER ( 8*32+13) /* AMD filtered pause intercept */
#define X86_FEATURE_PFTHRESHOLD ( 8*32+14) /* AMD pause filter threshold */
+#define X86_FEATURE_VMMCALL ( 8*32+15) /* Prefer vmmcall to vmcall */


/* Intel-defined CPU features, CPUID level 0x00000007:0 (ebx), word 9 */
diff --git a/arch/x86/include/asm/kvm_para.h b/arch/x86/include/asm/kvm_para.h
index c7678e43465b..e62cf897f781 100644
--- a/arch/x86/include/asm/kvm_para.h
+++ b/arch/x86/include/asm/kvm_para.h
@@ -2,6 +2,7 @@
#define _ASM_X86_KVM_PARA_H

#include <asm/processor.h>
+#include <asm/alternative.h>
#include <uapi/asm/kvm_para.h>

extern void kvmclock_init(void);
@@ -16,10 +17,15 @@ static inline bool kvm_check_and_clear_guest_paused(void)
}
#endif /* CONFIG_KVM_GUEST */

-/* This instruction is vmcall. On non-VT architectures, it will generate a
- * trap that we will then rewrite to the appropriate instruction.
+#ifdef CONFIG_DEBUG_RODATA
+#define KVM_HYPERCALL \
+ ALTERNATIVE(".byte 0x0f,0x01,0xc1", ".byte 0x0f,0x01,0xd9", X86_FEATURE_VMMCALL)
+#else
+/* On AMD processors, vmcall will generate a trap that we will
+ * then rewrite to the appropriate instruction.
*/
#define KVM_HYPERCALL ".byte 0x0f,0x01,0xc1"
+#endif

/* For KVM hypercalls, a three-byte sequence of either the vmcall or the vmmcall
* instruction. The hypervisor may replace it with something else but only the
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index 60e5497681f5..813d29d00a17 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -525,6 +525,13 @@ static void early_init_amd(struct cpuinfo_x86 *c)
}
#endif

+ /*
+ * This is only needed to tell the kernel whether to use VMCALL
+ * and VMMCALL. VMMCALL is never executed except under virt, so
+ * we can set it unconditionally.
+ */
+ set_cpu_cap(c, X86_FEATURE_VMMCALL);
+
/* F16h erratum 793, CVE-2013-6885 */
if (c->x86 == 0x16 && c->x86_model <= 0xf)
msr_set_bit(MSR_AMD64_LS_CFG, 15);
--
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Borislav Petkov

unread,
Sep 22, 2014, 3:43:48 PM9/22/14
to Paolo Bonzini, linux-...@vger.kernel.org, k...@vger.kernel.org, ch...@arachsys.com, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Borislav Petkov
On Mon, Sep 22, 2014 at 01:17:48PM +0200, Paolo Bonzini wrote:
> On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.

Hmm, that depends on DEBUG_KERNEL.

I think you're actually talking about distro kernels which enable
CONFIG_DEBUG_RODATA, right?

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Paolo Bonzini

unread,
Sep 23, 2014, 4:00:29 AM9/23/14
to Borislav Petkov, linux-...@vger.kernel.org, k...@vger.kernel.org, ch...@arachsys.com, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Borislav Petkov
Il 22/09/2014 21:43, Borislav Petkov ha scritto:
>> > On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
> Hmm, that depends on DEBUG_KERNEL.
>
> I think you're actually talking about distro kernels which enable
> CONFIG_DEBUG_RODATA, right?

This is for guest kernels, so it's not necessarily distro kernels.
Anyone who compiles their kernel with CONFIG_DEBUG_RODATA + PV spinlocks
would not be able to run it on AMD.

Paolo

Borislav Petkov

unread,
Sep 23, 2014, 4:27:43 AM9/23/14
to Paolo Bonzini, linux-...@vger.kernel.org, k...@vger.kernel.org, ch...@arachsys.com, Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Borislav Petkov
On Tue, Sep 23, 2014 at 10:00:12AM +0200, Paolo Bonzini wrote:
> Il 22/09/2014 21:43, Borislav Petkov ha scritto:
> >> > On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
> > Hmm, that depends on DEBUG_KERNEL.
> >
> > I think you're actually talking about distro kernels which enable
> > CONFIG_DEBUG_RODATA, right?
>
> This is for guest kernels, so it's not necessarily distro kernels.
> Anyone who compiles their kernel with CONFIG_DEBUG_RODATA + PV spinlocks
> would not be able to run it on AMD.

I see. Yeah, so the patch makes sense to me:

Acked-by: Borislav Petkov <b...@suse.de>

Thanks.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

Thomas Gleixner

unread,
Sep 24, 2014, 3:39:53 PM9/24/14
to Paolo Bonzini, linux-...@vger.kernel.org, k...@vger.kernel.org, ch...@arachsys.com, Ingo Molnar, H. Peter Anvin, x...@kernel.org, Borislav Petkov
On Mon, 22 Sep 2014, Paolo Bonzini wrote:

> On x86_64, kernel text mappings are mapped read-only with CONFIG_DEBUG_RODATA.
> In that case, KVM will fail to patch VMCALL instructions to VMMCALL
> as required on AMD processors.
>
> The failure mode is currently a divide-by-zero exception, which obviously
> is a KVM bug that has to be fixed. However, picking the right instruction
> between VMCALL and VMMCALL will be faster and will help if you cannot upgrade
> the hypervisor.
>
> -/* This instruction is vmcall. On non-VT architectures, it will generate a
> - * trap that we will then rewrite to the appropriate instruction.
> +#ifdef CONFIG_DEBUG_RODATA
> +#define KVM_HYPERCALL \
> + ALTERNATIVE(".byte 0x0f,0x01,0xc1", ".byte 0x0f,0x01,0xd9", X86_FEATURE_VMMCALL)

If we can do it via a feature bit and alternatives, then why do you
want to patch it manually if CONFIG_DEBUG_RODATA=n?

Just because more #ifdeffery makes the code more readable?

Thanks,

tglx
0 new messages