Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[PATCH] kvm: eoi msi documentation

49 views
Skip to first unread message

Michael S. Tsirkin

unread,
May 13, 2012, 11:13:37 AM5/13/12
to k...@vger.kernel.org, Rob Landley, Glauber Costa, Rik van Riel, Avi Kivity, gl...@redhat.com, linu...@vger.kernel.org, linux-...@vger.kernel.org
Document the new EOI MSR.

Signed-off-by: Michael S. Tsirkin <m...@redhat.com>
---

This documents my PV EOI patchset and applies on top.
Will make it part of the patchset on the next respin.

Documentation/virtual/kvm/msr.txt | 56 +++++++++++++++++++++++++++++++++++++
1 files changed, 56 insertions(+), 0 deletions(-)

diff --git a/Documentation/virtual/kvm/msr.txt b/Documentation/virtual/kvm/msr.txt
index 5031780..bdbd337 100644
--- a/Documentation/virtual/kvm/msr.txt
+++ b/Documentation/virtual/kvm/msr.txt
@@ -219,3 +219,59 @@ MSR_KVM_STEAL_TIME: 0x4b564d03
steal: the amount of time in which this vCPU did not run, in
nanoseconds. Time during which the vcpu is idle, will not be
reported as steal time.
+
+MSR_KVM_EOI_EN: 0x4b564d04
+ data: Bit 0 is 1 when PV end of interrupt is enabled on the vcpu; 0
+ when disabled. When enabled, bits 63-1 hold 2-byte aligned physical address
+ of a 2 byte memory area which must be in guest RAM and must be zeroed.
+
+ The first, least significant bit of 2 byte memory location will be
+ written to by the hypervisor, typically at the time of interrupt
+ injection. Value of 1 means that guest can skip writing EOI to the apic
+ (using MSR or MMIO write); instead, it is sufficient to signal
+ EOI by clearing the bit in guest memory - this location will
+ later be polled by the hypervisor.
+ Value of 0 means that the EOI write is required.
+
+ It is always safe for the guest to ignore the optimization and perform
+ the APIC EOI write anyway.
+
+ Hypervisor is guaranteed to only modify this least
+ significant bit while in the current VCPU context, this means that
+ guest does not need to use either lock prefix or memory ordering
+ primitives to synchronise with the hypervisor.
+
+ However, hypervisor can set and clear this memory bit at any time:
+ therefore to make sure hypervisor does not interrupt the
+ guest and clear the least significant bit in the memory area
+ in the window between guest testing it to detect
+ whether it can skip EOI apic write and between guest
+ clearing it to signal EOI to the hypervisor,
+ guest must both read the least sgnificant bit in the memory area and
+ clear it using a single CPU instruction, such as test and clear, or
+ compare and exchange.
+
+the page referred to by the page fault is not
+ present. Value 2 means that the page is now available. Disabling
+ interrupt inhibits APFs. Guest must not enable interrupt
+ before the reason is read, or it may be overwritten by another
+ APF. Since APF uses the same exception vector as regular page
+ fault guest must reset the reason to 0 before it does
+ something that can generate normal page fault. If during page
+ fault APF reason is 0 it means that this is regular page
+ fault.
+
+ During delivery of type 1 APF cr2 contains a token that will
+ be used to notify a guest when missing page becomes
+ available. When page becomes available type 2 APF is sent with
+ cr2 set to the token associated with the page. There is special
+ kind of token 0xffffffff which tells vcpu that it should wake
+ up all processes waiting for APFs and no individual type 2 APFs
+ will be sent.
+
+ If APF is disabled while there are outstanding APFs, they will
+ not be delivered.
+
+ Currently type 2 APF will be always delivered on the same vcpu as
+ type 1 was, but guest should not rely on that.
+
--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Gleb Natapov

unread,
May 13, 2012, 11:56:31 AM5/13/12
to Michael S. Tsirkin, k...@vger.kernel.org, Rob Landley, Glauber Costa, Rik van Riel, Avi Kivity, linu...@vger.kernel.org, linux-...@vger.kernel.org
Looks good, but everything below this is here by mistake. Are You still
going to resend host side patch to address my other comment?
Gleb.

Michael S. Tsirkin

unread,
May 13, 2012, 12:03:42 PM5/13/12
to Gleb Natapov, k...@vger.kernel.org, Rob Landley, Glauber Costa, Rik van Riel, Avi Kivity, linu...@vger.kernel.org, linux-...@vger.kernel.org
On Sun, May 13, 2012 at 06:56:23PM +0300, Gleb Natapov wrote:
> > + However, hypervisor can set and clear this memory bit at any time:
> > + therefore to make sure hypervisor does not interrupt the
> > + guest and clear the least significant bit in the memory area
> > + in the window between guest testing it to detect
> > + whether it can skip EOI apic write and between guest
> > + clearing it to signal EOI to the hypervisor,
> > + guest must both read the least sgnificant bit in the memory area and
> > + clear it using a single CPU instruction, such as test and clear, or
> > + compare and exchange.
> > +
> Looks good, but everything below this is here by mistake.

Ugh. Right. Good catch,

> Are You still
> going to resend host side patch to address my other comment?

Yes, like this.
I'll give more people a chance to review first though.


diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 77e0244..c7e6ffb 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1490,6 +1490,7 @@ int kvm_pv_enable_apic_eoi(struct kvm_vcpu *vcpu, u64 data)
if (eoi_enabled(vcpu))
eoi_clr_pending(vcpu);
vcpu->arch.eoi.msr_val = data;
- kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.eoi.data, addr);
- return 0;
+ if (!eoi_enabled(vcpu))
+ return 0;
+ return kvm_gfn_to_hva_cache_init(vcpu->kvm, &vcpu->arch.eoi.data, addr);
0 new messages