[PATCH] vtd: ignore redir_hint if dest_mode != logical

15 views
Skip to first unread message

Veaceslav Falico

unread,
Sep 2, 2015, 5:40:32 AM9/2/15
to jailho...@googlegroups.com
On x86, Redirection Hint is only supported in physical Destination Mode,
and otherwise should be ignored.

This enables to use some hardware which sets DM 0/RH 1.

Signed-off-by: Veaceslav Falico <vfa...@gmail.com>
---

Notes:
This goes in line with:
https://software.intel.com/en-us/forums/topic/288883
official intel response, last comment

https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg04949.html
QEMU implementation

Also, it finally permits to use my network card in
non-root-cell - as it seems to provide garbage in RH.

hypervisor/arch/x86/apic.c | 3 +--
hypervisor/arch/x86/vtd.c | 5 ++---
2 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/hypervisor/arch/x86/apic.c b/hypervisor/arch/x86/apic.c
index d3b4211..9556538 100644
--- a/hypervisor/arch/x86/apic.c
+++ b/hypervisor/arch/x86/apic.c
@@ -237,8 +237,7 @@ bool apic_filter_irq_dest(struct cell *cell, struct apic_irq_message *irq_msg)
if (dest != irq_msg->destination && cell != &root_cell)
return false;
irq_msg->destination = dest;
- } else if (dest > APIC_MAX_PHYS_ID ||
- !cell_owns_cpu(cell, apic_to_cpu_id[dest])) {
+ } else if (dest > APIC_MAX_PHYS_ID) {
return false;
}
return true;
diff --git a/hypervisor/arch/x86/vtd.c b/hypervisor/arch/x86/vtd.c
index db43d46..d469634 100644
--- a/hypervisor/arch/x86/vtd.c
+++ b/hypervisor/arch/x86/vtd.c
@@ -871,9 +871,8 @@ int iommu_map_interrupt(struct cell *cell, u16 device_id, unsigned int vector,
* Note that we do support redirection hint only in logical
* destination mode.
*/
- if ((irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
- irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI) ||
- irq_msg.dest_logical != irq_msg.redir_hint)
+ if (irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
+ irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI)
return -EINVAL;
if (!apic_filter_irq_dest(cell, &irq_msg))
return -EPERM;
--
2.4.3

Valentine Sinitsyn

unread,
Sep 2, 2015, 5:47:18 AM9/2/15
to Veaceslav Falico, jailho...@googlegroups.com
Hi Veaceslav,
Does this have anything to do with RH? Do we really want to deliver
interrupts to other cell CPUs, or am I missing anything?

> return false;
> }
> return true;
> diff --git a/hypervisor/arch/x86/vtd.c b/hypervisor/arch/x86/vtd.c
> index db43d46..d469634 100644
> --- a/hypervisor/arch/x86/vtd.c
> +++ b/hypervisor/arch/x86/vtd.c
> @@ -871,9 +871,8 @@ int iommu_map_interrupt(struct cell *cell, u16 device_id, unsigned int vector,
> * Note that we do support redirection hint only in logical
> * destination mode.
> */
> - if ((irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
> - irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI) ||
> - irq_msg.dest_logical != irq_msg.redir_hint)
> + if (irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
> + irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI)
> return -EINVAL;
> if (!apic_filter_irq_dest(cell, &irq_msg))
> return -EPERM;
>

Valentine

Veaceslav Falico

unread,
Sep 2, 2015, 6:02:51 AM9/2/15
to Valentine Sinitsyn, jailho...@googlegroups.com
Hm, good catch, you're right, I've misread the code here.

Hm #2 - then why my network card wants a physical delivery to CPU0, while
sitting on CPU2?.. Maybe the DM=logical is lost somewhere?..

Anyway, NAK for this patch, I should re-submit only the first chunk after
fixing all the issues...

Valentine Sinitsyn

unread,
Sep 2, 2015, 6:35:50 AM9/2/15
to Veaceslav Falico, jailho...@googlegroups.com
Couldn't it be the case Linux has programmed interrupt affinity this way
before the card was moved to a non-root cell? Just guessing, of course.

> Anyway, NAK for this patch, I should re-submit only the first chunk after
> fixing all the issues...
>
>>
>>> return false;
>>> }
>>> return true;
>>> diff --git a/hypervisor/arch/x86/vtd.c b/hypervisor/arch/x86/vtd.c
>>> index db43d46..d469634 100644
>>> --- a/hypervisor/arch/x86/vtd.c
>>> +++ b/hypervisor/arch/x86/vtd.c
>>> @@ -871,9 +871,8 @@ int iommu_map_interrupt(struct cell *cell, u16
>>> device_id, unsigned int vector,
>>> * Note that we do support redirection hint only in logical
>>> * destination mode.
>>> */
>>> - if ((irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
>>> - irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI) ||
>>> - irq_msg.dest_logical != irq_msg.redir_hint)
>>> + if (irq_msg.delivery_mode != APIC_MSG_DLVR_FIXED &&
>>> + irq_msg.delivery_mode != APIC_MSG_DLVR_LOWPRI)
>>> return -EINVAL;
>>> if (!apic_filter_irq_dest(cell, &irq_msg))
>>> return -EPERM;
>>>
>>
>> Valentine

Valentine

Veaceslav Falico

unread,
Sep 2, 2015, 7:32:18 AM9/2/15
to Valentine Sinitsyn, jailho...@googlegroups.com

Yeah, that was my initial thought as well, however even after PCI reset (done by the jailhouse driver when starting the cell) it's the same.

BTW, did anyone succeed in using the network card in non root on x86?

Valentine Sinitsyn

unread,
Sep 2, 2015, 7:45:50 AM9/2/15
to Veaceslav Falico, jailho...@googlegroups.com
On 02.09.2015 16:32, Veaceslav Falico wrote:
..snip...
> > Couldn't it be the case Linux has programmed interrupt affinity this
> way before the card was moved to a non-root cell? Just guessing, of course.
>
> Yeah, that was my initial thought as well, however even after PCI reset
> (done by the jailhouse driver when starting the cell) it's the same.
>
> BTW, did anyone succeed in using the network card in non root on x86?
Jan did, I believe. See inmates/e1000.c; it's supposed to run non-root,
and he was sharing some ping benchmarks IIRC.

I have this in short-term plans as well, but I'm on AMD.

Valentine

Veaceslav Falico

unread,
Sep 2, 2015, 8:11:08 AM9/2/15
to Valentine Sinitsyn, jailho...@googlegroups.com
On Wed, Sep 02, 2015 at 04:45:48PM +0500, Valentine Sinitsyn wrote:
>On 02.09.2015 16:32, Veaceslav Falico wrote:
>..snip...
>> > Couldn't it be the case Linux has programmed interrupt affinity this
>>way before the card was moved to a non-root cell? Just guessing, of course.
>>
>>Yeah, that was my initial thought as well, however even after PCI reset
>>(done by the jailhouse driver when starting the cell) it's the same.
>>
>>BTW, did anyone succeed in using the network card in non root on x86?
>Jan did, I believe. See inmates/e1000.c; it's supposed to run
>non-root, and he was sharing some ping benchmarks IIRC.

Yeah, but not really, afaik - no interrupts/MSI involved, just busy loop.

>
>I have this in short-term plans as well, but I'm on AMD.

Ouch, good luck! :)

>
>Valentine

Valentine Sinitsyn

unread,
Sep 2, 2015, 8:14:41 AM9/2/15
to Veaceslav Falico, jailho...@googlegroups.com
On 02.09.2015 17:11, Veaceslav Falico wrote:
> On Wed, Sep 02, 2015 at 04:45:48PM +0500, Valentine Sinitsyn wrote:
>> On 02.09.2015 16:32, Veaceslav Falico wrote:
>> ..snip...
>>> > Couldn't it be the case Linux has programmed interrupt affinity this
>>> way before the card was moved to a non-root cell? Just guessing, of
>>> course.
>>>
>>> Yeah, that was my initial thought as well, however even after PCI reset
>>> (done by the jailhouse driver when starting the cell) it's the same.
>>>
>>> BTW, did anyone succeed in using the network card in non root on x86?
>> Jan did, I believe. See inmates/e1000.c; it's supposed to run
>> non-root, and he was sharing some ping benchmarks IIRC.
>
> Yeah, but not really, afaik - no interrupts/MSI involved, just busy loop.
Could be, I don't remember the details.

>> I have this in short-term plans as well, but I'm on AMD.
>
> Ouch, good luck! :)
Thanks!

Valentine

Jan Kiszka

unread,
Sep 2, 2015, 11:18:29 AM9/2/15
to Veaceslav Falico, Valentine Sinitsyn, jailho...@googlegroups.com
On 2015-09-02 13:32, Veaceslav Falico wrote:
>
> On 2 Sep 2015 12:35 pm, "Valentine Sinitsyn"
> <valentine...@gmail.com <mailto:valentine...@gmail.com>> wrote:
>>
>> On 02.09.2015 15:02, Veaceslav Falico wrote:
>>>
>>> On Wed, Sep 02, 2015 at 02:47:15PM +0500, Valentine Sinitsyn wrote:
>>>>
>>>> Hi Veaceslav,
>>>>
>>>> On 02.09.2015 14:40, Veaceslav Falico wrote:
>>>>>
>>>>> On x86, Redirection Hint is only supported in physical Destination
> Mode,
>>>>> and otherwise should be ignored.
>>>>>
>>>>> This enables to use some hardware which sets DM 0/RH 1.
>>>>>
>>>>> Signed-off-by: Veaceslav Falico <vfa...@gmail.com
> <mailto:vfa...@gmail.com>>
Could you post the values you found in the registers (address/data) that
were considered invalid by Jailhouse? Just in case, it may be helpful to
attach the CPU map (which APIC IDs the cell owned) and the APIC mode
(x2APIC or xAPIC).

>
> BTW, did anyone succeed in using the network card in non root on x86?
>

Yes, we successfully assigned individual ports of an ET2 quad port
adapter to non-root Linux cells.

Jan

--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

Veaceslav Falico

unread,
Sep 3, 2015, 6:38:53 AM9/3/15
to Jan Kiszka, Valentine Sinitsyn, jailho...@googlegroups.com
Sure!
address 0x00000000fee00498, data 0x0000000000000000
Only CPU#1 is used for the cell, in x2APIC mode.

Jan Kiszka

unread,
Sep 3, 2015, 6:52:08 AM9/3/15
to Veaceslav Falico, Valentine Sinitsyn, jailho...@googlegroups.com
But that must be a to-be-remapped entry (bits set that are reserved for
plain MSI messages). Do you also have the translated one, after
resolving it via the interrupt remapping table? Of does this translation
not take place, which could point us to a different problem area.

Jan

Veaceslav Falico

unread,
Sep 3, 2015, 7:48:25 AM9/3/15
to Jan Kiszka, Valentine Sinitsyn, jailho...@googlegroups.com
Yep, I guess it's the to-be-remapped one, these are the values that I get in:

@@ -336,7 +349,10 @@ int arch_pci_update_msi(struct pci_device *device,
if (vectors == 0)
return 0;

+ printk("Total MSI vectors: %d, address %p, data %p\n", vectors, msi.raw.address, msi.raw.data);
+

The cell has normal (non-emulated) interrupts, so that's also the data I
get for the irq message, which is checked in iommu_map_interrupt - and,
obviously, fails, as:

1) Destination Mode == Physical && Redirection Hint == 1
and, even if this one is fixed, then
2) in apic_filter_irq_dest() - as the physical DM is used - CPU#0 is owned
by RootCell, and, thus, fails (via cell_owns_cpu()).

So I don't have the translated ones, as the cell fails beforehand...

Jan Kiszka

unread,
Sep 3, 2015, 8:44:47 AM9/3/15
to Veaceslav Falico, Valentine Sinitsyn, jailho...@googlegroups.com
So let me recap the scenario again to ensure I'm looking at the right bits:

- device was in use (at least enabled with interrupts on) in root cell
- device gets assigned to non-root cell
- non-root cell starts to use device, likely just enables it
- Jailhouse processing (pre-existing?) MSI setups and complains

Given that the device was previously in use under Linux which was using
(emulated) interrupt remapping, values in the HW regs may still contain
such remappable parameters. They do not make sense in a non-root cell
context, both regarding remappability (non-root cells don't have an
emulated IR IOMMU, they believe to have native interrupt delivery) as
well as target APIC IDs.

So, could it be that the non-root cell just doesn't properly initializes
the device before enabling it? Or Jailhouse fails to properly reset it
during handover.

Veaceslav Falico

unread,
Sep 3, 2015, 8:51:19 AM9/3/15
to Jan Kiszka, Valentine Sinitsyn, jailho...@googlegroups.com
Exactly the case.

As a side-note - if I blacklist the driver on boot of root cell (i.e. leave
the device as uninitialized as possible) - everything works for the
first(!!!) boot of non-root-cell. For the second one - again, garbage (or,
rather, previous values) in the MSI mem.

>
>Given that the device was previously in use under Linux which was using
>(emulated) interrupt remapping, values in the HW regs may still contain
>such remappable parameters. They do not make sense in a non-root cell
>context, both regarding remappability (non-root cells don't have an
>emulated IR IOMMU, they believe to have native interrupt delivery) as
>well as target APIC IDs.
>
>So, could it be that the non-root cell just doesn't properly initializes
>the device before enabling it? Or Jailhouse fails to properly reset it
>during handover.

Yep, most probably it's either the failure of reseting it (btw, afaik -
it's not reset directly - but relies on the device_release_driver() - which
doesn't guarantee it) - or that the initialization on-start somehow goes
wrong. I think it's the former.

I'm currently debugging it (though I've had a tooth removed yesterday, so
my abilities are not that great :D ).
Reply all
Reply to author
Forward
0 new messages