[RFC] tools: config create: add -P, --no-pci option

16 views
Skip to first unread message

hw.cl...@gmail.com

unread,
Mar 16, 2015, 10:27:23 AM3/16/15
to Jan Kiszka, Claudio Fontana, jailho...@googlegroups.com
From: Claudio Fontana <claudio...@huawei.com>

this works around the lack of VT-d or ACPI DMAR table.
For debugging and early enablement of guests, it might be
useful to ignore devices.

Signed-off-by: Claudio Fontana <claudio...@huawei.com>
---
tools/jailhouse-config-create | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)

Does this make sense?
We are playing around with the idea to enable other non-linux guests
in Jailhouse, and for that it will be a long way before we need to
care about devices. We will first try to just print a hello world
from the guest and die.
For this we shouldn't need VT-d, or any remapping of PCI devices
I would think..

Thank you for your comments,

Claudio

diff --git a/tools/jailhouse-config-create b/tools/jailhouse-config-create
index c3eebbe..6ed490c 100755
--- a/tools/jailhouse-config-create
+++ b/tools/jailhouse-config-create
@@ -50,6 +50,9 @@ parser.add_argument('-t', '--template-dir',
default=template_default_dir,
action='store',
type=str)
+parser.add_argument('-P', '--no-pci',
+ help="skip parsing of pci devices",
+ action="store_true")

memargs = [['--mem-inmates', '2M', 'inmate'],
['--mem-hv', '64M', 'hypervisor']]
@@ -613,6 +616,8 @@ def parse_dmar_devscope(f):
# parsing of DMAR ACPI Table
# see Intel VT-d Spec chapter 8
def parse_dmar(pcidevices, ioapics):
+ if not os.path.exists("/sys/firmware/acpi/tables/DMAR"):
+ raise RuntimeError('DMAR: no DMAR table found in sysfs. Try --no-pci ?')
f = input_open('/sys/firmware/acpi/tables/DMAR', 'rb')
signature = f.read(4)
if signature != b'DMAR':
@@ -778,7 +783,10 @@ if jh_enabled == '1':
file=sys.stderr)
sys.exit(1)

-(pcidevices, pcicaps) = parse_pcidevices()
+if (options.no_pci):
+ (pcidevices, pcicaps) = ([], [])
+else:
+ (pcidevices, pcicaps) = parse_pcidevices()

product = [input_readline('/sys/class/dmi/id/sys_vendor',
True).rstrip(),
@@ -797,7 +805,7 @@ mmconfig = MMConfig.parse()

ioapics = parse_madt()

-if get_cpu_vendor() == 'GenuineIntel':
+if get_cpu_vendor() == 'GenuineIntel' and not options.no_pci:
(dmar_units, rmrr_regs) = parse_dmar(pcidevices, ioapics)
else:
(dmar_units, rmrr_regs) = [], []
--
1.8.5.3

Jan Kiszka

unread,
Mar 16, 2015, 10:40:49 AM3/16/15
to hw.cl...@gmail.com, Claudio Fontana, jailho...@googlegroups.com
On 2015-03-16 15:29, hw.cl...@gmail.com wrote:
> From: Claudio Fontana <claudio...@huawei.com>
>
> this works around the lack of VT-d or ACPI DMAR table.
> For debugging and early enablement of guests, it might be
> useful to ignore devices.
>
> Signed-off-by: Claudio Fontana <claudio...@huawei.com>
> ---
> tools/jailhouse-config-create | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> Does this make sense?

Hmm, for the current situation, it may to some degree. But my secret
plan is actually to make IOMMUs mandatory, also in the hypervisor, once
we have them fully available in QEMU/KVM.

Did you try this path already? It can be much more convenient than using
real hw - unless you depend on something that only physical hw can provide.

> We are playing around with the idea to enable other non-linux guests
> in Jailhouse, and for that it will be a long way before we need to
> care about devices. We will first try to just print a hello world
> from the guest and die.
> For this we shouldn't need VT-d, or any remapping of PCI devices
> I would think..

Yes, functionally you can get along without it as Linux / the root cell
will have a 1:1 mapping anyway. It would even suffice to just ignore the
missing IOMMU support, still scanning and listing all PCI devices (this
is how AMD is supported right now).

Jan

--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

hw.cl...@gmail.com

unread,
Mar 18, 2015, 5:24:22 AM3/18/15
to jailho...@googlegroups.com, hw.cl...@gmail.com, claudio...@huawei.com
On Monday, 16 March 2015 15:40:49 UTC+1, J. Kiszka wrote:
> On 2015-03-16 15:29, hw.cl...@gmail.com wrote:
> > From: Claudio Fontana <claudio...@huawei.com>
> >
> > this works around the lack of VT-d or ACPI DMAR table.
> > For debugging and early enablement of guests, it might be
> > useful to ignore devices.
> >
> > Signed-off-by: Claudio Fontana <claudio...@huawei.com>
> > ---
> > tools/jailhouse-config-create | 12 ++++++++++--
> > 1 file changed, 10 insertions(+), 2 deletions(-)
> >
> > Does this make sense?
>
> Hmm, for the current situation, it may to some degree. But my secret
> plan is actually to make IOMMUs mandatory, also in the hypervisor, once
> we have them fully available in QEMU/KVM.
>
> Did you try this path already? It can be much more convenient than using
> real hw - unless you depend on something that only physical hw can provide.

We are trying both over here, qemu and physical hw.

>
> > We are playing around with the idea to enable other non-linux guests
> > in Jailhouse, and for that it will be a long way before we need to
> > care about devices. We will first try to just print a hello world
> > from the guest and die.
> > For this we shouldn't need VT-d, or any remapping of PCI devices
> > I would think..
>
> Yes, functionally you can get along without it as Linux / the root cell
> will have a 1:1 mapping anyway. It would even suffice to just ignore the
> missing IOMMU support, still scanning and listing all PCI devices (this
> is how AMD is supported right now).

I got it to work on the hardware by using the skeleton iommu_ functions for amd on intel as well, to survive iommu_init().

Also I needed to edit the my-machine.c file because of a region which was not page aligned:

/* MemRegion: 000ce800-000cf7ff : Adapter ROM */

which caused check_mem_regions() to fail.

>
> Jan
>

Ciao,

Claudio

Jan Kiszka

unread,
Mar 18, 2015, 7:39:02 AM3/18/15
to hw.cl...@gmail.com, jailho...@googlegroups.com, claudio...@huawei.com
If you are running on a hardware that does expose VT-d but you don't
need it, just look at iommu_init, "WARNING: No VT-d support found!\n",
and make sure your machine takes that path. That's what happens on QEMU
as well when you leave iommu=off.

>
> Also I needed to edit the my-machine.c file because of a region which was not page aligned:
>
> /* MemRegion: 000ce800-000cf7ff : Adapter ROM */
>
> which caused check_mem_regions() to fail.

OK, that's a config creator issue. At least as long as we can safely
coalesce that region with the neighbors or have no sub-page dispatching
(the latter is on the to-do list, but will have performance implications).

Jan Kiszka

unread,
Mar 18, 2015, 7:41:30 AM3/18/15
to hw.cl...@gmail.com, jailho...@googlegroups.com, claudio...@huawei.com
On 2015-03-18 12:38, Jan Kiszka wrote:
>>
>> Also I needed to edit the my-machine.c file because of a region which was not page aligned:
>>
>> /* MemRegion: 000ce800-000cf7ff : Adapter ROM */
>>
>> which caused check_mem_regions() to fail.
>
> OK, that's a config creator issue. At least as long as we can safely
> coalesce that region with the neighbors or have no sub-page dispatching
> (the latter is on the to-do list, but will have performance implications).

Oh, and could you share the collected state (config collect) of that
machine so that we can look into the issue?

Claudio Fontana

unread,
Mar 18, 2015, 9:07:27 AM3/18/15
to Jan Kiszka, jailho...@googlegroups.com, Claudio Fontana
Sure I'll try to attach the tar file here. I generated it with

$ jailhouse config collect my-machine.tar

with the following tiny patch to jailhouse-config-collect.tmpl:


---------------------
diff --git a/tools/jailhouse-config-collect.tmpl
b/tools/jailhouse-config-collect.tmpl
index d155b18..3790d5f 100644
--- a/tools/jailhouse-config-collect.tmpl
+++ b/tools/jailhouse-config-collect.tmpl
@@ -58,7 +58,9 @@ for f in $filelist; do
done
grep GenuineIntel /proc/cpuinfo > /dev/null &&
for f in $filelist_intel; do
- copy_file $f
+ if [ -f $f ]; then
+ copy_file $f
+ fi
done
for f in $filelist_opt; do
if [ -f $f ]; then
--------------------------

By the way my understanding is that my processor _should_ support VT-d
(i7-2600), but my board and firmware do not.
Maybe that's the reason why I get in trouble with the jailhouse iommu code..

Thanks,

Claudio
my-machine.tar

Jan Kiszka

unread,
Mar 18, 2015, 10:50:09 AM3/18/15
to Claudio Fontana, jailho...@googlegroups.com, Claudio Fontana
On 2015-03-18 14:07, Claudio Fontana wrote:
> Sure I'll try to attach the tar file here. I generated it with
>
> $ jailhouse config collect my-machine.tar
>

Thanks!

> with the following tiny patch to jailhouse-config-collect.tmpl:
>
>
> ---------------------
> diff --git a/tools/jailhouse-config-collect.tmpl
> b/tools/jailhouse-config-collect.tmpl
> index d155b18..3790d5f 100644
> --- a/tools/jailhouse-config-collect.tmpl
> +++ b/tools/jailhouse-config-collect.tmpl
> @@ -58,7 +58,9 @@ for f in $filelist; do
> done
> grep GenuineIntel /proc/cpuinfo > /dev/null &&
> for f in $filelist_intel; do
> - copy_file $f
> + if [ -f $f ]; then
> + copy_file $f
> + fi

Yeah, sure, the config collector demands VT-d (a DMAR table).

> done
> for f in $filelist_opt; do
> if [ -f $f ]; then
> --------------------------
>
> By the way my understanding is that my processor _should_ support VT-d
> (i7-2600), but my board and firmware do not.
> Maybe that's the reason why I get in trouble with the jailhouse iommu code..

VT-d is a chipset feature (maybe there is some internal dependency to
the CPU, but that's not visible for the programmer). If the board (ACPI)
does not report VT-d support, Jailhouse will complain but continue to
work, just like under QEMU.

Claudio Fontana

unread,
Mar 18, 2015, 11:05:11 AM3/18/15
to Jan Kiszka, hw.cl...@gmail.com, jailho...@googlegroups.com
On 18.03.2015 12:41, Jan Kiszka wrote:
> On 2015-03-18 12:38, Jan Kiszka wrote:
>>>
>>> Also I needed to edit the my-machine.c file because of a region which was not page aligned:
>>>
>>> /* MemRegion: 000ce800-000cf7ff : Adapter ROM */
>>>
>>> which caused check_mem_regions() to fail.
>>
>> OK, that's a config creator issue. At least as long as we can safely
>> coalesce that region with the neighbors or have no sub-page dispatching
>> (the latter is on the to-do list, but will have performance implications).
>
> Oh, and could you share the collected state (config collect) of that
> machine so that we can look into the issue?
>
> Jan

Btw I removed the discrete nvidia card from the intel box, and that memory region
disappeared.

C.



--
Claudio Fontana
Server Virtualization Architect
Huawei Technologies Duesseldorf GmbH
Riesstraße 25 - 80992 München

office: +49 89 158834 4135
mobile: +49 15253060158

Jan Kiszka

unread,
Mar 18, 2015, 11:12:49 AM3/18/15
to Claudio Fontana, hw.cl...@gmail.com, jailho...@googlegroups.com, Henning Schild
On 2015-03-18 16:05, Claudio Fontana wrote:
> On 18.03.2015 12:41, Jan Kiszka wrote:
>> On 2015-03-18 12:38, Jan Kiszka wrote:
>>>>
>>>> Also I needed to edit the my-machine.c file because of a region which was not page aligned:
>>>>
>>>> /* MemRegion: 000ce800-000cf7ff : Adapter ROM */
>>>>
>>>> which caused check_mem_regions() to fail.
>>>
>>> OK, that's a config creator issue. At least as long as we can safely
>>> coalesce that region with the neighbors or have no sub-page dispatching
>>> (the latter is on the to-do list, but will have performance implications).
>>
>> Oh, and could you share the collected state (config collect) of that
>> machine so that we can look into the issue?
>>
>> Jan
>
> Btw I removed the discrete nvidia card from the intel box, and that memory region
> disappeared.

Yeah, but I bet it will bite us in more cases in the future with other
option ROMs, e.g. of discrete storage controllers or network adapters.
This is the layout Linux reported in your case:

...
000a0000-000bffff : PCI Bus 0000:00
000c0000-000dffff : PCI Bus 0000:00
000c0000-000c7fff : Video ROM
000ce800-000cf7ff : Adapter ROM
000e0000-000fffff : reserved
000f0000-000fffff : System ROM
...

Maybe we should just take the ROM region (c0000..System ROM) as single
chunk and avoid breaking it down. In Jailhouse, we don't run the ROMs
during non-root cell boot or even just map them over, even if the
associated device is assigned to a non-root cell. Or what do you think,
Henning?

Claudio Fontana

unread,
Mar 18, 2015, 11:21:27 AM3/18/15
to Jan Kiszka, hw.cl...@gmail.com, jailho...@googlegroups.com
On 18.03.2015 16:05, Claudio Fontana wrote:
> On 18.03.2015 12:41, Jan Kiszka wrote:
>> On 2015-03-18 12:38, Jan Kiszka wrote:
>>>>
>>>> Also I needed to edit the my-machine.c file because of a region which was not page aligned:
>>>>
>>>> /* MemRegion: 000ce800-000cf7ff : Adapter ROM */
>>>>
>>>> which caused check_mem_regions() to fail.
>>>
>>> OK, that's a config creator issue. At least as long as we can safely
>>> coalesce that region with the neighbors or have no sub-page dispatching
>>> (the latter is on the to-do list, but will have performance implications).
>>
>> Oh, and could you share the collected state (config collect) of that
>> machine so that we can look into the issue?
>>
>> Jan
>
> Btw I removed the discrete nvidia card from the intel box, and that memory region
> disappeared.
>
> C.


However, now that I removed the graphics card and use the integrated intel one, jailhouse locks the machine when I run

$ jailhouse enable configs/my-machine.cell

Nothing interesting in the logs I could see...

Claudio Fontana

unread,
Mar 18, 2015, 11:29:05 AM3/18/15
to Jan Kiszka, Claudio Fontana, jailho...@googlegroups.com
Hmm this "fallback" scenario does not seem to work for me on the physical hardware.
Jailhouse does not complain while continuing to work, it returns with error -EINVAL from enter_hypervisor on all cpus
when I try jailhouse enable.

Would acpidump output be useful?

Thanks,

Claudio

Jan Kiszka

unread,
Mar 18, 2015, 11:34:54 AM3/18/15
to Claudio Fontana, hw.cl...@gmail.com, jailho...@googlegroups.com
Do you have a serial cable attached that catches Jailhouse outputs? Will
be required for debugging.

Jan Kiszka

unread,
Mar 18, 2015, 11:37:09 AM3/18/15
to Claudio Fontana, Claudio Fontana, jailho...@googlegroups.com
I've just pushed an instrumentation series to next that I happened to
prepare today. It will write out the file and line that raises -EINVAL
and can help us to track down the reason. Usage: create
hypervisor/include/jailhouse/config.h with "#define CONFIG_TRACE_ERROR
1", then recompile and run.

Valentine Sinitsyn

unread,
Mar 18, 2015, 11:38:26 AM3/18/15
to Jan Kiszka, Claudio Fontana, jailho...@googlegroups.com, Claudio Fontana
Hi all,

On 18.03.2015 17:50, Jan Kiszka wrote:
> VT-d is a chipset feature (maybe there is some internal dependency to
> the CPU, but that's not visible for the programmer). If the board (ACPI)
> does not report VT-d support, Jailhouse will complain but continue to
> work, just like under QEMU.
I feel you won't be able to generate a config with 'jailhouse config
create' for Intel system without VT-d support, however.

Valentine

Henning Schild

unread,
Mar 18, 2015, 12:53:43 PM3/18/15
to Jan Kiszka, Claudio Fontana, hw.cl...@gmail.com, jailho...@googlegroups.com
That makes sense and i just sent a patch. For the other corner cases
from this thread i am not sure how to elegantly handle them. I guess
for now they can be ignored until we introduce sub-page "mappings".

Henning
Reply all
Reply to author
Forward
0 new messages