virtio-mmio vs virtio-pci

3,180 views
Skip to first unread message

Richard W.M. Jones

unread,
Dec 22, 2017, 4:07:25 AM12/22/17
to Michael Clark, sw-...@groups.riscv.org
On the subject of virtio-mmio, ARM's -M virt model moved from
virtio-mmio to virtio-pci a couple of years ago.

There were several reasons for this:

(1) virtio-mmio has abysmal performance. (As in: an order of
magnitude slower, at least).

(2) virtio-pci supports hot-plugging.

(3) Enhanced features of PCI like advanced error reporting and so on.

(4) Ability to add more devices. By using PCI root ports and multiple
buses you can add (literally) thousands of devices, whereas
virtio-mmio was limited by address space to relatively few devices. I
believe this applies even on 64 bit because your address space
requirements are dictated by having to be compatible with 32 bit VMs.

(5) Single code path used by x86 and [your other architecture]. For
the same reason we removed a lot of arm-specific hacks from libvirt &
other tools that generate qemu command lines.

So, I'm just saying really. If RISC-V -M virt is just a temporary
thing then you maybe don't care about the above, but if you're looking
to get an -M virt model into upstream qemu long term then I suspect
use of virtio-mmio is going to be an issue.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Stefan O'Rear

unread,
Dec 22, 2017, 4:16:21 AM12/22/17
to Richard W.M. Jones, Michael Clark, RISC-V SW Dev
On Fri, Dec 22, 2017 at 1:07 AM, Richard W.M. Jones <rjo...@redhat.com> wrote:
> On the subject of virtio-mmio, ARM's -M virt model moved from
> virtio-mmio to virtio-pci a couple of years ago.
>
> There were several reasons for this:
>
> (1) virtio-mmio has abysmal performance. (As in: an order of
> magnitude slower, at least).
>
> (2) virtio-pci supports hot-plugging.
>
> (3) Enhanced features of PCI like advanced error reporting and so on.
>
> (4) Ability to add more devices. By using PCI root ports and multiple
> buses you can add (literally) thousands of devices, whereas
> virtio-mmio was limited by address space to relatively few devices. I
> believe this applies even on 64 bit because your address space
> requirements are dictated by having to be compatible with 32 bit VMs.
>
> (5) Single code path used by x86 and [your other architecture]. For
> the same reason we removed a lot of arm-specific hacks from libvirt &
> other tools that generate qemu command lines.
>
> So, I'm just saying really. If RISC-V -M virt is just a temporary
> thing then you maybe don't care about the above, but if you're looking
> to get an -M virt model into upstream qemu long term then I suspect
> use of virtio-mmio is going to be an issue.

Hi, I wrote the first version of this code, although I've been largely
unavailable for several months for a variety of personal reasons and
I'm thankful Michael Clark took it over.

Work on this code predates the RISC-V Linux Device Tree migration.
The pre-upstreaming kernel port could only support devices which were
modified to support Config String, and the only PCIe root port for
which such a modification was made was the Xilinx AXI/PCIe bridge.
There is no model for that PCIe root port in QEMU, and thus supporting
virtio-pci prior to kernel upstreaming would have required either
developing a new PCIe root port model in QEMU, or adding a Linux
driver for a config-string version of a QEMU-existing root port.

Now that the Linux port supports all existing drivers through device
tree, the reason to use virtio-mmio is largely obviated.

-s

Michael Clark

unread,
Dec 22, 2017, 6:05:41 AM12/22/17
to Richard W.M. Jones, Stefan O'Rear, RISC-V SW Dev
We can certainly consider adding any of the QEMU virtual devices that are architecture neutral and are supported by device tree.

Adding virtio_mmio was the path of least resistance to get virtio-block and virtio-net up and running. I took a 4-line fragment of code from Stefan’s original RISCVEMU pull request and added device-tree nodes by reading the device-tree comments in the linux-kernel virtio code.

https://github.com/torvalds/linux/blob/ead68f216110170ec729e2c4dec0aad6d38259d7/drivers/virtio/virtio_mmio.c#L32-L36

I ended up dumping dts from the arm virt board, as in my first patch I had “virtio_block@<address>” instead of “virtio_mmio@<address>”.

qemu-system-aarch64 -nographic -machine virt,dtbdump=arm-virt.out
fdtdump arm-virt.out

That still didn’t work because RISC-V device-tree has a different interrupt controller topology and we needed an “interrupt-parent” node (thanks to Palmer) inside of each device. Adding it at the top-level like arm’s device tree doesn’t work on RISC-V as the CLINT irqs are from a different irq domain and we had crashing kernels.

Most of the focus at the time was debugging the PLIC implementation as we had a bug that led to spurious interrupts and we uncovered another subtle bug in the pre-existing RISC-V QEMU interrupt code that lead to lost interrupts and deadlocks. In any case it was a lot of fun!

Thanks to Stefan, Palmer, Karsten, Sagar, Daire and Ivan from Emdalo and other testers and previous contributors, we now have most of privilege ISA v1.10 (still need mtval but Linux is not yet using it), working virtio-block, virtio-net, a working PLIC and working WFI. i.e. QEMU no longer chews 100% CPU when idle and we can attach network and disk and we are close to getting SMP working. I can get ~100MB/sec from VirtIO block. Sure it’s not 1GB/sec which I can get from the SSD on the host, but it’s usable.

I suspect virtio-scsi-pci will perform better due to tagged command queuing. I suspect virtio-block can only have one request outstanding. Unless we are talking PCI passthrough, emulated PCI DMA is still going to require memcpy, whether it’s pci-mmio or virtio-mmio, both are interrupt driven, so the fact that virtio-scsi is faster than virtio-block is likely due to driver quality. i.e. there is no magic DMA controller we can use when we have fully virtual devices. it’s memcpy.

That said, I’m currently focusing on forward porting the RISC-V cpu and devices to current QEMU master. There have been a lot of changes in the year between the last re-sync with upstream, and we also need to implement the AON for power-off and reset. The SiFiveUART needs work too. There is no shortage of things to do…

We’re happy to receive patches and pull requests. It’s going to take time to become feature complete… I’m not sure whether that should be a barrier to code going upstream? Does moxie have virtio-pci? is it a requirement for QEMU upstreaming?

Richard W.M. Jones

unread,
Dec 22, 2017, 6:27:17 AM12/22/17
to Michael Clark, Stefan O'Rear, RISC-V SW Dev, drj...@redhat.com
On Sat, Dec 23, 2017 at 12:05:32AM +1300, Michael Clark wrote:
[...]

I'm super-happy that we have virtio anything working, so don't take it
as in any way disparaging of the very hard work you've done.

> I’m not sure whether that should be a barrier to code going
> upstream? Does moxie have virtio-pci? is it a requirement for QEMU
> upstreaming?

I can't speak for the qemu community since I've only worked only
relatively tiny bits of qemu. It's only my suspicion that virtio-mmio
could become a sticking point when upstreaming, particularly (I think)
since use of the ‘virt’ machine model name is going to invite
comparisons to ARM.

CC-ing Drew
[Drew, for context, the whole thread starts here:
https://groups.google.com/a/groups.riscv.org/forum/#!topic/sw-dev/3MLIq6rxa9Q
]

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Michael Clark

unread,
Dec 22, 2017, 5:37:16 PM12/22/17
to Andrew Jones, Richard W.M. Jones, Stefan O'Rear, RISC-V SW Dev


> On 23/12/2017, at 5:03 AM, Andrew Jones <drj...@redhat.com> wrote:
>
> On Fri, Dec 22, 2017 at 11:27:13AM +0000, Richard W.M. Jones wrote:
>> On Sat, Dec 23, 2017 at 12:05:32AM +1300, Michael Clark wrote:
>> [...]
>>
>> I'm super-happy that we have virtio anything working, so don't take it
>> as in any way disparaging of the very hard work you’ve done.

No worries. I just wanted to point out that it takes time as you most certainly would understand…

>>> I’m not sure whether that should be a barrier to code going
>>> upstream? Does moxie have virtio-pci? is it a requirement for QEMU
>>> upstreaming?
>>
>> I can't speak for the qemu community since I've only worked only
>> relatively tiny bits of qemu. It's only my suspicion that virtio-mmio
>> could become a sticking point when upstreaming, particularly (I think)
>> since use of the ‘virt’ machine model name is going to invite
>> comparisons to ARM.
>>
>
> I think virtio-mmio will be around for some time to come. The vexpress
> model, for example, is exclusively virtio-mmio. Also a new feature,
> virtio-iommu, requires virtio-mmio. That said, if there's nothing
> stopping one from targeting virtio-pci from the start, then that would
> be the recommended approach, for all the things Richard said.

The question is which PCI host controller to emulate as there seems to be a few emulated PCI hosts in QEMU.

There is GPEX, the QEMU Generic PCI Express Bridge:

- https://github.com/qemu/qemu/blob/master/hw/pci-host/gpex.c
- https://www.kernel.org/doc/Documentation/devicetree/bindings/pci/host-generic-pci.txt

There is also xilinx-pcie.c which has the advantage that it would match some FPGA configurations:

- https://github.com/qemu/qemu/blob/master/hw/pci-host/xilinx-pcie.c

There is also the possibility that another PCI host may be emulated to match actual hardware.

It needs some consideration, and perhaps some testing of the various options…

Richard W.M. Jones

unread,
Dec 22, 2017, 5:48:15 PM12/22/17
to Michael Clark, Andrew Jones, Stefan O'Rear, RISC-V SW Dev
On Sat, Dec 23, 2017 at 11:37:08AM +1300, Michael Clark wrote:
> The question is which PCI host controller to emulate as there seems to be a few emulated PCI hosts in QEMU.
>
> There is GPEX, the QEMU Generic PCI Express Bridge:
[...]

TBH I'd just follow whatever ARM's virt model does. It seems to be
using GPEX from a quick scan of the code.

There were a few other mistakes in the initial ARM virt model which
Drew and others have been working on fixing. Off the top of my head I
think the main ones were:

- limited max RAM (30 GB initially, since extended to 255 GB)

- limits on the max number of vCPUs possible

It turns out that enterprise customers really like to create huge VMs
with loads of vCPUs, for reasons I have never understood. But hey,
the customer is (usually) right!

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines. Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top
Reply all
Reply to author
Forward
0 new messages