‘probe of virtio0 failed with error -22’ with Linux 4.15 on qemu-upstream-v3

859 views
Skip to first unread message

Richard W.M. Jones

unread,
Jan 30, 2018, 6:35:20 AM1/30/18
to Michael Clark, RISC-V SW Dev, d...@redhat.com
Hi Michael,

I'm trying to boot Linux 4.15 (upstream commit d8a5b8056) on top of
riscv-qemu qemu-upstream-v3. However it fails to detect virtio-mmio
devices for unclear reasons (see full log below).

virtio_blk: probe of virtio0 failed with error -22
virtio_net: probe of virtio1 failed with error -22

errno -22 == -EINVAL. There are 3 places in the Linux virtio_mmio.c
file where it could return -EINVAL but I added debugging and it
doesn't seem to be any of those.

Any ideas? Perhaps the command line is wrong for this qemu?
Is there a better branch of qemu to choose?

My Linux config is defconfig plus:
https://github.com/rwmjones/fedora-riscv-bootstrap/blob/master/kernel-config

Rich.

$ qemu-system-riscv64 \
-nographic -machine virt -m 2G \
-kernel host-tools/riscv64-unknown-elf/bin/bbl \
-append "console=ttyS0 ro root=/dev/vda init=/init" \
-device virtio-blk-device,drive=hd0 \
-drive file=stage3-disk.img,format=raw,id=hd0 \
-device virtio-net-device,netdev=usernet \
-netdev user,id=usernet${TELNET:+,hostfwd=tcp::10000-:23}

OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
Linux version 4.15.0-dirty (rjo...@trick.home.annexia.org) (gcc version 7.3.1 20180129 (GCC)) #3 SMP Tue Jan 30 11:27:06 GMT 2018
Initial ramdisk at: 0x (ptrval) (512 bytes)
elf_hwcap is 0x112d
percpu: Embedded 15 pages/cpu @ (ptrval) s28824 r0 d32616 u61440
Built 1 zonelists, mobility grouping on. Total pages: 516615
Kernel command line: console=ttyS0 ro root=/dev/vda init=/init
Dentry cache hash table entries: 262144 (order: 9, 2097152 bytes)
Inode-cache hash table entries: 131072 (order: 8, 1048576 bytes)
Sorting __ex_table...
Memory: 2054992K/2095104K available (5380K kernel code, 334K rwdata, 1467K rodata, 184K init, 844K bss, 40112K reserved, 0K cma-reserved)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
Hierarchical RCU implementation.
RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=1.
RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 20.00 BogoMIPS (lpj=40000)
pid_max: default: 32768 minimum: 301
Mount-cache hash table entries: 4096 (order: 3, 32768 bytes)
Mountpoint-cache hash table entries: 4096 (order: 3, 32768 bytes)
Hierarchical SRCU implementation.
smp: Bringing up secondary CPUs ...
smp: Brought up 1 node, 1 CPU
devtmpfs: initialized
cpu cpu0: Error -2 creating of_node link
random: get_random_u32 called from bucket_table_alloc+0xee/0x28e with crng_init=0
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
futex hash table entries: 256 (order: 2, 16384 bytes)
NET: Registered protocol family 16
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
NET: Registered protocol family 2
TCP established hash table entries: 16384 (order: 5, 131072 bytes)
TCP bind hash table entries: 16384 (order: 6, 262144 bytes)
TCP: Hash tables configured (established 16384 bind 16384)
UDP hash table entries: 1024 (order: 3, 32768 bytes)
UDP-Lite hash table entries: 1024 (order: 3, 32768 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
Unpacking initramfs...
Initialise system trusted keyrings
workingset: timestamp_bits=62 max_order=19 bucket_order=0
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
nfs4filelayout_init: NFSv4 File Layout Driver Registering...
nfs4flexfilelayout_init: NFSv4 Flexfile Layout Driver Registering...
random: fast init done
NET: Registered protocol family 38
Key type asymmetric registered
Asymmetric key parser 'x509' registered
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254)
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered (default)
io scheduler mq-deadline registered
io scheduler kyber registered
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
irq: no irq domain found for interrupt-controller@c000000 !
console [ttyS0] disabled
10000000.uart: ttyS0 at MMIO 0x10000000 (irq = 0, base_baud = 230400) is a 16550A
console [ttyS0] enabled
[drm] radeon kernel modesetting enabled.
loop: module loaded
virtio_blk: probe of virtio0 failed with error -22
libphy: Fixed MDIO Bus: probed
virtio_net: probe of virtio1 failed with error -22
e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-platform: EHCI generic platform driver
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
ohci-platform: OHCI generic platform driver
usbcore: registered new interface driver uas
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
NET: Registered protocol family 10
Segment Routing with IPv6
sit: IPv6, IPv4 and MPLS over IPv4 tunneling driver
NET: Registered protocol family 17
Key type dns_resolver registered
Loading compiled-in X.509 certificates
VFS: Cannot open root device "vda" or unknown-block(0,0): error -6
Please append a correct "root=" boot option; here are the available partitions:
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.15.0-dirty #3
Call Trace:
[< (ptrval)>] walk_stackframe+0x0/0xa2
[< (ptrval)>] show_stack+0x26/0x34
[< (ptrval)>] dump_stack+0x5e/0x7c
[< (ptrval)>] panic+0xce/0x1e8
[< (ptrval)>] mount_block_root+0x17e/0x240
[< (ptrval)>] mount_root+0x108/0x124
[< (ptrval)>] prepare_namespace+0x116/0x15c
[< (ptrval)>] kernel_init_freeable+0x17e/0x1a2
[< (ptrval)>] kernel_init+0xe/0xf0
[< (ptrval)>] ret_from_syscall+0xa/0xe


--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines. Supports shell scripting,
bindings from many languages. http://libguestfs.org

Richard W.M. Jones

unread,
Jan 30, 2018, 9:09:30 AM1/30/18
to Michael Clark, RISC-V SW Dev, d...@redhat.com
On Tue, Jan 30, 2018 at 11:35:17AM +0000, Richard W.M. Jones wrote:
> Hi Michael,
>
> I'm trying to boot Linux 4.15 (upstream commit d8a5b8056) on top of
> riscv-qemu qemu-upstream-v3. However it fails to detect virtio-mmio
> devices for unclear reasons (see full log below).
>
> virtio_blk: probe of virtio0 failed with error -22
> virtio_net: probe of virtio1 failed with error -22
>
> errno -22 == -EINVAL. There are 3 places in the Linux virtio_mmio.c
> file where it could return -EINVAL but I added debugging and it
> doesn't seem to be any of those.

I debugged this and it's "something" to do with interrupts, I've
no idea what though.

What happens is that it probes the virtio bus which probes the
device (eg. virtio-blk). At some point it tries to set up the
virtio queues, and a call to request_irq here returns -EINVAL:

https://github.com/torvalds/linux/blob/6304672b7f0a5c010002e63a075160856dc4f88d/drivers/virtio/virtio_mmio.c#L457

Looking at the log there does seem to be something wrong with
interrupts, eg:

> NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
...
> irq: no irq domain found for interrupt-controller@c000000 !

Rich.

Richard W.M. Jones

unread,
Jan 30, 2018, 10:09:48 AM1/30/18
to Michael Clark, RISC-V SW Dev, d...@redhat.com
Thanks to Stefan O'Rear we worked out what's going wrong.
Got the wrong kernel.

Is ‘riscv-all’ better than ‘riscv-next’?

Michael Clark

unread,
Jan 30, 2018, 2:42:45 PM1/30/18
to Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com

> On 31/01/2018, at 4:09 AM, Richard W.M. Jones <rjo...@redhat.com> wrote:
>
> Thanks to Stefan O'Rear we worked out what's going wrong.
> Got the wrong kernel.
>
> Is ‘riscv-all’ better than ‘riscv-next’?

I’ve been performing QEMU regression testing with riscv-linux-4.14.

riscv-all would most likely be better than riscv-next. I should start performing regression testing with riscv-all.

As I understand it, riscv-all is the forward port of everything to the current linux tree, whereas riscv-next just contains patches that are ready to be submitted upstreamed. i.e. all of the riscv stuff is not there yet.

Soon after 4.15 is released, I would expect there to be a riscv-linux-4.15 which is a forward port of riscv-all to the 4.15 release.

Palmer will correct me if i’m wrong.

Richard W.M. Jones

unread,
Jan 30, 2018, 2:51:22 PM1/30/18
to Michael Clark, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com
OK thanks. I'm using riscv-all and it works, except ...

I'm still seeing this BUG_ON being reliably hit under load (probably
filesystem load):

https://github.com/torvalds/linux/blob/6304672b7f0a5c010002e63a075160856dc4f88d/fs/buffer.c#L1239

The kernel is unusable unless I comment out that line. There must be
something very different about our kernel configs or qemu command line
if you don't see it.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Michael Clark

unread,
Jan 30, 2018, 2:54:53 PM1/30/18
to Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com

> On 31/01/2018, at 8:51 AM, Richard W.M. Jones <rjo...@redhat.com> wrote:
>
> On Wed, Jan 31, 2018 at 08:42:34AM +1300, Michael Clark wrote:
>>
>>> On 31/01/2018, at 4:09 AM, Richard W.M. Jones <rjo...@redhat.com> wrote:
>>>
>>> Thanks to Stefan O'Rear we worked out what's going wrong.
>>> Got the wrong kernel.
>>>
>>> Is ‘riscv-all’ better than ‘riscv-next’?
>>
>> I’ve been performing QEMU regression testing with riscv-linux-4.14.
>>
>> riscv-all would most likely be better than riscv-next. I should start performing regression testing with riscv-all.
>>
>> As I understand it, riscv-all is the forward port of everything to the current linux tree, whereas riscv-next just contains patches that are ready to be submitted upstreamed. i.e. all of the riscv stuff is not there yet.
>>
>> Soon after 4.15 is released, I would expect there to be a riscv-linux-4.15 which is a forward port of riscv-all to the 4.15 release.
>
> OK thanks. I'm using riscv-all and it works, except ...
>
> I'm still seeing this BUG_ON being reliably hit under load (probably
> filesystem load):
>
> https://github.com/torvalds/linux/blob/6304672b7f0a5c010002e63a075160856dc4f88d/fs/buffer.c#L1239
>
> The kernel is unusable unless I comment out that line. There must be
> something very different about our kernel configs or qemu command line
> if you don't see it.

Does it also happen with riscv-linux-4.15?

I can look into it… I’ll try to reproduce with riscv-all…

Michael Clark

unread,
Jan 30, 2018, 2:56:13 PM1/30/18
to Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com
Sorry I meant riscv-linux-4.14

> I can look into it… I’ll try to reproduce with riscv-all…
>
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/3CC3A9EE-199D-4407-9B76-AC6E7665F6F4%40mac.com.

Richard W.M. Jones

unread,
Jan 30, 2018, 3:10:23 PM1/30/18
to Michael Clark, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com
On Wed, Jan 31, 2018 at 08:56:04AM +1300, Michael Clark wrote:
> Sorry I meant riscv-linux-4.14

I think that was the version I was using before.

In any case here's the stack trace:

https://groups.google.com/a/groups.riscv.org/d/msg/sw-dev/v05FjcGC1EI/atXXUAcsCgAJ

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com

Richard W.M. Jones

unread,
Jan 31, 2018, 2:48:42 PM1/31/18
to Michael Clark, Palmer Dabbelt, RISC-V SW Dev, d...@redhat.com, sor...@gmail.com

The attached patch written by Stefan O'Rear does appear to
fix the issue for me.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
0001-riscv-Fix-enable-interrupts.patch

Andrew Waterman

unread,
Jan 31, 2018, 5:15:45 PM1/31/18
to Richard W.M. Jones, Michael Clark, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie, Stefan O'Rear
Thanks, Stefan and Rich. Before going with this fix, can we make sure
we understand why the kernel is entering user mode when sstatus.PIE=0?
It's not necessarily a bug, but it smells to me like sstatus isn't
being managed quite right, and so it's maybe better fixed a different
way.
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/20180131194833.GI6044%40redhat.com.

Stefan O'Rear

unread,
Jan 31, 2018, 5:28:19 PM1/31/18
to Andrew Waterman, Richard W.M. Jones, Michael Clark, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
On Wed, Jan 31, 2018 at 2:15 PM, Andrew Waterman
<wate...@eecs.berkeley.edu> wrote:
> Thanks, Stefan and Rich. Before going with this fix, can we make sure
> we understand why the kernel is entering user mode when sstatus.PIE=0?
> It's not necessarily a bug, but it smells to me like sstatus isn't
> being managed quite right, and so it's maybe better fixed a different
> way.

I agree, and I'm continuing to investigate that. So far I've ruled
out execve() but there's a lot left to cover.

-s

Stefan O'Rear

unread,
Jan 31, 2018, 6:48:49 PM1/31/18
to Andrew Waterman, Richard W.M. Jones, Michael Clark, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Found the root cause. It's a QEMU bug, or rather, something in QEMU
that never got updated from 1.9.1 to 1.10.

1.9.1 : When a trap is taken from privilege mode y into privilege mode
x, xPIE is set to the value of y IE; x IE is set to 0; and xPP is set
to y.

1.10: When a trap is taken from privilege mode y into privilege mode
x, xPIE is set to the value of x IE; x IE is set to 0; and xPP is set
to y.

QEMU: https://github.com/riscv/riscv-qemu/blob/qemu-upstream-v3/target/riscv/helper.c#L434
matches 1.9.1

Since UIE is hard-wired to 0 (N not implemented), SPIE is always 0
when the kernel is entered after a trap from U-mode, which the page
fault code misinterpreted as "S-mode interrupts disabled". Since the
page fault code is the only place SR_SPIE is used, the patch Richard
posted above is a sufficient workaround. Do we want a QEMU patch?

Should there be any paranoia code in Linux to diagnose this if it happens again?

-s

Andrew Waterman

unread,
Jan 31, 2018, 6:53:10 PM1/31/18
to Stefan O'Rear, Richard W.M. Jones, Michael Clark, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Nice catch.

I think rather than adding paranoia code to Linux, we should test this
in riscv-tests (and make sure QEMU runs them).

Michael Clark

unread,
Jan 31, 2018, 8:38:38 PM1/31/18
to Stefan O'Rear, Andrew Waterman, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Does this look right? Let me know and I can commit it.

diff --git a/target/riscv/helper.c b/target/riscv/helper.c
index 9e4c5ef..a55e204 100644
--- a/target/riscv/helper.c
+++ b/target/riscv/helper.c
@@ -431,7 +431,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)
}

target_ulong s = env->mstatus;
- s = set_field(s, MSTATUS_SPIE, get_field(s, MSTATUS_UIE << env->priv));
+ s = set_field(s, MSTATUS_SPIE, env->priv_ver >= PRIV_VERSION_1_10_0 ?
+ get_field(s, MSTATUS_SIE) : get_field(s, MSTATUS_UIE << env->priv));
s = set_field(s, MSTATUS_SPP, env->priv);
s = set_field(s, MSTATUS_SIE, 0);
csr_write_helper(env, s, CSR_MSTATUS);
@@ -451,7 +452,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)
}

target_ulong s = env->mstatus;
- s = set_field(s, MSTATUS_MPIE, get_field(s, MSTATUS_UIE << env->priv));
+ s = set_field(s, MSTATUS_MPIE, env->priv_ver >= PRIV_VERSION_1_10_0 ?
+ get_field(s, MSTATUS_MIE) : get_field(s, MSTATUS_UIE << env->priv));
s = set_field(s, MSTATUS_MPP, env->priv);
s = set_field(s, MSTATUS_MIE, 0);
csr_write_helper(env, s, CSR_MSTATUS);


> On 1/02/2018, at 12:52 PM, Andrew Waterman <wate...@eecs.berkeley.edu> wrote:
>
> Nice catch.

Yes it was. I was unaware of that particular intricacy of priv 1.10.
> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CA%2B%2B6G0A6Lc66v-FGcveF4nMtaESDNGD_TuJ%2BccSqfY0aF%3DfFqQ%40mail.gmail.com.

Michael Clark

unread,
Jan 31, 2018, 8:58:50 PM1/31/18
to Andrew Waterman, Stefan O'Rear, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Just want to make sure I’m getting this the right way around in sret/mret. i.e. we set the same field we stashed to xPIE in the trap.

diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c
index 00305b5..5d986dc 100644
--- a/target/riscv/op_helper.c
+++ b/target/riscv/op_helper.c
@@ -612,8 +612,10 @@ target_ulong helper_sret(CPURISCVState *env, target_ulong cpu_pc_deb)

target_ulong mstatus = env->mstatus;
target_ulong prev_priv = get_field(mstatus, MSTATUS_SPP);
- mstatus = set_field(mstatus, MSTATUS_UIE << prev_priv,
- get_field(mstatus, MSTATUS_SPIE));
+ mstatus = set_field(mstatus,
+ env->priv_ver >= PRIV_VERSION_1_10_0 ?
+ MSTATUS_SIE : MSTATUS_UIE << prev_priv,
+ get_field(mstatus, MSTATUS_SPIE));
mstatus = set_field(mstatus, MSTATUS_SPIE, 0);
mstatus = set_field(mstatus, MSTATUS_SPP, PRV_U);
riscv_set_mode(env, prev_priv);
@@ -635,8 +637,10 @@ target_ulong helper_mret(CPURISCVState *env, target_ulong cpu_pc_deb)

target_ulong mstatus = env->mstatus;
target_ulong prev_priv = get_field(mstatus, MSTATUS_MPP);
- mstatus = set_field(mstatus, MSTATUS_UIE << prev_priv,
- get_field(mstatus, MSTATUS_MPIE));
+ mstatus = set_field(mstatus,
+ env->priv_ver >= PRIV_VERSION_1_10_0 ?
+ MSTATUS_MIE : MSTATUS_UIE << prev_priv,
+ get_field(mstatus, MSTATUS_MPIE));
mstatus = set_field(mstatus, MSTATUS_MPIE, 0);
mstatus = set_field(mstatus, MSTATUS_MPP, PRV_U);
riscv_set_mode(env, prev_priv);


> On 1/02/2018, at 2:41 PM, Andrew Waterman <wate...@eecs.berkeley.edu> wrote:
>
> Need the corresponding changes for MRET/SRET instructions, too, but it
> looks right so far to me.

Stefan O'Rear

unread,
Jan 31, 2018, 9:02:50 PM1/31/18
to Michael Clark, Andrew Waterman, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Looks fine to me (although I'd have a slight preference for an
if-statement instead of a 4-line ternary).

-s

Michael Clark

unread,
Jan 31, 2018, 9:18:07 PM1/31/18
to Stefan O'Rear, Andrew Waterman, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
I started out with an if but it was a little verbose. The ternary worked quite well on the trap part of the patch style-wise, but unfortunately the 80-column rule prevents us from fitting the ternary expression onto fewer lines for mret/sret. I was trying to avoiding introducing an “explicit” temporary.

I’ve pushed it to my qemu-devel branch which is where I stage commits before merging with the current upstream patch review working branch.

https://github.com/michaeljclark/riscv-qemu/commits/qemu-devel

I typically hold off promoting these working branches to riscv-next until i’ve submitted them for upstream review and am satisfied that they have had enough testing. i.e. have received testing feedback in addition to my own testing.

So in that case, we should probably backport this to riscv-next…

Stefan O'Rear

unread,
Jan 31, 2018, 9:26:44 PM1/31/18
to Michael Clark, Andrew Waterman, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
I can confirm that this QEMU patch fixes the crashes I was seeing
(boot the stage3 disk and attempt to install a RPM).

-s

Michael Clark

unread,
Feb 1, 2018, 1:29:13 AM2/1/18
to Stefan O'Rear, Andrew Waterman, Richard W.M. Jones, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Thanks. I’ve added it to qemu-upstream-v3 (for the v4 spin) and backported it to riscv-next

Richard W.M. Jones

unread,
Feb 1, 2018, 3:53:42 AM2/1/18
to Michael Clark, Stefan O'Rear, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
I can also confirm that this patch to qemu works:

https://github.com/michaeljclark/riscv-qemu/commit/ad01d6e1ae7cf88dae72bc4f1fe31b240b52a3ed.patch

I have removed the various kernel patches I was using to
work around this and silence the BUG_ON.

Rich.

--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com

Michael Clark

unread,
Feb 3, 2018, 2:04:49 AM2/3/18
to Richard W.M. Jones, Stefan O'Rear, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
Hi Rich,

BTW - We now have poweroff in the riscv-qemu virt machine.

- https://github.com/riscv/riscv-qemu/commit/713f2c116481d568702759bcb1b7fed835a2d575

I was poking around in riscv-pk/bbl because whatever we implement, we need to expose it in device-tree for riscv-pk/bbl, and it would need a hook in sbi_shutdown in riscv-pk/bbl

It turns out using the test finisher device is simple and it is a pre-existing not-HTIF way to do poweroff, and it is already detected by riscv-pk/bbl I just needed to implement a simple mmio device and expose it in device tree. poweroff should now work out of the box with current riscv-pk/bbl.

Ultimately we should be probably be emulating GPIOs hooked up to power and reset, but given the test finisher is already supported by riscv-pk/bbl, it seemed expedient. i.e. now we can power off QEMU and if we for example run riscv-tests in QEMU, we can return an exit code to the process running QEMU. I’d like to use this to start automating some of the QEMU testing…

Michael.

Stefan O'Rear

unread,
Feb 3, 2018, 2:19:35 AM2/3/18
to Michael Clark, Richard W.M. Jones, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie
On Fri, Feb 2, 2018 at 11:04 PM, Michael Clark <m...@sifive.com> wrote:
> Hi Rich,
>
> BTW - We now have poweroff in the riscv-qemu virt machine.
>
> - https://github.com/riscv/riscv-qemu/commit/713f2c116481d568702759bcb1b7fed835a2d575
>
> I was poking around in riscv-pk/bbl because whatever we implement, we need to expose it in device-tree for riscv-pk/bbl, and it would need a hook in sbi_shutdown in riscv-pk/bbl
>
> It turns out using the test finisher device is simple and it is a pre-existing not-HTIF way to do poweroff, and it is already detected by riscv-pk/bbl I just needed to implement a simple mmio device and expose it in device tree. poweroff should now work out of the box with current riscv-pk/bbl.
>
> Ultimately we should be probably be emulating GPIOs hooked up to power and reset, but given the test finisher is already supported by riscv-pk/bbl, it seemed expedient. i.e. now we can power off QEMU and if we for example run riscv-tests in QEMU, we can return an exit code to the process running QEMU. I’d like to use this to start automating some of the QEMU testing…
>
> Michael.

Automated testing, with riscv-tests or otherwise, would be a most
excellent development. I'm working on reviewing Richard Henderson's
patchset ( https://github.com/rth7680/qemu/commits/tgt-riscv ; based
on qemu-upstream-v3 despite the timestamps ), which supports
multi-threaded TCG mode; it boots linux and works for a while, but
starts throwing RCU errors after 20 minutes under load (which do not
occur on the FPGA) and a testsuite would really help here.

Agreed on your assessments re. test-finisher and GPIOs. Once we have
Liviu Ionescu's semihosting protocol implemented in qemu. we could
also use that for tests (4 magic instructions to return a
success/failure code from a test, already implemented in OpenOCD).

-s

Michael Clark

unread,
Feb 5, 2018, 3:45:53 AM2/5/18
to Stefan O'Rear, Richard W.M. Jones, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, DJ Delorie


> On 3/02/2018, at 8:19 PM, Stefan O'Rear <sor...@gmail.com> wrote:
>
> On Fri, Feb 2, 2018 at 11:04 PM, Michael Clark <m...@sifive.com> wrote:
>> Hi Rich,
>>
>> BTW - We now have poweroff in the riscv-qemu virt machine.
>>
>> - https://github.com/riscv/riscv-qemu/commit/713f2c116481d568702759bcb1b7fed835a2d575
>>
>> I was poking around in riscv-pk/bbl because whatever we implement, we need to expose it in device-tree for riscv-pk/bbl, and it would need a hook in sbi_shutdown in riscv-pk/bbl
>>
>> It turns out using the test finisher device is simple and it is a pre-existing not-HTIF way to do poweroff, and it is already detected by riscv-pk/bbl I just needed to implement a simple mmio device and expose it in device tree. poweroff should now work out of the box with current riscv-pk/bbl.
>>
>> Ultimately we should be probably be emulating GPIOs hooked up to power and reset, but given the test finisher is already supported by riscv-pk/bbl, it seemed expedient. i.e. now we can power off QEMU and if we for example run riscv-tests in QEMU, we can return an exit code to the process running QEMU. I’d like to use this to start automating some of the QEMU testing…
>>
>> Michael.
>
> Automated testing, with riscv-tests or otherwise, would be a most
> excellent development. I'm working on reviewing Richard Henderson's
> patchset ( https://github.com/rth7680/qemu/commits/tgt-riscv ; based
> on qemu-upstream-v3 despite the timestamps ), which supports
> multi-threaded TCG mode; it boots linux and works for a while, but
> starts throwing RCU errors after 20 minutes under load (which do not
> occur on the FPGA) and a testsuite would really help here.

As you may have seen, I pulled in Richard’s changes, however, after running riscv-tests, I found that one of the changes broke FENCE.I. I’ve reverted the commit in the v3 branch (pre squash and rebase) and the v4 branch (post squash and rebase). We might need to spin a v5 patch series. I should have run riscv-tests beforehand, but now i’ll add it to my pre-release testing so i’ll catch any future regressions. Ideally we will automate this soon.

- https://github.com/riscv/riscv-qemu/tree/qemu-upstream-v3
- https://github.com/riscv/riscv-qemu/tree/qemu-upstream-v4

> Agreed on your assessments re. test-finisher and GPIOs. Once we have
> Liviu Ionescu's semihosting protocol implemented in qemu. we could
> also use that for tests (4 magic instructions to return a
> success/failure code from a test, already implemented in OpenOCD).

The riscv-tests with virtual memory enabled run out-of-the-box on the riscv-qemu spike_v1.10 machine. We just need to perform some result parsing and perhaps output Xunit XML format. I’d also like to build some automated tests that boot up different versions of linux in spike-v1.9.1, spike-v.10 and the virt machine (test priv isa 1.9.1 and priv isa 1.10). We have all of the bits and pieces, we just have to tie them together. I’m going to script the build of the test images so the test set can easily be reproduced. It probably makes sense that we have binaries stored somewhere as checking out multiple versions of riscv-linux, busybox and bbl will take too much time to run in travis. We are already hitting the 50 minute timeout for some fo the current tests.

I’d also like to add some deeper testing for virtual memory. i.e. checking that accessed and dirty are set correctly, and that SUM and MXR work. I have a starting point for some VM tests here:

- https://github.com/rv8-io/rv8/blob/master/src/test/test-m-sv39.S

It seems we have some floating point tests failing:

#!/bin/bash

QEMU=./riscv64-softmmu/qemu-system-riscv64
ALL_TESTS=$(find ../riscv-tools/riscv-tests/build/isa -name 'rv64*-v-*' -a ! -name '*.dump' | sort)

for i in ${ALL_TESTS}; do
test=$(basename $i)
echo ${test}
${QEMU} -nographic -machine spike_v1.10 -kernel $i
done

rv64ua-v-amoadd_d
rv64ua-v-amoadd_w
rv64ua-v-amoand_d
rv64ua-v-amoand_w
rv64ua-v-amomax_d
rv64ua-v-amomax_w
rv64ua-v-amomaxu_d
rv64ua-v-amomaxu_w
rv64ua-v-amomin_d
rv64ua-v-amomin_w
rv64ua-v-amominu_d
rv64ua-v-amominu_w
rv64ua-v-amoor_d
rv64ua-v-amoor_w
rv64ua-v-amoswap_d
rv64ua-v-amoswap_w
rv64ua-v-amoxor_d
rv64ua-v-amoxor_w
rv64ua-v-lrsc
rv64uc-v-rvc
rv64ud-v-fadd
rv64ud-v-fclass
rv64ud-v-fcmp
rv64ud-v-fcvt
rv64ud-v-fcvt_w
rv64ud-v-fdiv
rv64ud-v-fmadd
rv64ud-v-fmin
*** FAILED *** (tohost = 20)rv64ud-v-ldst
*** FAILED *** (tohost = 6)rv64ud-v-move
*** FAILED *** (tohost = 40)rv64ud-v-recoding
rv64ud-v-structural
rv64uf-v-fadd
rv64uf-v-fclass
rv64uf-v-fcmp
rv64uf-v-fcvt
rv64uf-v-fcvt_w
rv64uf-v-fdiv
rv64uf-v-fmadd
rv64uf-v-fmin
*** FAILED *** (tohost = 20)rv64uf-v-ldst
rv64uf-v-move
rv64uf-v-recoding
rv64ui-v-add
rv64ui-v-addi
rv64ui-v-addiw
rv64ui-v-addw
rv64ui-v-and
rv64ui-v-andi
rv64ui-v-auipc
rv64ui-v-beq
rv64ui-v-bge
rv64ui-v-bgeu
rv64ui-v-blt
rv64ui-v-bltu
rv64ui-v-bne
rv64ui-v-fence_i
rv64ui-v-jal
rv64ui-v-jalr
rv64ui-v-lb
rv64ui-v-lbu
rv64ui-v-ld
rv64ui-v-lh
rv64ui-v-lhu
rv64ui-v-lui
rv64ui-v-lw
rv64ui-v-lwu
rv64ui-v-or
rv64ui-v-ori
rv64ui-v-sb
rv64ui-v-sd
rv64ui-v-sh
rv64ui-v-simple
rv64ui-v-sll
rv64ui-v-slli
rv64ui-v-slliw
rv64ui-v-sllw
rv64ui-v-slt
rv64ui-v-slti
rv64ui-v-sltiu
rv64ui-v-sltu
rv64ui-v-sra
rv64ui-v-srai
rv64ui-v-sraiw
rv64ui-v-sraw
rv64ui-v-srl
rv64ui-v-srli
rv64ui-v-srliw
rv64ui-v-srlw
rv64ui-v-sub
rv64ui-v-subw
rv64ui-v-sw
rv64ui-v-xor
rv64ui-v-xori
rv64um-v-div
rv64um-v-divu
rv64um-v-divuw
rv64um-v-divw
rv64um-v-mul
rv64um-v-mulh
rv64um-v-mulhsu
rv64um-v-mulhu
rv64um-v-mulw
rv64um-v-rem
rv64um-v-remu
rv64um-v-remuw
rv64um-v-remw

Reply all
Reply to author
Forward
0 new messages