I’ve done some more instrumentation and I’m seeing SEPC returning from an interrupt handler to some random pre-empted code, however the interrupt appears to stay pending as I see a stream of SRET to the same address. I then determined that it is the timer interrupt that is calling SRET, and then immediately trapping again. It’s a Timer interrupt. Note the log message just before the hang, just before the scheduler clock is set up.
[ 0.000188] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns
I disabled timer interrupts and the boot gets a bit further and then the same thing happens on external interrupts. Likely the VirtIO interrupts via the PLIC. I don’t think it is a problem with the PLIC because it appears to be showing up both on Timer interrupts and on external interrupts. It could be a bug either in Linux or QEMU. I suspect Linux, BBL or QEMU is not clearing the asynchronous interrupt after it is delivered, because it goes into an SRET loop to the same address which appears to be a random address of some pre-empted code, however it keeps calling SRET to the same address. Synchronous traps are fine. Page faulting is working. It’s somewhere in interrupt delivery and acknowledgement. Obviously some code that is affected by TARGET_LONG_BITS == 32, as 64-bit is not affected.
Here is a boot with Timer interrupts disabled. Of course it gets further, but then the same thing happens with External interrupts.
At least we have isolated the are of code. The question is why after SRET do we see an SRET to the exact same address in a loop (the address of the code that was pre-empted by the interrupt). This suggest after SRET is not actually completing or the interrupt pending flags are still set. Somehow STIP is not being cleared (or SEIP in the case of external). STIP is dispatched by bbl and SEIP is dispatched by the PLIC in QEMU so it suggests its not a bug in BBL.
We have to look for differences that affect 32-bit in these possible places.
- QEMU interrupt handling
- Linux interrupt handling
- BBL interrupt handling
I don’t want to point-the-finger to early, but I am suspecting QEMU, given we essentially see a stream of SRET to the same address after an asynchronous interrupt which suggests QEMU is not reseting QEMUs internal interrupt state after an “asynchronous exception” i.e. interrupt. Regular traps are fine.
In any case we have isolated the pieces of code we need to look at:
$ grep interrupt riscv-qemu/target/riscv/*.c
Note: I had enabled additional log messages by enabling RISCV_DEBUG_INTERRUPT and adding a log message in helper_sret that prints retpc, to figure this out. I needed to fix some log messages because when we converted them from printf to qemu_log, I wasn’t aware the messages needed a carriage return. So we needed this patch (I will merge into riscv-all at some point soon as it is a logging bug fix):
$ git show
commit 506c2c977350aac4bd1c9128c3fe325df6d8eb6a
Author: Michael Clark <
m...@sifive.com>
Date: Mon Apr 23 05:57:26 2018 +1200
RISC-V: Clean up interrupt debug messages
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index d70548e..3d2ac4d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -442,11 +442,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
if (RISCV_DEBUG_INTERRUPT) {
int log_cause = cs->exception_index & RISCV_EXCP_INT_MASK;
if (cs->exception_index & RISCV_EXCP_INT_FLAG) {
- qemu_log_mask(LOG_TRACE, "core 0: trap %s, epc 0x" TARGET_FMT_lx,
- riscv_intr_names[log_cause], env->pc);
+ qemu_log_mask(LOG_TRACE, "core "
+ TARGET_FMT_ld ": trap %s, epc 0x" TARGET_FMT_lx "\n",
+ env->mhartid, riscv_intr_names[log_cause], env->pc);
} else {
- qemu_log_mask(LOG_TRACE, "core 0: intr %s, epc 0x" TARGET_FMT_lx,
- riscv_excp_names[log_cause], env->pc);
+ qemu_log_mask(LOG_TRACE, "core "
+ TARGET_FMT_ld ": intr %s, epc 0x" TARGET_FMT_lx "\n",
+ env->mhartid, riscv_excp_names[log_cause], env->pc);
}
}
@@ -508,8 +510,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)
if (hasbadaddr) {
if (RISCV_DEBUG_INTERRUPT) {
- qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld
- ": badaddr 0x" TARGET_FMT_lx, env->mhartid, env->badaddr);
+ qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld ": badaddr 0x"
+ TARGET_FMT_lx "\n", env->mhartid, env->badaddr);
}
env->sbadaddr = env->badaddr;
} else {
@@ -533,8 +535,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)
if (hasbadaddr) {
if (RISCV_DEBUG_INTERRUPT) {
- qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld
- ": badaddr 0x" TARGET_FMT_lx, env->mhartid, env->badaddr);
+ qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld ": badaddr 0x"
+ TARGET_FMT_lx "\n", env->mhartid, env->badaddr);
}
env->mbadaddr = env->badaddr;
} else {
Here is the boot log when timer interrupts are suppressed.
qemu-system-riscv32 -nographic -machine virt -kernel /home/mclark/src/sifive/freedom-u-sdk/work/riscv-pk/bbl \
-drive file=/home/mclark/src/sifive/freedom-u-sdk/work/rootfs.bin,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 \
[ 0.000000] Linux version 4.15.0-00044-g2b0aa1de45f6 (mclark@minty) (gcc version 7.2.0 (GCC)) #2 Sun Apr 22 17:02:53 NZST 2018
[ 0.000000] bootconsole [early0] enabled
[ 0.000000] Initial ramdisk at: 0x(ptrval) (512 bytes)
[ 0.000000] Zone ranges:
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000080400000-0x0000087fffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB) mapped at [(ptrval)-(ptrval)]
[ 0.000000] elf_hwcap is 0x112d
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 31496
[ 0.000000] Kernel command line: earlyprintk
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[ 0.000000] Sorting __ex_table...
[ 0.000000] Memory: 55052K/126976K available (3663K kernel code, 127K rwdata, 624K rodata, 140K init, 214K bss, 71924K reserved, 0K cma-reserved)
[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[ 0.000000] riscv,cpu_intc,0: 32 local interrupts mapped
[ 0.000000] riscv,plic0,c000000: mapped 10 interrupts to 1/2 handlers
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 191126044627 ns
[ 0.000187] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns
[ 0.002023] Calibrating delay loop (skipped), value calculated using timer frequency.. 20.00 BogoMIPS (lpj=100000)
[ 0.002609] pid_max: default: 32768 minimum: 301
[ 0.003526] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.003909] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.045399] devtmpfs: initialized
[ 0.050450] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 0.051034] futex hash table entries: 256 (order: -1, 3072 bytes)
[ 0.054321] random: get_random_u32 called from bucket_table_alloc+0xb4/0x2b4 with crng_init=0
[ 0.055188] NET: Registered protocol family 16
[ 0.068161] SCSI subsystem initialized
[ 0.069764] pps_core: LinuxPPS API ver. 1 registered
[ 0.070051] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <
giom...@linux.it>
[ 0.070586] PTP clock support registered
[ 0.074874] clocksource: Switched to clocksource riscv_clocksource
[ 0.079980] NET: Registered protocol family 2
[ 0.082503] TCP established hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.082933] TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.083343] TCP: Hash tables configured (established 1024 bind 1024)
[ 0.084791] UDP hash table entries: 256 (order: 0, 4096 bytes)
[ 0.085190] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[ 0.086358] NET: Registered protocol family 1
[ 0.090816] Unpacking initramfs...
[ 0.093681] Initialise system trusted keyrings
[ 0.094936] workingset: timestamp_bits=30 max_order=14 bucket_order=0
[ 0.109118] random: fast init done
[ 0.129122] Key type asymmetric registered
[ 0.129420] Asymmetric key parser 'x509' registered
[ 0.129828] io scheduler noop registered
[ 0.130485] io scheduler cfq registered (default)
[ 0.130788] io scheduler mq-deadline registered
[ 0.131066] io scheduler kyber registered
[ 0.259751] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.264592] console [ttyS0] disabled
[ 0.265455] 10000000.uart: ttyS0 at MMIO 0x10000000 (irq = 1, base_baud = 230400) is a 16550A
[ 0.266600] console [ttyS0] enabled
[ 0.266600] console [ttyS0] enabled
[ 0.267157] bootconsole [early0] disabled
[ 0.267157] bootconsole [early0] disabled
> To view this discussion on the web visit
https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/53498111-711E-4729-9185-EA253E3665BF%40mac.com.