The problem of write misa on QEMU and BBL

Zong Li

unread,

Apr 18, 2018, 10:40:08 PM4/18/18

to Zong Li, and...@sifive.com, Palmer Dabbelt, m...@sifive.com, sw-...@groups.riscv.org

Hi all,

For BBL part, in fp_init at machine/minit.c,
it will clear the D and F bits of misa register, and assertion that
the bits is cleared.
But the misa is WARL register, so there is no effect for writing it,
and the assertion not be true.
So is there has necessary to do that if toolchain not support D and F extension?

For QEMU part, when writing misa, it will trigger the illegal
instruction exception, but I think that the WARL allow write behavior?

Michael Clark

unread,

Apr 19, 2018, 12:44:01 AM4/19/18

to Zong Li, Zong Li, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

Hi Zong,

QEMU in the riscv-all branch should have WARL behavior.

- https://github.com/riscv/riscv-qemu/commits/riscv-all

There is a bug in upstream. We have submitted patches to fix the issue for review on the qemu-devel mailing list. The patch series will be submitted for upstream review again shortly. We were holding off on the series as we didn’t classify it as a “critical bug” as QEMU was in soft-freeze for 2.12 and we weren’t able to get review in time to include this fix in the 2.12 release.

See “No traps on writes to misa,minstret,mcycle"

- https://github.com/riscv/riscv-qemu/commits/qemu-2.13-for-upstream

The history is that there were several unimplemented CSRs that had printf followed by exit. Richard Henderson said we should fix this. I changed several CSRs to cause illegal instruction traps instead of calling exit. That was a mistake as CSRs that don’t support write are WARL (Write Any Read Legal). It was certainly better than having the simulation exit as a cpu doesn’t typically have a way to ”exit” like a C program, nevertheless trapping was wrong. My mistake. See here for the history:

- https://github.com/riscv/riscv-qemu/blob/ff36f2f77ec3e6a6211c63bfe1707ec057b12f7d/target-riscv/op_helper.c

The implementation in the current tree is quite different. We have recently made the CSR system more modular so that with minor changes, custom CPUs will be able to hook their own control and status registers.

- https://github.com/riscv/riscv-qemu/blob/qemu-2.13-for-upstream/target/riscv/csr.c#L780-L867

See these changes:

- https://github.com/riscv/riscv-qemu/commit/9d9c1bfef911c520a35bd3f8c0ed2e14cc96bbb7
- https://github.com/riscv/riscv-qemu/commit/b5a4cd79ce6c7fbb65fdcf078fb9a8391da1d6b1

We know have a flexible system that will allow implementations to hook per-cpu control and status registers, and we have predicates that make CSRs appear on some processor but not on others. i.e. if misa.S is not present, then S-mode s* CSRs will trap. Sometimes WARL is the correct behaviour, but sometimes trapping is the correct behaviour i.e. if the processor does not implement S-mode.

misa traps on write should only affect bootloaders as Supervisor’s like Linux don’t yet have access to the isa register. It’s not a major issuse.

Michael.

Zong Li

unread,

Apr 19, 2018, 5:28:27 AM4/19/18

to Michael Clark, Zong Li, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

Hi Michael,

Thanks for the information. The new CSR system is helpful for custom
CPU such as ours. Thanks.

In the future, maybe we can do something like this in BBL for flexible
custom platform which has own device to control the timer, ipi and so
on.

Back to the misa problem in BBL, at fp_init in BBL initial phrase, the
assertion will has problem because the bits of misa will not be
cleared.

the code piece like below:
uintptr_t fd_mask = (1 << ('F' - 'A')) | (1 << ('D' - 'A'));
clear_csr(misa, fd_mask);
assert(!(read_csr(misa) & fd_mask));

I think that the assertion is not necessary even the clear misa.

David H. Lynch Jr.

unread,

Apr 19, 2018, 8:52:43 AM4/19/18

to Michael Clark, Palmer Dabbelt, RISC-V SW Dev

As evident from other posts I have been trying to get the current RISC-
V Linux 32bit booting from the freedom-u-sdk repository.

I was ultimately able to get everything to completely build 32bits -
with some relatively minor changes to the Makefile.
Which I tried to post as a patch - but pushing patches is new to me and
the patch did not show up on the list.
I will try again later.

32 Bit RISC-V Linux currently boots part way in Qemu with those 32 bit
Makefile patches.

I a working to get fully built.

In the useful advice I have received from many here was the suggestion
that Qemu/32 may be broke in freedom-u-sdk and that I should use the
version in the riscv-tools repository.

I have that building 32-bits, but it is not producing a riscv "system"
version of qemu

I am hoping someone can help narrow the scope of what I am looking at.

There are messages on the lists suggesting that there is a more current
qemu in the riscv-all repository.

I am just trying to get an environment where I can start working on the
Risc-V Linux 32 source to adapt it now to the hardware that someday I
will be dealing with.

This is where I am getting when I try to execute.

==============================================

/usr/local/dlasys/micron/software/risc-v/freedom-u-sdk.32/work/riscv-
qemu/prefix/bin/qemu-system-riscv32 -nographic -machine virt -kernel
/usr/local/dlasys/micron/software/risc-v/freedom-u-sdk.32/work/riscv-
pk/bbl \
-drive file=/usr/local/dlasys/micron/software/risc-v/freedom-u-
sdk.32/work/rootfs.bin,format=raw,id=hd0 -device virtio-blk-
device,drive=hd0 \
-netdev user,id=net0 -device virtio-net-device,netdev=net0
bbl loader

                SIFIVE, INC.

         5555555555555555555555555
        5555                   5555
       5555                     5555
      5555                       5555
     5555       5555555555555555555555
    5555       555555555555555555555555
   5555                             5555
5555                               5555
5555                                 5555
5555555555555555555555555555          55555
55555           555555555           55555
   55555           55555           55555
     55555           5           55555
       55555                   55555
         55555               55555
           55555           55555
             55555       55555
               55555   55555
                 555555555
                   55555
                     5

           SiFive RISC-V Coreplex
[    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80400000
[    0.000000] Linux version 4.15.0-00044-g2b0aa1de45f6 (root@z370-dhli
i) (gcc version 7.2.0 (GCC)) #3 Sun Apr 15 11:10:50 EDT 2018
[    0.000000] bootconsole [early0] enabled
[    0.000000] Initial ramdisk at: 0x(ptrval) (9877504 bytes)
[    0.000000] Zone ranges:
[    0.000000]   DMA32    empty
[    0.000000]   Normal   [mem 0x0000000080400000-0x0000087fffffffff]
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000080400000-0x0000000087ffffff]
[    0.000000] Initmem setup node 0 [mem 0x0000000080400000-
0x0000000087ffffff]
[    0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB)
mapped at [(ptrval)-(ptrval)]
[    0.000000] elf_hwcap is 0x112d
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages:
31496
[    0.000000] Kernel command line: earlyprintk
[    0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536
bytes)
[    0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768
bytes)
[    0.000000] Sorting __ex_table...
[    0.000000] Memory: 46500K/126976K available (2590K kernel code,
124K rwdata, 632K rodata, 9760K init, 214K bss, 80476K reserved, 0K
cma-reserved)
[    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1,
Nodes=1
[    0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[    0.000000] riscv,cpu_intc,0: 32 local interrupts mapped
[    0.000000] riscv,plic0,c000000: mapped 10 interrupts to 1/2
handlers
[    0.000000] clocksource: riscv_clocksource: mask: 0xffffffff
max_cycles: 0xffffffff, max_idle_ns: 191126044627 ns
[    0.000149] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps
every 4398046511100ns
[    0.001934] Calibrating delay loop (skipped), value calculated using
timer frequency.. 20.00 BogoMIPS (lpj=100000)
[    0.003016] pid_max: default: 32768 minimum: 301
[    0.003933] Mount-cache hash table entries: 1024 (order: 0, 4096
bytes)
[    0.004545] Mountpoint-cache hash table entries: 1024 (order: 0,
4096 bytes)
=======================================
At the above Linux Hangs

From booting RISCV 64 bits The following is what should be next.

[ 0.018688] Hierarchical SRCU implementation.
[    0.023224] smp: Bringing up secondary CPUs ...
[    0.023681] smp: Brought up 1 node, 1 CPU

I have also tried using spike.
64 bits - linux boots fine
32 bits - no output at all. I am getting further with Qemu

Jim Wilson

unread,

Apr 19, 2018, 11:31:20 AM4/19/18

to Dave Lynch, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

On Thu, Apr 19, 2018 at 5:52 AM, David H. Lynch Jr. <dhly...@gmail.com> wrote:
> In the useful advice I have received from many here was the suggestion
> that Qemu/32 may be broke in freedom-u-sdk and that I should use the
> version in the riscv-tools repository.

There is a known problem with stat() and 32-bit user-mode qemu which
is perhaps a glibc bug. This fails with current qemu, but works with
old qemu in the riscv-gnu-toolchain repo. This probably also breaks
32-bit system mode qemu, but we don't know, as we don't test it.

> I have that building 32-bits, but it is not producing a riscv "system"
> version of qemu

There are multiple ways to work around this. You can go into
riscv-qemu and use a git checkout command to get the old version
present in the riscv-gnu-toolchain repo. You can rename riscv-qemu
and create a link pointing at the riscv-gnu-toolchain/riscv-qemu. You
can look at the $(qemu) rule in the top level makefile, and run those
same commands using the riscv-gnu-toolchain/riscv-qemu sources instead
of the top level riscv-qemu sources. Etc.

I have no idea if this will help. The 32-bit linux support is
untested, so you may have to debug problems yourself. There may be
kernel bugs that need to be fixed.

Proper 32-bit linux support is waiting for 32-bit glibc support to be
upstreamed. There is no target date for that as yet. Currently the
only 32-bit glibc support is in riscv-gnu-toolchain, and that is an
old obsolete unmaintained version.

Jim

David H. Lynch Jr.

unread,

Apr 19, 2018, 2:05:50 PM4/19/18

to Jim Wilson, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

On Thu, 2018-04-19 at 08:31 -0700, Jim Wilson wrote:
> On Thu, Apr 19, 2018 at 5:52 AM, David H. Lynch Jr. <dhlynch2@gmail.c

> om> wrote:
> > In the useful advice I have received from many here was the
> > suggestion
> > that Qemu/32 may be broke in freedom-u-sdk and that I should use
> > the
> > version in the riscv-tools repository.
>
> There is a known problem with stat() and 32-bit user-mode qemu which
> is perhaps a glibc bug. This fails with current qemu, but works with
> old qemu in the riscv-gnu-toolchain repo. This probably also breaks
> 32-bit system mode qemu, but we don't know, as we don't test it.

EXCEPT that I need working risc-v 32bit tools, I do not care about
risc-v 32 bit qemu user space. I only care about the Linux kernel
which does not use glibc.

Am I understanding correctly - there is either a general glibc(32) bug
or an X86_32 glibc bug - because the failure is in an x86 tool
executing risc-v 32 code - correct ?

> > I have that building 32-bits, but it is not producing a riscv
> > "system"
> > version of qemu
>
> There are multiple ways to work around this. You can go into
> riscv-qemu and use a git checkout command to get the old version
> present in the riscv-gnu-toolchain repo. You can rename riscv-qemu
> and create a link pointing at the riscv-gnu-toolchain/riscv-

> qemu. You

> can look at the $(qemu) rule in the top level makefile, and run those
> same commands using the riscv-gnu-toolchain/riscv-qemu sources
> instead
> of the top level riscv-qemu sources. Etc.

I am trying some of this.

>
> I have no idea if this will help. The 32-bit linux support is
> untested, so you may have to debug problems yourself. There may be
> kernel bugs that need to be fixed.

I can deal with kernel bugs.

It is much harder to deal with not knowing where to begin - when there
are half a dozen different programs and tools that could be the issue.

> Proper 32-bit linux support is waiting for 32-bit glibc support to be
> upstreamed. There is no target date for that as yet. Currently the
> only 32-bit glibc support is in riscv-gnu-toolchain, and that is an
> old obsolete unmaintained version.

Obsolete is fine - working is what matters.

Thanks

>
> Jim
>

Jim Wilson

unread,

Apr 19, 2018, 2:35:07 PM4/19/18

to Dave Lynch, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

On Thu, Apr 19, 2018 at 11:05 AM, David H. Lynch Jr. <dhly...@gmail.com> wrote:
> Am I understanding correctly - there is either a general glibc(32) bug
> or an X86_32 glibc bug - because the failure is in an x86 tool
> executing risc-v 32 code - correct ?

It is a RISC-V specific problem with the 32-bit stat structure which
is used by both glibc and the kernel, and they work only if they have
the same stat structure definition. qemu and glibc disagree on the
layout of the stat structure, and it is probably glibc that is wrong,
because the stat structure apparently changed at some point in the
RISC-V kernel port. But I'm not a kernel hacker, so I don't know the
details of what happened or what is wrong.

> It is much harder to deal with not knowing where to begin - when there
> are half a dozen different programs and tools that could be the issue.

I can't help with that. You will just have to look at all of them.

>> Proper 32-bit linux support is waiting for 32-bit glibc support to be
>> upstreamed. There is no target date for that as yet. Currently the
>> only 32-bit glibc support is in riscv-gnu-toolchain, and that is an
>> old obsolete unmaintained version.
> Obsolete is fine - working is what matters.

If I check out riscv-gnu-toolchain, build a rv32-linux toolchain, and
run the gcc testsuite on user-mode qemu, it works. But beyond that, I
have no idea what if anything works. So the glibc and qemu in
riscv-gnu-toolchain work if used together, but they are both old
versions. The glibc in riscv-gnu-toolchain does not work with current
qemu. There is no upstream 32-bit glibc support as yet, so there is
no where else to get the 32-bit glibc support that I know of. It may
also be the case that the glibc in riscv-gnu-toolchain does not work
with current linux kernel port, as that probably agrees with current
qemu on the stat structure layout. It is not clear if any of that
matters for a linux kernel boot time problem though.

Jim

Michael Clark

unread,

Apr 19, 2018, 8:05:45 PM4/19/18

to Zong Li, Zong Li, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

I agree. The specification makes no guarantee that misa writes are not ignored so it is legal for a processor that supports FD to drop misa writes and the assertion will trigger on legal RISC-V implementations. That code piece does not support legal RISC-V implementations that can't disable F and D. Disabling F and D should not be asserted because it is harmless if an unused extension is present.

This assertion will always trigger in QEMU until we support the 'optional' feature to allow changes to 'misa'.

Just noting this is not QEMU specifc so we should drop qemu-devel if we continue to discuss misa on RISC-V in bbl.

Nevertheless, we do plan to support 'misa' writes however we need to do some work in translate.c to make sure that cached translations match the current state of misa. We may want to perform a tb_flush when we implement writable misa. We also want writable misa to be a CPU feature so we can emulate CPUs that don't support writable misa. eg add this to the CPU model.

set_feature(env, RISCV_FEATURE_MISA_WRITABLE)

Thanks for raising this because the new modular CSR implementation only implemented 'existential' predicates for CSRs. We should add a write flag to the predicate. Or we can just return -1 in the write_misa function. e.g.

static int write_misa(CPURISCVState *env, int csrno, target_ulong val)

{

if (!riscv_feature(env, RISCV_FEATURE_MISA_WRITABLE)) {

return -1;

}

/* validate misa - must contain 'I' or 'E' */

env->misa = val;

tb_flush(CPU(riscv_env_get_cpu(env)));

}

tb_flush is pessimistic but conservative. Currently its not common to write misa so it would be acceptable.

There is a similar but somewhat more complex issue for disabling misa.C. The behaviour has been discussed on the isa-dev mailing list. Iirc, we have to ignore bit 1 in mepc/sepc in MRET/SRET if misa.C has been cleared and a 2-byte aligned address is present in mepc/sepc, so that MRET/SRET can only jump to 4-byte aligned code. So we drop bit 1 on writes to mepc/sepc while misa.C is clear and we ignore bit 1 on reads from mepc/sepc while misa.C is cleared. So the change needs slightly more work than just making 'misa' writable. We also have to enforce that 'I' or 'E' are set, and we currently don't have support for RVE emulation in RISC-V QEMU. This will require changes to validate registers in translate.c and cause illegal instructions if regno >= 16 is used.

I'm also not sure exactly how we add misa to the translation cache index, but tb_flush seems like the conservative way to ensure the translation cache matches the currently set bits in misa.

We also have to audit translate.c to make sure that misa is checked for all allowable extensions. MAFDC. Currently it only checks 'C' so we will need to add checks for 'M' in mul/mulw/div/divw/divu/divuw/rem/remw/remu/remuw and 'A' for amos, 'F' and 'D' in floating point operations, etc. It's a fair amount of work...

$ grep -r has_ext target/riscv/

target/riscv//csr.c: return -!riscv_has_ext(env, RVS);

target/riscv//csr.c: (!riscv_has_ext(env, RVS) && mpp == PRV_S) ||

target/riscv//csr.c: (!riscv_has_ext(env, RVU) && mpp == PRV_U)) {

target/riscv//cpu.h:static inline int riscv_has_ext(CPURISCVState *env, target_ulong ext)

target/riscv//op_helper.c: if (!riscv_has_ext(env, RVC) && (retpc & 0x3)) {

target/riscv//translate.c: if (!riscv_has_ext(env, RVC)) {

target/riscv//translate.c: if (!riscv_has_ext(env, RVC) && ((ctx->pc + bimm) & 0x3)) {

target/riscv//translate.c: if (riscv_has_ext(env, RVS)) {

target/riscv//translate.c: if (!riscv_has_ext(env, RVC)) {

So it seems like writable misa is a fair amount of work

- RISCV_FEATURE_MISA_WRITABLE (easy)

- ISA extension validation rules in write_misa (easy)

- Extension checks in translate.c (time-consuming but easy)

- RVC instruction pointer alignment checking rules (needs some care)

- Make sure we have CPU models with and without writable 'misa' so we can test code to handle typical legal processor variants.

Michael

Michael Clark

unread,

Apr 19, 2018, 8:11:15 PM4/19/18

to Zong Li, Zong Li, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

- Check regno >= 16 && riscv_has_ext(env, RVE) then illegal instruction trap (time-consuming - needs care to not miss any register)

Are F & D legal with E? 16 register FPU? I guess not. We may need to mask and drop some extension writes in the validate misa logic (WARL).

Andrew Waterman

unread,

Apr 19, 2018, 8:11:57 PM4/19/18

to Michael Clark, Zong Li, Zong Li, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

The problem is that BBL cannot cope with this inconsistent scenario. If pk is compiled with to assume no floating-point, there had better be no floating-point. If you remove the assertion, it will break in other ways later during in execution.

If you don't want the assertion to fire, compile BBL to match the ISA.

Andrew Waterman

unread,

Apr 19, 2018, 8:12:56 PM4/19/18

to Michael Clark, Zong Li, Zong Li, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

No, the manual forbids this case. E should preclude FD.

Michael Clark

unread,

Apr 19, 2018, 8:31:44 PM4/19/18

to Andrew Waterman, Zong Li, Zong Li, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson, Bastian Koppelmann, Sagar Karandikar, Alistair Francis, Emilio G. Cota

Good to know.

In any case it seems we need some pretty major changes to translate.c before we can make misa writable in qemu-riscv. Almost every gen routine with the exception of RVI will need predication based on extensions. It makes one pause and think whether adding if statements is a good approach or whether having exension metadata available in the decoder so that it can be done generically. Adding lots of riscv_has_ext checks would be nasty.

We will need to add misa to DisasContext so that we can remove CPURISCVState *env from gen methods.

It also doesn't make sense to start this until we have merged Emilio's DisasContextBase changes.

It would be nice if Emilio's changes can be merged early in the 2.13 cycle so that folk are able to make progress on extension checking to target/riscv/translate.c

I'll be happy if we can nuke CPURISCVState *env and perform the misa checks on DisasContext. CSR instructions end translation blocks so its safe to read the 'misa' state at the start of a block.

Michael

Zong Li

unread,

Apr 19, 2018, 9:33:14 PM4/19/18

to Andrew Waterman, Michael Clark, Zong Li, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

It make sense, but the BBL support three scenarios here.
1. Not support floating-point
2. Support floating-point and ISA also supported
3. Support floating-point and ISA not supprted

Only the third point is not working for now because the problem we discussed.

static void fp_init()
{
if (!supports_extension('D') && !supports_extension('F')) <---- (1)
return;

assert(read_csr(mstatus) & MSTATUS_FS);

#ifdef __riscv_flen
<----(2)
for (int i = 0; i < 32; i++)
init_fp_reg(i);
write_csr(fcsr, 0);
#else
<----(3)

uintptr_t fd_mask = (1 << ('F' - 'A')) | (1 << ('D' - 'A'));
clear_csr(misa, fd_mask);
assert(!(read_csr(misa) & fd_mask));

#endif
}

So if we need to match the BBL and ISA,
maybe we should remove the code about scenario 3 or just remove the
assertion when
misa is implemented by ignoring write anything.

Zong Li

unread,

Apr 19, 2018, 9:40:02 PM4/19/18

to Michael Clark, Zong Li, Andrew Waterman, Palmer Dabbelt, RISC-V SW Dev, QEMU Developers, Richard Henderson

There are some effort about the CSR can be writable or not, but it
looks nice about what you plan to do.

Michael Clark

unread,

Apr 20, 2018, 4:40:55 AM4/20/18

to dh...@dlasys.net, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

Hi David,

I don’t mind to help out. I’d be very keen to get riscv32 Linux running. It looks like you’ve made very good progress already. Last time I tried, there was code that wouldn’t compile.

Can you can share your riscv-linux branch I can have a go at helping to isolate the hang?

I would suggest to test with riscv-all. It’s certainly stable as we run regression tests before merging the qemu-2.13-upstream branch containing the pending fixes that have not yet landed upstream (work that is not yet complete we keep in wip- branches and these are not merged into riscv-all).

Note: if you use the ‘riscv-all’ branch, avoid doing a pull with rebase. It’s better to do: git stash; git fetch origin; git reset --hard origin/riscv-all ; git stash apply (substitute origin for whichever remote name you have configured for the g...@github.com:riscv/riscv-qemu.git repo).

The reason to use fetch is that we often rebase the branches against master so we can stay up-to-date and the pending patches often get updated based on review. e.g. we rebase and add Reviewed-by: tags to commit messages, make changes based on review and reorder patches in the series. We have been reordering the qemu-2.13-upstream series so that reviewed patches start linearly from the first commit in the series, meaning that once 2.12 is released, we can branch, tag and submit a subset of the patches sequentially to reduce our pending upstream branch.

There should be no major difference between the ‘virt’ machine in the riscv repo vs upstream QEMU, so either should work fine, if you want to use upstream QEMU. There are some additional bug fixes and ongoing development in the riscv repo.

I also have lots of remotes on my repos e.g.

$ git remote add riscv https://github.com/riscv/riscv-qemu.git
$ git remote add mjc https://github.com/michaeljclark/riscv-qemu.git
$ git remote add upstream https://git.qemu.org/qemu.git

It makes it easier to merge together branches from multiple sources. I never to “git pull”. It’s “git fetch” and “git merge” (for upstream) or “git stash ; git reset ; git apply” (for rebased downstream branches)

Just thought I’d add some notes on git workflow as I once struggled with conflicts using git pull on downstream branches that have been rebased. When we’re working on a downstream repo, we have no choice but to rebase our changes against upstream as ideally PRs should be from fresh branches...

I’d like to add a remote for your riscv-linux repo to mine so I can pull in your branches...

Thanks,
Michael

Michael Clark

unread,

Apr 20, 2018, 4:57:05 AM4/20/18

to Jim Wilson, Dave Lynch, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

Hi Jim,

I think the best plan is to get riscv32 Linux running as Linux is authoritative for stat. We can currently compile musl for riscv32 so we can use musl to compile the stat test. I need to find the test you sent me. I have it in my riscv-bugs directory.

It won’t be hard to fix but it just takes a little time and analysis. I went through this already with musl on the earlier QEMU version. We are /apparently/ using asm-generic in QEMU so I’m not yet certain if it’s glibc or QEMU. Quite possibly QEMU. I just haven’t had time to see what’s happening.

Regarding full system riscv32 Linux. We would really like this to be working. It has different issues. We need to make sure that the priv v1.10 sv32 support is working okay in QEMU. We can use tracing to see if it is hanging on a page fault.

I often use tracing with “-d in_asm,op,op_opt,out_asm” or some subset. There is also “-d all” but the output is huge. We can also attach GDB and find the kernel address that is faulting and look this up in the kernel symbol table to find the source file we need to add a printk to debug :-D

I’m looking forward to getting a riscv-linux branch that compiles... I haven’t had time to work on this... so this is a step forward.

It could be a number of things as we haven’t heavily tested S mode support in QEMU with sv32 paging in quite some time. Once we get it working we can add this to our regular tests. And also test stat inside Linux vs QEMU Linux-User emulation which are both completely different code paths.

Thanks,
Michael

> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/CAFyWVaa%2B0_14Z2vKFiBd6M8uVGjb9FH6tBB5gTpOGU16daimRA%40mail.gmail.com.

Michael Clark

unread,

Apr 20, 2018, 5:29:53 AM4/20/18

to Jim Wilson, Dave Lynch, Palmer Dabbelt, RISC-V SW Dev

> On 20/04/2018, at 8:56 PM, Michael Clark <michae...@mac.com> wrote:
>
> Hi Jim,
>
> I think the best plan is to get riscv32 Linux running as Linux is authoritative for stat. We can currently compile musl for riscv32 so we can use musl to compile the stat test. I need to find the test you sent me. I have it in my riscv-bugs directory.
>
> It won’t be hard to fix but it just takes a little time and analysis. I went through this already with musl on the earlier QEMU version. We are /apparently/ using asm-generic in QEMU so I’m not yet certain if it’s glibc or QEMU. Quite possibly QEMU. I just haven’t had time to see what’s happening.
>
> Regarding full system riscv32 Linux. We would really like this to be working. It has different issues. We need to make sure that the priv v1.10 sv32 support is working okay in QEMU. We can use tracing to see if it is hanging on a page fault.
>
> I often use tracing with “-d in_asm,op,op_opt,out_asm” or some subset. There is also “-d all” but the output is huge. We can also attach GDB and find the kernel address that is faulting and look this up in the kernel symbol table to find the source file we need to add a printk to debug :-D
>
> I’m looking forward to getting a riscv-linux branch that compiles... I haven’t had time to work on this... so this is a step forward.
>
> It could be a number of things as we haven’t heavily tested S mode support in QEMU with sv32 paging in quite some time. Once we get it working we can add this to our regular tests. And also test stat inside Linux vs QEMU Linux-User emulation which are both completely different code paths.

BTW which glibc branch has riscv32 support?

- https://github.com/riscv/riscv-glibc branch=?

I could possibly test stat in qemu-riscv32 linux-user this weekend. I’ve raised an issue in the riscv-qemu issue tracker:

- https://github.com/riscv/riscv-qemu/issues/135

I didn’t see an open issue in riscv-qemu for this as it might have been raised somewhere else (riscv-glibc?, riscv-gcc?, riscv-gnu-toolchain?).

You’ve already mentioned this. The user-mode riscv32 stat issue doesn’t affect freedom-u-sdk because we only use QEMU as a full system emulator there.

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/5FF24F12-1C2F-4214-997C-CF546938F60E%40mac.com.

Michael Clark

unread,

Apr 20, 2018, 6:08:30 AM4/20/18

to Jim Wilson, Dave Lynch, Palmer Dabbelt, RISC-V SW Dev

> On 20/04/2018, at 9:29 PM, Michael Clark <michae...@mac.com> wrote:
>
>
>
>> On 20/04/2018, at 8:56 PM, Michael Clark <michae...@mac.com> wrote:
>>
>> Hi Jim,
>>
>> I think the best plan is to get riscv32 Linux running as Linux is authoritative for stat. We can currently compile musl for riscv32 so we can use musl to compile the stat test. I need to find the test you sent me. I have it in my riscv-bugs directory.
>>
>> It won’t be hard to fix but it just takes a little time and analysis. I went through this already with musl on the earlier QEMU version. We are /apparently/ using asm-generic in QEMU so I’m not yet certain if it’s glibc or QEMU. Quite possibly QEMU. I just haven’t had time to see what’s happening.
>>
>> Regarding full system riscv32 Linux. We would really like this to be working. It has different issues. We need to make sure that the priv v1.10 sv32 support is working okay in QEMU. We can use tracing to see if it is hanging on a page fault.
>>
>> I often use tracing with “-d in_asm,op,op_opt,out_asm” or some subset. There is also “-d all” but the output is huge. We can also attach GDB and find the kernel address that is faulting and look this up in the kernel symbol table to find the source file we need to add a printk to debug :-D
>>
>> I’m looking forward to getting a riscv-linux branch that compiles... I haven’t had time to work on this... so this is a step forward.
>>
>> It could be a number of things as we haven’t heavily tested S mode support in QEMU with sv32 paging in quite some time. Once we get it working we can add this to our regular tests. And also test stat inside Linux vs QEMU Linux-User emulation which are both completely different code paths.
>
> BTW which glibc branch has riscv32 support?
>
> - https://github.com/riscv/riscv-glibc branch=?
>
> I could possibly test stat in qemu-riscv32 linux-user this weekend. I’ve raised an issue in the riscv-qemu issue tracker:
>
> - https://github.com/riscv/riscv-qemu/issues/135
>
> I didn’t see an open issue in riscv-qemu for this as it might have been raised somewhere else (riscv-glibc?, riscv-gcc?, riscv-gnu-toolchain?).
>
> You’ve already mentioned this. The user-mode riscv32 stat issue doesn’t affect freedom-u-sdk because we only use QEMU as a full system emulator there.

Okay. It seems i’ve isolated the issue the stat issue.

Linux asm-generic stat has 32-bit st_dev and st_ino. musl and glibc for riscv32 have chosen to use 64-bit dev_t and ino_t

See analysis here:

- https://github.com/riscv/riscv-qemu/issues/135

I think we should run the riscv32 stat test in the riscv-linux/riscv32 QEMU full system emulator vs QEMU linux-user.

It should be relatively easy to fix QEMU, but we might need to fix Linux also, based on my reading of the riscv-linux/include/uapi/asm-generic/stat.h it seems QEMU /might/ match Linux

- https://github.com/riscv/riscv-linux/blob/master/include/uapi/asm-generic/stat.h

We won’t know until we have linux running in qemu-system-riscv32

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/E34B362E-64BD-47DF-A7A0-17C236A369E8%40mac.com.

David H. Lynch Jr.

unread,

Apr 20, 2018, 4:05:36 PM4/20/18

to Michael Clark, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

On Fri, 2018-04-20 at 20:40 +1200, Michael Clark wrote:
> Hi David,

>

> I don’t mind to help out. I’d be very keen to get riscv32 Linux
> running. It looks like you’ve made very good progress already. Last
> time I tried, there was code that wouldn’t compile.
>

Mostly I am looking to get to where I know I am dealing with a Linux
problem - rather than a Qemu problem. I have done Linux Kernel bringup
on a new board several times before. I am comfortable with that.
If I know I am not separately fighting with Risc-V Qemu issues.

Not that I can not try to deal with those - but I am not a Qemu expert
and my budget is 10 hours/week on this project.

Can you can share your riscv-linux branch I can have a go at helping
to isolate the hang?

I am using the github freedom-u-sdk repository master branch.

EXCEPT the Makefile, I have changed nothing.
BUT you will note the Makefile "patches" a variety of other configs
to get them to build 32 bits.

I tried to submit this as a "patch" but I must have got something wrong
as it never showed up.

I have attached My Makefile

I am not offering this as a polished solution - just as something that
is working. There are not really that many changes to the Makefile.

I would also note. I am building a pretty minimal Linux.
My target is going to be very very light on hardware. The bare minimum
necescary to run Linux. Early versions may not have that.

We are working on something that is closer to a RISC-V32 equivalent of
an STM32 - a SOC with the CPU Ram, Flash and IO in a single chip.

> I would suggest to test with riscv-all. It’s certainly stable as we
> run regression tests before merging the qemu-2.13-upstream branch
> containing the pending fixes that have not yet landed upstream (work
> that is not yet complete we keep in wip- branches and these are not
> merged into riscv-all).
>
> Note: if you use the ‘riscv-all’ branch, avoid doing a pull with
> rebase. It’s better to do: git stash; git fetch origin; git reset --
> hard origin/riscv-all ; git stash apply (substitute origin for
> whichever remote name you have configured for the g...@github.com:risc
> v/riscv-qemu.git repo).
>

So I am clear - clone riscv-qemu and use the riscv-all branch - correct
? I beleive that freedom-u-sdk pulls riscv-qemu. I will try to change
the branch.

> The reason to use fetch is that we often rebase the branches against
> master so we can stay up-to-date and the pending patches often get
> updated based on review. e.g. we rebase and add Reviewed-by: tags to
> commit messages, make changes based on review and reorder patches in
> the series. We have been reordering the qemu-2.13-upstream series so
> that reviewed patches start linearly from the first commit in the
> series, meaning that once 2.12 is released, we can branch, tag and
> submit a subset of the patches sequentially to reduce our pending
> upstream branch.

Your way ahead of my git skills. I use git heavily - but almost
entirely on non-collaborative projects. Occasionally I share work with
clients using gitlab,

>
> There should be no major difference between the ‘virt’ machine in the
> riscv repo vs upstream QEMU, so either should work fine, if you want
> to use upstream QEMU. There are some additional bug fixes and ongoing
> development in the riscv repo.

I do not want to get deeply into Qemu if I do not have to.
I want to focus on my efforts on the Linux Kernel for my target.

> I also have lots of remotes on my repos e.g.
>
> $ git remote add riscv https://github.com/riscv/riscv-qemu.git
> $ git remote add mjc https://github.com/michaeljclark/riscv-qemu.git
> $ git remote add upstream https://git.qemu.org/qemu.git
>
> It makes it easier to merge together branches from multiple sources.
> I never to “git pull”. It’s “git fetch” and “git merge” (for
> upstream) or “git stash ; git reset ; git apply” (for rebased
> downstream branches)
>
> Just thought I’d add some notes on git workflow as I once struggled
> with conflicts using git pull on downstream branches that have been
> rebased. When we’re working on a downstream repo, we have no choice
> but to rebase our changes against upstream as ideally PRs should be
> from fresh branches...
>
> I’d like to add a remote for your riscv-linux repo to mine so I can
> pull in your branches...

I can put my work up on Gitlab if you want and make it public - but at
this moment, ALL the changes I have made are to the freedom-u-sdk
Makefile, and those are minimal and serve 2 purposes - Allowing the
Makefile to work 32/64 bits, Minimizing the Linux config.

>
> Thanks,
> Michael

Thank you.

Makefile

Michael Clark

unread,

Apr 22, 2018, 12:12:04 AM4/22/18

to dh...@dlasys.net, Palmer Dabbelt, RISC-V SW Dev

> On 21/04/2018, at 8:05 AM, David H. Lynch Jr. <dhly...@gmail.com> wrote:
>
> On Fri, 2018-04-20 at 20:40 +1200, Michael Clark wrote:
>> Hi David,
>>
>
>> I don’t mind to help out. I’d be very keen to get riscv32 Linux
>> running. It looks like you’ve made very good progress already. Last
>> time I tried, there was code that wouldn’t compile.
>>
>
>
>
> Mostly I am looking to get to where I know I am dealing with a Linux
> problem - rather than a Qemu problem. I have done Linux Kernel bringup
> on a new board several times before. I am comfortable with that.
> If I know I am not separately fighting with Risc-V Qemu issues.
>
> Not that I can not try to deal with those - but I am not a Qemu expert
> and my budget is 10 hours/week on this project.
>
> Can you can share your riscv-linux branch I can have a go at helping
> to isolate the hang?
>
> I am using the github freedom-u-sdk repository master branch.
>
> EXCEPT the Makefile, I have changed nothing.
> BUT you will note the Makefile "patches" a variety of other configs
> to get them to build 32 bits.
>
> I tried to submit this as a "patch" but I must have got something wrong
> as it never showed up.
>
> I have attached My Makefile
>
> I am not offering this as a polished solution - just as something that
> is working. There are not really that many changes to the Makefile.

The Makefile works. Thanks. I’m now able to build riscv32 linux and reproduce your failure. I had a couple of issues because my glibc and perl versions are too new:

- https://github.com/sifive/freedom-u-sdk/issues/52
- https://github.com/sifive/buildroot/pull/5

I’m looking through the symbol table where it hangs and it seems to be processing an interrupt based on the last output from tracing using -d in_asm

The back trace seems to be:

<riscv_intc_irq>
<ret_from_exception>
<restore_all>

Then there is an SRET and the trace stops.

I’ll let you know what I find if I make further progress…

> <Makefile>

Michael Clark

unread,

Apr 22, 2018, 2:29:41 PM4/22/18

to dh...@dlasys.net, Palmer Dabbelt, RISC-V SW Dev

I’ve done some more instrumentation and I’m seeing SEPC returning from an interrupt handler to some random pre-empted code, however the interrupt appears to stay pending as I see a stream of SRET to the same address. I then determined that it is the timer interrupt that is calling SRET, and then immediately trapping again. It’s a Timer interrupt. Note the log message just before the hang, just before the scheduler clock is set up.

[ 0.000188] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns

I disabled timer interrupts and the boot gets a bit further and then the same thing happens on external interrupts. Likely the VirtIO interrupts via the PLIC. I don’t think it is a problem with the PLIC because it appears to be showing up both on Timer interrupts and on external interrupts. It could be a bug either in Linux or QEMU. I suspect Linux, BBL or QEMU is not clearing the asynchronous interrupt after it is delivered, because it goes into an SRET loop to the same address which appears to be a random address of some pre-empted code, however it keeps calling SRET to the same address. Synchronous traps are fine. Page faulting is working. It’s somewhere in interrupt delivery and acknowledgement. Obviously some code that is affected by TARGET_LONG_BITS == 32, as 64-bit is not affected.

Here is a boot with Timer interrupts disabled. Of course it gets further, but then the same thing happens with External interrupts.

At least we have isolated the are of code. The question is why after SRET do we see an SRET to the exact same address in a loop (the address of the code that was pre-empted by the interrupt). This suggest after SRET is not actually completing or the interrupt pending flags are still set. Somehow STIP is not being cleared (or SEIP in the case of external). STIP is dispatched by bbl and SEIP is dispatched by the PLIC in QEMU so it suggests its not a bug in BBL.

We have to look for differences that affect 32-bit in these possible places.

- QEMU interrupt handling
- Linux interrupt handling
- BBL interrupt handling

I don’t want to point-the-finger to early, but I am suspecting QEMU, given we essentially see a stream of SRET to the same address after an asynchronous interrupt which suggests QEMU is not reseting QEMUs internal interrupt state after an “asynchronous exception” i.e. interrupt. Regular traps are fine.

In any case we have isolated the pieces of code we need to look at:

$ grep interrupt riscv-qemu/target/riscv/*.c

Note: I had enabled additional log messages by enabling RISCV_DEBUG_INTERRUPT and adding a log message in helper_sret that prints retpc, to figure this out. I needed to fix some log messages because when we converted them from printf to qemu_log, I wasn’t aware the messages needed a carriage return. So we needed this patch (I will merge into riscv-all at some point soon as it is a logging bug fix):

$ git show
commit 506c2c977350aac4bd1c9128c3fe325df6d8eb6a
Author: Michael Clark <m...@sifive.com>
Date: Mon Apr 23 05:57:26 2018 +1200

RISC-V: Clean up interrupt debug messages

diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index d70548e..3d2ac4d 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -442,11 +442,13 @@ void riscv_cpu_do_interrupt(CPUState *cs)
if (RISCV_DEBUG_INTERRUPT) {
int log_cause = cs->exception_index & RISCV_EXCP_INT_MASK;
if (cs->exception_index & RISCV_EXCP_INT_FLAG) {
- qemu_log_mask(LOG_TRACE, "core 0: trap %s, epc 0x" TARGET_FMT_lx,
- riscv_intr_names[log_cause], env->pc);
+ qemu_log_mask(LOG_TRACE, "core "
+ TARGET_FMT_ld ": trap %s, epc 0x" TARGET_FMT_lx "\n",
+ env->mhartid, riscv_intr_names[log_cause], env->pc);
} else {
- qemu_log_mask(LOG_TRACE, "core 0: intr %s, epc 0x" TARGET_FMT_lx,
- riscv_excp_names[log_cause], env->pc);
+ qemu_log_mask(LOG_TRACE, "core "
+ TARGET_FMT_ld ": intr %s, epc 0x" TARGET_FMT_lx "\n",
+ env->mhartid, riscv_excp_names[log_cause], env->pc);
}
}

@@ -508,8 +510,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)

if (hasbadaddr) {
if (RISCV_DEBUG_INTERRUPT) {
- qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld
- ": badaddr 0x" TARGET_FMT_lx, env->mhartid, env->badaddr);
+ qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld ": badaddr 0x"
+ TARGET_FMT_lx "\n", env->mhartid, env->badaddr);
}
env->sbadaddr = env->badaddr;
} else {
@@ -533,8 +535,8 @@ void riscv_cpu_do_interrupt(CPUState *cs)

if (hasbadaddr) {
if (RISCV_DEBUG_INTERRUPT) {
- qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld
- ": badaddr 0x" TARGET_FMT_lx, env->mhartid, env->badaddr);
+ qemu_log_mask(LOG_TRACE, "core " TARGET_FMT_ld ": badaddr 0x"
+ TARGET_FMT_lx "\n", env->mhartid, env->badaddr);
}
env->mbadaddr = env->badaddr;
} else {

Here is the boot log when timer interrupts are suppressed.

qemu-system-riscv32 -nographic -machine virt -kernel /home/mclark/src/sifive/freedom-u-sdk/work/riscv-pk/bbl \
-drive file=/home/mclark/src/sifive/freedom-u-sdk/work/rootfs.bin,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 \

[ 0.000000] Linux version 4.15.0-00044-g2b0aa1de45f6 (mclark@minty) (gcc version 7.2.0 (GCC)) #2 Sun Apr 22 17:02:53 NZST 2018
[ 0.000000] bootconsole [early0] enabled
[ 0.000000] Initial ramdisk at: 0x(ptrval) (512 bytes)

[ 0.000000] Zone ranges:
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000080400000-0x0000087fffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB) mapped at [(ptrval)-(ptrval)]
[ 0.000000] elf_hwcap is 0x112d
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 31496
[ 0.000000] Kernel command line: earlyprintk
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[ 0.000000] Sorting __ex_table...

[ 0.000000] Memory: 55052K/126976K available (3663K kernel code, 127K rwdata, 624K rodata, 140K init, 214K bss, 71924K reserved, 0K cma-reserved)

[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[ 0.000000] riscv,cpu_intc,0: 32 local interrupts mapped
[ 0.000000] riscv,plic0,c000000: mapped 10 interrupts to 1/2 handlers
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 191126044627 ns

[ 0.000187] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns
[ 0.002023] Calibrating delay loop (skipped), value calculated using timer frequency.. 20.00 BogoMIPS (lpj=100000)
[ 0.002609] pid_max: default: 32768 minimum: 301
[ 0.003526] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.003909] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.045399] devtmpfs: initialized
[ 0.050450] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 0.051034] futex hash table entries: 256 (order: -1, 3072 bytes)
[ 0.054321] random: get_random_u32 called from bucket_table_alloc+0xb4/0x2b4 with crng_init=0
[ 0.055188] NET: Registered protocol family 16
[ 0.068161] SCSI subsystem initialized
[ 0.069764] pps_core: LinuxPPS API ver. 1 registered
[ 0.070051] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giom...@linux.it>
[ 0.070586] PTP clock support registered
[ 0.074874] clocksource: Switched to clocksource riscv_clocksource
[ 0.079980] NET: Registered protocol family 2
[ 0.082503] TCP established hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.082933] TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.083343] TCP: Hash tables configured (established 1024 bind 1024)
[ 0.084791] UDP hash table entries: 256 (order: 0, 4096 bytes)
[ 0.085190] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[ 0.086358] NET: Registered protocol family 1
[ 0.090816] Unpacking initramfs...
[ 0.093681] Initialise system trusted keyrings
[ 0.094936] workingset: timestamp_bits=30 max_order=14 bucket_order=0
[ 0.109118] random: fast init done
[ 0.129122] Key type asymmetric registered
[ 0.129420] Asymmetric key parser 'x509' registered
[ 0.129828] io scheduler noop registered
[ 0.130485] io scheduler cfq registered (default)
[ 0.130788] io scheduler mq-deadline registered
[ 0.131066] io scheduler kyber registered
[ 0.259751] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.264592] console [ttyS0] disabled
[ 0.265455] 10000000.uart: ttyS0 at MMIO 0x10000000 (irq = 1, base_baud = 230400) is a 16550A
[ 0.266600] console [ttyS0] enabled
[ 0.266600] console [ttyS0] enabled
[ 0.267157] bootconsole [early0] disabled
[ 0.267157] bootconsole [early0] disabled

> --
> You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
> To post to this group, send email to sw-...@groups.riscv.org.
> Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/53498111-711E-4729-9185-EA253E3665BF%40mac.com.

Michael Clark

unread,

Apr 22, 2018, 2:46:00 PM4/22/18

to dh...@dlasys.net, Palmer Dabbelt, RISC-V SW Dev

I can possibly make a minimal bare metal reproducer that tests timer interrupts in rv32 because we need them to work for the E series boards. Currently the sifive_e board which runs some of the 32-bit HiFive1 examples in QEMU, I think might be using polled mode IO vs interrupt driven IO. I’m not 100% ready to point the finger yet, but I think it is QEMU 32-bit interrupt handling and it is a general problem that affects both timer, external and likely software interrupts. If that’s the case it’s likely some target_long vs int64 conversion problem, like a missing sign extension (uint64), however at this point its speculation until we find the exact line of code that is broken.

> To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/5F2A9847-9653-4DBD-9E04-68728D869011%40mac.com.

Michael Clark

unread,

Apr 24, 2018, 4:58:42 AM4/24/18

to Dave Lynch, Palmer Dabbelt, RISC-V SW Dev

Hi Daryl, Palmer,

It turns out that the riscv32 interrupt bug was in riscv-linux. QEMU is fine.

The riscv_intc_irq interrupt handler casts the interrupt cause to an int and then used it in a case statement. This works on 64-bit kernels as it clears the MSB of the cause. It needs bit-width neutral code to clear the MSB before using it in a switch statement.

Here is the change I made to get the kernel to boot up until trying to mount root:

diff --git a/drivers/irqchip/irq-riscv-intc.c b/drivers/irqchip/irq-riscv-intc.c
index 1f2c450efae8..58f211648072 100644
--- a/drivers/irqchip/irq-riscv-intc.c
+++ b/drivers/irqchip/irq-riscv-intc.c
@@ -56,7 +56,8 @@ void riscv_intc_irq(struct pt_regs *regs)
{
struct pt_regs *old_regs = set_irq_regs(regs);
struct irq_domain *domain;
- int cause = csr_read(scause);
+ long cause = csr_read(scause);
+ cause &= ~(1L << (PTR_BITS - 1));

irq_enter();

I’ll send a proper patch or pull request as my MUA is likely to munge the patch as i’ve just pasted it from the terminal.

I’ve verified the kernel is receiving timer interrupts and it tries to mount the root disk. I think I need to rebuild buildroot as 32-bit.

Regards,
Michael.

work/riscv-qemu/prefix/bin/qemu-system-riscv32 -nographic -machine virt -kernel work/riscv-pk/bbl \
-drive file=work/rootfs.bin,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 \

[ 0.000000] Linux version 4.15.0-00044-g2b0aa1de45f6-dirty (mclark@minty) (gcc version 7.2.0 (GCC)) #8 Tue Apr 24 20:51:10 NZST 2018
[ 0.000000] bootconsole [early0] enabled
[ 0.000000] Initial ramdisk at: 0x(ptrval) (9900032 bytes)

[ 0.000000] Zone ranges:
[ 0.000000] DMA32 empty
[ 0.000000] Normal [mem 0x0000000080400000-0x0000087fffffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000080400000-0x0000000087ffffff]
[ 0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB) mapped at [(ptrval)-(ptrval)]
[ 0.000000] elf_hwcap is 0x112d
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 31496
[ 0.000000] Kernel command line: earlyprintk
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes)
[ 0.000000] Sorting __ex_table...

[ 0.000000] Memory: 46484K/126976K available (2590K kernel code, 128K rwdata, 624K rodata, 9780K init, 214K bss, 80492K reserved, 0K cma-reserved)

[ 0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
[ 0.000000] NR_IRQS: 0, nr_irqs: 0, preallocated irqs: 0
[ 0.000000] riscv,cpu_intc,0: 32 local interrupts mapped
[ 0.000000] riscv,plic0,c000000: mapped 10 interrupts to 1/2 handlers
[ 0.000000] clocksource: riscv_clocksource: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 191126044627 ns

[ 0.000193] sched_clock: 64 bits at 10MHz, resolution 100ns, wraps every 4398046511100ns
[ 0.002094] Calibrating delay loop (skipped), value calculated using timer frequency.. 20.00 BogoMIPS (lpj=100000)
[ 0.002690] pid_max: default: 32768 minimum: 301
[ 0.003646] Mount-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.004023] Mountpoint-cache hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.049448] devtmpfs: initialized
[ 0.057999] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604462750000 ns
[ 0.058561] futex hash table entries: 256 (order: -1, 3072 bytes)
[ 0.061985] random: get_random_u32 called from bucket_table_alloc+0x7c/0x1c2 with crng_init=0
[ 0.062861] NET: Registered protocol family 16
[ 0.076543] SCSI subsystem initialized
[ 0.078342] pps_core: LinuxPPS API ver. 1 registered
[ 0.078623] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giom...@linux.it>
[ 0.079161] PTP clock support registered
[ 0.083599] clocksource: Switched to clocksource riscv_clocksource
[ 0.088975] NET: Registered protocol family 2
[ 0.091753] TCP established hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.092178] TCP bind hash table entries: 1024 (order: 0, 4096 bytes)
[ 0.092591] TCP: Hash tables configured (established 1024 bind 1024)
[ 0.094102] UDP hash table entries: 256 (order: 0, 4096 bytes)
[ 0.094497] UDP-Lite hash table entries: 256 (order: 0, 4096 bytes)
[ 0.095686] NET: Registered protocol family 1
[ 0.337889] Unpacking initramfs...
[ 0.560381] Initialise system trusted keyrings
[ 0.561738] workingset: timestamp_bits=30 max_order=14 bucket_order=0
[ 0.575816] random: fast init done
[ 0.596686] Key type asymmetric registered
[ 0.596984] Asymmetric key parser 'x509' registered
[ 0.597407] io scheduler noop registered
[ 0.598046] io scheduler cfq registered (default)
[ 0.598347] io scheduler mq-deadline registered
[ 0.598616] io scheduler kyber registered
[ 0.728605] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.732900] console [ttyS0] disabled
[ 0.734025] 10000000.uart: ttyS0 at MMIO 0x10000000 (irq = 1, base_baud = 230400) is a 16550A
[ 0.735201] console [ttyS0] enabled
[ 0.735201] console [ttyS0] enabled
[ 0.735778] bootconsole [early0] disabled
[ 0.735778] bootconsole [early0] disabled
[ 0.774728] libphy: Fixed MDIO Bus: probed
[ 0.779111] NET: Registered protocol family 17
[ 0.781545] Loading compiled-in X.509 certificates
[ 0.798769] Freeing unused kernel memory: 9780K
[ 0.799029] This architecture does not have kernel memory protection.
[ 0.804256] Failed to execute /init (error -8)
[ 0.805036] Starting init: /sbin/init exists but couldn't execute it (error -8)
[ 0.805939] Starting init: /bin/sh exists but couldn't execute it (error -8)
[ 0.806331] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
[ 0.806979] CPU: 0 PID: 1 Comm: swapper Not tainted 4.15.0-00044-g2b0aa1de45f6-dirty #8
[ 0.807397] Call Trace:
[ 0.807614] [<(ptrval)>] walk_stackframe+0x0/0xa2
[ 0.807883] [<(ptrval)>] show_stack+0x24/0x32
[ 0.808109] [<(ptrval)>] dump_stack+0x20/0x2c
[ 0.808325] [<(ptrval)>] panic+0xc4/0x1d6
[ 0.808538] [<(ptrval)>] kernel_init+0xe4/0xe8
[ 0.808765] [<(ptrval)>] ret_from_syscall+0xa/0xe
[ 0.809228] ---[ end Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.

David H. Lynch Jr.

unread,

Apr 25, 2018, 12:58:07 AM4/25/18

to Michael Clark, Palmer Dabbelt, RISC-V SW Dev

Hi Mike;

Thank you.
Great catch.

It would have taken me far longer to find that and I would have been
wandering blindly in Qemu.

I thought I have Buildroot building 32 bits, but I guess I didn;t quite
patch Buildroot correctly in the Makefile.
I will have to look at that.

Dave

> [ 0.000000] Initmem setup node 0 [mem 0x0000000080400000-
> 0x0000000087ffffff]
> [ 0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB)

David H. Lynch Jr.

unread,

Apr 25, 2018, 9:18:31 PM4/25/18

to Michael Clark, Palmer Dabbelt, RISC-V SW Dev

When I run readelf -a work/buildroot_rootfs/images/rootfs/bin/busybox

where work/buildroot_rootfs/images/rootfs.ext2 is mounted as rootfs
and sbin/init is linked to bin/busybox

I get
ELF Header:
Magic:   7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class:                             ELF32
Data:                              2's complement, little endian
Version:                           1 (current)
OS/ABI:                            UNIX - System V
ABI Version:                       0
Type:                              EXEC (Executable file)
Machine:                           Intel 80386
Version:                           0x1
Entry point address:               0x804db0c
Start of program headers:          52 (bytes into file)
Start of section headers:          559484 (bytes into file)
Flags:                             0x0
Size of this header:               52 (bytes)
Size of program headers:           32 (bytes)
Number of program headers:         8
Size of section headers:           40 (bytes)
Number of section headers:         26
Section header string table index: 25

Am (I looking in the right place ?
What is the x86 version of readelf going to say the machine is for a
RISC-V binary ?

On Tue, 2018-04-24 at 20:58 +1200, Michael Clark wrote:

> [ 0.000000] Initmem setup node 0 [mem 0x0000000080400000-
> 0x0000000087ffffff]
> [ 0.000000] software IO TLB [mem 0x83f06000-0x87f06000] (64MB)

Andrew Waterman

unread,

Apr 25, 2018, 9:26:16 PM4/25/18

to dh...@dlasys.net, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

Readelf understands other ISAs' ELF headers. So it should print RISC-V, or if you're using an old version of Binutils, it'll print the RISC-V magic number (243).

If it's printing i386, it really is an i386 binary (which you can confirm with objdump -d).

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.

To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+unsubscribe@groups.riscv.org.

To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/1524705508.4851.37.camel%40dlasys.net.

David H. Lynch Jr.

unread,

Apr 25, 2018, 11:08:02 PM4/25/18

to Andrew Waterman, Michael Clark, Palmer Dabbelt, RISC-V SW Dev

Thank you.

I have confirmed that sometime after copying from the conf directory and being sed patched to the correct processor and architecture that