RISC-V disassembled program dump different from the cycle-accurate emulator output?

1,144 views
Skip to first unread message

Dragoș

unread,
Sep 3, 2016, 1:13:18 PM9/3/16
to RISC-V SW Dev
Good day!

I have a problem with running the RISC-V rocket-chip cycle-accurate emulator for the dummy_rocc_test.c program that I compile, located in the rocket-chip repository (riscv-tools/riscv-isa-sim/dummy_rocc).

I will explain my steps below and place the problem afterwards. The steps I'm taking:

1. git clone https://github.com/ucb-bar/rocket-chip.git
2. git submodule update --init --recursive` in the rocket-chip, rocket-chip/riscv-tools directories
3. export RISCV=[a folder called riscv-toolchain]
4. export PATH=[...]/riscv-toolchain/bin:$PATH
5. . rocket-chip/riscv-tools/build.sh (without running build-rv32im.sh afterwards)
6. make CONFIG=RoccConfigExample in the rocket-chip/emulator directory
7. cd rocket-chip/riscv-tools/riscv-isa-sim/dummy-rocc
8. riscv64-unknown-elf-gcc dummy_rocc_test.c -o dummy_rocc_test.out
9. (while in the dummy_rocc folder)
rocket-chip/emulator/emulator-TestHarness-RoccExampleConfig +dramsim +max-cycles=10000 +verbose pk dummy_rocc_test.out 3>&1 1>&2 2>&3 | spike-dasm > dummy_rocc_test.log
10. riscv64-unknown-elf-objdump --disassemble-all dummy_rocc_test.out > dummy_rocc_test.dump

The output of `dummy_rocc_test.log` file (first 25 lines, full text here):
    C0:          0 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          1 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          2 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          3 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          4 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          5 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          6 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          7 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          8 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:          9 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         10 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         11 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         12 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         13 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         14 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         15 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         16 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         17 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         18 [0] pc=[b90caeae11] W[r 0=0000000000000000][0] R[r19=0000000000000000] R[r 4=918250ce690efa03] inst=[024983dc] unknown
    C0
:         19 [0] pc=[0000001000] W[r 0=0000000000000000][0] R[r13=0000000000000000] R[r 0=918250ce690efa03] inst=[0e06b783] ld      a5, 224(a3)
    C0
:         20 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
    C0
:         21 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
    C0
:         22 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
    C0
:         23 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
    C0
:         24 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
    C0
:         25 [0] pc=[0000001002] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=918250ce690efa03] inst=[00006db7] lui     s11, 0x6
   
...

The output of `dummy_rocc_test.dump` (first 35 lines, full text here):


    dummy_rocc_test
.out:     file format elf64-littleriscv
   
   
   
Disassembly of section .text:
   
   
0000000000010000 <_ftext>:
       
10000:    00008197              auipc    gp,0x8
       
10004:    d8018193              addi    gp,gp,-640 # 17d80 <_gp>
       
10008:    00007297              auipc    t0,0x7
       
1000c:    5d028293              addi    t0,t0,1488 # 175d8 <_PathLocale>
       
10010:    00007317              auipc    t1,0x7
       
10014:    67030313              addi    t1,t1,1648 # 17680 <_end>
       
10018:    0002b023              sd    zero,0(t0)
       
1001c:    00828293              addi    t0,t0,8
       
10020:    fe62ece3              bltu    t0,t1,10018 <_ftext+0x18>
       
10024:    00000517              auipc    a0,0x0
       
10028:    2f050513              addi    a0,a0,752 # 10314 <__libc_fini_array>
       
1002c:    2a4000ef              jal    102d0 <atexit>
       
10030:    3c0000ef              jal    103f0 <__libc_init_array>
       
10034:    00012503              lw    a0,0(sp)
       
10038:    00810593              addi    a1,sp,8
       
1003c:    00000613              li    a2,0
       
10040:    124000ef              jal    10164 <main>
       
10044:    2a00006f              j    102e4 <exit>
   
   
0000000000010048 <_fini>:
       
10048:    00008067              ret
   
   
000000000001004c <deregister_tm_clones>:
       
1004c:    00017537              lui    a0,0x17
       
10050:    000177b7              lui    a5,0x17
       
10054:    57850713              addi    a4,a0,1400 # 17578 <__TMC_END__>
       
10058:    57f78793              addi    a5,a5,1407 # 1757f <__TMC_END__+0x7>
       
1005c:    40e787b3              sub    a5,a5,a4

My main problem is that I cannot make the emulator output me a SUCCESS! with a correct number of cycles for the compiled program dummy_rocc_test.c. The same program runs fine with spike pk. The output from the emulator dummy_rocc_test.log shows that there's no instruction from the disassembled program file dummy_rocc_test.dump is being executed; so the program itself isn't being executed by the emulator.

I understand the emulator starts with randomized values in its registers, that's explains the first value in pc. Afterwards, it sets at [0000001000] which, correct me if I'm wrong, but I think it's the same 0x10000 address from where programs start to execute if run with a proxy kernel (where the first instruction in dummy_rocc_test.dump should be).

I do believe that pc should start first with a value of 0x200 / pc=[0000000200] first though, to start the proxy kernel (0x200 being the address of choice for starting bare-metal programs by RISC-V). Even if I give a bigger number for +maxcycles in step 9, pc just loops through the same addresses, never going through [0000001000] again or though [0000000200] and outputting the same FAILED! in the end because the program apparently didn't execute in time.

I would like to understand where am I wrong in the thinking I'm doing or in the steps I'm taking. Thank you for your time!

I've also posted this question on StackOverflow here. I'm sorry beforehand if this isn't the place to post this kind of question, but I'm just looking to understand and learn about the architecture and its tools.

Michael Clark

unread,
Sep 3, 2016, 8:13:46 PM9/3/16
to Dragoș, RISC-V SW Dev
Hi Dragoș,

One of the RISC-V gurus should be able to answer this. I believe the ROCC is a custom extension and not part of the standard.

However injecting 2048 bits of entropy into the register state at boot is an absolutely brilliant idea!!! This solves early boot random. Excellent!

It does raise another orthogonal point. It seems that the foundation could /potentially/ maintain cycle and latency characterisation approximations as “polynomials” for each /standardised/ instruction for the certified models, If it’s not onerous. This will be a much harder problem to model for an out of order implementation like BOOM so it would need to be an approximation. Most would be O(1). load and store polynomial with address translation and a cache hierarchy would also be relatively onerous, but could be approximated (probabilistically for some workload based on cache hierarchy dimensions).

It seems that mtune=  could ideally use a flat file from the foundation’s records of certified models.

0x200 for PC after reset? Where does that come from? It’s not in section 3.3.

I guess it doesn’t matter if “pc" points to the start of some PIC code and “sp” is set to some scratch memory with a defined size?

"
3.3 Reset.

The pc is set to an implementation-defined reset vector.
"

I don’t think dummy_rocc is certified either so it’s unlikely that the cycle count would be correct.

SUCCESS! Does that answer your question?

Regards,
Michael. 

😎

--
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/sw-dev/6a301ae6-e011-40fa-8a08-012524844ee6%40groups.riscv.org.

Stefan O'Rear

unread,
Sep 3, 2016, 8:28:14 PM9/3/16
to Michael Clark, Dragoș, RISC-V SW Dev
On Sat, Sep 3, 2016 at 5:13 PM, Michael Clark <michae...@mac.com> wrote:
Hi Dragoș,

One of the RISC-V gurus should be able to answer this. I believe the ROCC is a custom extension and not part of the standard.

However injecting 2048 bits of entropy into the register state at boot is an absolutely brilliant idea!!! This solves early boot random. Excellent!

This is factually incorrect.  Register state randomization does not happen on actual hardware, it's an artifact of the Chisel emulation process.
 
It does raise another orthogonal point. It seems that the foundation could /potentially/ maintain cycle and latency characterisation approximations as “polynomials” for each /standardised/ instruction for the certified models, If it’s not onerous. This will be a much harder problem to model for an out of order implementation like BOOM so it would need to be an approximation. Most would be O(1). load and store polynomial with address translation and a cache hierarchy would also be relatively onerous, but could be approximated (probabilistically for some workload based on cache hierarchy dimensions).

This is largely meaningless.  I think you did not understand the OP, this isn't about validating cycle counts, this is about having the code work at all.
 
-s

Stefan O'Rear

unread,
Sep 3, 2016, 8:34:30 PM9/3/16
to Dragoș, RISC-V SW Dev
Note that it's starting at 0x1000 (2**12), but the payload is linked
at 0x10000 (2**16), so it is trying to use a bootloader and not
jumping directly to the payload. I'm not sure where you got 0x200
from, although I have the same vague recollection so maybe it was in
one of the earlier specs. 0x1000 is on p.12 of
https://static.dev.sifive.com/SiFive-U5-Coreplex-v1.0.pdf ; the SiFive
stuff is pretty close to baseline rocket-chip and AFAICT the best
public documentation of it. It's obviously not running correct
instrctions though (the first instruction of bbl is a 4-byte jump, not
a 2-byte non-jump). Until we get a more definitive answer from the
team, I suggest throwing a bunch of trace statements at dramsim and
the memory system. It'll be working when you can fetch a jump at
0x1000.

-s

Christopher Celio

unread,
Sep 3, 2016, 9:07:24 PM9/3/16
to Stefan O'Rear, Dragoș, RISC-V SW Dev
It's obviously not running correct
instrctions though (the first instruction of bbl is a 4-byte jump, not
a 2-byte non-jump).  Until we get a more definitive answer from the
team, I suggest throwing a bunch of trace statements at dramsim and
the memory system.  It'll be working when you can fetch a jump at
0x1000.

If he's using the latest rocket-chip stuff (priv-1.9, etc), if I recall correctly, reset begins at 0x1000, the built-in bootROM location, which is just a "jump 0x0" for the core to spin forever on.  The front-end server (riscv-fesvr) will then interrupt the core via the Debug interface, and force the core to execute code out of the uncacheable debug RAM (~0x400), in which the target binary is loaded one word at a time into rocket's memory space.  It will take ~2k instructions before the rocket core finally jumps to the machine-level binary, which will be located around ~0x8000_0000 (cacheable memory).  If you're using the pk, that's where it is. Then, the pk begins its own task of loading in the user-level program.

Naturally, this means just tracing the memory system isn't quite enough. ;)

Spike will simulate none of this startup overhead, since the target program will instantly be in its memory space.

6. make CONFIG=RoccConfigExample in the rocket-chip/emulator directory

I assumed this passed the normal ISA tests? ("make CONFIG=RoccConfigExample run").


-Chris


-- 
You received this message because you are subscribed to the Google Groups "RISC-V SW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sw-dev+un...@groups.riscv.org.
To post to this group, send email to sw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/sw-dev/.

Dragoș

unread,
Sep 4, 2016, 2:59:12 AM9/4/16
to RISC-V SW Dev
Hello! Thanks to everyone for answering in such a short time!

@michaeljclark: Unfortunately it doesn't answer my question. Although the information provided is really insightful, as @sorear2 says - I'm trying to make the code work. Thanks!

@michaeljclark, @sorear2: I've taken the 0x200 bit from here (StackOverflow). I now realise it might not have been a good idea to trust a year-old answer for such a fast-developing project. Thank you @sorear2 for the documentation though, I'll make sure to read!

@sorear2: I don't know how to throw trace statements at dramsim and the memory system. I'll look it up, but if you can direct me to any information body that might show to how to do it, I would be really grateful. Thanks!

@Chris: I haven't been using priv-1.9 for riscv-tools inside the rocket-chip repository. I'm not sure what you mean by other rocket-chip latest stuff - I'd be happy to switch to them and try it out.
I've run make CONFIG=RoccConfigExample run with the priv-1.9 branch for riscv-tools and apparently I get the following error/test fail:

Makefile:46: recipe for target 'output/rv64uf-p-ldst.out' failed


I haven't tried the command for the master branch though. I'll try switching and recompiling later in the day, when I have more time. Thanks!

If he's using the latest rocket-chip stuff (priv-1.9, etc), if I recall correctly, reset begins at 0x1000, the built-in bootROM location, which is just a "jump 0x0" for the core to spin forever on.  The front-end server (riscv-fesvr) will then interrupt the core via the Debug interface, and force the core to execute code out of the uncacheable debug RAM (~0x400), in which the target binary is loaded one word at a time into rocket's memory space.  It will take ~2k instructions before the rocket core finally jumps to the machine-level binary, which will be located around ~0x8000_0000 (cacheable memory).  If you're using the pk, that's where it is. Then, the pk begins its own task of loading in the user-level program.
 
Are you saying it would be wasteful anyway to try and find my disassembled program in the emulator output file (providing the emulator runs the instructions correctly, as @sorear2 says it isn't doing)? I'm trying to make the emulator stop and show me a correct number of cycles though - I've started looking through the emulator output file because it wasn't finishing and then saw pc looping around.

Thank you again everyone!

Dragoș

unread,
Sep 5, 2016, 8:28:32 AM9/5/16
to RISC-V SW Dev
I think I managed to make it work.

I've deleted the repository and the built riscv-tools and redownloaded rocket-chip, recompiled rocket-chip/riscv-tools and reran make run and make CONFIG=RoccExampleConfig run for each branch - master and priv-1.9 - of rocket-chip/riscv-tools, Everything worked and even passed the aforementioned error (Makefile:46: recipe for target 'output/rv64uf-p-ldst.out' failed).

Sorry for the delay as I met more errors yesterday and suspected I needed a full wipe and redownload+recompile everything which needed more time.

I've reread more carefully what you wrote @Chris and checked rocket-chip/emulator/output/*.out files to see that the actual instructions from tests do indeed get executed way later and that the start of the *.out files were similar to the start of my *.log file (front server taking time to call the proxy kernel, proxy kernel booting).

On the priv-1.9 branch of rocket-chip/riscv-tools, I've redone the step 7-10 from the OP (the other steps being already done from before) but with +max-cycles=100000000 instead of
10000 as previous from lo behold, the emulator actually finished because of an actual error and not a timeout error.

The following appeared on my console while running step 9 with +max-cycles=100000000:
z  0000000000000000 ra 0000000000010044 sp 000000000feeeb10 gp 0000000000017db0
tp
0000000000000000 t0 00000000000176b0 t1 00000000000176b0 t2 0000000000000000
s0
000000000feeeb40 s1 0000000000000000 a0 0000000000000001 a1 000000000feeeb48
a2
0000000000000000 a3 0000000000000000 a4 0000000000000000 a5 000000000000007b
a6
0000000000000000 a7 0000000000000000 s2 0000000000000000 s3 0000000000000000
s4
0000000000000000 s5 0000000000000000 s6 0000000000000000 s7 0000000000000000
s8
0000000000000000 s9 0000000000000000 sA 0000000000000000 sB 0000000000000000
t3
0000000000000000 t4 0000000000000000 t5 0000000000000000 t6 0000000000000000
pc
000000000001018c va 00000000000156c0 insn       0027e00b sr 8000000000006000
An illegal instruction was executed!

The end of the .log file being:
*** FAILED *** (code = -1, seed 1473082570) after 2561689 cycles

I'm thinking maybe the original dummy_rocc_test.c file is written wrong for the emulator, as it isn't following the format and using the macros from the test files (hence the
code = -1).
I saw something of a tangent in a YouTube video of the 1st RISC-V workshop (link to specific moment of what I'm talking about here).

So yeah, the initial question is solved. The answer (quite embarassing for being so simple) was doing a complete wipe of repository and redownload, recompile/rebuild, definitely run the tests - which I wasn't doing before - and let the
+max-cycles be =100000000. I did redownload and rebuilt the repository a few times, but interruptions happened (suddenly no disk space or internet went down) but I just fixed them and reran the last command that was running (not cleaning up the partially done work). The tests run yesterday showed me that a generated C file was actually only partially written and therefore make run had compilation errors.

Also, the first instruction of dummy_rocc_test.c (00008197) starts at cycle 1058546. It's interesting.

Now I just have to find out why do I get this new error from the emulator. Thanks everyone, especially @Chris!
Reply all
Reply to author
Forward
0 new messages