fesvr, dtm abstract command error when loading elf -- why?

543 views
Skip to first unread message

Edmond Cote

unread,
Dec 19, 2017, 2:06:21 PM12/19/17
to RISC-V HW Dev
Trying to wrap my head around this error.  Would love some debug pointers.

I have a "Hello, World!" program that runs on spike.

./bin/spike riscv64-unknown-elf/bin/pk `pwd`/hello
Hello, World!

I have generated hardware using a known-to-work configuration.  Top class within the Rocket Chip generator is freechips.rocketchip.system.  The configuration produces the following map :

       0 -     1000 ARWX  debug-controller@0
    3000 -     4000 ARW C error-device@3000
   10000 -    20000  R XC rom@10000
2000000 -  2010000 ARW   clint@2000000
c000000 - 10000000 ARW   interrupt-controller@c000000
60000000 - 80000000  RWX  mmio@60000000
80000000 - 80004000 ARWX  dtim@80000000

Running the test on the C emulator yields:

ERROR: [..]riscv-fesvr/fesvr/dtm.cc:346, Debug Abstract Command Error #3 (EXCEPTION)
ERROR: [..]/fesvr/dtm.cc:347, Should die, but allowing simulation to continue and fail.

Here are my observations:

FESVR correctly receives command to load the ELF file.
memif_t::write is called w/ address of 0x10000
--> dtm_t::write_chunk
--> line 300 of dtm.cc causes the above mentioned exception

  command = AC_ACCESS_REGISTER_TRANSFER |
    AC_ACCESS_REGISTER_POSTEXEC |
    AC_ACCESS_REGISTER_WRITE | 
    AC_AR_SIZE(xlen) |
    AC_AR_REGNO(S1);
  RUN_AC_OR_DIE(command, 0, 0, data, xlen/(4*8));

--> dtm_t::dtm_run_abstract_command runs through the following specs

write(DMI_PROGBUF0 [..]) // OK
write(DMI_DATA0 [..]) // OK
write(DMI_COMMAND) // OK
// wait for not busy... OK
// check for error.. dies here
 if ((get_field(command, AC_ACCESS_REGISTER_WRITE) == 0) &&
      get_field(command, AC_ACCESS_REGISTER_TRANSFER)) {
    for (size_t i = 0; i < data_n; i++){
      data[i] = read(DMI_DATA0 + i);
    }
  }

Megan Wachs

unread,
Dec 19, 2017, 2:25:37 PM12/19/17
to Edmond Cote, RISC-V HW Dev
You will get this error if you try to write to memory that is read-only (or completely inaccessible). Does your core have write permissions to the address 0x10000? What is at that address? You may want to check how your executable is being linked.

Megan

--
You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+unsubscribe@groups.riscv.org.
To post to this group, send email to hw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/hw-dev/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/hw-dev/CALzVNFD%3DrTCfRZ6u%2BY9s_5WKw91Jk898LUnqV%2Byvg4GoZVTAbw%40mail.gmail.com.



--
Megan A. Wachs
Engineer | SiFive, Inc 
1875 South Grant Street
Suite 600
San Mateo, CA 94402

Edmond Cote

unread,
Dec 19, 2017, 2:30:30 PM12/19/17
to Megan Wachs, RISC-V HW Dev
yep, it was right there, I'm writing to ROM... figured it couldn't be h/w related since came from a regressed config (ExampleRocketSystem).

   10000 -    20000  R XC rom@10000

 
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.

Edmond Cote

unread,
Jan 10, 2018, 6:07:30 PM1/10/18
to RISC-V HW Dev, me...@sifive.com

Nope.  I bet we are wrong.

During ELF file loading.

In dtm.cc ->

void dtm_t::write_chunk(uint64_t taddr, size_t len, const void* src)

This code is written to the processor and executed.

  prog[0] = STORE(xlen, S1, S0, 0);
  prog[1] = ADDI(S0, S0, xlen/8);
  prog[2] = EBREAK; // dissasmbly trace will add a store between prog[1] and prog[2]

Here is the disassembly trace.

C0:       1181 [0] pc=[000000340] inst=[00943023] sd      s1, 0(s0)
C0:       1182 [0] pc=[000000344] inst=[00840413] addi    s0, s0, 8
[..] ^^ repeat few times
C0:       1201 [0] pc=[000000344] inst=[00840413] addi    s0, s0, 8
C0:       1202 [1] pc=[000000808] inst=[0340006f] j       pc + 0x34
[..] ^^ repeat few times
C0:       1206 [1] pc=[00000083c] inst=[10002623] sw      zero, 268(zero)
C0:       1207 [0] pc=[000000840] inst=[00100073] ebreak
C0:       1208 [0] pc=[000000844] inst=[7b202473] csrr    s0, dscratch


CPU produces a few exceptions and sends a store to 0x10 that causes signal hartExceptionWrEn to go high and assertion to trigger.

The exception is core_io_dmem_s2_xcpt_ae_st + 3 others.

It was difficult for me to correlate this signal in the chisel source; thus throwing the towel for now.

Any hints?  What is this processor exception?  Why it is going high during ELF file loading.








My question is ... help?
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+unsubscribe@groups.riscv.org.

Megan Wachs

unread,
Jan 10, 2018, 6:29:57 PM1/10/18
to Edmond Cote, RISC-V HW Dev
In this rocket-chip implementation, when the core gets an exception in debug mode, it jumps to the hard-coded exception address, 0x808. 


This does the store that you hilighted red above (this stores to an address in the debug module to let it know that the Command got an Exception).

The exception occurred on your prog[0], which is why it appears to be "inserted" in the disassembler... you basically jumped to the exception handler when you tried to do the 
prog[0] = STORE(xlen, S1, S0, 0);

So the question is what was in S1, S0 at that time that gave you an exception when you tried to execute the store.

To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.

Edmond Cote

unread,
Jan 10, 2018, 6:57:39 PM1/10/18
to Megan Wachs, RISC-V HW Dev
Yep, a whole bunch of garbage S1=340111731e80006f.

Thanks for the pointers to debug_rom.S and tips on what happens during an exception.

Hoping this thread can benefit others as well.


Edmond Cote

unread,
Jan 11, 2018, 7:57:08 PM1/11/18
to RISC-V HW Dev, me...@sifive.com
Update.  Few other issues solved, but this remains.

prog[0] = STORE(xlen, S1, S0, 0); // S1 contains 0x80000000, what-should-be my physical DRAM base address; trying to match the behavior of spike.
prog[1] = ADDI(S0, S0, xlen/8);
prog[2] = EBREAK;

The store causes a DTLB address exception (core_io_dmem_s2_xcpt_ae_st).

My Rocket config has the TLB removed (useVM = false).  For no other reason than "keeping it simple".  There were other threads on this list that explained why TLB modules remain in the design (i.e., they are compiled out).  That confused me at first.

I also a statement that said "proxy kernel uses virtual addresses" and I was wondering whether someone could clarify that statement in the context of booting the system (in this case, loading a simple elf module).

My question:  Why would I see a DTLB address exception when there is no TLB?

Also, the output of the TLB tlb_io_resp_paddr[30:0] is 0x0.  Bit 31 dropped.

The result of both observations is no bus transaction never gout for the store instr.

Any clues?  Appreciate the help with my ramp ;)

(I will be re-enabling the TLB to see what happens)

-Ed

To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+unsubscribe@groups.riscv.org.

Edmond Cote

unread,
Jan 12, 2018, 10:05:17 AM1/12/18
to RISC-V HW Dev, me...@sifive.com
Done.  Closing the loop.  I'll plan a short write up on my blog.

My DRAM controller was at 0x800_0000 at not 0x8000_0000.

Thus an illegal address was reported by the TLB.

val legal_address = edge.manager.findSafe(mpu_physaddr).reduce(_||_) // mpu_physaddr OK, TL2 configuration was not

When setting TLManagerParameters for the target/slave device, i.e.:

private val params = TLManagerParameters(
   address = Seq(AddressSet(baseAddr, 0xFFFFFF)),

The base address was wrong.

val baseAddr = 0x80000000 // this was original code, there is no UInt in Scala, this became negative -> 0xFFFF_FFFF_8000_0000, caused issues
val baseAddr = BigInt(Array(0, 0, 0, 0, 8, 0, 0, 0) map { _.toByte }) // Fixed by using a Array[Byte], typo introduced
val baseAddr = BigInt(Array(0, 0, 0, 0, 0x80, 0, 0, 0) map { _.toByte }) // OK

Appreciate all the support!

Ed

To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.

--
You received this message because you are subscribed to the Google Groups "RISC-V HW Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hw-dev+un...@groups.riscv.org.

To post to this group, send email to hw-...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/hw-dev/.
Reply all
Reply to author
Forward
0 new messages