Debug WG 12/16/2016 Notes

71 views
Skip to first unread message

Megan Wachs

unread,
Dec 16, 2016, 10:49:15 AM12/16/16
to RISC-V Debug Group
Notes from the meeting are below. Again see the slides as well.

Intros (just the new folks):

John Eden — Retired ASIC designer, responsible for DFT logic. Wants groups across a chip to be talking together (there may be many groups working on the SAME ASIC that have DFT/debug requirements)

Cyril — Microsemi SoC — IP Engineering team. Wants to support different implementations of RISC-V on FPGA depending on customer requirements

Don Baile — founder of LabMouse security, just heard about RISC-V , here to listen

Larry Maydar — SOC Architect at Google. Part of MIPI standards body for debug and trace.

Richard proposed a new Spec Variant:

Basic idea is that it’s still memory mapped into the System Bus (we all agree on this point)

Only difference — instead of directly mapping core internals, there is now a command and data register set that the debug module interprets in an implementation-specific way

Rationale: GDB uses remote serial protocol (which we don't want to use). Command which need to be translated. Debug unit knows what to do. Simple implementation could be direct mapping into the core registers (internally CSRs). Others could do the instruction translation. It's more abstract.

Q: So are we actually implementing the GDB protocol in hardware?

A: No, “give me this register” —> Debug Unit does this command using a basic handshake communication protocol.

Q: from Tim: So what if the "give me this register" command is just the actual RISC-V opcode?

A: Well that’s kind of inefficient (more bits than you need)

Q: from Gadge — so are there basically two different interfaces basically? How does the debug need this?

A: If you want to use the freedom and features of the RAM based debug, you can, or you can do it this other way.

Q: Concern is that the tools vendors won’t want to do two versions. 

All agree -- at the end of the day we want to abstract away what is implemented.

Discussion on Variable Length Instructions — 

Both techniques need to define how to do them as it is a corner case that needs to be specified. 

Q: Tim -- Those that do have variable length instructions *probably* are going to have a bit more space to deal with them, right?

A: Richard — well, customers want vectors and SIMD, but they don’t want floating point units, MMU, etc, etc. So yes and no.

Discussion on Non-Invasive Debugging -- 

Q from Larry: Is executing instructions non-invasive debugging? We should be able to just observe without changing things. If we are executing instructions then we are going to interfere with the running program.

A: That is a fair point. How important is generally. How much interference are we willing to expect. It seems unlikely that you would *ever* be able to just do that. Note that halting the processor at any point is invasive.

Seems like you would *have* to do something to stop the processor. 

A: It’s just a mux that you can read out. 

Point from Cyril — whatever we do now, we shouldn’t prevent or make life difficult for things like tracing.

Is trace orthogonal to all this? Would trace counters would notice a "jump to debug" and get confused/throw off the stats?

A: Tim — way back in the day when I had this specced, trace would be disabled during debug mode

PC sampling — could be more straightforward in Richard’s if there is just a command “sample the PC”, depending on the specific implementation.

Richard — also has a design where the system is running in lockstep. It implements the direct mapped debugger. The second CPU in debug mode it just stalls. In real life the debug unit is turned off. If they have to execute code in lockstep, not sure how that work.

Gadge:  if you’re doing stuff in lockstep (trace) that’s different than debug. You’d be taking actions based on what the processor is doing. It really depends on the use case.

Some focusing of discussion: What interface are we talking about?

Debug Block -> Processor Core

or 

Host-> Target

We’re not talking about JTAG. 

Some current implementations: Richard’s proposal and SiFive’s proposal, and Nexus implementation.

Really we want to satisfy the professional tool vendors so that we can have a defined. But based on our discussions with tool vendors, they are just waiting for us to figure out the spec

Maybe we’re defining Debug Module -> Core. But Tim’s latest spec has backed off that significantly.

We definitely want to define what happens between the Debugger (software) and Debug Module

Let's talk about tradeoffs (this discussion got pretty side-railed).

Ease of HW Implementation:

Depends on your core size. Seems like a small simple core would prefer the "direct", a larger/more complicated system would prefer the "instruction"

Statement: Direct and Instruction are the *external view*. It would be possible to implement instruction with direct.

Richard — in a tiny system, if the debug is implemented as an exception (forcing CPU to jump to some address space — it can be done. But, this address space has to be flexible. 

Cyril — it’s not the amount of Debug RAM. It’s the fact that there is RAM there or not. Even in a large system. For some reason, silicon guys want to know about the RAM blocks in the system. Have to build special MBIST. 

Megan — I don’t think we should be talking about RAMs. It’s no longer in Tim’s spec and it wasn’t even in the FE310 implementation. Let's not get hung up on instruction == extra RAM.

Q from Gadge — is 28 bytes really that big a deal?

Richard — no that’s not a big deal. Richard’s biggest concern is that stuffing instructions in is a problem. Why can’t we just get an exception?

Tim — yes, just an exception is fine. The “stuffing” approach is an alternative. Basically you can make the instructions happen however you want.

Statement from Larry— we don’t want to have to put instructions into the pipe to do single step and halt, etc. Tim’s case is more like exception or monitor case, it maps well to that. But we additionally want something to single step and halt. And then do non-invasive debugging.

Q — is Non invasive a REQUIREMENT?

A  from Larry — exception and monitor approach from Tim’s spec works. But, to do non-invasive debugging for little cores and you don’t have to do anything, no support logic, nothing but a mux. Then this is a base requirement. 

Larry & Richard perspective — both direct mapped and debug monitor approach. 

Q -- But are the tool providers going to really support this? Do we need to get an answer from tool providers? How can we split this out?

Larry — I don’t think these two approaches are inconsistent. If we define an exception and we do a monitor portion, then we can also do direct, non-invasive debugging. The tool vendors will support either or. 

Gadge — ARM has both approaches (executing instructions and direct mapped)

Richard — what version of the spec should I continue with? Leaning towards the second one. Can Larry & Richard work together on the spec since they seem to share a perspective.

Tim — request to Richard & Larry — please do include a way to execute arbitrary instructions. A -- No problem, Exception trap to monitor. What Tim is asking, if the Monitor is in RAM.

There must be some way to execute some arbitrary instruction.

Action Items:

Tim to send out an email with the latest versions of both specs.

Everyone read the latest versions of the specs as both have changed significantly. 

Next week’s meeting will be Wednesday 8am.








--
Megan A. Wachs
Engineer | SiFive, Inc 
300 Brannan St, Suite 403 
San Francisco, CA  94107 

Alex Bradbury

unread,
Dec 16, 2016, 11:09:23 AM12/16/16
to Megan Wachs, RISC-V Debug Group
On 16 December 2016 at 15:49, Megan Wachs <me...@sifive.com> wrote:
> Notes from the meeting are below. Again see the slides as well.

Thanks Megan, I've updated https://riscv.github.io/debug-taskgroup/
with links to the slides+notes.

> Is trace orthogonal to all this? Would trace counters would notice a "jump
> to debug" and get confused/throw off the stats?
>
> A: Tim — way back in the day when I had this specced, trace would be
> disabled during debug mode

I do wonder if there are cases where this isn't what you want - e.g.
you want trace to capture every single load instruction that goes
through the CPU pipeline. How is this handled in existing SoC
trace+debug systems?

Alex

Tim Newsome

unread,
Dec 16, 2016, 1:58:44 PM12/16/16
to Megan Wachs, RISC-V Debug Group
Thanks for taking notes, Megan!

Tim
> --
> You received this message because you are subscribed to the Google Groups
> "RISC-V Debug Group" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to debug+un...@groups.riscv.org.
> To post to this group, send email to de...@groups.riscv.org.
> Visit this group at
> https://groups.google.com/a/groups.riscv.org/group/debug/.
> To view this discussion on the web visit
> https://groups.google.com/a/groups.riscv.org/d/msgid/debug/CAKnTnFRenwo--tAKMvB7TKBCmaSyfj0X_WaypFdbF-r4Rjem9Q%40mail.gmail.com.

Sober Liu

unread,
Dec 18, 2016, 10:08:40 PM12/18/16
to Megan Wachs, RISC-V Debug Group

I think GDB only expects a remote stub when work in “target remote <ipaddr>:<port>” with socket connection.

The remote stub is not hard to implement, including protocol handling (code from gdb) and call-back functions. These call-back functions talk with riscv debug interface for command like step/run/register-access/memory-access/etc.

And I believe tool vendors will supply such a remote stub.

--

You received this message because you are subscribed to the Google Groups "RISC-V Debug Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debug+un...@groups.riscv.org.
To post to this group, send email to de...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/debug/.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/debug/CAKnTnFRenwo--tAKMvB7TKBCmaSyfj0X_WaypFdbF-r4Rjem9Q%40mail.gmail.com.


This email message is for the sole use of the intended recipient(s) and may contain confidential information.  Any unauthorized review, use, disclosure or distribution is prohibited.  If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.

Alex

unread,
Dec 20, 2016, 2:53:58 AM12/20/16
to RISC-V Debug Group
Hi all,

My name is Alex Gruener and I am J-Link Product Manager at SEGGER Microcontroller GmbH & Co. KG.
SEGGER has been asked by Megan Wachs (siFive) and Cyril Jean (Microsemi) to provide some feedback on the latest external debug specification.
Here I come with some points I noticed when reading the spec:

First of all, it looks really good!
Good work by all working on this.
We plan to add support for RISC-V to J-Link within Q1/2017. (We are currently in the process of getting ourselves a RISC-V hardware etc.)

1)
It seems, the RISC-V supports single stepping in hardware, by default.
We recommend to either remove this or make this optional, as it can be done by the debug probe without loss of functionality or speed.

2)
From what I understand, on RISC-V there is an optional bus access block in the Debug Module that can be used to access memory independent from the CPU,
in fact making it possible to read/write memory via the debug probe while the CPU is running.
This is similar to what ARM supports on their Cortex-M series via the AHB-AP.
This is a nice feature!
Further, I understand from the spec. that if the debug probe tries to write new data etc. while the bus block indicated busy state, the bus access block goes into error state and sets the <buserror> bits in the dmcontrol register.
Would it be possible to have something like ARM has on the AHB-AP, where a write to the address/data register can return something like "WAIT" in case the bus is busy, so no error bit gets set, the current access is just ignored?
This way, even on slower memories etc., an efficient fast write is doable as the debug probe just needs to re-try the current access a number of times, until it is accepted or some software timeout in the probe is reached.
Going into error state immediately, forces the debug probe to change registers in order to check what happened (bad address, bus was busy, ...) and clear error bits (also in case the bus was just reporting busy) before it can continue.
To stay compatible, we recommend to add a bit to a debug register that allows configuring if the bus access block goes into error state in case an access is made while it is busy, or if it just returns busy, allowing the debug probe to repeat the last access.
Default value of this bit can be 0, indicating the default behavior "go to error state", to keep things compatible.

Hope it helps :)

Best regards
Alex

Tim Newsome

unread,
Dec 22, 2016, 4:46:57 PM12/22/16
to Alex, RISC-V Debug Group
Hi Alex,

Thank you for taking the time to comment.
Note that the spec you read is likely to change a lot, as we merge it with a different proposal written by Richard Herveille.

On Mon, Dec 19, 2016 at 11:53 PM, Alex <alex.g...@segger.com> wrote:
1)
It seems, the RISC-V supports single stepping in hardware, by default.
We recommend to either remove this or make this optional, as it can be done by the debug probe without loss of functionality or speed.

My rationale for including it was that this is not that expensive to add in hardware, and makes implementing a debugger a lot easier. The logic to predict the next PC is not that straightforward, and may require reading register values. RISC-V is also intended to be extended with custom extensions, which don't have any limitations on them. Somebody could implement a custom branch instruction (branch depending on some vector state, for instance) and without hardware single step the debugger would have to be modified to know that this instruction exists on the hardware it's currently debugging.
 
2)
From what I understand, on RISC-V there is an optional bus access block in the Debug Module that can be used to access memory independent from the CPU,
in fact making it possible to read/write memory via the debug probe while the CPU is running.
This is similar to what ARM supports on their Cortex-M series via the AHB-AP.
This is a nice feature!
Further, I understand from the spec. that if the debug probe tries to write new data etc. while the bus block indicated busy state, the bus access block goes into error state and sets the <buserror> bits in the dmcontrol register.
Would it be possible to have something like ARM has on the AHB-AP, where a write to the address/data register can return something like "WAIT" in case the bus is busy, so no error bit gets set, the current access is just ignored?
This way, even on slower memories etc., an efficient fast write is doable as the debug probe just needs to re-try the current access a number of times, until it is accepted or some software timeout in the probe is reached.

A failed access instantly returns error on the Debug Bus (aka Debug Module Interface). This is reflected in the op field of dbus on the next JTAG scan, so you can retry quite quickly. You do still have to clear the status bit at a different register, but I don't see why that's a big deal. Note that these are registers in the Debug Module. The select JTAG register remains the same.

The reason for specifying the sticky error bit is to accommodate dumb hardware USB debuggers (eg. Olimex USB-JTAG adapter). In those setups every USB request adds a lot of overhead, so you want to combine as much JTAG scanning as possible into a single request. But you also don't want to continue writing if something failed (especially when autoincrement is used). Making the bit sticky allows a debugger to optimistically send scans for a lot of data to the debug device in a single USB transaction, and check at the end if it all worked (which will almost always be the case).

Tim
 

--
You received this message because you are subscribed to the Google Groups "RISC-V Debug Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to debug+un...@groups.riscv.org.
To post to this group, send email to de...@groups.riscv.org.
Visit this group at https://groups.google.com/a/groups.riscv.org/group/debug/.

SEGGER - Alex Gruener

unread,
Dec 22, 2016, 5:52:41 PM12/22/16
to Tim Newsome, RISC-V Debug Group
Hi Tim,

HW stepping:
I get the point regarding custom instructions, however as it is software, it would always be modifiable. :)
HW not that expensive: More HW, more space, more cost, more power consumption :)

sticky bit:
Need to read the spec. again regarding this. Maybe I missed something.

Don't have it ready right now, but assuming a slow memory / system where "busy" happens, how would a correct write sequence look like to avoid the sticky error to get set?
1) Set addr
2) Write word
3) Check if busy (don't know the register out of my head)
4) Send next word

I guess the busy bit is in a different register than the data register where to write the word to?
So it would mean that always 2 registers would need to be accessed?
Having a "WAIT" / "OK" response in the TDO output while shifting in new data into the data register would make it more efficient and also for dump probes they could simply check the TDO stream received during the long sequence for any WAITs, indicating that not all accesses were taken.


Best regards
Alex

Tim Newsome

unread,
Dec 22, 2016, 8:08:40 PM12/22/16
to Alex, RISC-V Debug Group
On Thu, Dec 22, 2016 at 2:52 PM, SEGGER - Alex Gruener <alex.g...@segger.com> wrote:
Hi Tim,

HW stepping:
I get the point regarding custom instructions, however as it is software, it would always be modifiable. :)
HW not that expensive: More HW, more space, more cost, more power consumption :)

That is the tradeoff. So far nobody who is planning on implementing RISC-V cores has brought this up as too expensive
 
sticky bit:
Need to read the spec. again regarding this. Maybe I missed something.

Don't have it ready right now, but assuming a slow memory / system where "busy" happens, how would a correct write sequence look like to avoid the sticky error to get set?
1) Set addr
2) Write word
3) Check if busy (don't know the register out of my head)
4) Send next word

I guess the busy bit is in a different register than the data register where to write the word to?
So it would mean that always 2 registers would need to be accessed?
Having a "WAIT" / "OK" response in the TDO output while shifting in new data into the data register would make it more efficient and also for dump probes they could simply check the TDO stream received during the long sequence for any WAITs, indicating that not all accesses were taken.

You're confusing Debug Module registers and JTAG registers.

The JTAG register is used to perform accesses to the Debug Module. In the Debug Module, writing to sbdata0 will cause a memory write to begin. "If the bus master is busy then accesses set buserror, return error, and don’t do anything else." That error is reflected in the JTAG register that's being used, so it's already in the TDO stream.

When the debugger sees the error, it needs to clear the condition by performing a write to a different Debug Module register (dmcontrol), but that does not require changing JTAG registers. The sequence to write a block of memory would look like this:
1. write word
2. in TDO out, check if the access before the last one had failed. If so, deal with it.
3. goto 1

Using simple debug hardware, you write lots of words before looking at the TDO output to improve performance.

Tim
Reply all
Reply to author
Forward
0 new messages