Best Practice of witing bootrom for chipyard based ASIC design

Jerry Ho

unread,

Sep 1, 2024, 11:43:32 PM9/1/24

to Chipyard

Hey Community,

I am digging into the default BOOTROM in chipyard, and found it pertty robust, but it seems that the default BOOTROM is only suitable for simulation: It depends on a master(maybe TSI interface or others) initiating a MSIP write to make the hart jump to a customized addr. the msip can be writen by fesvr in simulation, but once the chip has been taped out, a off-chip client(TSI or chiplink masters) must exist to initiate a MSIP write. It seems less benificial than a simple on-chip bootrom that jumps directly to other on&chip memories' addr like the SPI flash and conduct further system booting there, which is configurable. I think the following code should be fine(by Jerry):

#define DRAM_BASE 0x80000000

// Dummy empty alternate boot rom.
// Immediately jumps to DRAM_BASE
.section .text.start, "ax", @progbits
.globl _start
_start:
li a0, DRAM_BASE
jalr a0

Our team is tapping out a chipyard based design, I want to know if there is something I miss to use a simple jump elsewhere bootrom instead of the current one in chipyard.

Also, The custom boot pin functionality seems work fine even in post-sillicon chips. It has a FSM write an addr to BootAddrReg and initiates the MSIP write then. But I think it is only benificial during simulation, because the bootrom is harden into the chip, and is not configurable, therefore we can not change the valud stored in BootAddrReg on fly, which makes using custom boot pin feature less useful.

Can anyone in the team shed some lights? Thanks so much for the amzing work you have done.

Jerry Zhao

unread,

Sep 2, 2024, 5:53:08 PM9/2/24

to chip...@googlegroups.com

Hi,

Your observations on the intended purpose of BootAddrReg and CustomBootPin is correct. The key is that CustomBootPin can be configured to write an alternate address (not 0x8000_0000) into BootAddrReg at Chisel elaboration time - see WithCustomBootPinAltAddr.

The boot scenarios I envisioned are:
- Tethered boot with a TSI FESVR frontend explicitly writing into MSIP
- Tethered boot where the TSI FESVR frontend writes an alternate address into BootAddrReg before writing MSIP
- JTAG boot where a JTAG frontend writes an alternate address into BootAddrReg before writing MSIP
- Self-FPGA-assisted-boot where CustomBootPin writes the address of a FPGA-side ROM to BootAddrReg. This allows you to swap out the second-stage bootloader by generating different FPGA bitstreams.
- Total-self-boot where CustomBootPin writes the address of an alternative on-chip second-stage-bootloader in ROM. For example, this second stage bootloader can be something that copies a binary from a SPI device to DRAM, before jumping into DRAM

Of course, you are free to ignore all this and replace the default BootROM entirely. If you do, just be aware that CustomBootPin and BootAddrReg would be useless, and should probbaly be removed.

-Jerry

--
You received this message because you are subscribed to the Google Groups "Chipyard" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chipyard+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chipyard/18a06050-3b16-434f-be70-def96e720451n%40googlegroups.com.

Jerry Ho

unread,

Sep 2, 2024, 9:56:40 PM9/2/24

to chip...@googlegroups.com

Jerry, Thanks, your reply is really helpful, now I think it's reasonable to leave the bootrom as it currently is, which gives numerous flexibilities, and We actually use sifive's chiplink as a substitute for TSI, and maybe we can use chiplink to initiate bootAddrReg and MSIP write to boot the system.

Jerry, I have one more question, are you aware of any performance differences between chipyard's TSI and ChipLink, I know Chiplink's protocol is not open sourced, but it's actually very strigntforward. I don't have time to delve into the TSI code base yet. Our team choosed chiplink before we found that TSI in chipyard actually does almost the same thing, I wonder if there some merits TSI over Chiplink.

Thanks for your work.

'Jerry Zhao' via Chipyard <chip...@googlegroups.com> 于2024年9月3日周二 05:53写道：

To view this discussion on the web visit https://groups.google.com/d/msgid/chipyard/CAC%2BpDSHoz6dcaLpsOD78snouFQo%3DjGFvjvW%3DAKOpcaV1uwYWPA%40mail.gmail.com.

Jerry Zhao

unread,

Sep 2, 2024, 10:19:43 PM9/2/24

to chip...@googlegroups.com

Interesting that you guys are using Chiplink.

To be precise, the equivalent of Chiplink in testchipip is SerialTilelink, not TSI. TSI is a bringup protocol that maps down to reads-and-writes that are transferred over SerialTileLink, but can also be transferred over other hardware protocols as well.
SerialTilelink and ChipLink are both ways to serialize TileLink messages.

There are some disadvantages of ChipLink over SerialTilelink I observed. Please correct me if I'm wrong about these, I didn't do a very thorough examination:
- ChipLink requires more pins than SerialTilelink. In pin-constrained test chips, we can go down to a 1-bit wide SerTL.
- ChipLink hardcodes a sourceID mapping, which may not match the actual mapping on the chip. While we can do the remapping, SerTL just gives us a bit more flexibility.
- ChipLink requires much deeper RX queues than SerTL, which can be a problem in area-constrained test chips
- ChipLink targets performance and throughput, while we use TSI-over-SerTL as a slow debug and bringup mechanism

Some recent changes to SerTL have addressed its main deficiencies compared to ChipLink:
- SerTL now uses source-synchronous clocking by default
- SerTL now uses credited flow control by default, as opposed to the old ready-valid style
- SerTL now supports TL-C as well

-Jerry Zhao

To view this discussion on the web visit https://groups.google.com/d/msgid/chipyard/CANRwArvZnnNcRKnJqYQWCKDkRybzbQq5QJ70Zc%2BV0zUY_qs_pQ%40mail.gmail.com.

Jerry Ho

unread,

Sep 24, 2024, 4:24:22 AM9/24/24

to chip...@googlegroups.com

Jerry, So sorry for this delayed reply, I didn't notice your reply. The reason why we use chiplink is that there are lots of functionality needs to be done at FPGA side, therefore performance and throughput is important. But SerialTilelink does have a better support in chipyard, When I tried to make chiplink work at our chipyard based design, there are random thoughts that maybe we can expose a CDE key letting users specify what kind of sedeser (Chiplink vs. SerialTilelink) they want to use. But as you said, this may not be necessary as SerialTilelink keeps getting better.

I really appreciated your help.

'Jerry Zhao' via Chipyard <chip...@googlegroups.com> 于2024年9月3日周二 10:19写道：

To view this discussion on the web visit https://groups.google.com/d/msgid/chipyard/CAC%2BpDSF2pirgrCHbgmGSDrRewiQdHKVV7pW5BqOR-9xSXursEQ%40mail.gmail.com.

Message has been deleted

Scott Eckart

unread,

Sep 23, 2025, 1:26:24 PM (9 days ago) Sep 23

to Chipyard

Hi Jerry,

Not sure if you'll see this (being that this thread is now a year old) - but any help would be greatly appreciated.

My team is taping out a chip, and we have taken our Chipyard-generated RTL out into a fresh Verilator simulation environment and written a "lean" TestHarness.v (without the convenient DUT connection features like TSI and ChipLink) that more closely reflects the boot program/pattern we intend to have with our ASIC.

We would like our final boot scenario to be like the last one you described:

"- Total-self-boot where CustomBootPin writes the address of an alternative on-chip second-stage-bootloader in ROM. For example, this second stage bootloader can be something that copies a binary from a SPI device to DRAM, before jumping into DRAM"

But for now, we want to totally replace the BootROM with a simple program that writes text out of the UART with a program like this:

```
#define UART_BASE 0x54000000 // UART memory mapped registers base address

#define UART_TXFIFO 0x00 // Transmit FIFO register
#define UART_RXFIFO 0x04 // Receive FIFO register
#define UART_TXCTRL 0x08 // Transmit control (bit 0 = txen)
#define UART_TXMARK 0x0a // Transmit mark ( ?? )
#define UART_RXCTRL 0x0c // Receive control (bit 0 = rxen ??)
#define UART_RXMARK 0x0e // Receive mark ( ?? )

// define ie
#define UART_IP 0x14 // Interrupt pending (bit 0 = txwm_ip)
// define div
// define parity
// define wire4
// define either8or9 -> frame length?

.section .text.hang, "ax", @progbits
.global _hang
_hang:
li t0, UART_BASE
li t1, 'A' // ASCII 'A'

# Enable transmitter (txen = 1)
li t2, 1
sw t2, UART_TXCTRL(t0)

wait_uart_ready:
lw t2, UART_IP(t0) # Read interrupt pending
andi t2, t2, 0x1 # Check txwm_ip bit
beqz t2, wait_uart_ready
sw t1, UART_TXFIFO(t0) # Write to txfifo

done:
j done

```

But it is not working so far. The synchronization of the instructions being fetched and exected does not correlate with the ROM addresses being accessed according to waveform dumps on the simulations we have viewed.

Can you spot any problems? Any help would be greatly appreciated, even if you can only give a general consult and not specific debugging

Reply all

Reply to author

Forward