[riscv-hw] clocks in Rocket

Prof. Michael Taylor

unread,

Dec 10, 2015, 10:52:38 AM12/10/15

to hw-...@lists.riscv.org

Hi,

Are there any asynchronous clock boundaries in the Rocket core?

Besides in SlowIO.scala, are any clocks being generated internally by the source code?

Thanks!

M

--

Michael Taylor
Associate Professor
Department of Computer Science and Engineering
University of California, San Diego
http://www.cse.ucsd.edu/~mbtaylor

Andrew Waterman

unread,

Dec 10, 2015, 4:38:01 PM12/10/15

to Prof. Michael Taylor, hw-dev

Hi Mike,

In the available RTL, there's only one clock domain. We have used the
rocket-chip generator to build chips with multiple clock domains,
though. The boundary between a Rocket tile and the uncore is a
natural demarcation, since all communication across that cut is
decoupled. The queues can just be replaced with async FIFOs.

At the moment, doing so requires manual post-processing of the Verilog
that Chisel spits out.

Andrew

Ben Keller

unread,

Dec 10, 2015, 5:15:54 PM12/10/15

to Andrew Waterman, Prof. Michael Taylor, hw-dev

Or you can hack up the Chisel; there is an AsyncFifo class defined, although I can��t vouch for it��s correctness.

-Ben

Prof. Michael Taylor

unread,

Dec 10, 2015, 5:48:08 PM12/10/15

to Ben Keller, Andrew Waterman, hw-dev

Thanks Andrew and Ben!

We have our own CDC FIFOs in the group; so probably we would use those since we have the backend flow (placement and synchronizers) worked out for them.

Asking the broader question, which Andrew alluded to, what are the typical kinds of manual post-processing that you guys do on the Chisel-generated Verilog prior to running it through the backend when you build a chip? Or alternatively stated, to what extent can I treat the Chisel's output ball of verilog as a blackbox and just run it through the backend flow?

M

Ben Keller

unread,

Dec 10, 2015, 8:48:17 PM12/10/15

to Prof. Michael Taylor, Andrew Waterman, hw-dev

It’s perfectly reasonable to pass the Chisel-generated Verilog directly through the VLSI flow if you are comfortable with the design residing within a single clock domain.

There are different approaches to building in all of the bells and whistles you need to actually tape out a chip, but the two broad approaches boil down to 1) embedding necessary Verilog modules via Chisel BlackBoxes or 2) instantiating the Chisel-generated Verilog in a handwritten (or script-written) top-level Verilog module.

Off the top of my head, some of the bits and pieces that Chisel doesn’t handle completely (and so we have to postprocess):

* Instantiating a pad frame

* Handling multiple clock domains and async FIFO insertion

* Muxing multiple off-chip clock sources

* Instantiating SRAM macros

* Instantiating analog IP

There’s probably more depending on your goals for the silicon.

-Ben

Ouabache Designworks

unread,

Dec 11, 2015, 12:50:35 PM12/11/15

to Ben Keller, Prof. Michael Taylor, Andrew Waterman, hw-dev

On Thu, Dec 10, 2015 at 5:48 PM, Ben Keller <bke...@eecs.berkeley.edu> wrote:

It’s perfectly reasonable to pass the Chisel-generated Verilog directly through the VLSI flow if you are comfortable with the design residing within a single clock domain.

There are different approaches to building in all of the bells and whistles you need to actually tape out a chip, but the two broad approaches boil down to 1) embedding necessary Verilog modules via Chisel BlackBoxes or 2) instantiating the Chisel-generated Verilog in a handwritten (or script-written) top-level Verilog module.

Off the top of my head, some of the bits and pieces that Chisel doesn’t handle completely (and so we have to postprocess):

* Instantiating a pad frame
* Handling multiple clock domains and async FIFO insertion
* Muxing multiple off-chip clock sources
* Instantiating SRAM macros
* Instantiating analog IP

There’s probably more depending on your goals for the silicon.

-Ben

Chisel is a good language for component designers building leaf level blocks but it falls apart when architects try using it to configure and interconnect those blocks. You need to have chisel also generate IP-Xact descriptor files so that the architects can pull your designs into an IP-Xact enabled tool flow that was designed to solve all of those problems. Verilog suffers from many of the same issues.

John Eaton

kr...@berkeley.edu

unread,

Dec 13, 2015, 8:19:20 PM12/13/15

to Ouabache Designworks, Ben Keller, Prof. Michael Taylor, Andrew Waterman, hw-dev

>>>>> On Fri, 11 Dec 2015 09:50:35 -0800, Ouabache Designworks <z3qm...@gmail.com> said:
| Chisel is a good language for component designers building leaf level blocks
| but it falls apart when architects try using it to configure and interconnect
| those blocks.� You need to have chisel also generate� IP-Xact descriptor files
| so that the architects can pull your designs into an IP-Xact enabled tool flow
| that was designed to solve all of those problems. Verilog suffers from many of
| the same issues.

| John Eaton

Adding an optional IP-Xact backend would be a great project for some
external Chisel contributor to work on (volunteers?), but this project
should wait until Chisel 3.0 is stable.

Krste

Prof. Michael Taylor

unread,

Dec 13, 2015, 11:43:04 PM12/13/15

to Ben Keller, Andrew Waterman, hw-dev

Thanks Ben!

------

Sent from a phone without a UCSD GreenDroid processor.

Ben Keller

unread,

Oct 25, 2016, 3:42:26 PM10/25/16

to RISC-V HW Dev

So there have actually been a lot of improvements to multi-clock support in Rocket Chip since this thread began 10 months ago, including support for async fifos implemented as black-boxed verilog (which are also included in the repo, but you could easily swap out your own). So now it is quite easy to instantiate a design with per-core clock domains - just instantiate a MultiClockCoreplex instead of a DefaultCoreplex, and wire the clocks/resets for each core via the “tcr” signals (for Tile Clock and Reset) at the top level, and Chisel takes care of all of the async crossings for you.

To answer your specific question, all of the communication between the core and the rest of the world needs to be synchronized - take a look at the AsyncConnection trait in src/main/scala/coreplex/Coreplex.scala for an exhaustive list. (The hartid and the reset vector are assumed not to change during system operation, so they are just wired directly.)

Assuming your system organization consists of one clock for each core and one clock for everything else, then the uncore clock drives everything in the periphery and the rest of the coreplex, including the shared L2. (Because of the reorganization of the Rocket Chip hierarchy, we now refer to this as the system clock, not the uncore clock.)

-Ben

On Oct 25, 2016, at 11:07, Prof. Michael Taylor <prof....@gmail.com> wrote:

Hi Ben,
Quick question -- which FIFOs in the chisel code do you guys replace?
M
------
Apologies for any errors, an approximate neural network and a mobile phone synthesized this email.

Reply all

Reply to author

Forward