instantiating existing RTL blocks

66 views
Skip to first unread message

muzaff...@gmail.com

unread,
Nov 20, 2022, 5:09:22 PM11/20/22
to xls-dev
At some point there was this statement by Chris: "One of the things we have on our short list is "good FFI" for instantiating existing RTL blocks (and making their timing characteristics known to the compiler) and making import flows from Verilog/SystemVerilog types. The latter may be a bit your-Verilog-flow specific, but we think there are some universal components you can provide that folks can slot in their flows as appropriate.

Being able to re-time pipelines without a rewrite is a useful capability. Although it's still experimental and we're actively building out the capabilities, we have it in real designs that have important datapaths."

Is there any update on where we are on this flow? Is there any tutorial on how to use any of these features if they already exist?

Best


Leary, Chris

unread,
Nov 25, 2022, 8:03:52 PM11/25/22
to muzaff...@gmail.com, Paul Rigge, xls-dev
+Paul Rigge made progress in this direction, in that there's now scaffolding for overriding particular operations for different environments, e.g. assertions or operand-gates:


These hooks are exposed via codegen options which we documented here: https://google.github.io/xls/codegen_options/#format-strings

Though that notably doesn't get us all the way to plugging in multi-cycle verilog modules that we reason about in scheduling, which is the desired endpoint when we talk about FFI. (Paul took a detour from there where he's now working on RAM support.) We took a bit of a shortcut on some FFI-like ops we wanted recently by having them be built-in operations in the IR instead of making them more user defined. I'm hopeful we'll kick them out to be user defined when we have a full FFI facility.

The napkin sketch is that in the frontend we'd hope for something like:

extern "verilog" my_great_op {verilog_name: "my_great_op_mod", latency: 2} (x: bits[32], y: bits[32]) -> bits[64] {
  ... polyfill here ...
}

Where the "polyfill" definition would allow us to still do simulation/verification at the XLS level and have something to compare between RTL-level simulation results and higher level simulation results (DSL level, unopt IR, opt IR).

/If/ the polyfill definition is bit-accurate (vs say, a valid implementation of several possible implementations of an abstract function, which would be a little more complicated), hypothetically this could also allow the compiler to trade off using these "custom ops" as black boxes vs optimizing against the the provided definition -- it could be the case that optimizing against the user-provided polyfill definition could result in a better result.

Hope that makes sense, let us know if you have questions/comments/thoughts.

- Chris Leary

--
You received this message because you are subscribed to the Google Groups "xls-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to xls-dev+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/xls-dev/bcaaa0c1-669f-4f47-bfcf-5af6ac82f6c7n%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

muzaff...@gmail.com

unread,
Nov 26, 2022, 8:14:07 PM11/26/22
to xls-dev
Thanks for the detailed response. We had lunch with Paul a couple of weeks ago, which is what triggered this re-examination of XLS. Frankly I'd rather have array->memory mapping ala Catapult much more than fully integrated FFI support.
My immediate goal is not about reasoning/optimizing XLS code together with (S)Verilog but just doing easy integration/stitching. Previously we had a python based flow for this and I'm leaning towards that in my current role too. Probably amaranth/nmigen or magma.

Best
-K

Leary, Chris

unread,
Nov 28, 2022, 7:25:03 PM11/28/22
to muzaff...@gmail.com, xls-dev
Very cool -- please keep us in the loop as you finalize your thinking about your stitching solution and whether it will be developed in OSS with a license like Apache2. It'd be great to have something canonically open source to point users at and that we may be able to help with. It's been on our wish list -- in particular I was leaning towards trying more things in Amaranth, but hadn't done a detailed analysis or example design to really inform that preference -- but due to the variety of different solutions people use the priorities have been elsewhere. For ASIC flows I'm excited about the possibility of us all collaborating on tooling on top of OpenPDKs for open testing and CI, e.g. sky130, XLS already pulls in bazel_rules_hdl which has support for synth/PnR to GDS on Open PDKs.

Paul Rigge

unread,
Nov 28, 2022, 7:25:03 PM11/28/22
to muzaff...@gmail.com, xls-dev
What Chris said is all very true, the main thing I'd add is that I see the stuff I'm working on w/ RAMs as the common case and a starting point for more generic Verilog FFI. For general Verilog FFI, we'll need more flexible ways to specify interfacing+scheduling constraints, but my goal is to leverage SRAM work on general Verilog FFI.

I'd also add that the op override stuff Chris mentioned is best suited for when the op you're overriding is generically useful. It's a pretty big task to add an op, so I wouldn't want to do it for a one-off block.

Re: supporting memories, the same doc Chris linked also talks about an experimental RAM feature. There's an example usage in DSLX here, but using it in C++ currently requires using ac_channel (doesn't yet work w/ arrays).

On Sat, Nov 26, 2022 at 5:14 PM muzaff...@gmail.com <muzaff...@gmail.com> wrote:

Leary, Chris

unread,
Nov 28, 2022, 7:30:21 PM11/28/22
to Paul Rigge, muzaff...@gmail.com, xls-dev
Yeah great point, there's a sense in which a RAM module looks like "just another black box implementation" that you send a message to, and some corresponding message comes back with some (potentially known) latency and (known) pipelined-ness, and ideally we have some behavioral model for how that black box thing works for simulation at the XLS level -- which are all the components in the crux of what we'd think of as "foreign functions".

muzaff...@gmail.com

unread,
Nov 30, 2022, 5:54:35 PM11/30/22
to xls-dev
This sounds very promising. Right now I'm investigating a mix of (amaranth/magma + verible + something_like_sifive_duh) but it's quite early. I'll definitely push hard to make at least parts of what I come up with OSS.

muzaff...@gmail.com

unread,
Nov 30, 2022, 6:01:31 PM11/30/22
to xls-dev
In terms of treating RAM as another process with which communicate over channels, I'm somewhat conflicted as it's a significant difference from c++. I really like how c++ hls (ala Catapult) can make life easy to just infer this under the hood and keep the code very similar to where it came from (ie the research/dev code base). Converting all those local static arrays for linebuffers etc. into separate processes would be a significant effort and (one way) change from the input. 

Tim 'mithro' Ansell

unread,
Nov 30, 2022, 7:24:24 PM11/30/22
to muzaff...@gmail.com, xls-dev, Johan Euphrosine, Chris Leary
Hi everyone,

I have personally never had the chance to use XLS, but Johan Euphrosine <pro...@google.com>, a developer relations engineer on our team, has used XLS with a number of other frameworks.

For one example, Johan used XLS to create an RGB<->HLV CPU opcode using the CFU Playground (http://cfu-playground.rtfd.io/) which is LiteX based. He presented this work at WOSET 2021 and the information can be found at https://woset-workshop.github.io/WOSET2021.html#porting-software-to-hardware-using-xlsdslx (CFU Playground was also presented at WOSET 2021 and can be found at https://woset-workshop.github.io/WOSET2021.html#cfu-playground-build-your-own-ml-processor-using-open-source).

Johan also demonstrated an XLS adder using OpenLane (open source automated physical design ASIC toolchain) in SKY130 (open source 130nm PDK) all in a Juypter Notebook hosted on Google Colab @ https://colab.research.google.com/github/proppy/silicon-notebooks/blob/xls/xls-adder-openlane.ipynb

LiteX is probably a great system to use as a wrapper / SoC stitcher as it has wide support for a large number of FPGAs and a large number of IP cores. 

As mentioned above, LiteX is used to provide the SoC in the CFU Playground. 

There are also multiple examples of Linux capable SoCs using LiteX, see https://github.com/litex-hub/linux-on-litex-vexriscv and https://github.com/litex-hub/linux-on-litex-rocket -- this work is even been used by the LibreBMC project (which is an FPGA based BMC soft SoC designed to run OpenBMC stack) -- see https://www.prnewswire.com/news-releases/openpower-foundation-to-showcase-librebmc-a-fully-open-source-power-based-bmc-at-ocp-global-summit-301652496.html

LiteX is also used by the BeTrusted project (https://betrusted.io/) for their Precursor SoC. See this great post about their solution at https://www.crowdsupply.com/sutajio-kosagi/precursor/updates/a-guided-tour-of-the-precursor-system-on-chip-soc

The LiteX ecosystem is also used on Google's RowHammer test platform that we have been developing with Antmicro due to the very powerful memory controller (LiteDRAM) included in the ecosystem -- https://opensource.googleblog.com/2021/11/Open%20source%20DDR%20controller%20framework%20for%20mitigating%20Rowhammer.html

I have a old diagram (from like 2018) for the ecosystem which can be found at https://docs.google.com/drawings/d/15hSX1cHwz2_Lm-CxQxlelGUuVPP4AmEEljyRb1ncjjk/edit

If LiteX is not what you are looking for, you should also consider the FuseSoC project (https://github.com/olofk/fusesoc) which is used by OpenTitan project (https://opentitan.org/). FuseSoC has pretty strong integration with multiple HDLs like Chisel and many IP blocks out there already have FuseSoC support.

Hopefully this information is useful to you!

Tim 'mithro' Ansell


Reply all
Reply to author
Forward
0 new messages