Fast way to initialize ram in tests

571 views
Skip to first unread message

Øyvind Harboe

unread,
Mar 17, 2017, 9:09:19 AM3/17/17
to chisel-users
For my tests, I am currently initializing RAM through the external interface of my top-level design.

This is *slow* (minutes), whereas the actual execution doesn't take very long.

I've tried to peek or poke signals outside of the IO() bundle, but poke refuses to do so.

Is there a way to directly initialize a RAM block deep within my design from a PeekPokeTester()?



Cheers,

Steve Burns

unread,
Mar 17, 2017, 9:53:14 AM3/17/17
to chisel...@googlegroups.com
Couple of thoughts.
If you are always initializing the RAM to the same values, you should be able to refactor the RAM into a BlackBox and then describe the RAM in verilog and initialize it using an "initial" block with a "$readmemh" statement. This will work with the Verilog based simulations (VCS or Verilator). If you like using the firrtl-interpreter backend, you'll have to use a different mechanism to provide a Scala model for your black box (there is a reference on the Wiki.)

Another option is to use the firrtl-interpreter directly (not through the PeekPokeTester interface)  where you can poke state variables internal to the design. 

--
You received this message because you are subscribed to the Google Groups "chisel-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chisel-users+unsubscribe@googlegroups.com.
To post to this group, send email to chisel...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/chisel-users/700428d3-45ee-4db3-a9e9-55ed07327e8f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Øyvind Harboe

unread,
Mar 17, 2017, 10:53:02 AM3/17/17
to chisel-users
I need to use the Verilator backend for performance because I need to launch the same simulation with hundreds of datasets. Startup time is hugely important. 

After some rummaging through options, I think I've landed on doing my own Verilator driver in C. Fortunately I have the Chisel Verilator C driver to guide me, doesn't look too bad and the performance should be *great*. There are various reasons why this makes sense for me.

From the Verilator C driver, I have access to *all* signals, not only the top-level signals that PeekPokeTester() gets exported.

On Friday, March 17, 2017 at 2:53:14 PM UTC+1, Steve Burns wrote:
Couple of thoughts.
If you are always initializing the RAM to the same values, you should be able to refactor the RAM into a BlackBox and then describe the RAM in verilog and initialize it using an "initial" block with a "$readmemh" statement. This will work with the Verilog based simulations (VCS or Verilator). If you like using the firrtl-interpreter backend, you'll have to use a different mechanism to provide a Scala model for your black box (there is a reference on the Wiki.)

Another option is to use the firrtl-interpreter directly (not through the PeekPokeTester interface)  where you can poke state variables internal to the design. 
On Fri, Mar 17, 2017 at 6:09 AM, Øyvind Harboe <oyvind...@gmail.com> wrote:
For my tests, I am currently initializing RAM through the external interface of my top-level design.

This is *slow* (minutes), whereas the actual execution doesn't take very long.

I've tried to peek or poke signals outside of the IO() bundle, but poke refuses to do so.

Is there a way to directly initialize a RAM block deep within my design from a PeekPokeTester()?



Cheers,

--
You received this message because you are subscribed to the Google Groups "chisel-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to chisel-users...@googlegroups.com.

Øyvind Harboe

unread,
Mar 22, 2017, 7:46:55 PM3/22/17
to chisel-users
Writing my own harness for Verilator did the trick! I can do what I wanted and much more! 

I'm a bit disappointed by the speed of initialising RAM in Verilator, but it's good enough for now. I would have thought there was a way to write a bit of C++ to simply poke the values I wanted into the RAMs directly, but I ended up poking & stepping in my C++ harness too, just like in Chisel PeekPokeTester.

Schuyler Eldridge

unread,
Mar 23, 2017, 2:02:56 PM3/23/17
to chisel-users
Steve's $readmemh approach should be pretty efficient. A non-blackbox approach that I was using was to add a DPI function inside of the Chisel-emitted Verilog, `dpi_readmemh` that takes a filename as input. From the C++ testbench I could then load then initialize that memory with whatever I want. At the time, I wrote a Perl script that would apply this instrumentation for a specific signal in a specific module. Note: this is horribly kludgy and the best approach here, I expect, would be to enable this with a FIRRTL pass that exposes memories via the DPI/VPI. 

Note: for the VPI/DPI, I've found Verilator to be horrendously lacking in documentation...

At a more philosophical level, this is one of the weird disconnects between what Chisel is doing and what you think Chisel should be capable of. Verilog/System Verilog gives you 100% signal visibility in a completely non-synthesiable way. If you want to peek/poke some signal way deep inside of a module that has no IO to do so, you can. Chisel, on the other hand, is restricting you to a set synthesizable constructs as it's really a language for emitting circuit descriptions in FIRRTL. From the purely Chisel level, dynamically loading memory should be impossible unless you emit a circuit that allows you to do this.

Øyvind Harboe

unread,
Mar 23, 2017, 3:14:01 PM3/23/17
to chisel-users
I think we need a bit of impure solutions to make this all work. My design
runs at 400Hz in Verilator, whereas the fastest silicon runs at 4GHz.
That's 10^7 difference. If you want to simulate something non-trivial then
there needs to be some shortcuts.

I ended up creating a special top-level interface that's only used in
simulation(it exposes hundreds of pins and makes no physical sense) and
which the FPGA synthesis just sweeps away. It's clean in some ways(doesn't
require knowing much about advanced Chisel concepts or Verilog) and a
kludge in other (it messes up the interface of my RAM module and any
intervening modules to the top-level). It gives me an amazing 5kBytes/s RAM
loading speed which is *way* faster than before! :-)
Reply all
Reply to author
Forward
0 new messages