SonicBOOM (BOOMv3) Released

723 views
Skip to first unread message

Jerry Zhao

unread,
Jun 1, 2020, 1:12:05 PM6/1/20
to riscv-boom
Hi all,

We've just tagged the latest version of BOOM as SonicBOOM (BOOMv3). SonicBOOM can achieve 6.2 CoreMark/MHz, and higher IPC than the A72 on SPEC17 workloads. SonicBOOM provides several new features over BOOMv2, including

- Superscalar TAGE-based branch prediction algorithm
- Multi-level BTB, with repaired Return-Address-Stack
- Auto-internal-predication of short-forwards-branches
- Integration with Dromajo/Fromajo co-simulation tools
- Superscalar load/store unit (2 ld/cycle)
- Next-line-prefetcher into L1 line-fill-buffers
- Support for RoCC accelerators, such as the Gemmini NN accelerator (https://github.com/ucb-bar/gemmini)

Additionally, many structures were rewritten to be higher-performance in the OOO context, including the rename-structures, instruction-fetch-unit, branch predictors, load-store-unit, and L1 data cache. A more detailed description of major changes and performance numbers is available in our CARRV report: https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao.pdf.

SonicBOOM is available to use through version 1.3 of the Chipyard SoC design framework, available here: https://github.com/ucb-bar/chipyard. Users interested in SonicBOOM should follow the Chipyard documentation to instantiate a SonicBOOM-based SoC.

While SonicBOOM v3.0.0 represents a large step forwards for BOOM development, we have even more optimizations sitting in the pipeline. A secondary set of performance/physical optimizations are currently being verified for a future v3.1.0 release. Additionally, we plan on updating the (very stale) docs for BOOM to reflect the SonicBOOM pipeline. 

Best,
-Jerry Zhao


Erling Jellum

unread,
Jun 2, 2020, 3:14:17 AM6/2/20
to Jerry Zhao, riscv-boom
Great news.

Just in time for us to merge it into our work and rerun benchmarks before our master thesis deadline.

Keep up the good work.
--
Erling Rennemo Jellum

P: +47 465 15 653



--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/riscv-boom/CAC%2BpDSFsdQ68ue1Opb8O6DH1mQhNViB-ZKU-AANjX8ziMk3SsQ%40mail.gmail.com.

Christopher Celio

unread,
Jun 3, 2020, 12:37:36 AM6/3/20
to riscv-boom
Good luck finishing up!

-Chris

赵夏

unread,
Jun 8, 2020, 12:32:27 AM6/8/20
to riscv-boom
Hi Jerry,

   Great work!! Thanks!!

   BTW, does SonicBOOM have an L2 TLB and how many TLB misses can the PTW handle simultaneously? I did not find these things in the SonicBOOM paper.

Cheers,
Xia

Jerry Zhao

unread,
Jun 8, 2020, 12:39:55 AM6/8/20
to 赵夏, riscv-boom
Hi Xia,

SonicBOOM supports a L2TLB. See here for a link to code which configures it.
The PTW handles 1 TLB miss at a time. There is room for improvement here.

-Jerry

--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.

yujia liu

unread,
Jun 16, 2020, 5:47:13 AM6/16/20
to riscv-boom

Hi Jerry,
            I am new to Boom. last week, I generate a Boom RTL with MegaBoomConfig and get a 5.4 Coremark/MHz; GigaBoomConfig with a 5.5 Coremark/MHz.
How can I generate a Boom Core to reach the declared 6.2 Coremark/MHz. By the way, if I annotating some code in the BoomConfig.scala, for example, annotating
the code "// new freechips.rocketchip.subsystem.WithNoMMIOPort ++" , it will get error " Reference system is not fully initialized.  [error]    : system.mmio_axi4.0.w.ready <= VOID ".
how can I fix it.


捕获.PNG

Bert Pieters

unread,
Jun 16, 2020, 6:41:23 AM6/16/20
to riscv-boom
Hi,

as the core is configurable, I assume my question is difficult to answer. Could you also provide rough estimate of gate-count for the 3 BOOM variants (v2, and the new v3) ?

it would certainly complete the comparison table from the CARRV report, next to Coremark scores.

Thanks!

Best regards,
Bert

Jerry Zhao

unread,
Jun 16, 2020, 12:31:56 PM6/16/20
to yujia liu, riscv-boom
Hi Yujia,

To reach >6 CM/MHz with MegaBoom, you must set the "enableSFBOpt" flag in boom/common/parameters.scala. I did not enable this optimization by default since it results in non-intuitive core and branch-predictor behavior.
Additionally, CoreMark is highly compiler-sensitive. What compiler flags are you using?

As for the issue, that seems to be a bug with Chipyard, I'll push a fix soon"

-Jerry

--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.

Jerry Zhao

unread,
Jun 16, 2020, 12:44:39 PM6/16/20
to yujia liu, riscv-boom
Never mind, the error message you were seeing is the intended behavior. By deleting that line from the config, you specified a design with external AXI ports into the L2 FBus. However, that config does not specify what will be driving those ports in the elaborated design. You should add the `chipyard.iobinders.WithTieOffL2FBusAXI` fragment into your config, as this fragment specifies that top-level L2 FBus AXI ports should be tied off.

yujia liu

unread,
Jun 24, 2020, 3:46:22 AM6/24/20
to riscv-boom
Hi Jerry,
         I set the "enableSFBOpt"flag and reach a 6.35 CM/MHz with MegaBoom. But I  delete that line "new freechips.rocketchip.subsystem.WithNoMMIOPort ++" 
or "new freechips.rocketchip.subsystem.WithNoSlavePort ++" from the config and add the `chipyard.iobinders.WithTieOffL2FBusAXI` fragment into the config, 
The error "Reference system is not fully initialized" still exist. 

Jerry Zhao

unread,
Jun 24, 2020, 3:57:50 AM6/24/20
to Bert Pieters, riscv-boom
Hi Bert,

That's a good question. The challenge is, as you mentioned, matching the configurations across multiple BOOM versions. Additionally, our current synthesis infrastructure is not backwards compatible with old BOOM versions, so bringing that up will be a pain.

I'm currently optimizing various parts of the core to reduce physical design pain, I'll report new critical paths, area estimates, and gate count once I get to a comfortable point.

-Jerry

--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv-boom+...@googlegroups.com.

Bert Pieters

unread,
Jun 24, 2020, 4:01:33 AM6/24/20
to riscv-boom
At least matching with the performance scores reported. I don't think configurations between BOOM versions needs to match.

The base question is, yes you increased performance, but at which cost (number of gates) ?

On Wednesday, 24 June 2020 09:57:50 UTC+2, Jerry Zhao wrote:
Hi Bert,

That's a good question. The challenge is, as you mentioned, matching the configurations across multiple BOOM versions. Additionally, our current synthesis infrastructure is not backwards compatible with old BOOM versions, so bringing that up will be a pain.

I'm currently optimizing various parts of the core to reduce physical design pain, I'll report new critical paths, area estimates, and gate count once I get to a comfortable point.

-Jerry

On Tue, Jun 16, 2020 at 3:41 AM Bert Pieters <bert....@gmail.com> wrote:
Hi,

as the core is configurable, I assume my question is difficult to answer. Could you also provide rough estimate of gate-count for the 3 BOOM variants (v2, and the new v3) ?

it would certainly complete the comparison table from the CARRV report, next to Coremark scores.

Thanks!

Best regards,
Bert

On Monday, 1 June 2020 19:12:05 UTC+2, Jerry Zhao wrote:
Hi all,

We've just tagged the latest version of BOOM as SonicBOOM (BOOMv3). SonicBOOM can achieve 6.2 CoreMark/MHz, and higher IPC than the A72 on SPEC17 workloads. SonicBOOM provides several new features over BOOMv2, including

- Superscalar TAGE-based branch prediction algorithm
- Multi-level BTB, with repaired Return-Address-Stack
- Auto-internal-predication of short-forwards-branches
- Integration with Dromajo/Fromajo co-simulation tools
- Superscalar load/store unit (2 ld/cycle)
- Next-line-prefetcher into L1 line-fill-buffers
- Support for RoCC accelerators, such as the Gemmini NN accelerator (https://github.com/ucb-bar/gemmini)

Additionally, many structures were rewritten to be higher-performance in the OOO context, including the rename-structures, instruction-fetch-unit, branch predictors, load-store-unit, and L1 data cache. A more detailed description of major changes and performance numbers is available in our CARRV report: https://carrv.github.io/2020/papers/CARRV2020_paper_15_Zhao.pdf.

SonicBOOM is available to use through version 1.3 of the Chipyard SoC design framework, available here: https://github.com/ucb-bar/chipyard. Users interested in SonicBOOM should follow the Chipyard documentation to instantiate a SonicBOOM-based SoC.

While SonicBOOM v3.0.0 represents a large step forwards for BOOM development, we have even more optimizations sitting in the pipeline. A secondary set of performance/physical optimizations are currently being verified for a future v3.1.0 release. Additionally, we plan on updating the (very stale) docs for BOOM to reflect the SonicBOOM pipeline. 

Best,
-Jerry Zhao


--
You received this message because you are subscribed to the Google Groups "riscv-boom" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riscv...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages