Announcing BOOMv2 (oh, and also BOOMv1 while we're at it).

259 views
Skip to first unread message

Christopher Celio

unread,
Aug 16, 2017, 1:54:08 AM8/16/17
to riscv-boom
Hey all,

I'm happy to announce BOOMv2, 3 months of work to improve the synthesis quality-of-result (QoR) of BOOM.

We felt the change list is qualitatively different enough to warrant a version number bump:

  * The unified issue window (that holds all uops) has now been split up into three separate issue windows. This improves issue-select performance while increasing the total number of inflight, unexecuted instructions. 
    - integer issue window
    - memory issue window
    - floating point issue window
  * Likewise, the unified physical register file has also been split up, separating floating point registers from integer registers. This lowers the total register count within a single register file and allows for a slightly reduced register read port count.
  * Issue-select and register-read now (optionally) have their own dedicated stages.
  * Rename stage is now (optionally) split across two cycles (now decode+rename1 and rename2+dispatch).
  * BOOM now provides its own branch target buffer (BTB).
    - This requires changes to the Rocket source code to provide an override to the rocket.Frontend's own BTB.
    - BOOM's BTB is now set-associative, allowing for dramatically more entries, and fits into single-port SRAM.
    - BOOM's BTB now uses a simpler bimodal predictor to quickly drive BTB predictions as opposed to Rocket's gshare predictor.
  * The BTB is now used to inform the conditional branch predictor (BPD) of the branch info and target for F2 redirection.
    - The F2 redirection can be turned off if it proves too slow of a critical path.
  * The front-end adds a new stage, F3, from which to perform branch prediction redirection from based on the decoded instructions.
  * A utility-class of SeqMemTransformable has been added to transform tall, skinny memories into rectangular memories. This allows the branch predictor tables to target single-port SRAMs.
  * The h-bit in the branch predictor two-bit counter tables can now be shared across 2 p-bits.
  * The MediumBoomConfig has been tweaked to match a parameter set that we believe has reasonable synthesis QoR results.
  * A blackbox example is provided for an integer register file if you just so happen to want to take BOOM down to layout and prefer to not rely on fully synthesized flip-flop-based register files.


BOOMv1 has been tagged as a release at (https://github.com/ucb-bar/riscv-boom/releases/tag/v1.0) so the older version lives forever.

BOOMv2 is now the master branch of boom and is marked as a release at (https://github.com/ucb-bar/riscv-boom/releases/tag/v2.0).


Warts and bugs are expected. BOOMv2 so far has mostly been an effort in improving physical QoR, and as such, a loss in Instructions per Cycle (IPC) is not surprising. Most of the IPC loss is due to the increased load-use delay caused by splitting issue-select and register-read into separate cycles (which can be reconfigured back to the old latency). 

We have more plans to improve BOOM and to take advantage of its new features. We will also be releasing a tech report on BOOMv2 in the near(ish) future.


-Chris
Reply all
Reply to author
Forward
0 new messages