FPU implementation

68 views
Skip to first unread message

Nasir Abbas

unread,
Oct 6, 2024, 9:56:52 PM10/6/24
to RISC-V ISA Dev
Hi everyone I want to implement single floating point unit in RISC-V processor kindly can someone help by telling which core I should select and well as FPAG for hardware implementation.

Tommy Murphy

unread,
Oct 7, 2024, 8:52:09 AM10/7/24
to Nasir Abbas, RISC-V ISA Dev
There's no one implementation that is necessarily the right one in this context. If you do an internet search for, say, RISC-V open-source FPGA implementation, you'll find many different implementations, some if which will include an FPU and some of which may be suitable for your needs or as pedagogical examples.

Al Martin

unread,
Oct 7, 2024, 10:39:20 AM10/7/24
to RISC-V ISA Dev, Nasir Abbas
Github has many RISC-V implementations (in various design languages), some of which include the F extension, that can be used as starting points.  There are also standalone fp implementations which could be used, needing interface modifications to fit into an existing RISC-V implementation that does not include floating point support.  Their quality (== compliance with IEEE754) differs.

Al Martin

unread,
Oct 9, 2024, 7:31:30 PM10/9/24
to Sean Halle, RISC-V ISA Dev, Nasir Abbas
Based on personal experience, you should be able to get close to your goals using HardFloat as your starting point.  There are some fairly obvious places to add staging flops to accomplish pipelining.  You should also be able minimize power/energy with aggressive clock gating techniques.

The problem with a lot of the FP implementations I've looked at in Github is that they fail compliance in some of the hairy edges of IEEE754 (issues with rounding, subnormal handling, etc.).  HardFloat (which is pretty much a Verilog or Scala implementation of SoftFloat) has never failed anything we've thrown at it.

Regarding alternate algorithms (e.g. high-radix Booth encoding for the multiplier arrays, or alternative div/sqrt methods), I too would be interested in any data regarding latency vs gate count vs power tradeoffs.

Al Martin


On Wed, Oct 9, 2024 at 3:21 PM Sean Halle <sean...@intensivate.com> wrote:
Hi Tommy, thanks for the pointers.

Nasir, if you end up doing a detailed study, would you be up for sharing your evaluation? 

Note: we are using hardfloat, which comes with Rocket-Chip, but are interested in other options as we turn towards deeper level optimization.  If there is interest, we can share what we discover as well. 
 
FYI, the main criteria we're looking at are:
1. IEEE754 compliance
2. ability to pipeline it deeply (targeting 3.5GHz on 6nm)
3. energy per operation -- measured on a canonical node -- for open source work, an option would be skywater PDK plus OpenLane tools -- they enable rough power estimation (note: energy varies by operation, but this is more of a back of the envelope estimate, so we use 7% activation)
4. performance factors -- these are qualitative features, such as the type of divider (invert and multiply versus iterative, etc) -- basically some notes about implementation choices that you believe affect performance..

Thanks Nasir and Tommy,

Sean


--
You received this message because you are subscribed to the Google Groups "RISC-V ISA Dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isa-dev+u...@groups.riscv.org.
To view this discussion on the web visit https://groups.google.com/a/groups.riscv.org/d/msgid/isa-dev/7edff4e4-b71a-41e4-ac1e-0778a447ded7n%40groups.riscv.org.
Reply all
Reply to author
Forward
0 new messages