Fault injection for ECC demonstration

76 views
Skip to first unread message

alejandr...@gmail.com

unread,
Jul 6, 2022, 11:54:25 AM7/6/22
to RISC-V HW Dev
Hi all,

I'm working on a project to demonstrate the fault tolerance Rocket-chip has using ECC techniques. At the moment I could enable the ECC coding for L1 cache memories and for an internal TLRAM. I've also checked that the ECC bits are already available on the verilog and the ECC coding works perfectly on simulation. 

My next step is to check if software interruptions work well when detecting an uncorrectable failure and for that, I need a way of introducing faults at run-time on the architecture. I've checked there are some traits and classes called ECCTest in the UnitTest directory but I was not able to make them work on my architecture. So my questions are:

 - Is there a way of introducing failures from the Chisel code level in the architecture? And if that's true, is using the UnitTest traits the correct way to do it?

 - Are there other recommended approaches such as introducing them as FirRTL level or with third-party tools?

Many thanks in advance! Any help is really appreciated.
Alex. 

sp...@section5.ch

unread,
Jul 7, 2022, 7:45:52 AM7/7/22
to RISC-V HW Dev, alejandr...@gmail.com
As far as my OpenSource experience with the generic V* goes, this could be achieved as follows: Keep original model 'as is' and create a fault-injection-wrapper that flips bits according to some randomness (and a frequency parameter). When run-time parameter tweaking or triggering to a specific event is necessary in an interactive or non-deterministic way, this gets nastier and very simulator-specific, I know only a handful of simulators that allow looping in a software backdoor into a hardware entity. The other way via FLI/VPI is simpler, but that requires driving the compiled simulation backend by a master (which may be required when faults need to occur at a specific time) and is probably too slow with most simulators (e.g. icarus verilog) for a complexity of an entire SoC.
I was playing with the yosys CXXRTL backend lately, which allows to run such model in the loop scenarios with decent performance, but requires a bit of C++ hacking and had some issues with combinatorial statements (obscure errors, but eventually pointed out bad coding practise on my side). But not sure if experimenting with/switching simulators is an option for you.

alejandr...@gmail.com

unread,
Jul 13, 2022, 11:52:04 AM7/13/22
to RISC-V HW Dev, sp...@section5.ch
Thanks for your help.

Since I'm developing on an UltraScale, I will try to inject failures manually on the FPGA over the BRAMs designated for main memories. If that's not successful I would go for a chisel wrapper that randomly injects failures on the memory. 

Thanks for your advice!

Alex.  
Reply all
Reply to author
Forward
0 new messages