fault - an open source fault injection developer-centric tool

34 views
Skip to first unread message

Sylvain Hellegouarch

unread,
Feb 11, 2026, 2:02:06 PMFeb 11
to Chaos Community
Hey everybody,

I'm going to try something a bit out there and talk about a Chaos Engineering related project. Crazy me.

I built a developer-centric tool called "fault". Yeah, it's a fault injection tool as well. Naming is hard.

fault started as a way to have a dev-friendly network fault injection tool for myself. It also was an opportunity to learn rust. In the end, it has grown into something a bit more fun than I anticipated.


The basic first.

fault acts as a network proxy. You start it with a fault pattern and point your client to it.

$ fault run --proxy 9090=example.com:443 --with-latency --latency-mean 300

You can of course have multiple fault types at once.

Then you can also create a schedule:

fault run \
    ... \
    --latency-sched "start:20s,duration:40s;start:80s,duration:30s" \
    ...
    --bandwidth-sched "start:35s,duration:20s"


This means you can create rather complex network patterns from your CLI and just get on with figuring out how life goes on then.

The fun part

When you find solid scenarios you need to verify for continuously, you can record them in a yaml file and run them with "fault scenario".

To help you our, fault comes with a scenario generator which creates a bunch of cases from your OpenAPI spec.

Then you can ask a scenario case to run a mini load test and fault will generate report of what happened.

Finally, you can specifiy SLOs in your scenario. They don't need to actually exist. You just set their definition and fault will tell you if, under the scenario's conditions they would be impacted for the given load test.


The extra-fun part

fault works great on your local machine, but  what if I told you it could also inject itself into your cloud platform. 

For instance, for AWS ECS:

fault inject aws \
    --region <region>  \  
    --cluster <cluster-name> \  
    --service <service-name> \  
    --duration 30s \  
    --with-latency --latency-mean 800

It can also inject itself into Kubernetes or GCP Cloud Run.

The extra-cherry-on-the-cake-fun part

If you're keen on a pinch of LLM, fault can:

* Act as a MCP server to let your favourite agent run verification on its own
* Let your favourite LLM review the reports from scenario tests
* Let you scramble the conversation with a LLM so you can see how you handle that failure mode

The what's left part

* fault supports eBPF if you want to be in stealth mode
* faults comes with DNS errors
* faults can be extended via gRPC (and we have a neat showcase to play directly on the PostgreSQL protocol)

Overall, it's been fun to build and even more fun to use. I also plugged it into unfault (its sister project so that you can click on a function in VSCode and it will play a fault case directly from there)

Oh and it's open source and free. 

Now we can revert to odd messages once more.

Thank you folks!

- Sylvain





Reply all
Reply to author
Forward
0 new messages