On 10/11/23 10:43, Yifei Ma wrote:
> My question about the etcd-raft is that if the community has ever
> considered of adding the fast Paxos within this lib. If the community
> has worked on it (e.g. its planning or high level design), what is the
> outcome? Is there any technical concern or challenge of implementing it?
> If the community has not worked on it, does the community believe it
> worth of giving it a try (e.g. POC).
As far as I know, nobody's considered it. Etcd originated as the
demonstration case of Raft. I don't remember at this point why the Raft
researchers rejected Fast Paxos, and it may not matter. I know that the
folks at CitusDB adopted Fast Paxos instead of Raft, because we had a
long argument about it.
The technical challenge would be demonstrating that an implementation
based on Fast Paxos was correct under real-world circumstances. At this
point, Etcd has 9 years of engineering behind consistency, data
retention, and crash recovery -- most of which is closely tied to how
etcd-raft works. So doing this would probably require adding to Etcd's
testing in order to make sure that we haven't introduced a new data loss
bug. We'd also need testing to show that real-world performance was
actually better.
So realistically you're looking at a year-long project here, during
which you'd become an expert in Etcd. You can decide if that's worth it
for you. Regardless of how it came out, the results would be interesting.
--
-- Josh Berkus
Kubernetes Community Architect
OSPO, OCTO