Raft and data stored in RAMCloud

Boaz Leskes

unread,

Feb 25, 2015, 3:21:19 PM2/25/15

to raft...@googlegroups.com

Hi,

Reading up on raft and LogCabin you get a lot of examples of how Raft is used in LogCabin to support cluster meta data, like master election and other configuration options. It's not clear though whether Raft is also used in RAMCloud for data consistency. I tried to check out the ramcloud repo and check but it seems the repo is down. Can someone here answer the question? if it's not used, is there any concrete reason why (other then the work involved needs to be done, which is perfectly valid)?

Cheers,

Boaz

Diego Ongaro

unread,

Feb 25, 2015, 8:57:58 PM2/25/15

to Boaz Leskes, raft...@googlegroups.com

Hi Boaz,

The machine hosting the RAMCloud repo seems to have died today, so it
finally got moved over to GitHub:
https://github.com/PlatformLab/RAMCloud

While RAMCloud was the original motivation for developing Raft and
LogCabin, RAMCloud does not currently use Raft or LogCabin. RAMCloud's
coordinator relies on a pluggable ExternalStorage module
(src/ExternalStorage.h) for consensus on the cluster metadata, but
there is only a ZooKeeper implementation of that so far
(src/ZooStorage.cc). I think the RAMCloud team would welcome a
LogCabin implementation of ExternalStorage, but to my knowledge, no
one has done this yet.

For replicating its data, RAMCloud uses its own master-backup
replication. It waits for all of R (typically 3) backups to
acknowledge log data before considering it committed, and the
coordinator is involved when either the master fails or any of the
backups holding the master's head segment fails. See the
papers/dissertations that discuss the log and/or recovery mechanism at
https://ramcloud.atlassian.net/wiki/display/RAM/RAMCloud+Papers , or
src/ReplicaManager.cc and src/BackupClient.cc.

Using Raft for replicating data in RAMCloud would have some benefits,
but the resulting system would tolerate fewer faults with the same
amount of hardware. I wrote a little bit about this in 11.7.2 (last
paragraph) in my dissertation:
https://github.com/ongardie/dissertation#readme . In short, RAMCloud
can survive the failure of R of its R+1 copies, but it needs to
involve the coordinator, which in turn invokes an ExternalStorage
module for consensus, to do so. Raft can only automatically survive
the failure of F of is 2F+1 copies, though it can do so autonomously.

Moreover, RAMCloud's data replication is asymmetric in that the master
copy is in memory and the backup copies are on disk. The data is
unavailable unless it's in memory and has been indexed. Each master
scatters its backup data across the entire cluster so that it can be
read quickly from many disks upon a crash. It'd be hard to map this
onto Raft and maintain the same (in-memory) availability and fast
crash recovery properties.

Finally, there are projects out there using Raft for data replication.
Facebook's HydraBase is probably the biggest, using Raft for
cross-datacenter data replication in HBase:
https://issues.apache.org/jira/browse/HBASE-12259

Hope that helps, or at least keeps you busy :)

BTW, this is a good place to continue Raft discussions, while the best
place for RAMCloud-only discussions is ramcloud-dev:
https://mailman.stanford.edu/mailman/listinfo/ramcloud-dev

Best,
Diego

> --
> You received this message because you are subscribed to the Google Groups
> "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to raft-dev+u...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Boaz Leskes

unread,

Feb 26, 2015, 2:51:11 PM2/26/15

to Diego Ongaro, raft...@googlegroups.com

Thank you Diego.

That's very helpful.

> email to raft-dev+unsubscribe@googlegroups.com.

Reply all

Reply to author

Forward