HA Characteristics for Applications that implement Raft

vs0522

unread,

Aug 3, 2016, 3:26:33 PM8/3/16

to raft-dev

I have not read the Raft implementation(s) code - specifically for LogCabin and have the following question.

At a high level, in Raft:

1. Leader "commits" its log once replicated to (majority) of followers

2. Leader then executes the log in its state machine - if the state machine is "Application A" (App-A), then App-A applies the operations in the log. If App-A is a database (DB), or a KV store, and the operations in the log specify a sequence of DB, KV store, etc. operations (e.g., a transaction), then if all goes well with the DB instance it successfully carries out the transaction (e.g., commits it etc).

3. Followers now have the same operation in their logs, and will each begin (I assume) to have their state machine (App-A instance) apply the operations in their log (e.g., a DB transaction).

Question:

For the client ("Raft/App-A" client - I am assuming a properly written client) I am assuming it knows the Leader, Raft Term, state of the Leader's Raft log commit, etc. (and in case there there is a new election, Leader has or is going through Leader election, change, ...) it can properly react. The question is: is the client also aware of the "state" of the state machine (App-A) - for example, does it know whether the DB transactions have been successfully committed in the Leader and Followers?

More generally, in Raft implementations (specifically LogCabin) beyond handling Raft (state machine, in this case App-A) logs, is there any coordination between what happens after App-A applies the logs to monitor the "state" of App-A (among Leader and Followers), and between the server (Raft Leader, follower, ... - and, the App-A Leader and follower instances) and the client.

The reason to pose this question is that ultimately the client needs to know the "state" of the application (App-A) and within the consensus algorithm it seems the Leader (and followers) should also know something about this state (on their respective state machines). Is this coordination implemented in LogCabin or other Raft implementations, or is it assumed it is the responsibility of implementer (user) of the Raft protocol (including client-side logic).

Hope the question is not too convoluted and really appreciate a response.

Archie Cobbs

unread,

Aug 3, 2016, 3:55:37 PM8/3/16

to raft-dev

This question falls into the general bucket of how you take the consensus primitive that Raft provides, namely a simple append-only log, and convert it into something that looks to clients more like a database. By "database" we mean some complicated piece of state with access via transactions.

There are probably lots of ways to do that, with a general trade-off between:

(a) Simpler server implementation, but client is required to be "Raft-aware", and it looks less like a database
(b) More complicated server implementation, but client is not required to be Raft-aware, and it looks more like a database

My implementation is of type (b), whereby the client simply sees a transactional key/value store, and database transaction mechanics and Raft mechanics are decoupled. For example, a transaction can span a leadership change without notice. For details see here.

-Archie

Philip Haynes

unread,

Aug 3, 2016, 11:28:06 PM8/3/16

to raft-dev

Leading on from Archie's point, I think this is going to depend a bit on your view of what your are designing for and how you think

systems should be designed. In my case, finance and fraud. In real world use cases you want to expose and execute

full DFSM of how money is processed rather than just go along with an undergraduate textbook example of say a

debit / credit working within a single transaction. In the real world scenarios, systems are built that proceeds with an initial type

that goes through a series of mutations that are executed as part of the commit. To do this, the application has to be acutely aware

of the "state" of the state machine as the act of committing is the DB transaction. Once a state change commits on the leader, this of course,

is reflected back to followers with the next append entry request.

I found it interesting on page 16 of a recent talk by Birman where he states:

"... with Paxos, it is actually more natural to just build a database in which the protocol state “is” the application state"

So for simplified implementation, the client registers a series of asynchronous call backs with the RAFT engine that then called

as the DFSM executes on commit.

I hope my answer helps, given what is a complex topic without a common implementation to discuss in context.

Philip

vs0522

unread,

Aug 4, 2016, 2:28:46 AM8/4/16

to raft-dev

Thx to both Phillip and Archie.

Is there a reference implementation (code perspective) that you, or anyone else, is aware of that one can inspect/study that can reveal some of the patterns?

Archie Cobbs

unread,

Aug 4, 2016, 11:00:51 AM8/4/16

to raft-dev

On Thursday, August 4, 2016 at 1:28:46 AM UTC-5, vs0522 wrote:

Is there a reference implementation (code perspective) that you, or anyone else, is aware of that one can inspect/study that can reveal some of the patterns?

Not that I'm aware of.

-Archie

Diego Ongaro

unread,

Aug 12, 2016, 5:30:04 PM8/12/16

to raft...@googlegroups.com

Hope the question is not too convoluted and really appreciate a response.

I'm not entirely sure I'm reading it right, but...

The typical flow for Raft is the one shown in Figure 1 of the paper, where the leader's state machine applies the command, then its output is returned to the client. So the client doesn't learn that the log entry committed until the log entry was also applied to the state machine. This works on one log entry at a time, though a log entry can contain as complex an operation as you'd like. If you're trying to form "transactions" spanning across multiple log entries, that's when you get into the stuff that Archie and Philip mentioned.

Raft clients and leaders usually aren't concerned with how many entries a follower's state machine has applied. They might try more or less aggressively to get the new commit index value over to followers, but in the implementations I've seen, follower state machines lagging behind won't have any effect on the leader making progress.

--
You received this message because you are subscribed to the Google Groups "raft-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to raft-dev+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward