Hi Dongbin Cheng,
I'm sure I could have easily answered this question in the past. Thankfully, I wrote a little bit about it in my dissertation on the pages physically numbered 35 and 36, in the context of single server at a time membership changes:
As stated above, servers always use the latest configuration in their logs, regardless of whether
that configuration entry has been committed. This allows leaders to easily avoid overlapping config-
uration changes (the third item above), by not beginning a new change until the previous change’s
entry has committed. It is only safe to start another membership change once a majority of the old
cluster has moved to operating under the rules of Cnew. If servers adopted Cnew only when they
learned that Cnew was committed, Raft leaders would have a difficult time knowing when a major-
ity of the old cluster had adopted it. They would need to track which servers know of the entry’s
commitment, and the servers would need to persist their commit index to disk; neither of these
mechanisms is required in Raft. Instead, each server adopts Cnew as soon as that entry exists in its
log, and the leader knows it’s safe to allow further configuration changes as soon as the Cnew entry
has been committed. [...]
Your question was about joint consensus, which is the older version of Raft membership changes that we included in the paper and is a bit different. Similar issues apply to joint consensus when using the proposed rule of adopting a configuration only once it is committed:
1. Suppose a leader committed a configuration change and started using the new configuration. Then, the leader restarts or leadership changes to another server. Since Raft does not normally persist or replicate the commit index, the rebooted or new leader would probably revert to an older configuration, which could be unsafe.
2. In joint consensus, suppose a leader committed the Cold,new entry. The leader can't safely commit the Cnew entry until the leader knows that a majority of Cold and a majority of Cnew have adopted the Cold,new configuration -- so those followers must have marked Cold,new as committed and persisted that and informed the leader. (Otherwise, the leader could operate under the rules of Cnew while other members of the cluster operated under Cold.)
It may be possible to extend Raft to persist and replicate commit indexes and for leaders to track when followers persist new commit indexes. I don't believe I ever explored that because it seemed immediately harder.
etcd uses a different membership change protocol that changes a single server at a time but waits for configuration entries to be committed before using them. The doc comments at
https://github.com/etcd-io/raft/blob/main/doc.go#L260 mention some issues with the approach:
1. This can fail when removing a server from a two-server cluster. I think this refers to when you try to remove the leader of a two-server cluster: the leader commits the new entry, so it steps down, but the other server doesn't know the entry committed, so it's stuck.
2. They restrict membership changes to happen only when the leader has committed everything to prevent overlapping changes. In a busy cluster, you may never reach this state, so I imagine they'll block the creation of new entries and cause some brief unavailability before starting a membership change.
-Diego