Dear Yucheng,
At a certain level, yes the leader is the bottleneck. But this is not the whole story.
In a replicated state machine either the state machine, or the result from the state machine, can be replicated with
the later being more common. In the former circumstance the amount of data processed by the consensus system
is not large (~256 bytes in our case) and is a useful system optimisation.
In parts of the high performance Java community, a common theme is to highly optimise a single control thread,
an approach described in the
Disruptor concurrency component. A strong leader makes it much more tractable to
to do this so that the consensus sub-system itself is not the net performance constraint for the system as a whole.
Kind Regards,
Philip