Hi, I'm diving into the implementation of the rapid membership protocol, and I have a question about the consensus liveness. As described below:
There are 9 nodes in the cluster, the FastPaxos need at least 7 votes, let's assume that 3 nodes are unreachable, which means every consensus will fallback to classic paxos. Now:
- one of the node(let's say node A) start the classic paxos because of fast round timeout, and assume that node A have
largest node index in the remain 6 nodes
- node A send phase1a messages to other nodes
- other nodes handle phase1a message and set the
rnd to Rank(2, A.NodeIndex)
- now node A crashed because of unknown reason, and we have 5 nodes in the cluster
- the remain 5 nodes will continue the classic paxos, but no one accept the phase1a message at here because of the node have larger
rnd, and the consensus get stuck
Am I misunderstanding something?
Hi, I'm diving into the implementation of the rapid membership protocol, and I have a question about the consensus liveness. As described below:
There are 9 nodes in the cluster, the FastPaxos need at least 7 votes, let's assume that 3 nodes are unreachable, which means every consensus will fallback to classic paxos. Now:
largest node index in the remain 6 nodes
rndto Rank(2, A.NodeIndex)rnd, and the consensus get stuckAm I misunderstanding something?