Skip to content

consensus liveness of the implementation #36

@Lbqds

Description

@Lbqds

Hi, I'm diving into the implementation of the rapid membership protocol, and I have a question about the consensus liveness. As described below:

There are 9 nodes in the cluster, the FastPaxos need at least 7 votes, let's assume that 3 nodes are unreachable, which means every consensus will fallback to classic paxos. Now:

  1. one of the node(let's say node A) start the classic paxos because of fast round timeout, and assume that node A have
    largest node index in the remain 6 nodes
  2. node A send phase1a messages to other nodes
  3. other nodes handle phase1a message and set the rnd to Rank(2, A.NodeIndex)
  4. now node A crashed because of unknown reason, and we have 5 nodes in the cluster
  5. the remain 5 nodes will continue the classic paxos, but no one accept the phase1a message at here because of the node have larger rnd, and the consensus get stuck

Am I misunderstanding something?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions