Example: Paxos run
ZooKeeper uses a protocol that we call Zab (ZooKeeper Atomic Broadcast). Even though Zab follows the abstract description of Paxos by Butler Lampson, it does not match the description of the Classic Paxos protocol by Lamport. In fact, in this page we show an example of a Paxos run that violates the primary order of messages, which is a property that Zab satisfies. The basic idea is that if we have three proposers over time and three acceptors, we can have a situation in which a proposer proposes A and B, but only B is committed. For ZooKeeper, operations are state changes, and such cases can lead to an inconsistent state. In this particular example, the state change B depends upon A, but A was not committed.
Note that we concentrate on instances 27, 28, and 29, for the sake of illustration. For simplicity of presentation, we assume in each figure that the single proposer is the leader for the corresponding part of the execution. We omit common optimizations like sending one 1a message for all instances or summarizing 1b messages. We also only show the values accepted along with their ballots as part of the state of an acceptor.
Notation: For the following figures, we use bold italics to represent the state of an acceptor, and regular font to represent messages.
Figure 1: Proposer 1 finishes phase 1, and it has two values to propose: A and B. It proposes A and B for instances 27 and 28, respectively. Proposer 1, however, fails before it can get both acceptors 1 and 2 to accept A and B, and only acceptor 1 accepts A and B.
Figure 2: Proposer 2 finishes phase 1, and it has one value to propose: C. It proposes C for instance 27. Proposer 2 makes acceptors 2 and 3 accept C and crashes. Value C consequently is anchored (using the terminology of Lampson) for instance 27.
Figure 3: Proposer 3 finishes phase 1, and it has one value to propose: D. Given the values in messages 1b, it proposes C, B, and D for instances 27, 28, and 29 respectively. Proposer 3 makes acceptors 1 and 3 accept C, B, and D. These three values are anchored for instances 27, 28, and 29, respectively.