Zab: Atomic Broadcast for ZooKeeper

We would like to modularize the Atomic Broadcast layer of ZooKeeper so that it can be used independently. It also help to be able to test more effectively. We have named the atomic broadcast layer Zab, after the river in the middle east http://en.wikipedia.org/wiki/Zab, to distinguish it from ZooKeeper as a whole. To help extract Zab from ZooKeeper we have opened the ZOOKEEPER-30 Jira and we have attached zookeeper-internals.pdf and zookeeper-impl.pdf which, although old, shows how we connect RequestProcessors together. The actual RequestProcessor chain is setup in the setupRequestProcessors method of ZooKeeperServer, FollowerZooKeeperServer, and LeaderZooKeeperServer.

Zab is a standard atomic broadcast layer, but there are somethings that ZooKeeper uses from Zab that is specific to its implementation:

  1. ZooKeeper forwards change requests to a leader. It uses the same leader that Zab uses for simplicity and efficiency, so we need a method to get the leader.

  2. ZooKeeper uses Zab's write-ahead logging for atomic broadcast to also do the write-ahead logging for database changes.

  3. Most applications of atomic broadcast involves a state transfer if a process gets too far behind, so we have merged state transfer into Zab.

Zab is currently buried into Zab. Here is a proposed interface for the extracted Zab:

class Zab {
    // This constructor takes a ZabCallback that will receive events from Zab.
    // After Zab is constructed cb, will get called with setState() and deliver() for
    // any state and transactions that have been logged locally
    Zab(ZabCallback cb);
    // This method propose the message and returns a zxid (Zab transaction identifier)
    long propose(byte message[]);
}

enum ZabStatus { LOOKING, FOLLOWING, LEADING };

interface ZabCallback {
    // This method is called when a message has been delivered by Zab
    void deliver(long zxid, byte message[]);
    // This method is called when Zab's status changes. If this instance
    // of Zab is FOLLOWING, leader will have the name of the leader.
    void status(ZabStatus status, String leader);
    // This method is called by Zab to get the current state of ZooKeeper.
    // It may be called to sync with leaders who are behind (a state transfer), or it
    // may be used to periodically get the state of ZooKeeper for recovery.
    void getState(OutputStream os);
    // This method is called by Zab to push a state transfer to ZooKeeper.
    void setState(InputStream is);
}

ZooKeeper/ZabProtocol (last edited 2009-09-20 23:54:10 by localhost)