Connection strategy design for ZooKeeper client API
A connection strategy allows control over the way that ZooKeeper clients (we would implement this for both c and java apis) connect to a serving ensemble. Today we have two strategies, randomized round robin (default) and ordered round robin, both of which are hard coded into the client implementation.
This page details a design which adds a connection "strategy" to the ZooKeeper client API. By strategy I mean something like this: http://en.wikipedia.org/wiki/Strategy_pattern which would allow developers using the ZooKeeper client API to reuse existing strategies or specify their own.
See [https://issues.apache.org/jira/browse/ZOOKEEPER-779][ZOOKEEPER-779] for some background on this.
Use cases
Today's "round robin" connection strategy works well in most cases, however there are some use cases where it fails:
- balancing the load on a particular server of the ensemble. Round robin does a decent job on this, however there are cases where it fails - for example when a single server is restarted.
- choosing the "closest" server. Typically not an issue, except for the case where the ensemble itself is distributed over high latency (WAN) connections.
- exposing ensemble availability to the client code. In some cases the client would like to fall back to a hard coded state if it cannot reach the ensemble.
Non Goals
- This document does not attempt to address how the client gets the list of servers making up the ensemble, only how it determines which server to connect to.
Goals
- allow the client to select a strategy at runtime
- allow reuse of strategy code
- allow users to create their own strategy, or extend existing strategies
- strategy API should allow
- setStrategy(strat) on ZooKeeper client object, default should be RRR
- more here ... which will require real thought/consideration.
- provide some sort of context to the strat (list of servers for example, what more?)
- expectation is the strat will make the connection to the server and hand back to ZooKeeper client code
- strat may maintain multiple connections, might want to force ZooKeeper to disconnect and reconnect to another server
- this is the case for example where you connected to a "far away" server because close-in was down, but then later close in server becomes available again and you prefer to use that. similar for balancing.