ZooKeeper Google Summer of Code 2010 Ideas Page

Please update this page with ideas you'd like to see a SoC student implement in ZooKeeper over the summer. A good project is about 3 months long and has well defined success criteria.

If you are willing and able to mentor a student in any project you see below that doesn't currently have a mentor, please update the idea with your name.

More details about Google's Summer of Code can be found here: http://socghop.appspot.com/document/show/gsoc_program/google/gsoc2010/faqs - the deadline for our applications is Friday, so get those ideas coming in!

(Note that the mentorship assignment is tentative in some cases - every project can be mentored, however!)


Optimizations for WAN Deployments

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Java, some networking familiarity

Description

ZK 3.3.0 added observers which are non-voting members of a ZK ensemble. One use case for observers is as a proxy to a remote voting ensemble, say in a different data center. Since observers do not need to vote, there are less strict latency requirements on the delivery of messages to them. WAN traffic is also expensive. This project would investigate and implement batching of messages to observers, and potential mechanisms for decreasing the number of messages that need to be sent. For example, a destructive update to a znode twice in a row does not theoretically need to be sent twice - although making this work correctly with ZAB will be a challenge.


FUSE module for BookKeeper

Possible Mentor

Ben Reed (breed at apache dot org) & Patrick Hunt (phunt at apache dot org)

Requirements

C/Java, some networking familiarity

Description

BookKeeper is a distributed write ahead log with client & server written in Java. BookKeeper client & server also use ZooKeeper. There is a BookKeeper API that clients can use to integrate write ahead logging into their application. It would be a lot easier if applications could use BK without changes to the client application through use of a file system api (FUSE). The project would involve implementing a C interface for BookKeeper (Java already exists) and implementing the FUSE module.

Example use: the write ahead log in mysql, called binlogs are typically written to the local filesystem using the std filesystem api. We could modify mysql to use BooKeeper, however if we had a BK FUSE module we could run it (mysql) w/o any modification and get the performance/reliability of a distributed write ahead log.


Web-based Administrative Interface

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Modern web platform - e.g. Django. Some design or UI skills would help. Java for adding methods to ZooKeeper.

Description

ZooKeeper is a complex distributed system. Understanding how well it is running is tremendously important. Patrick Hunt has created a Django-based dashboard (see http://github.com/phunt/zookeeper_dashboard#readme) that allows some insight into how ZooKeeper is running. This is a great foundation on which to build; however there are improvements that could be made! This project would capture much more information from ZooKeeper, adding hooks to retrieve it where necessary and visualise it in a appealing and useful way. Integration with Ganglia would be a definite plus.


Failure Detector Module

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Java, some distributed systems knowledge, comfort implementing distributed systems protocols

Description

ZooKeeper servers detects the failure of other servers and clients by counting the number of 'ticks' for which it doesn't get a heartbeat from other machines. This is the 'timeout' method of failure detection and works very well; however it is possible that it is too aggressive and not easily tuned for some more unusual ZooKeeper installations (such as in a wide-area network, or even in a mobile ad-hoc network).

This project would abstract the notion of failure detection to a dedicated Java module, and implement several failure detectors to compare and contrast their appropriateness for ZooKeeper. For example, Apache Cassandra uses a phi-accrual failure detector (http://ddsg.jaist.ac.jp/pub/HDY+04.pdf) which is much more tunable and has some very interesting properties. This is a great project if you are interested in distributed algorithms, or want to help re-factor some of ZooKeeper's internal code.


Concurrent Primitives Library

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Java / C / Python / Ruby

Description

ZooKeeper is very powerful, but sometimes a bit hard to use. This project will create a library of concurrent primitives such as locks in many varieties (RW, MRSW, MRMW), concurrent queues, barriers, latches and semaphores that application developers can use to more easily take advantage of ZooKeeper's power. Writing distributed systems code is hard, and therefore should be done at most once!

This project would contribute a solid library in at least one of our supported languages. It is possible that improvements to the client library will be required - e.g. wrapper code to retry failed RPC calls. That falls under the scope of this project!

This would be a very interesting project to work on, as it will directly influence the evolution of the ZooKeeper API when you discover things that the current API makes needlessly difficult.

See http://www.cloudera.com/blog/2009/05/building-a-distributed-concurrent-queue-with-apache-zookeeper/ for a detailed example of how to build a distributed queue.


ZooKeeper DNS Server

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Java or Python or C

Description

Although ZooKeeper is primarily used for co-ordination of distributed processes, its consistency semantics means that it's a good candidate for serving small (key,value) records as well. The Domain Name Service has similar requirements, raising the interesting question of whether ZooKeeper would be a capable DNS server for your local network. One intriguing possibility is having versioned DNS records, such that known-good configurations can be stored and rolled back to in the case of an issue. If this versioning primitive proves to be useful, it's easy to imagine other types of configuration that could be stored.

This project would involve designing and building an RFC-1035 compliant DNS server and performing a detailed performance study against an already existant simple DNS server like tinydns.


Read-only mode

Possible Mentor

Henry Robinson (henry at apache dot org)

Requirements

Java and TCP/IP networking

Description

When a ZooKeeper server loses contact with over half of the other servers in an ensemble ('loses a quorum'), it stops responding to client requests because it cannot guarantee that writes will get processed correctly. For some applications, it would be beneficial if a server still responded to read requests when the quorum is lost, but caused an error condition when a write request was attempted.

This project would implement a 'read-only' mode for ZooKeeper servers (maybe only for Observers) that allowed read requests to be served as long as the client can contact a server.

This is a great project for getting really hands-on with the internals of ZooKeeper - you must be comfortable with Java and networking otherwise you'll have a hard time coming up to speed.


Load balance

Possible Mentor

Flavio Junqueira (fpj at apache dot org)

Requirements

Java

Description

Today we don't have a good way to balance the number of clients across available servers. As servers crash and recover, the number of clients connected to a server might be unbalanced with respect to others since clients don't automatically switch to servers based on load. The goal of this project is to implement at least one technique to balance the load of servers in an ensemble and evaluate it.