Cassandra is a distributed storage system for managing structured/unstructured data while providing reliability at a massive scale.


Development of Cassandra started in Facebook in June 2007. It started of a system to solve the Inbox Search problem and since then has matured to solve various storage problems associated with structured/unstructured data.


Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure. The philosophy behind the design of the storage portion of Cassandra is that it be able to satisfy the requirements of applications that demand storage of large amounts of structured data. Reliability at massive scale is a very big challenge. Outages in the service can have significant negative impact. Hence Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different datacenters). At this scale, small and large components fail continuously; the way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service.

Initial Source

Intial Source can be obtained from the following site - http://the-cassandra-project.googlecode.com/svn/branches/development/. The mailing list is currently maintained at the same site. We will move it over to Apache once this proposal has been accepted.

Source and Intellectual Property Submission Plan

External Dependencies



Current Status



Core Developers


Known Risks/Avoiding the Warning Signs

Orphaned Products

Homogenous Developers

Reliance on Salaried Developers

Relationships with Other Apache Products

An excessive fascination with the Apache brand

Required Resources

Mailing lists

Subversion Directory

Issue Tracking




Sponsoring Entity

Cassandra (last edited 2009-09-20 23:06:06 by localhost)