Setting up Cassandra in the Cloud
If you have done work to optimize your Cassandra install in the cloud, please take a moment to contribute some of that knowledge to this page
Amazon Web Services (AWS/EC2)
- There is an ec2snitch to make Cassandra rack-aware in the ec2 cloud.
Chef install for Cassandra, including ec2snitch setup.
Optimizing Volume Performance for a Transient Cluster
Depending on node size, and on how many EBS volumes are attached, most EC2 nodes will have many independent attached volumes.
- How should the Cassandra config be modified to take advantage of multiple attached volumes?
- What are the tradeoffs for EBS vs local drives as backing store for a persistent cluster?
Cloud clusters are often transient: when all jobs are finished, the nodes are terminated. Any data held on EBS volumes will remain until next restart, and any data on the local scratch disks disappears forever. When the cluster restarts, each node will have a new ip addresses and identity.
- For a non-persistent cluster, can Cassandra take advantage of the scratch disks (assume they are fast but could disappear at once across the whole cluster at any time)