Multi-tenancy with Cassandra
Work is currently being done to support multi-tenant capabilities in Cassandra. This page is meant to list tickets associated with that effort as well as provide some documentation on how multi-tenant clusters can be configured.
Open tickets having to do with multi-tenancy
Concurrent schema migrations (multiple users changing schema at once) - CASSANDRA-1391
For multi-tenant order preserved partitioning, auto/online load balancing is pretty important CASSANDRA-1418
Track per-user/per-keyspace statistics CASSANDRA-841
Improve load balancing to take into account load in terms of operations - probably add no. of operations performed on top of disk space used CASSANDRA-1037
Set query/resultset limits per CF to help prevent OOMing nodes CASSANDRA-2068
Completed tickets having to do with multi-tenancy
Provide relative memory usage settings for Memtables (and eventually caches) out of configurable totals CASSANDRA-2006
Ability to lock down schemas for Column Families - CASSANDRA-1554
Better authorization system - CASSANDRA-1271
Dynamic schema changes - CASSANDRA-44
Basic scheduling based on keyspace - CASSANDRA-1035
CASSANDRA-1237 - Store AccessLevels externally to IAuthenticator (better authorization system)
Implement a weighted scheduler - e.g. for dev/prod accounts on the same cluster CASSANDRA-1485
Ideas under consideration
- Compaction fairness: currently, compaction makes no attempt to balance attention across multiple CFs
Include memory usage in "load" considerations for load balancing: ColumnFamilies with higher cache requirements have memory to consider.
- Namespaces - in a multi-tenant use case, each user might like to have a keyspace XYZ for whatever reason. So it might be nice to have namespaces so that keyspace XYZ could be specific to their user. Ideally this would be an option that would not affect those that don't use namespaces.
- The distinction from keyspaces is that a namespace would be completely transparent to the user: the existence of namespaces would not be exposed. It might be returned by the authentication backend on login, and prefixed to keyspaces transparently.
- JMX authentication without breaking nodetool - currently if you try to use JMX authentication, some of the Cassandra tools won't connect properly.
Perhaps provide a way to throttle certain operations, maybe useful outside the scope of multi-tenancy. Voldemort, for example, uses a throttler for operations that quickly iterate over all the data of a node, such as a rebalance.
The Hector java client for Cassandra has experimental support for a “Shared Keyspace, Shared Column Families” multi-tenancy model that allows for application-level virtual keyspaces Virtual keyspaces in Hector.