Differences between revisions 1 and 2
Revision 1 as of 2011-08-18 07:00:28
Size: 26
Editor: NoblePaul
Comment:
Revision 2 as of 2011-08-18 07:02:56
Size: 2109
Editor: NoblePaul
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
(Work in progress)
== What is SolrCloud? ==

SolrCloud is an enhancement to the existing Solr to manage and operate Solr as a search service in a cloud.

== Glossary ==

 * Cluster : Cluster is a set of Solr nodes managed as a single unit. The entire cluster must have a single schema and solrconfig
 * Node : A JVM instance running Solr
 * Partition : A partition is a subset of the entire document collection. A partition is created in such a way that all its documents can be contained in a single index.
 * Shard : A Partition needs to be stored in multiple nodes as specified by the replication factor. All these nodes collectively form a shard. A node may be a part of multiple shards
 * Leader : Each Shard has one node identified as its leader. All the writes for documents belonging to a partition should be routed through the leader.
 * Replication Factor : Minimum number of copies of a document maintained by the cluster
 * Transaction Log : An append-only log of write operations maintained by each node
 * Partition version : This is a counter maintained with the leader of each shard and incremented on each write operation and sent to the peers
 * Cluster Lock : This is a global lock which must be acquired in order to change the range -> partition or the partition -> node mappings.

== Guiding Principles ==

 * Any operation can be invoked on any node in the cluster.
 * No non-recoverable single point of failures
 * Cluster should be elastic
 * Writes must never be lost i.e. durability is guaranteed
 * Order of writes should be preserved
 * If two clients send document "A" to two different replicas at the same time, one should consistently "win" on all replicas.
 * Cluster configuration should be managed centrally and can be updated through any node in the cluster. No per node configuration other than local values such as the port, index/logs storage locations should be required
 * Automatic failover of reads
 * Automatic failover of writes
 * Automatically honour the replication factor in the event of a node failure

New SolrCloud Design

(Work in progress)

What is SolrCloud?

SolrCloud is an enhancement to the existing Solr to manage and operate Solr as a search service in a cloud.

Glossary

  • Cluster : Cluster is a set of Solr nodes managed as a single unit. The entire cluster must have a single schema and solrconfig
  • Node : A JVM instance running Solr
  • Partition : A partition is a subset of the entire document collection. A partition is created in such a way that all its documents can be contained in a single index.
  • Shard : A Partition needs to be stored in multiple nodes as specified by the replication factor. All these nodes collectively form a shard. A node may be a part of multiple shards
  • Leader : Each Shard has one node identified as its leader. All the writes for documents belonging to a partition should be routed through the leader.
  • Replication Factor : Minimum number of copies of a document maintained by the cluster
  • Transaction Log : An append-only log of write operations maintained by each node
  • Partition version : This is a counter maintained with the leader of each shard and incremented on each write operation and sent to the peers
  • Cluster Lock : This is a global lock which must be acquired in order to change the range -> partition or the partition -> node mappings.

Guiding Principles

  • Any operation can be invoked on any node in the cluster.
  • No non-recoverable single point of failures
  • Cluster should be elastic
  • Writes must never be lost i.e. durability is guaranteed
  • Order of writes should be preserved
  • If two clients send document "A" to two different replicas at the same time, one should consistently "win" on all replicas.
  • Cluster configuration should be managed centrally and can be updated through any node in the cluster. No per node configuration other than local values such as the port, index/logs storage locations should be required
  • Automatic failover of reads
  • Automatic failover of writes
  • Automatically honour the replication factor in the event of a node failure

NewSolrCloudDesign (last edited 2012-05-19 19:45:07 by ool-457bec5f)