AntiEntropy means comparing all the replicas of each piece of data that exist (or are supposed to) and updating each replica to the newest version.

Cassandra's implementation is modeled on Dynamo's, with modifications to support the richer data model. Quoting from Amazon's Dynamo section 4.7,

The key difference in Cassandra's implementation of anti-entropy is that the Merkle trees are built per column family, and they are not maintained for longer than it takes to send them to neighboring nodes. Instead, the trees are generated as snapshots of the dataset during major compactions: this means that excess data might be sent across the network, but it saves local disk IO, and is preferable for very large datasets.


AntiEntropy (last edited 2013-11-12 23:13:17 by 50)