SSTable Overview
DRAFT. Notes on documenting how SSTables work in Cassandra (data format, indexing, serialization, searching)
SSTables have 3 separate files created, and are per column-family.
- Bloom Filter
- Index
- Data
When adding a new key to an SSTable here are the steps it goes through. All keys are sorted before writing.
- Serialize Index
- Sort columns for key
- Serialize columns bloom filter
- Loop through columns and subcolumns that make up for column family
- Build sum for columnCount by column getObjectCount (includes getting subcolumn counts for super columns)
- Create bloom filter with column count
- Loop through columns (again) and add column name to bloom filter
- If super column detected, loop through subcolumns and add column name
- Write bloom filter hash count (int)
- Write serialized bloom filter length (int)
- Write serialized bytes of bloom filter
- Loop through columns and subcolumns that make up for column family
- Start indexing based on column family comparator
- If columns empty write integer zero, return
- Iterate over columns until getColumnIndexSize() is exceeded (default is 64KB)
Construct new IndexInfo that consists of last column before exceeded, existing column name, startPosition and endPosition - startPosition
- Write size of indexSizeInBytes (int)
Serialize each IndexInfo object - (firstname is last colum name before exceeded, and lastname is the existing column name)
Write byte firstname - (length >> 8) & 0xFF
Write byte firstname - (length & 0xFF)
- Write byte firstname
Write byte lastname - (length >> 8) & 0xFF
Write byte lastname - (length & 0xFF)
- Write byte lastname
- Write long startPosition
- Write long endPosition - startPosition
- Serialize Data
- Write columnFamily localDeletionTime (int)
- Write columnFamily markedForDeleteAt (long)
- Sort columns
- Write the number of columns (int)
- Determine Column Serializer and Serialize Column
- Determine length of column name as length
Write byte - (length >> 8) & 0xFF
Write byte - length & 0xFF
- Write byte of column name
- Write boolean isMarkedForDelete
- Write long timestamp
- Write column value length (int)
- Write column value as byte
- Write to SSTable Data File
- Write out row key in UTF, this is based on partitioner
- Random Partitioner
- key token + DELIMITER + key name
- Delimiter is colon
- Random Partitioner
- Write size of row value (int)
- Write byte of row value
- Write out row key in UTF, this is based on partitioner
- Write SSTable Bloom Filter and SSTable Index
- Add to bloom filter disk key based on partitioner
- Random Partitioner
- key token + DELIMITER + key name
- Delimiter is colon
- Random Partitioner
- Write disk key to SSTable Index file (UTF)
- Write file position before (Write to SSTable Data File) (int)
- Add to bloom filter disk key based on partitioner