Outline
0. Overview
1. Amplification
2. Size-tiered Compaction
3. Leveled Compaction
4. Summary
5. Lucene Merge Policy
6. Reference
Overview
Compaction operations are expensive in terms of CPU, memory, and Disk I/O,而由于immutable特質,該操作在LSM架構上有必不可少。
Log Structured Merge (LSM)
data過來之后會寫到memory table (MemTable),當mem滿了之后,會flush到disk形成不可變的immutable Sorted String Table (SSTable)。當SSTable太多,os所打開的文件句柄也會過多,所以此時需要將多個同質的SSTable合并
成一個SSTable。
leveldb architecture
Amplification
- 寫放大:一份數據被順序寫幾次(還是那一份數據,只是需要一直往下流)。第一次寫到L0,之后由于compaction可能在L1~LN各寫一次
- 讀放大:一個query會在多個sstable,而這些sstable分布在L0~LN
- 空間放大:一份原始數據被copy/overwritten(一份用于服務query,一份用于snapshot之后用于compaction)
Size-tiered Compaction
- Triggered when the system has enough similarly sized SSTables, merged together to form one larger sstable. A disadvantage of this strategy is that very large SSTable will stay behind for a long time and utilize a lot of disk space (recommended for write-intensive workloads)
- 每一個tiers的單片大小逐漸變大,但是每一個tiers的sstables數量一致
- 如果某一個tier滿了(即sstables數量達到閾值)就會進行compaction,從而將該tier的所有數據merge為一個然后丟給下一個tier作為下一個tier的一個sstable。而在這個merge的過程,會copy一份原數據snapshot用于merge,merge之后再刪除
tiered (num same,size grow)
Leveled Compaction
- Triggered when unleveled SSTables (newly flushed SSTable files in Level 0) exceeds 4 (recommended for read-intensive workloads)
- 每一個tier里面的 sstable大小都是一致的,區別是每一個tier的sstable數量是逐漸變大的(一個數量級)
- tier1里面的sstables會跟tier2的sstables一起進行merge操作,最終在tier2(量大者)上形成一個有序的sstable
leveled (num grow,size same)
Summary
Size-tiered Compaction vs. Leveled Compaction
- data in one SSTable which is later modified or deleted in another SSTable wastes space as both tables are present in the system
- when data is split across many SSTables, read requests are processed slower as many SSTables need to be read
Scylla compaction summary
Lucene Merge Policy
這里同樣有三個放大問題,
- 寫放大(doc在segment之間的遷移)
- 讀放大(doc不同版本在不同segment,而打開一個segment需要一個indexReader)
- 空間放大(segments tmp空間)
lucene write-once vs. random write