Log-structured merge (LSM) stores have emerged as the technology of choice for building scalable write-intensive key- value storage systems. An LSM store replaces random I/O with sequential I/O by accumulating large batches of writes in a memory store prior to flushing them to log-structured disk storage; the latter is continuously re-organized in the background through a compaction process for efficiency of reads. Though inherent to the LSM design, frequent com- pactions are a major pain point because they slow down data store operations, primarily writes, and also increase disk wear. Another performance bottleneck in today’s state- of-the-art LSM stores, in particular ones that use managed languages like Java, is the fragmented memory layout of their dynamic memory store.
In this paper we show that these pain points may be mit- igated via better organization of the memory store. We present Accordion – an algorithm that addresses these prob- lems by re-applying the LSM design principles to memory management. Accordion is implemented in the production code of Apache HBase, where it was extensively evaluated. We demonstrate Accordion’s double-digit performance gains versus the baseline HBase implementation and discuss some unexpected lessons learned in the process.