Skip to content

tkdhmin/noblsm-project

Repository files navigation

NobLSM Project

This repository is a modified version of the Linux kernel with a custom LSM-tree KV-Store, called NobLSM (DAC'22) (Paper).

Overview

Key-Value Store (KV-Store) periodically triggers major compaction. The major compation generally includes (i) reading SSTable from disk, (2) merging into a new SSTable, (3) writing back to disk. During the major compaction, KV-Store relying on underlying filesystem's sync operation for data consistency. This sync operation causes not performace overhead. Specifically, RocksDB (one of KV-Stores) uses fdatasync() operation to synchronously write the disk for consistency.

In 2022, NobLSM (DAC'22) has been published. The NobLSM claims if KV-Store levereages the Ext4 journaling infrastructure to write the KV data without inconsistency issues, the sync overhead can be removed since it writes in non-blocking write manner.

To validate this idea, NobLSM must be equipped with the following user/kernel-level data structures and components:

  • User space
    • NobLSM's SSTable Manager: For tracking SStable dependency
    • Compaction thread must call NobLSM-supported new syscalls (check_commit and is_committed).
  • Kernel space
    • Pending Table: For managing tracked SSTable
    • Committed Table: For managing inode whose the commit completes
    • check_commit syscall: inode tracking syscall 462
    • is_committed syscall: Check commit status
    • commit callback: when commit done, status transition (with Pending to Committed Table)

In this repo, user-space (RocksDB) and kernel-space (linux-6.8) has been modified and developed for validating the idea of NobLSM. Accoding to my experiment, NobLSM is superior to original RocksDB in terms of both throughput and latency perspectives.

The throughput increased by up to 59.5% and the latency reduced by up to 37.4% for 10 millon KV fillrandom workload. This is because major compaction does not rely on synchrounous disk write anymore.

Setup

  • OS: Ubuntu 20.04 or 22.04
  • Kernel: 6.8
  • RocksDB: v8.11.4
  • Compiler: GCC 9+ / CMake 3.16+

This project integrates:

License:

  • This project is released under the GNU General Public License v2 (GPLv2).
  • Linux kernel retains its GPL-2.0 WITH Linux-syscall-note license.
  • RocksDB retains its Apache-2.0 license.

Programming Languages

This project combines multiple components:

  • C: Linux kernel modifications (submodule linux-6.8)
  • C++: RocksDB integration (submodule third_party/rocksdb)

cloc

RocksDB

    2067 text files.
    2061 unique files.                                          
    123 files ignored.

github.com/AlDanial/cloc v 1.82  T=1.54 s (1269.2 files/s, 590845.1 lines/s)
--------------------------------------------------------------------------------
Language                      files          blank        comment           code
--------------------------------------------------------------------------------
C++                             739          69386          68039         451710
C/C++ Header                    605          24986          52756         107115
Java                            330           8143          21693          43190
Python                           35           1263           1586           9223
JSON                              2              0              0           7757
Perl                              2            476           1709           5824
Markdown                         43           1580              0           5684
Bourne Shell                     43            803            935           4460
C                                 3            748            250           4108
make                              8            664            318           3351
CMake                            17            217            138           2624
Sass                             16            290             16           1781
YAML                             34             56            107           1581
HTML                             48             70              1            966
Bourne Again Shell                3            120            193            845
Assembly                          1            133            132            491
INI                               8             95              0            482
SVG                               6             22             27            339
PowerShell                        1             86            104            303
XML                               3              6             12            226
Maven                             1             15              4            119
Dockerfile                        4             18            109             95
Protocol Buffers                  1              5              8             15
DOS Batch                         1              0              0              1
--------------------------------------------------------------------------------
SUM:                           1954         109182         148137         652290
--------------------------------------------------------------------------------

Reference

About

Non-blocking Writes (NobLSM)-based LSM-tree KV-Store Implementation

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors