Back

Designing a Sharded, Lock-Striped, Allocation-Free LRU Key-Value Store in C++

lru kvstore image

GitHub Repo


Why I Built It


Design Goals


Shard Architecture

Keys are sharded across multiple independent stores, each with its own:

Why?


Hashing Scheme

A fast non-cryptographic hash (FNV-1a) is used to shard:

shard_index = hash(key) % num_shards;

This ensures even key distribution and low-cost routing. Shards never coordinate.


LRU Implementation


Avoiding Heap

Avoiding it gives full control over memory layout, performance profile, and peak usage.


Benchmarking

Benchmarked with Google Benchmark. Full data and graphs: Benchmark Results

System: AMD Ryzen 5 3550H · 16GB RAM · Fedora 42 · Kernel 6.15.5 Build Flags: -O3 -march=native -flto -DNDEBUG Store Capacity: 1024 entries

Summary Table (Real Time, ns)

Benchmark Time (ns) Description
BM_Insert_NoEvict 90,931 Insert N keys where N ≤ CAPACITY
BM_Insert_WithEvict 7,326,561 Insert 10× CAPACITY, triggers eviction
BM_Get_HotHit 72,836 100% cache hit
BM_Get_ColdMiss 204,845 100% miss, never-inserted keys
BM_Mixed_HotCold 3,815,285 90% hot reads, 10% cold writes
BM_Get_ParallelReaders/threads:8 154 8 threads concurrently reading
BM_Concurrent_ReadWrite/threads:5 270 4 readers + 1 writer under stress
BM_Write_Heavy_Parallel/threads:4 1,384 4 threads doing mostly writes
BM_ReadMostly_SharedStore/threads:8 765 8 threads, 95% reads / 5% writes

Observations

Full Benchmark Data


Future Work