LRU Page Cache
Architecture
The cache maps page IDs to fixed-size byte buffers. A doubly-linked list stored in a flat Vec tracks access order: the head is the least-recently-used page (eviction candidate) and the tail is the most-recently-used.
pub struct PageCache {
page_size: PageSize, // 4 KB or 8 KB
capacity: usize, // Max pages in cache
index: HashMap<PageId, usize>, // O(1) page lookup
entries: Vec<Entry>, // Flat-vec doubly-linked list
lru_head: Option<usize>, // Least recently used (evict first)
lru_tail: Option<usize>, // Most recently used
}
struct Entry {
page: Page, // The cached page data
prev: Option<usize>, // Previous in LRU order
next: Option<usize>, // Next in LRU order
}Read request: PageId
|
v
HashMap lookup ──── HIT ──── move_to_tail() ──── return &Page
|
MISS
|
v
Load from disk ──── insert() ──── cache full?
|
YES: evict_lru() ──── flush dirty page to disk
|
NO: append to tailOperations
| Operation | Complexity | Description |
|---|---|---|
| get(id) | O(1) | Lookup page by ID, promote to most-recently-used |
| get_mut(id) | O(1) | Mutable lookup, promotes to MRU, enables dirty tracking |
| insert(page) | O(1) | Add page to cache; evicts LRU if at capacity |
| remove(id) | O(1) | Remove specific page, unlink from LRU list |
| contains(id) | O(1) | Check if page is cached |
| dirty_pages() | O(n) | Iterate all dirty pages for checkpoint flushing |
LRU Promotion
Every get() and get_mut() call promotes the accessed page to the tail (most-recently-used position) via move_to_tail(), which unlinks the entry from its current position and relinks it at the end. This ensures frequently accessed pages stay in cache while cold pages drift toward the head for eviction.
Eviction
When insert() is called on a full cache, the head entry (least-recently-used) is evicted and returned as an Evicted struct. The caller is responsible for flushing the evicted page to disk if its dirty flag is set.
pub struct Evicted {
pub page: Page, // Caller must flush if page.is_dirty()
}
// Insert returns the evicted page when cache is full
let evicted: Option<Evicted> = cache.insert(new_page);
if let Some(e) = evicted {
if e.page.is_dirty() {
disk.write_page(&e.page)?; // Flush to disk
}
}Page Structure
Each page is a fixed-size byte buffer with an ID, dirty flag, and read/write accessors. Writing to a page automatically sets the dirty flag, which is checked during eviction and checkpoint flushing.
pub struct Page {
id: PageId, // u64 page identifier
data: Box<[u8]>, // Heap-allocated byte buffer
dirty: bool, // Set on write, cleared on flush
}
// Writing automatically marks dirty
page.write(offset, &data); // dirty = true
page.clear_dirty(); // After flushing to disk| Setting | Value | Description |
|---|---|---|
| PAGE_SIZE_4K | 4,096 bytes | Default page size for general use |
| PAGE_SIZE_8K | 8,192 bytes | Larger pages for bulk workloads |
| Minimum | 512 bytes | Must be power of two |
| Dirty tracking | Boolean flag | Set on write(), cleared on flush |
Free Page Management
Deleted pages are recycled via a FreeList that maintains a LIFO stack of reusable page IDs and a high-water mark for sequential allocation. New pages are allocated from the free list first; only when the list is empty does the high-water mark advance.
pub struct FreeList {
free: Vec<PageId>, // Recycled page IDs (LIFO)
next_page_id: u64, // High-water mark
}
// Allocation: reuse freed page or bump HWM
let id = freelist.allocate();
// Deallocation: return page for reuse
freelist.free(page_id);Implementation Details
Flat-Vec Linked List
The LRU list uses a Vec of entries with prev/next indices instead of heap-allocated nodes. This provides cache-friendly memory layout, avoids pointer chasing, and enables O(1) allocation via Vec index.
HashMap Index
A HashMap maps PageId to Vec index for O(1) page lookup. Combined with the O(1) LRU reordering, every cache operation is constant time regardless of cache size.
Dirty Page Tracking
Pages track their dirty state via a boolean flag set automatically on any write() call. The dirty_pages() iterator yields all modified pages for batch flushing during checkpoints.
No Unsafe Code
The entire page cache implementation uses safe Rust. Memory safety for page buffers is enforced by the type system, eliminating buffer overflows and use-after-free vulnerabilities.
Test Coverage
The page cache and related components are covered by 28 tests across 3 modules.
| Module | Tests | Coverage |
|---|---|---|
| page_cache.rs | 10 | Insert, get, eviction, LRU promotion, dirty tracking, replace-in-place |
| page.rs | 12 | Page allocation, read/write, dirty flag, byte offset calculation, page size validation |
| freelist.rs | 6 | Sequential allocation, free/reuse LIFO, high-water mark, free page listing |
Key Test Scenarios
| Test | Description |
|---|---|
| evicts_lru_when_full | Inserting into a full cache evicts the least-recently-used page |
| access_promotes_to_mru | get() promotes a page to most-recently-used, preventing eviction |
| get_mut_promotes | Mutable access also promotes, ensuring hot pages stay cached |
| insert_existing_replaces_in_place | Re-inserting same PageId replaces without eviction |
| dirty_pages_iterator | dirty_pages() correctly filters to only modified pages |
| evicted_dirty_page_is_reported | Evicted page preserves dirty flag for caller to flush |
| free_and_reuse | Freed pages are reused in LIFO order before allocating new ones |