B+ Tree Index
Architecture
The B+ tree is a balanced tree structure where all data resides in leaf nodes, and internal nodes contain only separator keys for navigation. Leaf nodes are linked in a forward chain, enabling efficient range scans without tree traversal.
pub struct BPlusTree<K: Ord + Clone, V: Clone> {
order: usize, // Branching factor (min 3)
root: Option<usize>, // Root node ID
nodes: Vec<Node<K, V>>, // All nodes in the tree
len: usize, // Total key-value pairs
}
enum Node<K, V> {
Internal(InternalNode<K>), // Separator keys + child pointers
Leaf(LeafNode<K, V>), // Data keys + values + next_leaf
}
struct LeafNode<K, V> {
keys: Vec<K>,
values: Vec<V>,
next_leaf: Option<usize>, // Forward pointer for range scans
}Operations
Point Lookup
Single-key lookup navigates from root to leaf using binary search at each internal node, then binary searches the leaf's key array. Complexity: O(log n).
Insert with Auto-Split
Insertion finds the correct leaf and adds the key-value pair. When a leaf exceeds capacity (order - 1 keys), it splits at the midpoint: the right half becomes a new leaf and the median key is promoted to the parent. Internal nodes split the same way, propagating up to the root if needed.
Insert key into leaf
↓ leaf.keys.len() > max_keys (order - 1)?
↓ YES: split at midpoint
Left leaf: keys[..mid], values[..mid]
Right leaf: keys[mid..], values[mid..]
Promote keys[mid] to parent
↓ Parent overflow? Split parent recursively
↓ Root splits? Create new root with single keyDelete with Merge/Redistribute
Deletion removes the key from its leaf. If the leaf underflows below the minimum (ceil((order-1)/2) keys), the tree first attempts to redistribute entries from a sibling. If redistribution isn't possible (sibling also at minimum), the nodes merge. Merges can cascade up to the root, reducing tree height.
Range Scans
Range queries navigate to the start leaf, then follow the next_leaf linked list until the upper bound is reached. No tree traversal is needed after finding the start position.
| Operation | Method | Description |
|---|---|---|
| Full scan | iter() | All entries in sorted order via leaf chain |
| From key | range_from(&K) | Entries where key >= start |
| To key | range_to(&K) | Entries where key <= end |
| Between | range(&K, &K) | Closed interval [start, end] |
| Prefix | prefix_scan(&K) | Keys sharing a common prefix (String, Vec<u8>) |
Bulk Loading
Pre-sorted data can be loaded bottom-up in a single pass, bypassing the insert-split cycle entirely. Leaf nodes are packed to capacity and linked, then internal levels are built from the leaf boundaries. This is significantly faster than repeated inserts for large datasets.
// Build tree from 10,000 sorted entries in one pass
let tree = BPlusTree::bulk_load(
64, // order
sorted_data.into_iter(),
);Capacity Rules
All capacity limits are derived from the configurable order parameter (minimum 3). The order determines the maximum fan-out of internal nodes.
| Parameter | Formula | Example (order=64) |
|---|---|---|
| Max keys per node | order - 1 | 63 |
| Min leaf keys | ceil((order - 1) / 2) | 32 |
| Min internal keys | ceil(order / 2) - 1 | 31 |
| Max children per internal | order | 64 |
| Root minimum | 1 key (no minimum) | 1 |
Concurrent Access
The ConcurrentBPlusTree wrapper provides thread-safe access via Arc<RwLock>. Multiple readers can access the tree concurrently; writers acquire an exclusive lock. Custom read/write closures are supported for complex operations.
pub struct ConcurrentBPlusTree<K, V> {
inner: Arc<RwLock<BPlusTree<K, V>>>,
}
// Multiple concurrent readers
tree.get(&key); // shared read lock
tree.range_to_vec(&a, &b); // shared read lock
// Exclusive writer
tree.insert(key, value); // exclusive write lock
tree.remove(&key); // exclusive write lockIndex Types Built on B+ Tree
The index manager uses B+ trees as the underlying structure for all index types in Anvil DB.
| Index Type | Key | Value | Use Case |
|---|---|---|---|
| Property | PropertyValue | Vec<NodeId> | Fast property lookups: WHERE n.name = 'Alice' |
| Unique | PropertyValue | NodeId | Uniqueness constraints: one node per key |
| Composite | (PropertyValue, PropertyValue) | Vec<NodeId> | Multi-property lookups |
| Full-text | String (term) | Vec<(NodeId, f32)> | Text search with TF-IDF scoring |
| Spatial | Morton code (u64) | Vec<NodeId> | Geospatial queries via Z-order curve |
| Label scan | LabelId | Vec<NodeId> | Fast label-based node retrieval |
| Relationship | (RelTypeId, NodeId) | Vec<RelId> | Relationship traversal acceleration |
Prefix Scanning
The B+ tree supports prefix-based range scans for String and Vec<u8> keys via the PrefixBound trait. The trait computes the exclusive upper bound of a prefix range by incrementing the last non-0xFF byte.
// Find all keys starting with "usr_"
for (key, val) in tree.prefix_scan(&"usr_".to_string()) {
// yields: "usr_001", "usr_002", "usr_alice", ...
}
// Internally computes: range("usr_", "usr]")
// where upper bound is "usr_" with last byte incrementedTest Coverage
The B+ tree implementation is covered by 51 tests across 2 modules, verifying correctness under sequential inserts, reverse inserts, large-scale operations (10,000 keys), and concurrent access.
| Module | Tests | Coverage |
|---|---|---|
| bptree.rs | 43 | Insert, delete, split, merge, range scan, prefix scan, bulk load, edge cases |
| concurrent_bptree.rs | 8 | Concurrent readers, writers, mixed access, closure-based operations |
Key Test Scenarios
| Test | Description |
|---|---|
| large_sequential_insert | 1,000 sequential keys with verified sorted iteration |
| large_reverse_insert | 1,000 reverse-order keys triggering repeated splits |
| bulk_load_large | 10,000 entries loaded bottom-up in one pass |
| delete_causes_height_reduction | Deletions collapse tree height via cascading merges |
| leaf_merge_on_delete | Deletion triggers leaf merge when below minimum |
| leaf_redistribute_on_delete | Deletion triggers redistribution from sibling |
| concurrent_readers_and_writer | 8 readers + 1 writer operating on 200 keys simultaneously |
| concurrent_writers | 4 writer threads inserting 1,000 keys in non-overlapping ranges |
| interleaved_insert_delete | 50 keys with alternating insert/delete operations |
| prefix_scan_strings | Prefix-based range scan on string keys |
Design Decisions
Generic Types
Keys require Ord + Clone, values require Clone. This allows the same B+ tree to back property indexes (PropertyValue keys), unique indexes (single NodeId values), and full-text indexes (term → score vectors).
Page-Ready Layout
Nodes are stored in a contiguous Vec with usize IDs, designed to map directly to a page-based storage manager. Each node corresponds to one fixed-size page on disk.
Leaf Linking
All leaves form a singly-linked list via next_leaf pointers. Range scans follow this chain after a single root-to-leaf traversal, avoiding repeated tree navigation.
No Unsafe Code
The entire implementation uses safe Rust with no unsafe blocks. Memory safety is guaranteed by the compiler, eliminating buffer overflow and use-after-free vulnerabilities common in C/C++ B+ tree implementations.