Write-Ahead Log (WAL)
Architecture
The WAL follows a multi-layer pipeline: mutations flow from the application through the WAL Manager, into segmented log files on disk, and are replayed on recovery to reconstruct the in-memory state.
Application (handlers.rs)
↓ log_wal() for metadata ops save_snapshot() for graph ops
WAL Manager (wal_manager.rs) Snapshot (graph.dfgs)
↓ append() + commit()
WAL Writer → Segments (wal/*.log, 64 MB each)
↓ CRC32 checksum per record
Recovery Engine (wal/recovery.rs)
↓ redo committed post-checkpoint mutations
WAL Replay (wal_replay.rs) → In-Memory SnapshotRecord Format
Each WAL record is a binary structure with a CRC32 checksum for corruption detection. Records are append-only and immutable once written.
Offset Size Field Type Description
────── ───── ────────────── ──────── ─────────────────────────────
0 8 LSN u64 LE Log Sequence Number
8 8 TX_ID u64 LE Transaction ID
16 1 OP_TYPE u8 Operation type (1-6)
17 4 BEFORE_LEN u32 LE Length of before-image
21 var BEFORE_IMAGE [u8] Before image (unused, redo-only recovery)
+len 4 AFTER_LEN u32 LE Length of after-image
+len var AFTER_IMAGE [u8] After image (redo data)
+len 4 CHECKSUM u32 LE CRC32 of all preceding bytesOperation Types
| Type | Value | Description |
|---|---|---|
| Insert | 1 | New record (after_image contains mutation data) |
| Update | 2 | Modify existing (both images supported, currently unused) |
| Delete | 3 | Remove record (before_image supported, currently unused) |
| Commit | 4 | Transaction commit marker |
| Abort | 5 | Transaction abort marker |
| Checkpoint | 6 | Checkpoint marker |
Mutation Types
The WAL captures 24 distinct mutation types covering all state changes across graph, documents, functions, triggers, and metadata.
Graph Mutations
| Mutation | Description |
|---|---|
| CreateNode(Node) | Insert a node with all properties and labels |
| DeleteNode { node_id } | Delete a node by ID |
| UpdateNodeProperties(Node) | Full node replacement after property update |
| CreateRelationship(Relationship) | Insert a relationship |
| DeleteRelationship { rel_id } | Delete a relationship by ID |
| DetachDeleteNode { node_id } | Remove node and all its relationships |
Document Mutations
| Mutation | Description |
|---|---|
| PutDocument(Document) | Insert or replace a document |
| DeleteDocument { collection_id, document_id } | Delete a document |
| CreateCollection(CollectionConfig) | Create a new collection |
| DropCollection { collection_id } | Drop a collection |
Schema & Metadata Mutations
| Mutation | Description |
|---|---|
| AddSyncRule / RemoveSyncRule | Graph-document sync rule changes |
| CreateFunction / DropFunction | Stored function changes |
| CreateTrigger / DropTrigger / Enable / Disable | Trigger lifecycle |
| RegisterLabel / RegisterRelType / RegisterPropKey / RegisterCollection | Registry entries |
| SetGraphIdCounters / SetDocIdCounters | ID sequence counters |
CRC32 Integrity Checks
Every WAL record includes a CRC32 checksum (IEEE / CRC-32b polynomial) computed over all preceding bytes. On recovery, each record's checksum is verified during deserialization. If a corrupt record is encountered, reading stops and recovery proceeds with all valid records up to that point.
fn crc32(data: &[u8]) -> u32 {
let mut crc: u32 = 0xFFFF_FFFF;
for &byte in data {
crc ^= byte as u32;
for _ in 0..8 {
if crc & 1 != 0 {
crc = (crc >> 1) ^ 0xEDB8_8320;
} else {
crc >>= 1;
}
}
}
crc ^ 0xFFFF_FFFF
}Segment Management
WAL data is stored in fixed-size segment files (default 64 MB each). When a segment fills, the writer automatically rotates to a new one. Old segments are purged after a successful checkpoint.
data/wal/
wal_000000000000.log ← oldest (purged after checkpoint)
wal_000000000001.log
wal_000000000002.log ← current segment| Setting | Default | Description |
|---|---|---|
| Segment size | 64 MB | Maximum size per segment file |
| Naming | wal_NNNNNNNNNNNN.log | 12-digit zero-padded sequence number |
| Rotation | Automatic | New segment created when current fills |
| Purge | After checkpoint | Old segments deleted safely |
Crash Recovery
Recovery follows an ARIES-inspired redo-only protocol: read all records, find the last checkpoint, classify transactions, and replay only committed post-checkpoint mutations. Incomplete transactions are silently discarded.
Recovery Algorithm
Read All Records
Read all WAL segment files in sequence order. Deserialize records until end-of-file or first corrupt record.
Find Last Checkpoint
Scan backward to find the most recent Checkpoint record. Recovery starts from this LSN. If no checkpoint exists, replay from the beginning.
Classify Transactions
After the checkpoint LSN, classify each transaction: committed (has Commit record), aborted (has Abort record), or incomplete (no Commit/Abort — crashed mid-transaction).
Redo Pass
Replay Insert/Update/Delete records from committed transactions only, in LSN order. Incomplete transactions are silently discarded.
Why Redo-Only?
Traditional ARIES requires an undo pass because its buffer pool can flush dirty pages from uncommitted transactions to disk. Anvil does not have this problem. The on-disk snapshot is always a consistent point-in-time image, and incomplete transactions only ever existed in memory — they are lost on crash and never reach the persisted state. Recovery simply loads the last snapshot and replays committed WAL entries forward from there, so no undo pass is needed.
pub struct RecoveryResult {
pub redo_records: Vec<WalRecord>, // Committed mutations to replay
pub checkpoint_lsn: Lsn, // Last checkpoint position
pub max_lsn: Lsn, // Highest LSN (for resuming)
pub committed_txs: HashSet<TxId>, // Successfully committed
pub incomplete_txs: HashSet<TxId>, // Crashed mid-flight (discarded)
}Checkpoint Strategy
Checkpoints bound the recovery window. A background task periodically checks if enough operations have accumulated, then flushes the in-memory state to disk and writes a checkpoint record to the WAL.
| Setting | Default | Environment Variable |
|---|---|---|
| Checkpoint interval | 300s (5 min) | ANVIL_CHECKPOINT_INTERVAL |
| Operations threshold | 1,000 ops | ANVIL_CHECKPOINT_OPS_THRESHOLD |
| WAL sync mode | fsync | ANVIL_WAL_SYNC_MODE |
Checkpoint Process
1. Check if ops_since_checkpoint >= threshold
2. Save in-memory snapshot to disk (graph.dfgs)
3. Write Checkpoint record to WAL with fsync
4. Purge all WAL segments before current one
5. Reset operation counter to 0Graceful Shutdown
On Ctrl+C or anvil stop, the server saves a final snapshot, writes a checkpoint record, and purges old WAL segments before exiting. This ensures the next startup has minimal or no recovery work.
Configuration
# WAL and checkpoint settings
[storage]
wal_sync_mode = "fsync" # fsync, fdatasync, or none
checkpoint_interval_secs = 300 # 5 minutes
checkpoint_ops_threshold = 1000 # checkpoint after 1000 operationsTest Coverage
The WAL implementation is covered by 54 tests across 8 modules, verifying record serialization, checksum integrity, segment rotation, recovery correctness, and replay idempotency.
| Module | Tests | Coverage |
|---|---|---|
| wal/record.rs | 12 | Serialization, checksums, corruption detection |
| wal/segment.rs | 9 | File creation, rotation, size limits, listing |
| wal_mutation.rs | 8 | Mutation round-trip serialization, buffer reading, invalid tag rejection |
| wal/writer.rs | 6 | Append, commit, auto-rotation, LSN allocation |
| wal/recovery.rs | 6 | Committed/incomplete/aborted tx recovery, checkpoint |
| wal_manager.rs | 5 | Open, log, batch, checkpoint, recovery integration |
| wal/checkpoint.rs | 4 | Dirty page flush, segment purge, recovery point |
| wal_replay.rs | 4 | Graph/document/registry mutation replay |
Durability Guarantees
Durable Persistence
Metadata mutations are logged to the WAL with fsync on commit. Graph mutations persist via snapshots. Checkpoints and graceful shutdown ensure committed state survives crashes.
Atomic Commits
Each transaction's mutations share a TX_ID. The Commit record marks the transaction as durable. Without it, the transaction is discarded on recovery.
Idempotent Replay
Records use full object replacement (after_image). Replaying the same mutation twice produces identical state. Safe for repeated recovery.
Bounded Recovery
Checkpoints bound the recovery window. Only mutations after the last checkpoint need to be replayed, keeping startup time predictable.