Technology

Volcano-Model Executor

Anvil DB uses a Volcano-model (iterator-model) query executor where each operator implements an open/next/close interface. Rows are pulled one at a time through a pipeline of 30+ physical operators, driven by a cost-based optimizer with predicate pushdown, index selection, and join reordering.

The Operator Trait

Every physical operator implements the same three-method interface. Operators compose by nesting: each operator holds a reference to its input operator and calls its next() to pull rows upstream.

operator interface (executor.rs)rust
pub trait Operator: std::fmt::Debug {
    /// Initialize the operator (called once before iteration)
    fn open(&mut self) -> ExecResult<()>;

    /// Pull the next row; returns None when exhausted
    fn next(&mut self) -> ExecResult<Option<Row>>;

    /// Clean up resources (called after iteration)
    fn close(&mut self) -> ExecResult<()>;
}

/// Rows flow between operators as variable-to-value maps
pub struct Row {
    columns: HashMap<String, Value>,
}
pipeline compositiontext
Query: MATCH (n:Person) WHERE n.age > 30 RETURN n.name LIMIT 10

Pipeline (inner → outer):
  NodeByLabelScan("Person")
    → FilterOp(n.age > 30)
      → ProjectOp(n.name)
        → LimitOp(10)

Execution: LimitOp.next()
  calls ProjectOp.next()
    calls FilterOp.next()
      calls NodeByLabelScan.next()
        returns Row{n: {name: "Alice", age: 32, ...}}
      evaluates predicate: 32 > 30 → true → passes row
    projects: Row{name: "Alice"}
  count < 10 → returns row to caller

Physical Operators (30+)

Scan Operators

OperatorDescription
EmptyRowOpProduces a single empty row (pipeline root without input)
InMemoryScanYields pre-loaded rows from a Vec
AllNodesScanScans all nodes in the graph, binds each to a variable
NodeByLabelScanScans nodes filtered by label
NodeByIndexSeekB+ tree index lookup returning matching nodes
NodeByIdSeekDirect node ID lookup (returns 0 or 1 row)

Transformation Operators

OperatorLazy/EagerDescription
FilterOpLazyEvaluates predicate per row, passes matching rows
ProjectOpLazyEvaluates expressions, restructures output columns
LimitOpLazyPasses first N rows, then returns None
SkipOpLazyDiscards first N rows, passes the rest
SortOpEagerMaterializes all input, sorts by keys, emits in order
DistinctOpLazy*Tracks seen rows via HashSet, removes duplicates
EagerOpEagerMaterializes all input to prevent read-your-own-writes

Graph Traversal

OperatorDescription
ExpandOpSingle-hop expansion from source node to neighbors
OptionalExpandOpLike Expand but emits nulls when no neighbors (LEFT JOIN semantics)
VarLengthExpandOpMulti-hop BFS with configurable min/max hops
ShortestPathOpFinds shortest path(s) between nodes using BFS

Join & Subquery Operators

OperatorDescription
ApplyOpFor each input row, executes subquery and merges results (correlated subquery)
SemiApplyOpKeeps input rows where subquery produces at least one result (EXISTS)
AntiSemiApplyOpKeeps input rows where subquery produces zero results (NOT EXISTS)
UnionOpCombines rows from two inputs (UNION ALL or UNION DISTINCT)

Aggregation

OperatorFunctionsDescription
AggregateOpCOUNT, SUM, AVG, MIN, MAX, COLLECTGroups input rows, computes aggregations, emits one row per group

Mutation Operators

OperatorDescription
CreateNodeOpCreates a node with labels and properties, binds to variable
DeleteOpDeletes nodes (with optional DETACH for relationships)
SetPropertyOpSets a property on a node
RemovePropertyOpRemoves a property from a node
MergeOpMatch-or-create with ON MATCH SET / ON CREATE SET

Control Operators

OperatorDescription
TimeoutOpWraps any operator with cancellation/timeout checks on every next() call
ResultStreamWraps pipeline with streaming interface, column metadata, and row counting

Cost-Based Optimizer

The query planner converts parsed Cypher AST into a logical plan tree, applies optimization passes, estimates costs using statistics, and selects the cheapest alternative.

Optimization Passes

PassDescription
Predicate PushdownSplits AND predicates and pushes filters below Project, Sort, Distinct, Expand, and Join operators
Index Lookup ConversionConverts Filter(NodeScan) to IndexScan when a matching B+ tree index exists
LIMIT 1 OptimizationPushes LIMIT 1 through Sort, Project, Filter for first-match semantics
Join ReorderingGreedy ordering by estimated cardinality (smallest first) to minimize intermediate results

Cost Model

cost weights (planner.rs)rust
pub struct CostModel {
    pub scan_cost: f64,           // 1.0  — full label scan
    pub index_lookup_cost: f64,   // 0.1  — B+ tree seek
    pub filter_cost: f64,         // 0.2  — predicate evaluation
    pub hash_join_cost: f64,      // 2.0  — hash join build+probe
    pub sort_cost_per_row: f64,   // 0.5  — O(n log n) sort
    pub project_cost: f64,        // 0.1  — expression evaluation
    pub expand_cost: f64,         // 1.5  — graph traversal
}

Cardinality Estimation

The planner uses per-label node counts, per-type relationship counts, average degree, and optional property histograms (with distinct count, min/max, bucket bounds) to estimate the number of rows each operator will produce.

statistics (planner.rs)rust
pub struct PlannerStats {
    pub total_nodes: u64,
    pub nodes_per_label: HashMap<String, u64>,
    pub total_relationships: u64,
    pub rels_per_type: HashMap<String, u64>,
    pub avg_degree: f64,
    pub property_histograms: HashMap<(String, String), Histogram>,
}

Runtime Controls

Memory Tracking

Eager operators (Sort, Aggregate, Eager) track memory consumption via a shared MemoryTracker. Each row buffered is estimated and allocated against a configurable limit. Exceeding the limit returns an error instead of consuming unbounded memory.

Query Cancellation & Timeout

The TimeoutOp wrapper checks a CancellationToken (atomic boolean) on every next() call. Queries can be cancelled from another thread or automatically timed out after a configurable duration.

timeout and cancellationrust
pub struct CancellationToken {
    cancelled: Arc<AtomicBool>,  // Thread-safe cancel flag
}

pub struct QueryContext {
    pub cancel_token: CancellationToken,
    pub timeout: Option<Duration>,
    pub start_time: Instant,
}

// TimeoutOp checks on every next() call
impl Operator for TimeoutOp {
    fn next(&mut self) -> ExecResult<Option<Row>> {
        self.context.check()?;  // Returns Err if cancelled/timed out
        self.input.next()
    }
}

Result Streaming

The ResultStream wrapper provides lazy initialization (calls open() on first pull), batch pulling, row counting, and column metadata access.

EXPLAIN & PROFILE

The planner provides EXPLAIN for estimated execution plans and PROFILE for actual runtime metrics per operator.

CommandOutput
EXPLAINPlan tree with estimated cardinality and cost per operator
PROFILEPlan tree with actual rows, execution time, cache hits/misses per operator

Test Coverage

The executor and planner are covered by 187 tests across 2 modules.

ModuleTestsCoverage
executor.rs101All 30+ operators, pipelines, memory tracking, cancellation, streaming
planner.rs86Plan generation, cost estimation, predicate pushdown, index optimization, join reordering

Design Decisions

Pull-Based Execution

Each operator pulls rows from its input on demand. This avoids materializing intermediate results for lazy operators (Filter, Project, Limit) and enables early termination when LIMIT is reached.

Lazy vs Eager

Most operators stream rows through without buffering. Only Sort, Aggregate, Union, and Eager materialize their input. The EagerOp is inserted before mutations to prevent read-your-own-writes anomalies.

Composable Operators

Every operator implements the same Operator trait, enabling arbitrary pipeline composition. TimeoutOp can wrap any operator. MutationSink is generic over storage backends.

No Unsafe Code

The entire executor uses safe Rust. Memory safety for row buffers, operator state, and concurrent cancellation tokens is enforced by the type system.