Kdb RFC 004 sql planner and executor
RFC004 — kdbnext SQL planner & executor
| Field | Value |
|---|---|
| Status | *mplemented*(retroactive) |
| Author(s) | Rodrigo (with Claude as scribe) |
| Date | 2026 |
| Target module | platform/kdb/next/crates/kdb-planner |
| Related | RFC |
1. Summary
kdb-sql (#055) gives us a sqlparser-rs-backed AST for PostgreSQL SQL. kdb-planner is the crate that compiles that AST into a tree of physical operators (Plan / DmlPlan) and executes it against a pluggable RowSource. The crate is *ully decoupled from kdb-record* it operates on its own decoded Value type and a Vec<Value> row, so it can be unit-tested without touching async, the substrate, or any Protobuf machinery.
This RFC documents the architecture as built. SELECT (#056), DML (#057), JOINs (#058), and subqueries + non-recursive CTEs (#059) are landed. Aggregates / GROUP BY (#060), window functions (#061), and set operations (#062) are tracked separately.
2. Design principles
- *ecoupled from storage.*The planner does not know
Record,Schema, orprost. ARowSourcetrait abstracts every read; aMutableRowSourcesuper-trait abstracts every write. The default in-memory implementation (InMemoryTables) is what tests use; a realRowSource for Recordadapter is a future ticket and is the only piece that needs to know about the substrate.
- *ecoded rows, not wire bytes.*Rows flowing through operators are
Vec<Value>whereValueis an enum of the eight types kdb-next ships in v0.1 (Text,Int64,Uint64,Bool,Bytes,Timestamp,Float64,Date) plusNull. Encoding/decoding is the adapter's job — the executor never touches Protobuf.
- *ocal schema, not
kdb_record::Schema.*A smallTableSchema/Columnpair lives in the planner crate. It mirrors the relevant fields ofkdb_record::Schemabut the planner can be compiled and tested standalone. Translating between the two belongs in the adapter.
- *ne node per logical operator.*No sharing of
Planfor DML —DmlPlanis a separate enum withInsert/Update/Deletevariants because the executor needs a mutable row source and returns a row count + optionalRETURNINGprojection.
- *euristic, not cost
based.*v1 is rulebased: column referencesresolve to positional indices at build time, and the executor walks the tree top-down with no rewrites. Pushdown to indexes is deferred to a follow-up RFC once the adapter exists.
3. Crate layout
crates/kdb-planner/
└── src/
├── lib.rs # public re-exports
├── value.rs # DataType + Value
├── schema.rs # TableSchema, Column
├── plan.rs # Plan, DmlPlan, Expr, JoinType, AggFn, …
├── build.rs # AST → Plan/DmlPlan compiler
├── eval.rs # Expr evaluator (with optional sub-source for subqueries)
├── exec.rs # Plan walker + RowSource trait + InMemoryTables
├── aggregate.rs # GROUP BY / aggregate execution (#060 wip)
└── tests.rs # 72 unit tests across all of the above4. Core types
Value and DataType (value.rs)
pub enum DataType { Text, Int64, Uint64, Bool, Bytes, Timestamp, Float64, Date }
pub enum Value {
Null,
Text(String),
Int64(i64), Uint64(u64),
Bool(bool),
Bytes(Vec<u8>),
Timestamp(i64), // epoch ms UTC
Float64(f64),
Date(i32), // days since 1970-01-01
}A free function cmp_values(&Value, &Value) -> Ordering defines the SQL-style ordering used by Sort (NULLs first/last per SortKey).
Plan (plan.rs)
pub enum Plan {
TableScan { table, output_columns },
SubqueryScan { plan, output_columns },
NestedLoopJoin { left, right, on, join_type, left_width, right_width },
Filter { predicate, child },
Sort { keys, child },
Limit { limit, offset, child },
Project { items, child },
Aggregate { group_keys, aggs, child },
}Column references inside Expr are *ositional*(Expr::Column(usize)), resolved at build time against the child's output column list. Joins flatten left then right and downstream operators index into that concatenation.
DmlPlan (plan.rs)
pub enum DmlPlan {
Insert { table, table_columns, column_indices, values: Vec<Vec<Expr>>, returning },
Update { table, table_columns, assignments: Vec<(usize, Expr)>, predicate, returning },
Delete { table, table_columns, predicate, returning },
}table_columns is the full storage-order layout, so the executor can build a positionally correct Vec<Value> even when the SQL only mentions a subset of columns.
RowSource (exec.rs)
pub trait RowSource {
fn scan(&self, table: &str) -> ExecResult<Vec<Row>>;
}
pub trait MutableRowSource: RowSource {
fn append(&mut self, table: &str, row: Row) -> ExecResult<()>;
fn replace_all(&mut self, table: &str, rows: Vec<Row>) -> ExecResult<()>;
}InMemoryTables is the default implementation. UPDATE/DELETE are currently full rewrites via replace_all; an indexed in-place mutation is a future optimization.
5. Build / compile flow
Statement (kdb-sql AST)
│
▼
build() / build_with_catalog() / build_dml()
│
▼
Plan / DmlPlan (positionally resolved)
│
▼
execute() / execute_dml(source: &dyn RowSource)
│
▼
Vec<Row> / DmlResult { rows_affected, returning }build_with_catalog takes a Catalog = HashMap<String, TableSchema> to resolve multi-table FROM clauses, JOINs, and subqueries against more than one table. build is the single-table convenience wrapper.
Errors are typed (PlanError, EvalError, ExecError) and bubble up via thiserror::Error. There is no panic path in normal operation; unknown table / column / type-mismatch all return Err.
6. Execution semantics
- *hree-valued logic*for
WHERE/ON: NULL predicates filter therow out (
predicate_matchesreturnsfalsefor NULL). - *uter joins*NULL-pad the unmatched side using
right_width/left_widthfrom the join node. - *ubqueries*are non-correlated in v1 and materialized once per
IN (...)/EXISTS (...)evaluation. Correlated subqueries need a per-row context and are out of scope. - *TEs*are non-recursive and inlined as
SubqueryScanat buildtime. Recursive CTEs are deferred.
- *ort stability* comparisons go through
cmp_values; ties keepsource order (Rust's
sort_byis stable). - *ivision by zero*in arithmetic yields
Value::Null, matchingPostgreSQL's behavior in expressions but *ot*its
division_by_zeroerror — a future ticket will add a strict mode.
7. Test coverage
72 unit tests in tests.rs covering:
- SELECT: column projection, WHERE comparisons, ORDER BY, LIMIT/OFFSET,
ANDORNOT, IS NULL, IN list, BETWEEN, arithmetic.
- DML: INSERT (full row, partial columns, multi-row), UPDATE
(assignments, with/without WHERE, RETURNING), DELETE.
- JOINs: INNER / LEFT / RIGHT / CROSS, multi-table FROM, joins inside
subqueries.
- Subqueries:
IN (subquery),NOT IN (subquery),EXISTS,NOT EXISTS, derived-table FROM. - CTEs: single CTE, multiple CTEs, CTE referenced in FROM and in JOIN.
- Errors: unknown table, unknown column, type mismatch.
cargo test -p kdb-planner is green at the time of writing.
8. What's intentionally not here
- *RowSource for Record
.** The adapter that bridgeskdb-planner`and
kdb-recordlives in a follow-up ticket. The boundary is deliberately narrow: a singleimplblock plus a small schema-translation helper. - *ndex pushdown.*Filters that match a primary key or unique index
should turn into
PrimaryKeyLookup/IndexLookupoperators. Not in v1; the executor scans every row. - *ggregate finalization (#060).*
aggregate.rsexists withexecute_aggregate()and theAggFn/AggKindtypes are wired intoPlan::Aggregate, but the build path fromGROUP BYAST → planner is partial. Closing #060 will document the final shape. - *indow functions, set operations, prepared statements, HTTP/JSON
gateway.*Tickets #061, #062, #093, #091. Each gets its own RFC if it grows beyond a single-ticket implementation.
- *ost-based optimizer.*Out of v1 entirely.
9. Open questions
- *trict vs. lenient arithmetic.*Should
1/0raise an error orreturn NULL? Current behavior is NULL. PostgreSQL raises. Decide when wiring
kdb-plannerinto the HTTP gateway, where client error reporting matters. - *dapter ownership.*Does
RowSource for Recordlive insidekdb-planner(gated behind a feature flag) or in a third crate (kdb-sql-runtime) that depends on both? Leaning toward a third crate to keepkdb-plannersubstrate-free. - *chema translation.*Today the planner has its own
Column/TableSchema. The adapter will need a one-way conversion fromkdb_record::Schema. Worth aTryFromimpl in a shared module.
10. References
- RFC
001 §9 — kdbnext phase plan kdb-sqlcrate (#055) — sqlparser-rs wrapper,Statement,analyze,TenantRewriter- Tickets #056 / #057 / #058 / #059 — done
- Tickets #060 / #061 / #062 — pending