Kdb RFC 004 sql planner and executor

RFC004 — kdbnext SQL planner & executor

Field Value
Status *mplemented*(retroactive)
Author(s) Rodrigo (with Claude as scribe)
Date 20260412
Target module platform/kdb/next/crates/kdb-planner
Related RFC001 §9, `kdbsql` (#055), tickets #056–#059 (done), #060–#062 (pending)

1. Summary

kdb-sql (#055) gives us a sqlparser-rs-backed AST for PostgreSQL SQL. kdb-planner is the crate that compiles that AST into a tree of physical operators (Plan / DmlPlan) and executes it against a pluggable RowSource. The crate is *ully decoupled from kdb-record* it operates on its own decoded Value type and a Vec<Value> row, so it can be unit-tested without touching async, the substrate, or any Protobuf machinery.

This RFC documents the architecture as built. SELECT (#056), DML (#057), JOINs (#058), and subqueries + non-recursive CTEs (#059) are landed. Aggregates / GROUP BY (#060), window functions (#061), and set operations (#062) are tracked separately.

2. Design principles

  1. *ecoupled from storage.*The planner does not know Record,

    Schema, or prost. A RowSource trait abstracts every read; a MutableRowSource super-trait abstracts every write. The default in-memory implementation (InMemoryTables) is what tests use; a real RowSource for Record adapter is a future ticket and is the only piece that needs to know about the substrate.

  1. *ecoded rows, not wire bytes.*Rows flowing through operators are

    Vec<Value> where Value is an enum of the eight types kdb-next ships in v0.1 (Text, Int64, Uint64, Bool, Bytes, Timestamp, Float64, Date) plus Null. Encoding/decoding is the adapter's job — the executor never touches Protobuf.

  1. *ocal schema, not kdb_record::Schema.*A small TableSchema /

    Column pair lives in the planner crate. It mirrors the relevant fields of kdb_record::Schema but the planner can be compiled and tested standalone. Translating between the two belongs in the adapter.

  1. *ne node per logical operator.*No sharing of Plan for DML —

    DmlPlan is a separate enum with Insert / Update / Delete variants because the executor needs a mutable row source and returns a row count + optional RETURNING projection.

  1. *euristic, not costbased.*v1 is rulebased: column references

    resolve to positional indices at build time, and the executor walks the tree top-down with no rewrites. Pushdown to indexes is deferred to a follow-up RFC once the adapter exists.

3. Crate layout

crates/kdb-planner/
└── src/
    ├── lib.rs        # public re-exports
    ├── value.rs      # DataType + Value
    ├── schema.rs     # TableSchema, Column
    ├── plan.rs       # Plan, DmlPlan, Expr, JoinType, AggFn, …
    ├── build.rs      # AST → Plan/DmlPlan compiler
    ├── eval.rs       # Expr evaluator (with optional sub-source for subqueries)
    ├── exec.rs       # Plan walker + RowSource trait + InMemoryTables
    ├── aggregate.rs  # GROUP BY / aggregate execution (#060 wip)
    └── tests.rs      # 72 unit tests across all of the above

4. Core types

Value and DataType (value.rs)

pub enum DataType { Text, Int64, Uint64, Bool, Bytes, Timestamp, Float64, Date }

pub enum Value {
    Null,
    Text(String),
    Int64(i64), Uint64(u64),
    Bool(bool),
    Bytes(Vec<u8>),
    Timestamp(i64),  // epoch ms UTC
    Float64(f64),
    Date(i32),       // days since 1970-01-01
}

A free function cmp_values(&Value, &Value) -> Ordering defines the SQL-style ordering used by Sort (NULLs first/last per SortKey).

Plan (plan.rs)

pub enum Plan {
    TableScan       { table, output_columns },
    SubqueryScan    { plan, output_columns },
    NestedLoopJoin  { left, right, on, join_type, left_width, right_width },
    Filter          { predicate, child },
    Sort            { keys, child },
    Limit           { limit, offset, child },
    Project         { items, child },
    Aggregate       { group_keys, aggs, child },
}

Column references inside Expr are *ositional*(Expr::Column(usize)), resolved at build time against the child's output column list. Joins flatten left then right and downstream operators index into that concatenation.

DmlPlan (plan.rs)

pub enum DmlPlan {
    Insert { table, table_columns, column_indices, values: Vec<Vec<Expr>>, returning },
    Update { table, table_columns, assignments: Vec<(usize, Expr)>, predicate, returning },
    Delete { table, table_columns, predicate, returning },
}

table_columns is the full storage-order layout, so the executor can build a positionally correct Vec<Value> even when the SQL only mentions a subset of columns.

RowSource (exec.rs)

pub trait RowSource {
    fn scan(&self, table: &str) -> ExecResult<Vec<Row>>;
}

pub trait MutableRowSource: RowSource {
    fn append(&mut self, table: &str, row: Row) -> ExecResult<()>;
    fn replace_all(&mut self, table: &str, rows: Vec<Row>) -> ExecResult<()>;
}

InMemoryTables is the default implementation. UPDATE/DELETE are currently full rewrites via replace_all; an indexed in-place mutation is a future optimization.

5. Build / compile flow

Statement (kdb-sql AST)
        │
        ▼
build()  / build_with_catalog()  / build_dml()
        │
        ▼
Plan / DmlPlan       (positionally resolved)
        │
        ▼
execute() / execute_dml(source: &dyn RowSource)
        │
        ▼
Vec<Row>  /  DmlResult { rows_affected, returning }

build_with_catalog takes a Catalog = HashMap<String, TableSchema> to resolve multi-table FROM clauses, JOINs, and subqueries against more than one table. build is the single-table convenience wrapper.

Errors are typed (PlanError, EvalError, ExecError) and bubble up via thiserror::Error. There is no panic path in normal operation; unknown table / column / type-mismatch all return Err.

6. Execution semantics

  • *hree-valued logic*for WHERE / ON: NULL predicates filter the

    row out (predicate_matches returns false for NULL).

  • *uter joins*NULL-pad the unmatched side using right_width /

    left_width from the join node.

  • *ubqueries*are non-correlated in v1 and materialized once per

    IN (...) / EXISTS (...) evaluation. Correlated subqueries need a per-row context and are out of scope.

  • *TEs*are non-recursive and inlined as SubqueryScan at build

    time. Recursive CTEs are deferred.

  • *ort stability* comparisons go through cmp_values; ties keep

    source order (Rust's sort_by is stable).

  • *ivision by zero*in arithmetic yields Value::Null, matching

    PostgreSQL's behavior in expressions but *ot*its division_by_zero error — a future ticket will add a strict mode.

7. Test coverage

72 unit tests in tests.rs covering:

  • SELECT: column projection, WHERE comparisons, ORDER BY, LIMIT/OFFSET,

    ANDORNOT, IS NULL, IN list, BETWEEN, arithmetic.

  • DML: INSERT (full row, partial columns, multi-row), UPDATE

    (assignments, with/without WHERE, RETURNING), DELETE.

  • JOINs: INNER / LEFT / RIGHT / CROSS, multi-table FROM, joins inside

    subqueries.

  • Subqueries: IN (subquery), NOT IN (subquery), EXISTS,

    NOT EXISTS, derived-table FROM.

  • CTEs: single CTE, multiple CTEs, CTE referenced in FROM and in JOIN.
  • Errors: unknown table, unknown column, type mismatch.

cargo test -p kdb-planner is green at the time of writing.

8. What's intentionally not here

  • *RowSource for Record.** The adapter that bridges kdb-planner`

    and kdb-record lives in a follow-up ticket. The boundary is deliberately narrow: a single impl block plus a small schema-translation helper.

  • *ndex pushdown.*Filters that match a primary key or unique index

    should turn into PrimaryKeyLookup / IndexLookup operators. Not in v1; the executor scans every row.

  • *ggregate finalization (#060).*aggregate.rs exists with

    execute_aggregate() and the AggFn/AggKind types are wired into Plan::Aggregate, but the build path from GROUP BY AST → planner is partial. Closing #060 will document the final shape.

  • *indow functions, set operations, prepared statements, HTTP/JSON

    gateway.*Tickets #061, #062, #093, #091. Each gets its own RFC if it grows beyond a single-ticket implementation.

  • *ost-based optimizer.*Out of v1 entirely.

9. Open questions

  1. *trict vs. lenient arithmetic.*Should 1/0 raise an error or

    return NULL? Current behavior is NULL. PostgreSQL raises. Decide when wiring kdb-planner into the HTTP gateway, where client error reporting matters.

  2. *dapter ownership.*Does RowSource for Record live inside

    kdb-planner (gated behind a feature flag) or in a third crate (kdb-sql-runtime) that depends on both? Leaning toward a third crate to keep kdb-planner substrate-free.

  3. *chema translation.*Today the planner has its own

    Column/TableSchema. The adapter will need a one-way conversion from kdb_record::Schema. Worth a TryFrom impl in a shared module.

10. References

  • RFC001 §9 — kdbnext phase plan
  • kdb-sql crate (#055) — sqlparser-rs wrapper, Statement, analyze,

    TenantRewriter

  • Tickets #056 / #057 / #058 / #059 — done
  • Tickets #060 / #061 / #062 — pending

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/kdb-RFC-004-sql-planner-and-executor.md