Kdb RFC 006 explicit er modeling
RFC006 — Explicit EntityRelationship Modeling in kdb-next
| Field | Value |
|---|---|
| Status | *ccepted + Implemented* |
| Author(s) | Rodrigo (with Claude as scribe) |
| Date | 2026 |
| Completed | 2026 |
| Target module | platform/kdb/next/crates/kdb-er (new) |
| Related | RFC |
1. Summary
kdbnext is designed as a multimodel database (RFC-001 §3): relational, objectoriented, document, keyvalue, and entity-relationship. The first four are either implemented or have clear roadmaps. This RFC defines the *ntityrelationship (ER) model*as a firstclass construct in kdb-next.
An ER model consists of *ntities*(named objects with typed attributes), *elationships*(cardinality-annotated links between entities), and optionally *nheritance*(ISA hierarchies). Today these concepts are expressible in SQL but the mapping is implicit. This RFC introduces:
- An ER definition language (
*.er.kmd) that can be stored alongsideapplication code and tracked in version control.
- A
kdb-ercrate that parses.er.kmdfiles into anErModelIR andgenerates the equivalent
CREATE TABLE/ALTER TABLE … ADD CONSTRAINTSQL for execution by the existingkdb-planner. - Integration with the migration system (future RFC) so that ER model
diffs produce migration scripts automatically.
2. Motivation
2.1 The current situation
To model a "User has many Posts" relationship today a developer writes:
CREATE TABLE users (id UUID PRIMARY KEY, email TEXT NOT NULL UNIQUE);
CREATE TABLE posts (id UUID PRIMARY KEY, title TEXT NOT NULL,
author_id UUID NOT NULL REFERENCES users(id));This is correct but loses semantic information: there is no way to know from the schema alone whether author_id is a 1:N ownership relationship, an optional attribution, or part of a composite M:N link without reading additional documentation.
2.2 What ER modeling adds
- *xplicit cardinality*—
1..1,0..1,1..*,0..*captured in themodel, not just enforced by a NOT NULL constraint.
- *utomatic junction tables*— M:N relationships generate the
junction table and its FK columns deterministically.
- *emantic validation*— the ER layer can reject circular mandatory
dependencies and other modeling errors before generating any SQL.
- *ound
trip fidelity*— an existing SQL schema can be reverseengineeredinto an ER model (future:
kdb-er introspect).
3. ER Definition Language
Files use the extension .er.kmd and contain one or more entity, relationship, and enum blocks.
3.1 Entity
entity <Name> {
<attr>: <type> [constraints...]
...
}*ype aliases*(maps to SQL types):
| ER type | SQL type |
|---|---|
id |
UUID DEFAULT gen_random_uuid() |
text |
TEXT |
text(n) |
VARCHAR(n) |
int |
BIGINT |
float |
DOUBLE PRECISION |
bool |
BOOLEAN |
date |
DATE |
timestamp |
TIMESTAMPTZ |
json |
JSONB |
bytes |
BYTEA |
<EnumName> |
<enum_name> (user-defined enum) |
*ttribute constraints*(combinable):
| Keyword | Meaning |
|---|---|
pk |
PRIMARY KEY |
unique |
UNIQUE |
not null |
NOT NULL |
null |
nullable (default) |
default <expr> |
DEFAULT <expr> |
check (<expr>) |
CHECK (<expr>) |
index |
Creates an index on this column |
*xample:*
entity User {
id: id pk
email: text unique not null
name: text not null
role: UserRole not null default 'member'
created_at: timestamp not null default now()
deleted_at: timestamp null
}Generated SQL:
CREATE TABLE users (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
email TEXT NOT NULL UNIQUE,
name TEXT NOT NULL,
role user_role NOT NULL DEFAULT 'member',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
deleted_at TIMESTAMPTZ
);*aming convention:*entity names are PascalCase; generated table names are their snake_case plural equivalent (User → users, BlogPost → blog_posts). The plural can be overridden with @table "posts" at the top of the entity block.
3.2 Relationship
relationship <Name> {
<Entity1> <card1>..<card2> <Entity2>
[through: <JunctionTable>(<col1>, <col2>)]
[fk: <owning_entity>.<attr> -> <referenced_entity>.<attr>]
[on_delete: cascade | restrict | set_null | set_default | no_action]
[on_update: cascade | restrict | set_null | set_default | no_action]
}*ardinality notation:*
| Notation | Meaning |
|---|---|
1..1 |
exactly one |
0..1 |
zero or one (optional) |
1..* |
one or more |
0..* |
zero or more |
*ode-generation rules:*
| Pattern | Generated structure |
|---|---|
A 1..* -- B |
FK column on b → a; NOT NULL |
A 0..* -- B |
FK column on b → a; nullable |
A 0..1 -- B |
FK on b with UNIQUE constraint |
A 1..1 -- B |
FK on b with NOT NULL + UNIQUE |
A *..* B |
Junction table (required through: clause) |
*xample — 1:N:*
relationship AuthorsPost {
User 1..* -- Post
on_delete: cascade
}Generated: adds author_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE to the posts table (FK column name: <snake_case entity name>_id).
*xample — M:N with explicit junction:*
relationship PostTags {
Post *..* Tag
through: post_tags(post_id, tag_id)
}Generated:
CREATE TABLE post_tags (
post_id UUID NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
tag_id UUID NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
PRIMARY KEY (post_id, tag_id)
);3.3 Enum
enum <Name> {
<Variant1>
<Variant2>
...
}Generated SQL: CREATE TYPE <name> AS ENUM ('<v1>', '<v2>', ...).
3.4 ISA / Inheritance
entity Admin extends User {
department: text not null
}Generation strategy (configurable per entity):
- *able
pertype*(default):adminstable with FK tousers(id)+ extra columns. - *able
perhierarchy* all columns inusers+typediscriminator column. - *nline* no separate table; extra columns merged into parent table.
3.5 Annotations
Annotations beginning with @ provide hints to the code generator:
@table "custom_table_name" -- override generated table name
@schema "app" -- emit inside a specific schema
@immutable -- no UPDATE allowed (enforced by trigger)
@audit -- add created_at / updated_at / created_by automatically4. Architecture
4.1 kdb-er crate
kdb-er/
src/
lib.rs -- public API: parse(), generate_sql(), diff()
parser.rs -- hand-written recursive-descent parser for .er.kmd
ir.rs -- ErModel, Entity, Attribute, Relationship, ErEnum
gen.rs -- ErModel → Vec<String> (CREATE TABLE SQL statements)
diff.rs -- (ErModel, ErModel) → Vec<AlterStatement>
validate.rs -- semantic checks (duplicate names, circular deps, etc.)
naming.rs -- PascalCase → snake_case, pluralisation tablekdb-er has no async, no runtime, no TiKV dependency — it is a pure transformation library: ER source text → SQL text. This keeps it fast and trivially testable.
4.2 Integration with kdb-planner
Generated SQL strings are passed through kdb-sql::parse() then kdb-planner::build_dml() and executed against a MutableRowSource, exactly as if a user had typed the DDL. No new execution paths needed.
4.3 Integration with the catalog
Each entity is registered in the catalog as a table through the normal Catalog::ensure_table path. The ER model is *ot*stored in the catalog in v0.1; roundtrip introspection (`kdber introspect`) is deferred to a future RFC.
4.4 CLI
# Validate an ER file
kdb-er validate schema.er.kmd
# Print generated SQL
kdb-er generate schema.er.kmd
# Apply to a running kdb instance
kdb-er apply --addr kdb-next-dev:18090 --tenant koder schema.er.kmd
# Diff two ER models (migration preview)
kdb-er diff old.er.kmd new.er.kmdThe kdb-er CLI binary lives in kdb-er/src/bin/kdb-er.rs.
5. Validation rules
The validator (validate.rs) enforces:
- Entity names are unique within a file.
- Relationship endpoints reference declared entities.
- M:N relationships must have an explicit
through:clause. - No entity may have two FK columns with the same name (ambiguous
reverse relationships require explicit
fk:overrides). - Circular mandatory dependencies (A 1..1 → B, B 1..1 → A) are rejected
with an error pointing to both entities.
- Attribute names are unique within an entity.
extendshierarchies are acyclic.
6. Scope and phasing
Phase 1 — Core (this RFC, immediate)
kdb-ercrate: parser, IR, SQL generator for entities + 1:N / M:N relationships- Entity →
CREATE TABLE - Relationship → FK column or junction table
- Enum →
CREATE TYPE … AS ENUM - Validation rules 1–6
- CLI:
validate+generate - 30+ unit tests (parser round-trips, SQL generation, validation errors)
Phase 2 — Integration
kdb-er apply— live DDL execution against kdb gateway@auditannotation (auto-adds timestamps)- CLI:
apply+diff
Phase 3 — Advanced (future RFCs)
- ISA / inheritance (
extends) kdb-er introspect— reverse-engineer SQL schema → ER model- Migration integration (ER diff → versioned migration scripts)
- Visual editor (separate tool)
7. Alternatives considered
| Option | Verdict |
|---|---|
| PlantUML ER syntax | Well |
| DBML | Good tooling but external ecosystem; adds a dependency |
| mermaid ER | Markdown-friendly but limited cardinality semantics |
| Prisma schema | Strong prior art; too opinionated about ORM layer |
.er.kmd (this RFC) |
Custom but minimal, fits KMD ecosystem, fully controlled |
8. Open questions
- *luralisation* English pluralisation (
User→users) isstraightforward; irregular forms (
Person→people) need a lookup table. Proposed: ship a small built-in table; override with@table.
- *K column naming* Default
<ref_entity_snake>_id; override withexplicit
fk:clause. Should the default be configurable per-file?
- *omposite PKs* Not covered in this RFC. Entity PKs are assumed
single
column UUID. Multicolumn PKs to be addressed in Phase 2.
- *rigger-based immutability*for
@immutable: requires thekdb-plannertrigger support (#115) to be implemented first.
9. Acceptance criteria
kdb-er validaterejects all 7 validation errors with clear messageskdb-er generateproduces valid PostgreSQL DDL for a schema with5+ entities, 2+ 1:N relationships, and 1 M:N relationship
- Generated DDL round-trips:
parse → generate → parse → generateisidempotent
cargo test -p kdb-erpasses with ≥ 30 tests, zero warnings- Ticket #049 closed
10. Implementation record (20260414)
All acceptance criteria met. Phase 1 and Phase 2 are complete.
Deliverables
| Crate | File | Description |
|---|---|---|
kdb-er |
src/parser.rs |
Hand.er.kmd |
kdb-er |
src/ir.rs |
ErModel, Entity, Attribute, Relationship, ErEnum IR types |
kdb-er |
src/gen.rs |
ErModel → Vec<String> (CREATE TABLE / FK SQL generation) |
kdb-er |
src/validate.rs |
Semantic validation — rules 1–7 |
kdb-er |
src/naming.rs |
PascalCase → snake_case, pluralisation with built-in lookup table |
kdb-er |
src/diff.rs |
(ErModel, ErModel) → ModelDiff; additive migration_sql() + summary() |
kdb-er |
src/lib.rs |
Public API: parse(), validate(), generate_sql(), diff() + re-exports |
kdb-er |
src/bin/kdb-er.rs |
CLI: validate, generate, diff (with --sql / --semicolon), apply (HTTP POST to /v1/sql) |
kdb-gateway |
src/sql_service.rs |
dev_schemas in SqlState — CREATE TABLE without real adapter registers schema; subsequent queries resolve column types from dev_schemas |
kdb-cli |
src/lib.rs |
CatalogClientHandle::ensure_table() + er_entity_to_proto_schema() conversion |
kdb-cli |
src/main.rs |
kdbctl er apply subcommand → gRPC Catalog.EnsureTable per entity |
kdb-cli |
tests/er_apply.rs |
4 integration tests: column mapping, register all entities, idempotency, tenant isolation |
Test counts (final)
| Crate | Tests | Notes |
|---|---|---|
kdb-er |
65+ | parser, IR, SQL gen, validation, naming, diff (18 diff tests) |
kdb-gateway |
50 | includes 2 new CREATE TABLE dev-mode tests |
kdb-cli |
8 | 4 catalogfingerprint + 4 erapply integration tests |
Deferred to Phase 3
- ISA / inheritance (
extends) — ticket #115 (trigger support) is a prerequisite kdb-er introspect— reverse-engineer SQL schema → ER model- Migration integration — ER diff → versioned migration scripts
- Visual editor