Kdb RFC 003 fingerprint history
RFC-003 — Schema fingerprint history
| Field | Value |
|---|---|
| Status | *ccepted* |
| Author(s) | Rodrigo (with Claude as scribe) |
| Date | 2026 |
| Target module | platform/kdb/next/crates/kdb-record + kdb-gateway + kdb-cli |
| Related | RFC |
1. Summary
With backlog #045 done, the current schema fingerprint of any table is readable through kdbctl catalog fingerprint and the LookupTableResponse.fingerprint gRPC field. What the catalog does *ot*record is *istory* the sequence of fingerprints a table has had across successive migrations. This RFC proposes an appendonly pertable fingerprint log, a new streaming RPC Catalog.GetFingerprintHistory, and a new CLI subcommand kdbctl catalog history that together let an operator answer:
"When did the
acme.usersschema last change, and what fingerprint did it have before?"
without dropping to git blame on schema source files or scanning external migration logs.
2. Motivation
Three use cases, called out but deferred in #045:
- *orensics.*When a production query starts failing with a
fingerprint mismatch, operators want to know when the schema changed and whether the change was intended. A timestamp + actor in the catalog answers that directly.
- *ersion pinning for pre-compiled clients.*A client binary
that pins a fingerprint at build time needs to know whether the current fingerprint is compatible with the pinned one. Without history, it can only check equality. With history, the client can ask "has my pinned fingerprint ever been stamped on this table?" and refuse to talk to a table whose current fingerprint is unknown to it.
- *ollback decisions.*If a bad migration ships, operators can
look at the previous entry in the history log, read the previous schema shape, and decide whether rolling back is safe.
These are not hypothetical: #045 §"Use cases" lists each of them and explicitly defers them to this RFC.
3. Non-goals
- *oint
intime restore.*The fingerprint log is metadata; itdoes not store row data. Rolling back data to a past fingerprint is a separate problem (PITR) and is not in scope.
- *utomatic migration of stamped rows.*If a schema changes
from fingerprint
AtoB, the rows stamped withAremain stamped withA; this RFC does not propose rewriting them. Compatibility is the record layer's job, not the log's. - *ontent diffing.*The log stores schema snapshots, not
pretty
printed diffs. A future followup may layer a diff visualizer on top, but the log itself is raw. - *ulti-region consistency of the log.*Single region today,
per RFC
001 §3 nongoal list. When RFC001 Phase 6 (multiregion) lands, the log inherits whatever model the catalog chooses for multi-region.
4. Data model
4.1 Log shape
One append-only log per (tenant_id, table_id). Each entry:
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct FingerprintHistoryEntry {
/// Monotonically increasing generation number within this log.
/// Starts at 1 on the first ensure_table; bumps on every
/// schema change that produces a different fingerprint.
pub generation: u64,
/// The u64 fingerprint produced by Schema::fingerprint at the
/// time this entry was written.
pub fingerprint: u64,
/// Unix epoch microseconds when this entry was committed.
pub migrated_at: u64,
/// Principal that committed the change. Pulled from the
/// AuthContext scoped to the EnsureTable call. "<system>" if
/// the change was made by an internal process with no token.
pub migrated_by: String,
/// Canonical bytes of the schema at this generation.
/// Replaying `Schema::decode(&snapshot)` reproduces exactly
/// the schema whose fingerprint is `self.fingerprint`.
pub snapshot: Vec<u8>,
}4.2 Keyspace layout
Using the existing kdb-record keyspace convention:
catalog:fingerprint_history:<tenant_id>:<table_id>:<generation>tenant_id is a fixedwidth bigendian u64; table_id is a fixedwidth bigendian u32 (matching the existing proto types); generation is a fixedwidth bigendian u64. The layout preserves per-log ascending order under the natural lex order of KvCluster, so a forward scan over catalog:fingerprint_history:<tenant>:<table>: is already a chronological iteration. No secondary index needed.
4.3 Retention
Default retention: *nlimited* The expected volume is low (schema migrations are rare — tens per year per table in practice) and the value per entry is small (~1 KB including the snapshot). At 100M tenants × 10 tables × 10 entries per year, the worst-case annual footprint is on the order of tens of gigabytes cluster-wide — acceptable.
A per-tenant retention knob (max_history_entries) is a configurable follow-up but *ot shipped in v1* If an operator hits a volume concern before the knob lands, manual pruning (kdbctl catalog history --prune-older-than <timestamp>) is the escape hatch; see §8.
5. Write path
The append happens *nside the same transaction*as the schema change in Catalog::ensure_table. Pseudocode:
pub async fn ensure_table(&self, tenant: Tenant, schema: Schema)
-> RecordResult<(TableId, Schema, u64 )>
{
let mut tx = self.kv.begin_tx().await?;
let existing = self.lookup_table_tx(&mut tx, tenant, &schema.name).await?;
let next_fp = schema.fingerprint();
match existing {
Some((tid, current, cur_fp)) if cur_fp == next_fp => {
// Idempotent re-ensure, no history entry.
tx.rollback().await?;
Ok((tid, current, cur_fp))
}
Some((tid, _, _)) => {
// Schema changed. Append a new history entry.
let gen = self.next_generation(&mut tx, tenant, tid).await?;
let entry = FingerprintHistoryEntry { gen, fingerprint: next_fp, .. };
self.write_history_entry(&mut tx, tenant, tid, &entry).await?;
self.update_table_schema(&mut tx, tenant, tid, &schema).await?;
tx.commit().await?;
Ok((tid, schema, next_fp))
}
None => {
// First-time creation. Allocate table_id and write
// generation=1 entry.
let tid = self.allocate_table_id(&mut tx).await?;
let entry = FingerprintHistoryEntry { gen: 1, fingerprint: next_fp, .. };
self.write_history_entry(&mut tx, tenant, tid, &entry).await?;
self.insert_table(&mut tx, tenant, tid, &schema).await?;
tx.commit().await?;
Ok((tid, schema, next_fp))
}
}
}Invariants (enforced by the transaction's serializable isolation):
- *tomicity.*The history entry and the catalog row move
together. There is no window where the catalog thinks the table is at fingerprint
Bbut the history still ends atA. - *onotonicity.*
generationis strictly increasing per log.Two concurrent
ensure_tablecalls with different schemas against the same table serialize on the catalog row, so exactly one wins and the other re-enters the match arm above. - *dempotence.*A re-
ensure_tablewith an unchanged schemarolls back; no history entry is written. Without this, a busy client redoing
ensure_tableon every boot would flood the log.
6. Read path — gRPC
New RPC on kdb.v1.catalog.Catalog:
rpc GetFingerprintHistory(GetFingerprintHistoryRequest)
returns (stream FingerprintHistoryEntryProto);
message GetFingerprintHistoryRequest {
uint64 tenant_id = 1;
uint32 table_id = 2;
// Optional: minimum generation to return. 0 = from the start.
uint64 since_generation = 3;
// Optional: page size hint. 0 = server default (64).
uint32 limit = 4;
// Optional: return in descending order (most recent first).
// Default is ascending.
bool descending = 5;
}
message FingerprintHistoryEntryProto {
uint64 generation = 1;
uint64 fingerprint = 2;
uint64 migrated_at = 3; // unix microseconds
string migrated_by = 4;
kdb.v1.schema.Schema snapshot = 5;
}Streaming rather than unary because the log is unbounded in the long run. The server enforces a hard cap of 1024 entries per request regardless of limit; the client pages via since_generation.
Error mapping:
NOT_FOUND— tenant or table unknown.PERMISSION_DENIED— theAuthContextdoes not carry acatalog scope for this tenant.
INVALID_ARGUMENT—since_generationlarger than the currentmax generation by more than a sanity threshold (likely a bug).
7. Read path — CLI
New subcommand under kdbctl catalog:
kdbctl catalog history \
--tenant acme \
--table users \
[--since <generation|duration>] \
[--limit 64] \
[--desc] \
[--json]Default output (human):
gen fingerprint migrated_at migrated_by
--- ------------------- ------------------- -------------
1 0x3c7d91b6f2a8e0d4 2026-03-01 09:14:02 rodrigo@koder.dev
2 0x8a04f91e2b5c0d17 2026-03-18 15:42:11 cicd-bot
3 0xde1a44f0e77bb902 2026-04-09 11:07:55 rodrigo@koder.devWith --json, emit one entry per line (ndjson) so the output can be piped into jq or an alerting pipeline. The snapshot field is base64-encoded in the json form.
Resolving --since as a duration (--since 7d) is a local convenience: the CLI translates it to a generation by first calling GetFingerprintHistory with since_generation = 0,
limit = 1024, descending = true and finding the first entry older than now minus the duration. This keeps the server API agnostic to wall-clock time interpretation.
8. Pruning
Not v1. When the need lands, expose Catalog.PruneFingerprintHistory(tenant_id, table_id, keep_last N |
before_generation G) and a mirror kdbctl catalog history --prune. The log stays append-only from the caller's perspective; pruning is an explicit administrative operation, never automatic.
Pruning never touches generation numbers — if you prune entries 1..=5, the next ensure_table still writes generation N+1 where N is the highest pruned generation. This preserves the monotonicity invariant from §5.
9. Pre-existing tables
Tables that exist before this feature ships have *o history entries* Two options for how to handle them on first read:
*ption A (lazy backfill).*On the first ensure_table, LookupTable, or GetFingerprintHistory that touches a legacy table, synthesize a generation-1 entry with migrated_at = 0, migrated_by = "<pre-history>" and write it in a background task. The entry reflects the current schema, not the schema at whatever time the table was actually created — we don't know that.
*ption B (explicit migration).*Ship a kdbctl catalog migrate-legacy-history admin command. Idempotent; does nothing if history entries already exist.
*ecision: Option A.*Lazy backfill minimizes operator work and the semantic loss ("pre-history" entry) is explicit. The background task is cheap (one KV write per legacy table) and runs with the <system> principal.
10. Rollback
If a migration is reverted (schema goes from B back to A), the history log does *ot*rewrite. A new entry is appended with the same fingerprint as an earlier entry, meaning the log may contain [1:fp=A, 2:fp=B, 3:fp=A]. This is intentional:
- The log is append-only; overwriting entries would violate §5.
- Rollbacks are operationally meaningful and deserve to be
visible in history.
- Clients that care about "has this fingerprint ever been
stamped?" can scan for equality against any historical entry, not just the latest.
A future kdbctl catalog history may mark entries whose fingerprint duplicates an earlier one with a (rollback from
gen N) annotation; that is cosmetic and can ship after v1.
11. Storage impact
Per-entry size estimate:
- 8 bytes generation + 8 bytes fingerprint + 8 bytes timestamp
- ~64 bytes principal string (typical)
- ~500 bytes canonical schema snapshot (typical; varies with
column count)
- Keyspace overhead (~40 bytes per KV pair in TiKV)
Total: *650 bytes per entry*in the worst-case typical case.
At the RFC-001 §5 target scale of *00M tenants × 10 tables × 10 migrationsyear* steady-state footprint is *6.5 GByear cluster-wide* Negligible vs. the row data footprint it shares a substrate with.
This sits comfortably inside the RFC-001 §6.8 cardinality budget because the log is not a metric — it consumes KV storage, not Prometheus cardinality.
12. Implementation plan
Phase 1 — record layer (1 week)
- Add
FingerprintHistoryEntrytokdb-record/src/schema.rsalongside
Schema::fingerprint. - Extend
Catalog::ensure_tableper §5 to write a historyentry inside the transaction.
- Add `Catalog::getfingerprinthistory(tenant, table, since,
limit, desc)
returning aVecFingerprintHistoryEntry`. - Unit tests in
kdb-recordcovering: first-time creationwrites gen1, schema change writes gen2, idempotent re-ensure writes nothing, rollback writes a duplicate-fingerprint entry.
Phase 2 — wire + gateway (3 days)
- Add
GetFingerprintHistorytokdb.v1.catalog.proto. - Implement the streaming handler in
kdb-gateway/src/catalog_service.rs, wiring the 1024-entry hard cap, scope-based auth andsince_generationpaging. - Integration test in
kdb-gateway/tests/using the existingwith_in_process_cataloghelper pattern.
Phase 3 — kdb-cli (3 days)
- Add
CatalogClientHandle::get_fingerprint_historyinkdb-cli/src/lib.rsreturningVec<FingerprintHistoryEntry>. - Add
kdbctl catalog historysubcommand with--since,--limit,--desc,--json,--tenant,--table. - Integration test in
kdb-cli/tests/that seeds multipleschema versions and asserts the history comes back in order.
Phase 4 — legacy backfill + docs (2 days)
- Background task in the gateway that walks the tenant/table
catalog and synthesizes
<pre-history>entries for any table without history. - Extend
docs/technical/schema-fingerprints.mdwith a"History" section cross-linking this RFC.
- Update the
FingerprintLookupstruct inkdb-clito carryan optional
generationfield sokdbctl catalog fingerprint --verbosecan show the current generation too.
*otal* ~2–3 weeks of focused work. Splittable across phases.
13. Open questions
- *hould
--sinceaccept a fingerprint value*as analternative to a generation or duration? It is convenient for "show me everything after the fingerprint my client had pinned". Proposed: yes, add as a third shape in the CLI only (server API stays numeric).
- *o we expose the snapshot bytes in the CLI by default?*
They are large and hex-dumping them is useless. Proposed: suppress by default, expose behind
--with-snapshot. - *s the
migrated_byfield always derivable?*Calls thatarrive via the legacy HMAC gateway auth (
kdb-gateway/src/auth.rs) only carrytenant_idandrole, not a principal identity. Proposed: recordhmac:tenant=<id>as the principal for those calls and wait for Phase 2.2.d (see RFC-002 §5) to provide real identities via Koder ID JWT claims.
14. References
docs/rfcs/RFC-001-kdb-next-hyperscale-architecture.md— §6.6auth + rate limiting; §6.8 cardinality budget.
docs/technical/schema-fingerprints.md— public surface of thecurrent-fingerprint API (backlog #045).
docs/technical/auth.md— authentication layers (backlog #048).backlog/pending/051-rfc-schema-fingerprint-history.kmd— theparent ticket for this RFC.
crates/kdb-record/src/schema.rs— where theFingerprintHistoryEntrytype will live.crates/kdb-cli/src/lib.rs— where the CLI-side client willlive.
15. Decision log
| Date | Decision | Notes |
|---|---|---|
| 2026 |
Drafted | Claude scribe, awaiting Rodrigo review |
| 2026 |
Accepted | No structural changes from Draft. Option A (lazy backfill) confirmed. Implementation sub-tickets opened: #124–#127. |