Ai modelreg

AI Modelreg — Model Registry

*rea:*Intelligence
*ath:*services/ai/modelreg
*ind:*Curated catalog of AI models (metadata, versions, benchmarks, curation tags)
*tatus:*v0.0.1 — sector bootstrapping (20260509)

Role in the stack

modelreg is the Koder-side answer to "what models can we choose from, and what do we know about each?". It owns model *etadata* identity, version, license, weights URL, benchmark scores, capability tags. It is the missing trust layer that lets runtime/, gateway/, and product UIs pick models with full information instead of falling back blindly on whatever Ollama happens to have cached.

It is the Koder analog of HuggingFace Hub and MLflow Model Registry — scoped to Koder needs (curation + benchmarks + product~~facing tags) and explicitly *eparated from runtime*(executor) and `kdb~~blob` (weight storage).

Boundary vs neighbors

services/ai/runtime consumes slug → weights URL + min-VRAM; modelreg never serves weights.
services/ai/eval produces benchmark rows; modelreg surfaces them per (model, version).
services/ai/training registers fine-tunes as new versions but does not store weights here.
infra/data/kdb-blob is the weight storage layer modelreg refers to.

Features (v1 target)

CRUD API with multi-axis search (tag, license, family, modality, maxvram, mincontext)
Model + version + tag entities backed by kdb-doc
Curation flow: 2~~curator approval + benchmark gate for `koder~~curated`
Eval integration: ingest benchmark rows, query cross-model
SDK helper engines/sdk/go/modelreg for runtime + gateway
Cache invalidation via SSE event stream
Web catalog UI at modelreg.koder.dev (deferred to v0.2)

Primary couplings

Producer	Relationship
`services/ai/eval`	Pushes benchmark results per (model, version)
Operators	Apply curation tags via Koder ID curator role

Consumer	Relationship
`services/ai/runtime`	Slug → weights URL resolution
`services/ai/gateway`	Per-tenant default routing from curated set
`services/ai/training`	Registers new versions (fine-tunes)
Product UIs (kode, agents, kortex)	Picker dropdowns scoped to curated tags

RFC and bootstrap

RFC: modelreg-RFC-001-foundations.kmd — *ccepted*20260509
Bootstrap ticket: services/ai/backlog/done/128-modelreg-bootstrap.md
Implementation tickets: services/ai/modelreg/backlog/pending/{001..005}

Selfhostedfirst analysis (5 gates)

Gate	Status	Notes
G1 Feature parity	pending	Skeleton phase; HF Hub schema subset chosen pragmatically
G2 Performance	pending	Read-heavy; query latency target p95 < 50ms
G3 Stability	pending	Pre-MVP
G4 Capability	pending	Hosting weights explicitly outofscope (kdb-blob)
G5 Critical-path readiness	pending	Unblocks informed model choice in runtimegatewaypicker UIs