Ai memory
AI Memory — Long-term Agent Memory
- *rea:*Intelligence
- *ath:*
services/ai/memory - *ind:*Self-hosted episodic memory + semantic recall (per user/tenant)
- *tatus:*v0.0.2 —
#008rescoped + shipped 20260511.local-ONNX path moved toServiceEmbeddernow speaks the realservices/ai/embed/v1/embed/textHTTP API (was a stub returningErrEmbedServiceUnavailable).[recall.backend] = "service"enables semantic recall over the embed sector;InlineEmbedderremains as the deterministic test fallback. Original BGE#012(gated by embed#009).
Role in the stack
memory is the foundation for persistent agents. Without a memory service, every product reinvents continuity (Kode CURRENT.md, Kortex indexing, agent ad-hoc state) and lose context across sessions. This sector fills that gap.
It is the Koder analog of Mem0MemGPTLetta and Anthropic's Memory Tool — selfhosted, tenantisolated, integrated with services/ai/embed for semantic recall and infra/data/kdb-{doc,vector} for storage.
Features (v1 target)
- Episodic write/read API
- Semantic recall (top-K + score)
- Tenant isolation (handler-enforced)
- Forgetting policies: TTL, decay, explicit erase, GDPR sweep
- Pluggable embedding backend (inline ONNX or
services/ai/embed)
Primary couplings
| Consumer | Relationship |
|---|---|
services/ai/agents |
Reads pre |
services/ai/kode |
Replaces ad-hoc session continuity files |
services/ai/runtime |
Optional middleware read-through |
services/ai/embed |
Vector backbone |
infra/data/kdb-doc |
Episode metadata |
infra/data/kdb-vector |
Vector index |
RFC and bootstrap
- RFC:
memory-RFC-001-foundations.kmd— *ccepted*20260509 - Bootstrap ticket:
services/ai/backlog/done/129-memory-bootstrap.md - Implementation tickets:
done/:001(OpenAPI),002(skeleton),003(tenant scope + audit),004(embed integration),005(forgettingTTLdecayeraseGDPR),006(real JWT viaengines/sdk/go/auth.JWKSValidator),008(rescoped —05ServiceEmbedderHTTP client toservices/ai/embedshipped 202611; BGE local ONNX moved toprune + quarantine),#012),009(decay auto010(bench harness inline+memory)pending/:007(kdbdoc + kdbvector swap — blocked on kdb#710-#713),011(bench at 100K against kdb+BGE — blocked on007+012),012(BGE local ONNX embedder — split from#008, blocked onservices/ai/embed#009ONNX runtime)
Recent changes
- *026
0511 (#008 rescope + ship)*—line stub returningServiceEmbedderrewritten from a 3ErrEmbedServiceUnavailableto a full HTTP client ofservices/ai/embed/v1/embed/text(~165 lines). Options pattern:WithEmbedderBearer,WithEmbedderIntent,WithEmbedderModel,WithEmbedderTimeout. Dim verification on every response prevents silent contract drift.[recall.backend] = "service"now consumes the new config:embed_url,embed_token,embed_model,embed_dim,embed_timeout_ms. 10 tests SE1SE10 (happy, 503, 401, bearer, dim mismatch, count mismatch, empty endpoint, empty inputs, model override, timeout). InlineEmbedder retained as test/bench fallback. Closes the AI crosssector loop: memory now joins cache#003semantic + embed#007HTTP cache as a consumer of an AI sibling sector. Original BGElocalONNX path split off as#012(blocked-by embed#009).
Selfhostedfirst analysis (5 gates)
| Gate | Status | Notes |
|---|---|---|
| G1 Feature parity | partial | Episodic + semantic + 4 forgetting policies (TTLdecayeraseGDPR) + decay auto-prune w quarantine; reflection not yet |
| G2 Performance | partial | Inline+memory @ 10 K: p50 5.3 ms, p99 10.5 ms (≈10× under target); 100 K + kdb-vector + BGE pending 007+008 |
| G3 Stability | pending | Pre-MVP |
| G4 Capability | partial | Episodic + semantic + forgetting (incl. auto-prune); no fancy reflection yet |
| G5 Critical-path readiness | pending | Pre-MVP; agents/kode can adopt once v1 ships |
Performance baseline
Latest bench numbers (and per-metric trend) live in registries/perf-baseline.md. Re-run via make bench-full from services/ai/memory/backend/.