AI Trace — Runtime Tracing
- *rea:*Intelligence
- *ath:*
services/ai/trace
- *ind:*OpenTelemetry-compatible runtime tracing for AI agents and chains
- *tatus:*v0.0.1 — sector bootstrapping (2026
0509)
Role in the stack
trace is the runtime half of AI observability — services/ai/eval is the offline half. Modern AI agents have many calls per response (LLM → tool → LLM → tool → LLM); without trace, debugging a wrong answer in production is impossible, latency root-cause is invisible, and prompt regressions go silent for weeks.
It is the Koder analog of LangSmith, Langfuse, Helicone, Arize Phoenix and Honeycomb — built on the OpenTelemetry standard (consumers don't reinvent the SDK; they use OTel) and wrapped around Tempo (we own the API contract while Tempo handles storage scale).
Boundary vs neighbors
services/ai/eval runs offline scoring; this sector handles runtime spans.
services/ai/prompt is the source of truth for prompt versions; trace correlates spans to them.
services/ai/billing shares request_id correlation for full endtoend view.
infra/observe provides the Tempo backend and Grafana dashboards.
Features (v1 target)
- OTLPgRPC + OTLPHTTP ingest (standard ports 4317/4318)
- Tempo wrapper (we own API contract; Tempo handles storage)
- AI semantic conventions (gen_ai.* + koder.* extensions)
- Stratified head-based sampling (always keep errors)
- 100% sample default; per-tenant tunable
- Prompt+rating correlation overlays
- Minimal trace UI at
trace.koder.dev (table + span tree + prompt panel)
Primary couplings
| Producer |
Relationship |
services/ai/agents |
Agent loop spans |
services/ai/kode |
Conversation + per-message spans |
services/ai/gateway |
LLM call spans |
services/ai/imaging, video, synth, scene3d, embed |
Job spans |
services/ai/extract, classify, translate |
Per-call spans |
| Consumer |
Relationship |
services/ai/eval |
Reads traces for offline scoring |
services/ai/prompt |
Receives ratings linked to spans |
services/ai/billing |
request_id correlation |
infra/observe |
Tempo + Grafana backend |
infra/data/kdb-time |
Storage layer |
RFC and bootstrap
- RFC:
trace-RFC-001-foundations.kmd — *ccepted*20260509
- Bootstrap ticket:
services/ai/backlog/done/127-trace-bootstrap.md
- Implementation tickets:
services/ai/trace/backlog/pending/{001..005}
Selfhostedfirst analysis (5 gates)
| Gate |
Status |
Notes |
| G1 Feature parity |
pending |
Skeleton phase; OTel + Tempo cover ingest+storage; UI matches LangSmith essentials |
| G2 Performance |
pending |
Target 5k spans/sec sustained per instance; query enrichment p95 < 50ms |
| G3 Stability |
pending |
Pre-MVP |
| G4 Capability |
pending |
Distributed tracing generic deferred to infra/observe; AI scope covered |
| G5 Critical-path readiness |
pending |
Pre-MVP; debugging agents/kode in prd is the first concrete unblock |