Ai trace

AI Trace — Runtime Tracing

*rea:*Intelligence
*ath:*services/ai/trace
*ind:*OpenTelemetry-compatible runtime tracing for AI agents and chains
*tatus:*v0.0.1 — sector bootstrapping (20260509)

Role in the stack

trace is the runtime half of AI observability — services/ai/eval is the offline half. Modern AI agents have many calls per response (LLM → tool → LLM → tool → LLM); without trace, debugging a wrong answer in production is impossible, latency root-cause is invisible, and prompt regressions go silent for weeks.

It is the Koder analog of LangSmith, Langfuse, Helicone, Arize Phoenix and Honeycomb — built on the OpenTelemetry standard (consumers don't reinvent the SDK; they use OTel) and wrapped around Tempo (we own the API contract while Tempo handles storage scale).

Boundary vs neighbors

services/ai/eval runs offline scoring; this sector handles runtime spans.
services/ai/prompt is the source of truth for prompt versions; trace correlates spans to them.
services/ai/billing shares request_id correlation for full endtoend view.
infra/observe provides the Tempo backend and Grafana dashboards.

Features (v1 target)

OTLPgRPC + OTLPHTTP ingest (standard ports 4317/4318)
Tempo wrapper (we own API contract; Tempo handles storage)
AI semantic conventions (gen_ai.* + koder.* extensions)
Stratified head-based sampling (always keep errors)
100% sample default; per-tenant tunable
Prompt+rating correlation overlays
Minimal trace UI at trace.koder.dev (table + span tree + prompt panel)

Primary couplings

Producer	Relationship
`services/ai/agents`	Agent loop spans
`services/ai/kode`	Conversation + per-message spans
`services/ai/gateway`	LLM call spans
`services/ai/imaging`, `video`, `synth`, `scene3d`, `embed`	Job spans
`services/ai/extract`, `classify`, `translate`	Per-call spans

Consumer	Relationship
`services/ai/eval`	Reads traces for offline scoring
`services/ai/prompt`	Receives ratings linked to spans
`services/ai/billing`	request_id correlation
`infra/observe`	Tempo + Grafana backend
`infra/data/kdb-time`	Storage layer

RFC and bootstrap

RFC: trace-RFC-001-foundations.kmd — *ccepted*20260509
Bootstrap ticket: services/ai/backlog/done/127-trace-bootstrap.md
Implementation tickets: services/ai/trace/backlog/pending/{001..005}

Selfhostedfirst analysis (5 gates)

Gate	Status	Notes
G1 Feature parity	pending	Skeleton phase; OTel + Tempo cover ingest+storage; UI matches LangSmith essentials
G2 Performance	pending	Target 5k spans/sec sustained per instance; query enrichment p95 < 50ms
G3 Stability	pending	Pre-MVP
G4 Capability	pending	Distributed tracing generic deferred to infra/observe; AI scope covered
G5 Critical-path readiness	pending	Pre-MVP; debugging agents/kode in prd is the first concrete unblock