Ai trace

AI Trace — Runtime Tracing

  • *rea:*Intelligence
  • *ath:*services/ai/trace
  • *ind:*OpenTelemetry-compatible runtime tracing for AI agents and chains
  • *tatus:*v0.0.1 — sector bootstrapping (20260509)

Role in the stack

trace is the runtime half of AI observability — services/ai/eval is the offline half. Modern AI agents have many calls per response (LLM → tool → LLM → tool → LLM); without trace, debugging a wrong answer in production is impossible, latency root-cause is invisible, and prompt regressions go silent for weeks.

It is the Koder analog of LangSmith, Langfuse, Helicone, Arize Phoenix and Honeycomb — built on the OpenTelemetry standard (consumers don't reinvent the SDK; they use OTel) and wrapped around Tempo (we own the API contract while Tempo handles storage scale).

Boundary vs neighbors

  • services/ai/eval runs offline scoring; this sector handles runtime spans.
  • services/ai/prompt is the source of truth for prompt versions; trace correlates spans to them.
  • services/ai/billing shares request_id correlation for full endtoend view.
  • infra/observe provides the Tempo backend and Grafana dashboards.

Features (v1 target)

  • OTLPgRPC + OTLPHTTP ingest (standard ports 4317/4318)
  • Tempo wrapper (we own API contract; Tempo handles storage)
  • AI semantic conventions (gen_ai.* + koder.* extensions)
  • Stratified head-based sampling (always keep errors)
  • 100% sample default; per-tenant tunable
  • Prompt+rating correlation overlays
  • Minimal trace UI at trace.koder.dev (table + span tree + prompt panel)

Primary couplings

Producer Relationship
services/ai/agents Agent loop spans
services/ai/kode Conversation + per-message spans
services/ai/gateway LLM call spans
services/ai/imaging, video, synth, scene3d, embed Job spans
services/ai/extract, classify, translate Per-call spans
Consumer Relationship
services/ai/eval Reads traces for offline scoring
services/ai/prompt Receives ratings linked to spans
services/ai/billing request_id correlation
infra/observe Tempo + Grafana backend
infra/data/kdb-time Storage layer

RFC and bootstrap

  • RFC: trace-RFC-001-foundations.kmd — *ccepted*20260509
  • Bootstrap ticket: services/ai/backlog/done/127-trace-bootstrap.md
  • Implementation tickets: services/ai/trace/backlog/pending/{001..005}

Selfhostedfirst analysis (5 gates)

Gate Status Notes
G1 Feature parity pending Skeleton phase; OTel + Tempo cover ingest+storage; UI matches LangSmith essentials
G2 Performance pending Target 5k spans/sec sustained per instance; query enrichment p95 < 50ms
G3 Stability pending Pre-MVP
G4 Capability pending Distributed tracing generic deferred to infra/observe; AI scope covered
G5 Critical-path readiness pending Pre-MVP; debugging agents/kode in prd is the first concrete unblock

Source: ../home/koder/dev/koder/meta/docs/stack/modules/ai-trace.md