Ai sandbox

AI Sandbox — Code Execution

  • *rea:*Intelligence
  • *ath:*services/ai/sandbox
  • *ind:*Isolated execution of AI-generated code (Firecracker microVMs + language packs)
  • *tatus:*v0.1.0 — foundation landed 20260524. HTTP daemon (koder-sandbox) + operator CLI (ksandbox) + Go SDK (engines/sdk/go/sandbox) ship with the subprocess runtime (real, testable). Firecracker driver is the typed stub returning SANDBOX-RUNTIME-UNAVAILABLE-001; rootfs catalog + boot pipeline + vsock agent track in *andbox#014* Preflip status — subprocess driver suitable for devCIsingletenant trustedworkload deployments; not yet for hostile multitenant traffic (gates G3 + G4 still open).

Role in the stack

sandbox is the missing primitive that unblocks the entire "agent that writes code" surface of the stack. Agents generate code; without a sandbox they can't run it (or worse, they run it on the host). Today every productside prototype that wants "run this snippet" reinvents some halfbaked Docker shellout — slow, leaky, no quotas, no audit. This sector consolidates the capability into one secure isolation primitive that all AIside consumers (kode, agents, workflow, kortex code-tools) plug into.

It is the Koder analog of E2B, Daytona, and Modal sandboxes — built on Firecracker (sub-200ms boot, ~5MB overhead per microVM) with a gVisor adapter for compatibility tail.

Boundary vs neighbors

  • services/ai/runtime is disjoint — runtime serves models; sandbox runs user code.
  • services/ai/kode, services/ai/agents, services/ai/workflow, services/ai/playground are the primary consumers.
  • products/dev/kortex consumes via "Run this" panels in the IDE.
  • services/ai/trace receives spans for every session + exec.
  • infra/data/kdb-blob holds artifact storage (file IO + snapshots).
  • The longstanding tracking ticket `servicesaiaibacklogpending/017pluggableexecutionsandbox-backends.md` is realized by this sector.

Features (v1 target)

  • Firecracker microVM runtime, warm pool per language
  • Sub-200ms session create from warm pool; < 1s cold boot
  • Language packs: Python 3.11, Node 20, Bash + GNU coreutils
  • Sync + async exec, SSE stdout/stderr streaming
  • Default-deny network with allowlist (DNS proxy + iptables)
  • FS quota (default 1GB), CPU quota (cgroups v2), memory cap with OOM kill
  • Hardkill on quota breach with partialoutput preservation
  • File IO API (upload, download, list, delete) with quota awareness
  • Snapshot/restore for replay + persistent agent loops
  • Pertenant concurrentsession + daily-minute quotas
  • Audit log of network connection attempts

Primary couplings

Producer Relationship
infra/data/kdb-blob Artifact + snapshot storage
infra/observe Pool metrics, OOM kills, session duration
services/ai/trace Session + exec spans
Consumer Relationship
services/ai/kode Code-execution tool calls
services/ai/agents Sandboxed step execution
services/ai/workflow Code-step nodes in DAGs
services/ai/playground Interactive code cells
products/dev/kortex "Run this" panels in IDE
services/ai/training Generated-code eval rigs

RFC and bootstrap

  • RFC: sandbox-RFC-001-foundations.kmd — *ccepted*20260509
  • Bootstrap ticket: services/ai/backlog/done/134-sandbox-bootstrap.md
  • Implementation tickets: services/ai/sandbox/backlog/pending/{001..005}
  • Tracking ticket (longstanding): `servicesaiaibacklogpending/017pluggableexecutionsandbox-backends.md`

Selfhostedfirst analysis (5 gates)

Gate Status Notes
G1 Feature parity pending Firecracker + agent covers E2B's execfilesnapshot surface
G2 Performance pending Targets: < 200ms warm, < 1s cold, < 50ms exec dispatch
G3 Stability pending PreMVP; needs containerescape security review
G4 Capability pending Long-running compute deferred to runtime; distributed jobs out of v1
G5 Critical-path readiness pending Unblocks every code-running agent surface

Source: ../home/koder/dev/koder/meta/docs/stack/modules/ai-sandbox.md