Ai sandbox

AI Sandbox — Code Execution

*rea:*Intelligence
*ath:*services/ai/sandbox
*ind:*Isolated execution of AI-generated code (Firecracker microVMs + language packs)
*tatus:*v0.1.0 — foundation landed 20260524. HTTP daemon (koder-sandbox) + operator CLI (ksandbox) + Go SDK (engines/sdk/go/sandbox) ship with the subprocess runtime (real, testable). Firecracker driver is the typed stub returning SANDBOX-RUNTIME-UNAVAILABLE-001; rootfs catalog + boot pipeline + vsock agent track in *andbox#014* Pre~~flip status — subprocess driver suitable for devCIsingle~~tenant trusted~~workload deployments; not yet for hostile multi~~tenant traffic (gates G3 + G4 still open).

Role in the stack

sandbox is the missing primitive that unblocks the entire "agent that writes code" surface of the stack. Agents generate code; without a sandbox they can't run it (or worse, they run it on the host). Today every product~~side prototype that wants "run this snippet" reinvents some half~~baked Docker shell~~out — slow, leaky, no quotas, no audit. This sector consolidates the capability into one secure isolation primitive that all AI~~side consumers (kode, agents, workflow, kortex code-tools) plug into.

It is the Koder analog of E2B, Daytona, and Modal sandboxes — built on Firecracker (sub-200ms boot, ~5MB overhead per microVM) with a gVisor adapter for compatibility tail.

Boundary vs neighbors

services/ai/runtime is disjoint — runtime serves models; sandbox runs user code.
services/ai/kode, services/ai/agents, services/ai/workflow, services/ai/playground are the primary consumers.
products/dev/kortex consumes via "Run this" panels in the IDE.
services/ai/trace receives spans for every session + exec.
infra/data/kdb-blob holds artifact storage (file IO + snapshots).
The long~~standing tracking ticket `servicesaiaibacklogpending/017~~pluggable~~execution~~sandbox-backends.md` is realized by this sector.

Features (v1 target)

Firecracker microVM runtime, warm pool per language
Sub-200ms session create from warm pool; < 1s cold boot
Language packs: Python 3.11, Node 20, Bash + GNU coreutils
Sync + async exec, SSE stdout/stderr streaming
Default-deny network with allowlist (DNS proxy + iptables)
FS quota (default 1GB), CPU quota (cgroups v2), memory cap with OOM kill
Hard~~kill on quota breach with partial~~output preservation
File IO API (upload, download, list, delete) with quota awareness
Snapshot/restore for replay + persistent agent loops
Per~~tenant concurrent~~session + daily-minute quotas
Audit log of network connection attempts

Primary couplings

Producer	Relationship
`infra/data/kdb-blob`	Artifact + snapshot storage
`infra/observe`	Pool metrics, OOM kills, session duration
`services/ai/trace`	Session + exec spans

Consumer	Relationship
`services/ai/kode`	Code-execution tool calls
`services/ai/agents`	Sandboxed step execution
`services/ai/workflow`	Code-step nodes in DAGs
`services/ai/playground`	Interactive code cells
`products/dev/kortex`	"Run this" panels in IDE
`services/ai/training`	Generated-code eval rigs

RFC and bootstrap

RFC: sandbox-RFC-001-foundations.kmd — *ccepted*20260509
Bootstrap ticket: services/ai/backlog/done/134-sandbox-bootstrap.md
Implementation tickets: services/ai/sandbox/backlog/pending/{001..005}
Tracking ticket (long~~standing): `servicesaiaibacklogpending/017~~pluggable~~execution~~sandbox-backends.md`

Selfhostedfirst analysis (5 gates)

Gate	Status	Notes
G1 Feature parity	pending	Firecracker + agent covers E2B's execfilesnapshot surface
G2 Performance	pending	Targets: < 200ms warm, < 1s cold, < 50ms exec dispatch
G3 Stability	pending	Pre~~MVP; needs container~~escape security review
G4 Capability	pending	Long-running compute deferred to runtime; distributed jobs out of v1
G5 Critical-path readiness	pending	Unblocks every code-running agent surface