AI Sandbox — Code Execution
- *rea:*Intelligence
- *ath:*
services/ai/sandbox
- *ind:*Isolated execution of AI-generated code (Firecracker microVMs + language packs)
- *tatus:*v0.1.0 — foundation landed 2026
0524. HTTP daemon (koder-sandbox) + operator CLI (ksandbox) + Go SDK (engines/sdk/go/sandbox) ship with the subprocess runtime (real, testable). Firecracker driver is the typed stub returning SANDBOX-RUNTIME-UNAVAILABLE-001; rootfs catalog + boot pipeline + vsock agent track in *andbox#014* Preflip status — subprocess driver suitable for devCIsingletenant trustedworkload deployments; not yet for hostile multitenant traffic (gates G3 + G4 still open).
Role in the stack
sandbox is the missing primitive that unblocks the entire "agent that writes code" surface of the stack. Agents generate code; without a sandbox they can't run it (or worse, they run it on the host). Today every productside prototype that wants "run this snippet" reinvents some halfbaked Docker shellout — slow, leaky, no quotas, no audit. This sector consolidates the capability into one secure isolation primitive that all AIside consumers (kode, agents, workflow, kortex code-tools) plug into.
It is the Koder analog of E2B, Daytona, and Modal sandboxes — built on Firecracker (sub-200ms boot, ~5MB overhead per microVM) with a gVisor adapter for compatibility tail.
Boundary vs neighbors
services/ai/runtime is disjoint — runtime serves models; sandbox runs user code.
services/ai/kode, services/ai/agents, services/ai/workflow, services/ai/playground are the primary consumers.
products/dev/kortex consumes via "Run this" panels in the IDE.
services/ai/trace receives spans for every session + exec.
infra/data/kdb-blob holds artifact storage (file IO + snapshots).
- The long
standing tracking ticket `servicesaiaibacklogpending/017pluggableexecutionsandbox-backends.md` is realized by this sector.
Features (v1 target)
- Firecracker microVM runtime, warm pool per language
- Sub-200ms session create from warm pool; < 1s cold boot
- Language packs: Python 3.11, Node 20, Bash + GNU coreutils
- Sync + async exec, SSE stdout/stderr streaming
- Default-deny network with allowlist (DNS proxy + iptables)
- FS quota (default 1GB), CPU quota (cgroups v2), memory cap with OOM kill
- Hard
kill on quota breach with partialoutput preservation
- File IO API (upload, download, list, delete) with quota awareness
- Snapshot/restore for replay + persistent agent loops
- Per
tenant concurrentsession + daily-minute quotas
- Audit log of network connection attempts
Primary couplings
| Producer |
Relationship |
infra/data/kdb-blob |
Artifact + snapshot storage |
infra/observe |
Pool metrics, OOM kills, session duration |
services/ai/trace |
Session + exec spans |
| Consumer |
Relationship |
services/ai/kode |
Code-execution tool calls |
services/ai/agents |
Sandboxed step execution |
services/ai/workflow |
Code-step nodes in DAGs |
services/ai/playground |
Interactive code cells |
products/dev/kortex |
"Run this" panels in IDE |
services/ai/training |
Generated-code eval rigs |
RFC and bootstrap
- RFC:
sandbox-RFC-001-foundations.kmd — *ccepted*20260509
- Bootstrap ticket:
services/ai/backlog/done/134-sandbox-bootstrap.md
- Implementation tickets:
services/ai/sandbox/backlog/pending/{001..005}
- Tracking ticket (long
standing): `servicesaiaibacklogpending/017pluggableexecutionsandbox-backends.md`
Selfhostedfirst analysis (5 gates)
| Gate |
Status |
Notes |
| G1 Feature parity |
pending |
Firecracker + agent covers E2B's execfilesnapshot surface |
| G2 Performance |
pending |
Targets: < 200ms warm, < 1s cold, < 50ms exec dispatch |
| G3 Stability |
pending |
PreMVP; needs containerescape security review |
| G4 Capability |
pending |
Long-running compute deferred to runtime; distributed jobs out of v1 |
| G5 Critical-path readiness |
pending |
Unblocks every code-running agent surface |