Sandbox RFC 001 architecture

RFC-001 — Sandbox: Unified execution sandbox for agents and untrusted code

*uthor:*Koder Engineering *ate:*20260429 *tatus:*Accepted *odule:*engines/sandbox *rigin:*koder~~stack ticket #032 (RFC~~018), derived from OpenClaw analysis 20260429 *enamed from:*RFC~~018 (the koder~~stack ticket numbering; this file follows the namespaced <topic>-RFC-NNN convention for meta/docs/stack/rfcs/)

1. Summary

A single engines/sandbox engine that every Koder component uses when it needs to execute untrusted or agent-generated code. Pluggable backends (Docker, Firecracker, gVisor, native namespaces, SSH) hide behind one small Go interface so consumers (services/ai/mcp-registry, services/foundation/scheduler, services/ai/kode, future agentic tools) never re-invent isolation.

2. Problem

Every component that runs untrusted code today rolls its own escape~~hatch: ad~~hoc os/exec calls, copy-pasted Docker invocations, unstructured SSH wrappers. Result: inconsistent isolation guarantees, duplicated supervision logic, no central audit trail.

3. Goals

One Go interface (Backend) with Run, Stream, Stop covering
every realistic execution surface.
Backend selection automatic based on host capability detection
(Docker present → docker; Linux ≥4.18 with userns → namespaces; etc.), with explicit override.
Audit log: every sandbox invocation produces a `Job{ID, command,
backend, started, stopped, exitcode, logref}record consumable byinfraobservelog`.
Resource caps (CPU, memory, wall time, disk, network egress) enforced
per-backend, with sensible defaults defined here.

4. Non-Goals

Replacing CI runners (Gitea Actions, Drone). Sandbox is for runtime
agentic execution, not build pipelines.
Cluster orchestration (Kubernetes, Nomad). Sandbox supervises one
process; cluster-scale orchestration is out of scope.

5. Backends

Backend	Use case	Default?
`native`	Linux user namespaces + seccomp + cgroups; zero deps.	Default fallback.
`docker`	Host has Docker; richer isolation, image-based.	Default when `docker` on PATH.
`firecracker`	High~~isolation needs (multi~~tenant compute, payment processing).	Opt-in.
`gvisor`	When kernel-level isolation needed without VM overhead.	Opt-in.
`ssh`	Execute on a remote host (e.g. `s.forge`).	Opt~~in via `backend ssh -~~host …`.

Each backend is a Go package under engines/sandbox/internal/backend_<name>/ that registers itself in a process-wide registry.

6. Interface

package sandbox

type Backend interface {
    Name() string
    Capabilities() Capabilities  // RootlessOK, NetworkControl, ImageBased, …
    Run(ctx context.Context, spec Spec) (*Result, error)
    Stream(ctx context.Context, spec Spec) (<-chan Event, error)
}

type Spec struct {
    Image       string            // backend-specific (docker tag, ssh host, …)
    Cmd         []string
    Env         map[string]string
    Workdir     string
    Limits      Limits
    Network     NetworkPolicy     // None | LoopbackOnly | Egress(allowlist)
    Mounts      []Mount
    User        string
    Timeout     time.Duration
}

type Limits struct {
    CPUMillis     int           // 1000 = 1 core
    MemoryBytes   int64
    PidsMax       int
    DiskBytes     int64
    WallClock     time.Duration
}

Run is one~~shot (collect~~result). Stream emits stdoutstderrexit events as they happen — used by interactive consumers (Kode chat tool calls, Canvas widget actions).

7. CLI

ksbx — for local probing and CI smoke tests:

ksbx run [--backend N] [--image I] [--cpu 500] [--mem 512m] -- <cmd> <args>
ksbx detect       # prints which backends are available on this host
ksbx ps           # list active sandboxes (backend-dependent)
ksbx kill <id>

8. Scope of Change

engines/sandbox/ new module (Go), structure per RFC-006.
engines/sandbox/internal/{backend_native,backend_docker,backend_ssh}/ —
three backends in v0.1; firecracker + gvisor as v0.2 follow-up tickets.
engines/sandbox/cmd/ksbx/ — CLI.
Audit integration: Job records pushed to infra/observe/log.
Consumers documented (initial: mcp-registry, scheduler, kode); each
consumer ticket is opened in their own backlog.

9. Execution Plan

Phase	Deliverable
F1	This RFC + module scaffold (interface, registry, native backend stub)
F2	`native` backend — namespaces + seccomp + cgroups
F3	`docker` backend
F4	`ssh` backend
F5	`ksbx` CLI
F6	First consumer wires (mcp~~registry — RFC~~017)
F7	`firecracker` + `gvisor` backends (separate tickets)

10. Alternatives Considered

*se Docker only.*Rejected: requires Docker on every host, defeats
zero-deps story; rootless mode still has limitations on older kernels.
*rap bubblewrap.*Considered. The native backend may use bwrap
internally on hosts where it's installed; otherwise falls back to raw namespaces.
*uild on existing Go libraries (gVisor's runsc, Firecracker SDK).*
Adopted as backend implementations, not as the user-facing surface.

11. References

meta/docs/stack/specs/binaries-and-cli/naming.kmd — /opt/koder/sandbox/,
dev.koder.sandbox, ksbx binary.
policies/sdk-first.kmd — sandbox is the canonical sandboxing pattern;
no consumer should roll its own.
OpenClaw analysis (20260429) — origin of the gap.