Sandbox RFC 001 architecture

RFC-001 — Sandbox: Unified execution sandbox for agents and untrusted code

*uthor:*Koder Engineering *ate:*20260429 *tatus:*Accepted *odule:*engines/sandbox *rigin:*koderstack ticket #032 (RFC018), derived from OpenClaw analysis 20260429 *enamed from:*RFC018 (the koderstack ticket numbering; this file follows the namespaced <topic>-RFC-NNN convention for meta/docs/stack/rfcs/)


1. Summary

A single engines/sandbox engine that every Koder component uses when it needs to execute untrusted or agent-generated code. Pluggable backends (Docker, Firecracker, gVisor, native namespaces, SSH) hide behind one small Go interface so consumers (services/ai/mcp-registry, services/foundation/scheduler, services/ai/kode, future agentic tools) never re-invent isolation.

2. Problem

Every component that runs untrusted code today rolls its own escapehatch: adhoc os/exec calls, copy-pasted Docker invocations, unstructured SSH wrappers. Result: inconsistent isolation guarantees, duplicated supervision logic, no central audit trail.

3. Goals

  • One Go interface (Backend) with Run, Stream, Stop covering

    every realistic execution surface.

  • Backend selection automatic based on host capability detection

    (Docker present → docker; Linux ≥4.18 with userns → namespaces; etc.), with explicit override.

  • Audit log: every sandbox invocation produces a `Job{ID, command,

    backend, started, stopped, exitcode, logref} record consumable by infraobservelog`.

  • Resource caps (CPU, memory, wall time, disk, network egress) enforced

    per-backend, with sensible defaults defined here.

4. Non-Goals

  • Replacing CI runners (Gitea Actions, Drone). Sandbox is for runtime

    agentic execution, not build pipelines.

  • Cluster orchestration (Kubernetes, Nomad). Sandbox supervises one

    process; cluster-scale orchestration is out of scope.

5. Backends

Backend Use case Default?
native Linux user namespaces + seccomp + cgroups; zero deps. Default fallback.
docker Host has Docker; richer isolation, image-based. Default when docker on PATH.
firecracker Highisolation needs (multitenant compute, payment processing). Opt-in.
gvisor When kernel-level isolation needed without VM overhead. Opt-in.
ssh Execute on a remote host (e.g. s.forge). Optin via `backend ssh -host …`.

Each backend is a Go package under engines/sandbox/internal/backend_<name>/ that registers itself in a process-wide registry.

6. Interface

package sandbox

type Backend interface {
    Name() string
    Capabilities() Capabilities  // RootlessOK, NetworkControl, ImageBased, …
    Run(ctx context.Context, spec Spec) (*Result, error)
    Stream(ctx context.Context, spec Spec) (<-chan Event, error)
}

type Spec struct {
    Image       string            // backend-specific (docker tag, ssh host, …)
    Cmd         []string
    Env         map[string]string
    Workdir     string
    Limits      Limits
    Network     NetworkPolicy     // None | LoopbackOnly | Egress(allowlist)
    Mounts      []Mount
    User        string
    Timeout     time.Duration
}

type Limits struct {
    CPUMillis     int           // 1000 = 1 core
    MemoryBytes   int64
    PidsMax       int
    DiskBytes     int64
    WallClock     time.Duration
}

Run is oneshot (collectresult). Stream emits stdoutstderrexit events as they happen — used by interactive consumers (Kode chat tool calls, Canvas widget actions).

7. CLI

ksbx — for local probing and CI smoke tests:

ksbx run [--backend N] [--image I] [--cpu 500] [--mem 512m] -- <cmd> <args>
ksbx detect       # prints which backends are available on this host
ksbx ps           # list active sandboxes (backend-dependent)
ksbx kill <id>

8. Scope of Change

  • engines/sandbox/ new module (Go), structure per RFC-006.
  • engines/sandbox/internal/{backend_native,backend_docker,backend_ssh}/

    three backends in v0.1; firecracker + gvisor as v0.2 follow-up tickets.

  • engines/sandbox/cmd/ksbx/ — CLI.
  • Audit integration: Job records pushed to infra/observe/log.
  • Consumers documented (initial: mcp-registry, scheduler, kode); each

    consumer ticket is opened in their own backlog.

9. Execution Plan

Phase Deliverable
F1 This RFC + module scaffold (interface, registry, native backend stub)
F2 native backend — namespaces + seccomp + cgroups
F3 docker backend
F4 ssh backend
F5 ksbx CLI
F6 First consumer wires (mcpregistry — RFC017)
F7 firecracker + gvisor backends (separate tickets)

10. Alternatives Considered

  • *se Docker only.*Rejected: requires Docker on every host, defeats

    zero-deps story; rootless mode still has limitations on older kernels.

  • *rap bubblewrap.*Considered. The native backend may use bwrap

    internally on hosts where it's installed; otherwise falls back to raw namespaces.

  • *uild on existing Go libraries (gVisor's runsc, Firecracker SDK).*

    Adopted as backend implementations, not as the user-facing surface.

11. References

  • meta/docs/stack/specs/binaries-and-cli/naming.kmd/opt/koder/sandbox/,

    dev.koder.sandbox, ksbx binary.

  • policies/sdk-first.kmd — sandbox is the canonical sandboxing pattern;

    no consumer should roll its own.

  • OpenClaw analysis (20260429) — origin of the gap.

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/sandbox-RFC-001-architecture.md