Agent stream

draft

Koder Agent Stream Protocol

Abstract

This specification defines the wire format for streaming an agent run from the gateway (services/ai/ai/gateway) to clients (CLI, web, mobile, desktop, third-party API consumers). It standardises *even event types*that together describe reasoning, tool use, step boundaries, and run lifecycle — enough for UI surfaces to render a faithful timeline equivalent to Manus 1.6 Max, Devin 2.0, and Claude Code's transcript views.

The protocol is transport-agnostic: same event payload over ServerSent Events (SSE), WebSocket, gRPC serverstream, or stored JSONL for replay (AICORE-117).

1. Design philosophy

  • *ingle envelope, typed payloads.*Every event shares a small

    envelope (id, ts, type, runid, childid?) and discriminates by type. Consumers parse once.

  • *easoning is a first-class segment, separate from text.*

    Anthropic-style thinking blocks need their own renderer (collapsed by default, smaller font, italic). Text and reasoning never intermix in the same event.

  • *tep boundaries are observable.*AICORE-117 (step replay) and

    AICORE-121 (checkpoints) both need to know where steps end.

  • *ool calls have request + result events.*Same call_id

    correlates them, even when the result arrives N events later.

  • *ultichild support from day one.*AIGW051 parallel dispatch

    embeds child_id so parent streams can intercalate sub-runs.

  • *ounded throughput.*Server agreggates to ≤ 10 events/sec per

    run by default; clients can request raw via ?detail=full.

2. Envelope

Every event JSON is a single object:

{
  "id":       "evt_01HZ7K4P...",
  "ts":       "2026-05-14T19:42:13.124Z",
  "type":     "reasoning.delta",
  "run_id":   "run_01HZ7K3M...",
  "child_id": null,
  "seq":      127,
  "payload":  {  }
}

Envelope fields

Field Type Required Description
id string yes ULID, monotonic per stream
ts RFC3339 string yes server-side timestamp
type enum yes one of the 7 types in §3
run_id string yes the agent run this event belongs to
child_id string|null no nonnull when this event comes from a subrun (AIGW-051)
seq int yes per-run monotonic sequence; gap = dropped event
payload object yes type-specific shape per §3

3. The 8 event types

3.1 reasoning.delta

Incremental "thinking" output — what model produces before/between visible text. Per Anthropic extended-thinking semantics.

{ "type": "reasoning.delta", "payload": { "text": "Let me check..." } }

Multiple deltas concatenate to form the full reasoning block of the current step. Clients render in a collapsed block (default closed), smaller font, italic. Reasoning is *ot*assistant-visible output — treat as scratch.

3.2 text.delta

Incremental assistant-visible text. Concatenates to the message body.

{ "type": "text.delta", "payload": { "text": "I'll compare the three..." } }

*ooloutput streaming (AICORE137c).*Tools that implement StreamingTool (Go interface { Tool; CallStream(ctx, input, chunks chan<- string) (output, error) }) cause the runtime to emit one text.delta per chunk between tool.start and tool.end. The final tool.end.output still carries the canonical JSON. Clients SHOULD render the deltas inline (just like model-produced text) without distinguishing toolstreamed text from modelstreamed text — from the consumer's perspective, both are assistant-visible content arriving incrementally.

3.3 tool.start

A tool invocation begins.

{
  "type": "tool.start",
  "payload": {
    "call_id":  "call_01HZ7K5R...",
    "tool":     "browser.navigate",
    "input":    { "url": "https://example.com" },
    "skill_id": "compare-products@0.2.0"
  }
}

skill_id (optional) marks the call as originating from a TOOLS-013 skill (versioned).

3.4 tool.end

The corresponding tool invocation completed.

{
  "type": "tool.end",
  "payload": {
    "call_id":  "call_01HZ7K5R...",
    "ok":       true,
    "output":   { "url": "https://example.com", "title": "Example" },
    "error":    null,
    "duration_ms": 1842
  }
}

When ok=false, error carries a KAI-* code per specs/errors/user-facing-messages.kmd. The output field MAY be omitted in error cases or when output is large (then a blob_ref points to kdrive).

3.5 step.boundary

The agent finished a step (a coherent unit of reasoning + tool calls + text). Triggers checkpoint capture (AICORE121) and stepreplay indexing (AICORE-117).

{
  "type": "step.boundary",
  "payload": {
    "step_index":  4,
    "step_kind":   "tool-roundtrip",
    "checkpoint_id": "ckpt_01HZ7K6T..."
  }
}

step_kind is one of: plan, tool-roundtrip, text-only, fan-out (parent emitting subruns), `fanin (parent merging results), done`.

3.6 child.spawn

Parent run is dispatching a subrun (AIGW051).

{
  "type": "child.spawn",
  "payload": {
    "child_id": "run_01HZ7K7U...",
    "prompt": "...",
    "tools_allowed": ["browser", "web.extract"]
  }
}

After spawn, events from the child stream appear in the parent stream with child_id set. Consumers can group by child_id to render parallel tabs (Cursor 3 / Replit Agent 4 style).

3.8 plan.proposal (added v0.2.0, AICORE-126)

The agent generated a plan for human review. Emitted when the run was started with planning_gate=required or planning_gate=preview. For required, the run pauses in awaiting_approval until the user calls POST /v1/agent/runs/:id/approve_plan.

{
  "type": "plan.proposal",
  "payload": {
    "plan": {
      "id": "plan_01HZ...",
      "run_id": "run_01HZ7K3M...",
      "steps": [
        { "id": "s1", "title": "Research candidate libraries",
          "intent": "research", "est_tools": ["web.search"],
          "est_cost_usd": 0.02 },
        { "id": "s2", "title": "Write benchmark harness",
          "intent": "write", "est_tools": ["file.write", "shell"],
          "est_cost_usd": 0.05 }
      ],
      "est_total_cost_usd": 0.07
    }
  }
}

step.intent is one of: research, write, edit, execute, verify, report, other. UIs render an icon per intent. Cost estimation per step comes from services/foundation/billing token history.

3.7 run.lifecycle

Lifecycle state transition. One event per state change.

{
  "type": "run.lifecycle",
  "payload": {
    "state": "running",
    "reason": null
  }
}

State values:

State When
planning run accepted, plan being generated (AICORE-126)
awaiting_approval plan proposed, waiting for user (AICORE-126)
running actively executing
paused user paused (AICORE-122)
redirecting user issued redirect (AICORE-122)
done completed successfully; final state
aborted user cancelled; final state
error unrecoverable failure; final state

reason is a human-readable string explaining the transition (e.g. "user clicked pause", "tool browser failed 3 times").

4. Throughput shaping (R1–R4)

R1 — Default aggregation

Server batches reasoning.delta and text.delta events to at most *0 events/sec*per run by default. Aggregation window: 100ms. Multiple deltas within the window concatenate text.

R2 — Raw mode opt-in

Client may request ?detail=full (SSE) or send a Detail: full header (WebSocket handshake) to disable aggregation. Returns every underlying token boundary. Use for debugging, replay validation, or animations that need character-level granularity.

R3 — tool.start / tool.end / step.boundary are never batched

These are control-plane signals; deliver immediately even under aggregation pressure.

R4 — Backpressure

If client's read buffer fills, server drops reasoning.delta first, then text.delta, never toolsteplifecycle. Dropped events are counted in a final run.lifecycle with state=done payload extending with dropped_count.

5. Transport profiles

5.1 SSE (HTTP/2 server-sent events) — primary

  • Endpoint: GET /v1/agent/runs/:id/stream
  • ContentType: `text/eventstream`
  • One event per data: line, JSON-encoded
  • Reconnect with Last-Event-ID: <seq> header; server replays from seq+1
  • Heartbeat: empty : keepalive\n\n every 15s

5.2 WebSocket — for browser when needed

  • Endpoint: GET /v1/agent/runs/:id/stream with Upgrade: websocket
  • Binary frames forbidden; only text JSON
  • Same envelope; client opts in via subprotocol koder-agent-stream-v1

5.3 gRPC server-stream — internal

  • Service agent.v1.AgentRun method `Stream(StreamRequest)

    returns (stream Event)`

  • Same proto3 shape as JSON

5.4 JSONL replay — offline

  • One JSON event per line, id ordered ascending
  • Stored in kdrive blob per agent_runs.replay_uri
  • Backs AICORE117 stepreplay UI

6. Persistence

The full event stream is persisted in kdb (agent_run_events table) multitenant per `policies/multitenantbydefault.kmd`:

Column Type Notes
id ULID PK
run_id ULID FK
koder_user_id string tenant
child_id ULID? when from sub-run
seq int64 per-run monotonic
type enum one of §3
ts timestamp server-side
payload jsonb type-specific

Retention: 90 days for completed runs (overrideable per workspace); 30 days for abortederror. Per `policiesidentitydataretention.kmd` spirit — but this is operational data, not auth.

7. Test contract (T1–T5)

  • *1*— Round-trip: marshal → unmarshal → equal (Go + JS clients)
  • *2*— Ordering: client receiving events outofseq detects gap
  • *3*— Reconnect: send LastEventID, get events ≥ seq+1
  • *4*— Throughput: 1000 deltasrun aggregated to ≤10s by default
  • *5*— Multi-child: parent stream intercalates child events

    without corrupting ordering per child_id

Templates live in specs/ai/agent-stream-test-template.kmd (to be authored as subticket of AICORE120).

8. Error codes

Per specs/errors/user-facing-messages.kmd, namespace KAI-STREAM-:

Code Meaning
KAI-STREAM-CONN-0001 client disconnected mid-stream
KAI-STREAM-CONN-0002 reconnect LastEventID too old (replay window expired)
KAI-STREAM-PROTO-0001 malformed event in raw mode
KAI-STREAM-PROTO-0002 unknown event type
KAI-STREAM-BUF-0001 backpressure drop occurred

9. Compatibility

  • *penAI streaming API*— superset; text.delta maps to OpenAI's

    content delta; we add reasoning + tool/step events

  • *nthropic streaming API*— direct mapping for reasoning.delta

    (thinking blocks) and text.delta; tool events different

  • *SE general*— strict subset; consumable by EventSource in

    any browser

10. References

  • AICORE-117 (step replay — consumes step.boundary + JSONL persistence)
  • AICORE-121 (checkpoints — consumes step.boundary)
  • AICORE-122 (pause/redirect — emits run.lifecycle)
  • AICORE123 (followups — runs after run.lifecycle state=done)
  • AICORE-126 (planning gate — emits planningawaiting_approvalrunning)
  • AIGW-051 (parallel — emits child.spawn + child_id)
  • KIT-030 (KAiThinking, KAiToolCall, KAiStepBoundary widgets)
  • WEBKIT-010 (web custom elements)
  • specs/errors/user-facing-messages.kmd
  • policies/multi-tenant-by-default.kmd

Source: ../home/koder/dev/koder/meta/docs/stack/specs/ai/agent-stream.kmd