MCP sampling approval

mandatory

UI for MCP server-initiated LLM completion requests delegated to the client (which has model + credentials + cost tracking). Two-step approval: pre-LLM (prompt) + post-LLM (response). Audit-logged. Closes MCP Section A of umbrella #099.

Spec — MCP sampling approval

MCP normative source: https://modelcontextprotocol.io/specification/2025-06-18/client/sampling. Pattern advanced — ativável quando algum tool server Koder usar sampling (services/ai/agents ou services/ai/kode no futuro).

Princípios

  1. *wostep approval*— preLLM (prompt visible + editable) + post-LLM (response visible + approveregeneratereject).
  2. *lient chooses model*— server NÃO especifica model; client decide via model-selector.kmd (#113).
  3. *udit-logged*— prompt + response + decision persisted.
  4. *ostaware*— prestep shows estimated cost (cross-link #112).
  5. *utoapprove optin*— user pode habilitar auto-approve per server.

R1 — Pre-LLM step

Server requests sampling:

{
  "type": "sampling/createMessage",
  "messages": [...],
  "modelPreferences": {
    "intelligencePriority": 0.8,
    "speedPriority": 0.3
  },
  "systemPrompt": "...",
  "maxTokens": 1000
}

Client renders dialog:

┌───────────────────────────────────────────────┐
│ Server requests LLM completion                │
│ From: <server_origin_chip>                    │
├───────────────────────────────────────────────┤
│ Prompt:                                       │
│ [editable text area with messages preview]   │
├───────────────────────────────────────────────┤
│ Model: [Auto (Kode Relay)] ▾                  │
│ Estimated cost: ~250 tokens, $0.005           │
├───────────────────────────────────────────────┤
│ [Reject] [Approve & run]                      │
└───────────────────────────────────────────────┘

User can edit prompt before approval (audit-logged).

R2 — Post-LLM step

After LLM responds, client shows preview before returning to server:

┌───────────────────────────────────────────────┐
│ LLM response ready for server                 │
├───────────────────────────────────────────────┤
│ [response text]                               │
├───────────────────────────────────────────────┤
│ Tokens: 250 in / 180 out · Cost: $0.005      │
├───────────────────────────────────────────────┤
│ [Reject] [Regenerate] [Send to server]        │
└───────────────────────────────────────────────┘

Regenerate: re-invoke LLM with same prompt (model can switch).

R3 — Autoapprove mode (optin)

User can enable auto-approve per (server_id, koder_user_id, workspace_id). When enabled:

  • Prestep: skipped (prompt autoapproved).
  • Post-step: optionally skipped (depending on additional setting).

Autoapprove disabled by default (security). UI surface in `mcpserver-state.kmd` (#104) drawer settings.

Autorevoke: 7 days default (similar to `mcppermission-prompt.kmd` R5 High risk).

R4 — Audit log

Each decision emits event to services/foundation/audit/:

event_type: "mcp.sampling.decision"
decision: "APPROVED" | "REJECTED" | "REGENERATED" | "AUTO_APPROVED"
koder_user_id, workspace_id, server_id, model, tokens_in, tokens_out, cost, timestamp

Includes prompt edit if user modified before approval.

R5 — Model selection

model-selector.kmd (#113) integrates: user picks model in pre-step. Server modelPreferences are HINTS, not commands — client decides.

Hint resolution:

Server hint Client behavior
intelligencePriority: 0.8+ Prefer hightier model (Opus, GPT5)
speedPriority: 0.8+ Prefer fast model (Haiku, Mini, Flash)
costPriority: 0.8+ Prefer cheap or self-hosted Koder LLM

R6 — Surface bindings

Surface API
Flutter KoderMCPSamplingDialog em koder_kit/lib/src/ai/mcp_sampling_dialog.dart
Web <koder-mcp-sampling-dialog>
Compose/SwiftUI futuro
CLI / TUI Prompts inline com YNr (regenerate)

R7 — Acessibilidade

  • Dialog: role="dialog" aria-modal="true".
  • Prompt edit area: <textarea> with aria-label.
  • Cost estimate: aria-describedby dialog.
  • Focus management; ESC cancela (= reject).

R8 — i18n

Key en-US pt-BR
mcp.sampling.title "Server requests LLM completion" "Servidor solicita resposta de IA"
mcp.sampling.prompt_label "Prompt (editable)" "Prompt (editável)"
mcp.sampling.action.approve_run "Approve & run" "Aprovar e executar"
mcp.sampling.action.regenerate "Regenerate" "Gerar de novo"
mcp.sampling.action.send "Send to server" "Enviar ao servidor"
mcp.sampling.action.reject "Reject" "Recusar"
mcp.sampling.cost.estimate "Estimated cost: ~{tokens} tokens, {cost}" "Custo estimado: ~{tokens} tokens, {cost}"
mcp.sampling.auto.enabled "Auto-approve enabled for this server" "Aprovação automática ativa pra este servidor"

T-suite

  • *1*Pre-step render: receive sampling request → dialog shows prompt + model picker + cost.
  • *2*Prompt edit: modify prompt → submit includes edited text + audit log entry.
  • *3*Approve & run: → LLM invoked; post-step appears with response.
  • *4*Post-step approve: send to server → sampling/createMessage response.
  • *5*Regenerate: re-invoke LLM same prompt → new response.
  • *6*Reject: emits rejection; server receives error.
  • *7*Model switch in pre-step: change from Auto to Claude Opus → LLM called with selected model.
  • *8*Autoapprove enabled: server requests → prestep skipped; audit event marks "AUTO_APPROVED".
  • *9*Autorevoke: 8 days later → autoapprove expired; user re-prompted.
  • *10*Cost transparency: prestep shows accurate estimate; poststep shows actual.

Source: ../home/koder/dev/koder/meta/docs/stack/specs/ai-ui/mcp-sampling-approval.kmd