AI cost display

mandatory

Per-message badge + session header total + breakdown drawer for AI cost tracking. Token counts (in/out), monetary cost (configurable currency or Koder credits), model attribution, threshold alerts. Backend: services/ai/billing/.

Spec — AI cost display

Backend: services/ai/billing/, services/ai/gateway/. Pricing source: registries/ai-model-recommendations.md OR gateway response metadata.

Princípios

  1. *er-message + session aggregate*— both visible.
  2. *onfigurable currency*— BRL / USD / Koder credits.
  3. *iscrete by default*— small chip; expand for breakdown.
  4. *hreshold alerts*— configurable soft + hard limits.

R1 — Per-message badge

Compact chip on assistant bubble footer:

🪙 250 in · 180 out · $0.005

Slots: in tokens, out tokens, monetary cost (or credits).

Configurable display: hide cost (only show tokens) or hide tokens (only show cost).

Position: bubble footer, before AI disclaimer chip (crosslink `chatmessage-bubble.kmd` R2).

R2 — Session header total

Chat header strip:

Session: 12 messages · 4,250 tokens · $0.18

Updates live as messages complete. Click → opens breakdown drawer (R3).

R3 — Breakdown drawer

┌────────────────────────────────────────────┐
│ Session cost                               │
├────────────────────────────────────────────┤
│ Total: $0.18 (4,250 tokens)               │
│                                            │
│ By message:                                │
│ #1  Claude Opus 4.7  250t  $0.005          │
│ #2  Claude Opus 4.7  430t  $0.009          │
│ ...                                        │
│                                            │
│ By model:                                  │
│ Claude Opus 4.7: 3,200t · $0.130          │
│ Gemini Pro: 1,050t · $0.050               │
│                                            │
│ By tool call:                              │
│ search:    200t · $0.004                   │
│ fetch_url: 380t · $0.008                   │
└────────────────────────────────────────────┘

R4 — Threshold alerts

User-configurable per workspace:

Limit Behavior
*oft alert*(e.g., $1) Toast warning; session continues
*ard limit*(e.g., $5) Block new messages; require explicit "Continue anyway"

Defaults disabled (opt-in via Settings).

R5 — Currency / unit configuration

Per (koder_user_id, workspace_id):

[cost_display]
currency = "BRL"          # BRL | USD | EUR | credits
show_tokens = true
show_cost = true
soft_alert_currency = 5.00
hard_limit_currency = 25.00

Conversion rate: gateway maintains exchange rates (cached daily); credits = Koder internal unit (1 credit = $0.001 baseline, configurable).

R6 — Pricing source

Per-request response from gateway includes:

{
  "usage": {
    "tokens_in": 250,
    "tokens_out": 180,
    "cost_usd": 0.005,
    "cost_credits": 5
  },
  "model": "claude-opus-4-7"
}

Fallback: client lookup in ai-model-recommendations.md registry if gateway omits.

R7 — Multi-tenant

Cost attribution per (koder_user_id, workspace_id). Workspace admin can view aggregate; peruser privacy preserved in shared workspaces (admin sees totals, not peruser breakdown unless permission).

R8 — Surface bindings

Surface API
Flutter KoderCostBadge + KoderCostBreakdownDrawer em koder_kit/lib/src/ai/cost_badge.dart
Web <koder-cost-badge> + <koder-cost-breakdown>
Compose/SwiftUI futuro
CLI / TUI Inline [$0.005] per message; koder cost session shows breakdown

R9 — Acessibilidade

  • Badge: role="status" aria-label="Cost: 250 tokens, $0.005".
  • Drawer: role="dialog".
  • Threshold alert: role="alert" (announces on trigger).

R10 — i18n

Key en-US pt-BR
ai.cost.tokens_in "{n} in" "{n} entrada"
ai.cost.tokens_out "{n} out" "{n} saída"
ai.cost.session_total "Session: {messages} messages · {tokens} tokens · {cost}" "Sessão: {messages} mensagens · {tokens} tokens · {cost}"
ai.cost.alert.soft "Approaching cost limit ({current} of {limit})" "Aproximando do limite ({current} de {limit})"
ai.cost.alert.hard "Cost limit reached" "Limite de custo atingido"
ai.cost.alert.continue_anyway "Continue anyway" "Continuar mesmo assim"
ai.cost.unit.credits "credits" "créditos"

T-suite

  • *1*Per-message badge: bubble has usage data → badge renders correctly.
  • *2*Session total: 3 messages → header shows sum.
  • *3*Breakdown drawer: by message + by model + by tool call sections populated.
  • *4*Soft alert: cross threshold → toast appears; session continues.
  • *5*Hard limit: cross limit → new messages blocked; "Continue anyway" enables.
  • *6*Currency switch: change BRL → USD → all displays update.
  • *7*Credits mode: switch to credits → no currency symbol; "credits" suffix.
  • *8*Multi-tenant: workspace switch → cost data for new workspace only.
  • *9*A11y: alerts announced via aria-live.
  • Companion: chat-message-bubble.kmd, model-selector.kmd, mcp-tool-invocation.kmd (tool call cost attribution)
  • Backend: services/ai/billing/, services/ai/gateway/
  • Registry: registries/ai-model-recommendations.md
  • Policies: multi-tenant-by-default.kmd
  • Refs: Langfuse cost tracking, MLflow token usage, Foundry granular metrics

Source: ../home/koder/dev/koder/meta/docs/stack/specs/ai-ui/cost-display.kmd