AI streaming text

mandatory

Token-by-token reveal of AI-generated text with cursor, Stop button, Retry, autoscroll, and incremental markdown rendering. Hosted by chat-message-bubble (#105) and any surface that streams gateway responses (inline-suggest, agent step trace, etc.).

Spec — AI streaming text

Companion: chat-message-bubble.kmd é o consumer principal. Code blocks deferred per R4 (crosslink [`codeblock.kmd](code-block.kmd) R7). Cursor animation respeita [motion.kmd`](..themesmotion.kmd) R6 reduced-motion.

Princípios

  1. *ppend-only render*— tokens chegam, append no buffer, render incrementalmente. Sem reflow inteiro.
  2. *ursor presence signals state*— pisca durante stream; some on done.
  3. *top is prominent*— não em menu, sempre 1-tap reachable.
  4. *efer code blocks*— NÃO highlight enquanto fence aberto (evita re-highlight per token).
  5. *utoscroll com escape*— follow tail por default; user scroll up cancela; reengage on returnto-bottom.

R1 — Token buffer

Input: stream de events do gateway (SSE ou WebSocket via chat-adapter).

event: token       data: "Hello"
event: token       data: " world"
event: token       data: "!"
event: done        data: {}

Buffer: StringBuffer append-only. Cada token → append + trigger incremental render.

Performance gate: render coalescing ≥16ms (60fps) — multiple tokens dentro de 16ms agrupam num único render frame.

R2 — Cursor

Cursor visible enquanto stream ativo:

  • Glyph: (default) ou (block) configurável per-preset.
  • Animation: opacity 1.0 → 0.3 → 1.0 com period 1000ms.
  • Position: imediatamente após último char renderizado.
  • Anchor: inline span; respeita line wrap.

On done event: cursor desaparece (fadeout 100ms via motion.kmd R9 `motioneffect-fast`).

On error event: cursor desaparece + error icon appears.

Reduced-motion: cursor estático (não pisca); ainda visível pra signal "streaming".

R3 — Stop button

Stop button rendered PROMINENTLY:

  • Position: floating action button right-aligned no composer area (mobile) OR inline trailing do streaming bubble (desktop).
  • Visible: ALWAYS during stream (não esconde em menu).
  • Action: emit SIGINT pro gateway via WebSocket cancel message; aguarda 200ms; força close se sem ack.
  • After stop: cursor sumiu; "Retry" button replaces Stop (crosslink [`chatmessagebubble.kmd`](chatmessage-bubble.kmd) R4 error/stopped state).

Antipattern (forbidden): Stop em overflow menu ( button). Streaming long é frustrante; Stop MUST be 1tap.

R4 — Defer code block rendering

Markdown source pode incluir fenced code:

Aqui está o código:

def hello(): pass

Sem defer: cada token renderiza markdown → code block highlight re-runs per token → CPU/memory drain + flicker.

*ontract* code blocks ( opening fence detected) MUST stay como placeholder text dimmed até closing fence detected. APENAS após closing fence: trigger syntax highlight + render via [codeblock.kmd`](codeblock.kmd).

Placeholder estilo:

┌────────────────────────────────────┐
│ ```python                          │
│ ▋  (writing code...)               │
│                                    │
└────────────────────────────────────┘

Quando fechado: snaprender para final codeblock widget.

R5 — Autoscroll com escape

Scroll behavior:

  • Default: follow tail (autoscroll on append).
  • User scroll up (>50px from bottom): cancel autoscroll; show "Jump to bottom" floating chip.
  • User scrolls back to bottom (<10px): re-engage autoscroll; chip hides.

Scroll velocity: smooth, não jump; via motion.kmd motion-spatial-fast spring.

Multi-bubble streaming (rare; cascade responses): autoscroll segue o LATEST bubble; older bubbles não movem absolute position.

R6 — Retry after stop

After user clicked Stop OR gateway emitted error:

  • "Retry" button replaces Stop position.
  • Click → re-invoke gateway com same context (last user message + history); new stream starts.
  • Buffer cleared antes do restart (não append em buffer parcial).
  • Max retries client-side: 3 dentro de 30s (avoid runaway loop); after threshold, mostrar "Try again later".

R7 — Markdown incremental rendering

Tokens podem partir construções markdown ao meio:

"In **markdown** *italic*"

Tokens recebidos:

  1. "In "
  2. "**mark" ← ainda incompleto
  3. "down**" ← agora "*arkdown* pode renderar
  4. " *italic*"

Strategy:

  • Render markdown ATÉ último construct completo. Partial: render como literal text (degrade graciosamente).
  • Re-evaluate AT NEWLINE OR every paragraph boundary (whichever mais lenient pra performance).
  • Específicos defer per R4: fenced code, tabelas grandes, blockquotes profundos.

Library: persurface (flutter_markdown incremental fork; marked.jscompatible Web; etc.). Surfaces MUST passar T-suite mesmo em libs diferentes.

R8 — Surface bindings

Surface API
Flutter KoderStreamingText({required stream, onStop, onRetry}) em koder_kit/lib/src/ai/streaming_text.dart
Web <koder-streaming-text source="event-source://..."> em koder_web_kit
Compose Android KoderStreamingText em koder-design-compose (futuro)
SwiftUI iOS idem em koder-design-swift (futuro)
CLI / TUI Print incremental (io.Writer.Write); Stop via Ctrl+C; no cursor (terminal cursor já existe)

API: Stream<Token> input → Widget/Element output + callbacks onStop() + onRetry() + onDone(fullText).

R9 — Acessibilidade

  • Container: aria-live="polite" durante stream (announces increments mas não flood).
  • Cursor: aria-hidden="true".
  • Stop button: aria-label="Stop generating" (i18n).
  • After done: announce "Done" once.
  • After error: announce error description.
  • Reduced-motion: cursor estático; autoscroll instant (não smooth).
  • Touch: Stop target ≥48dp.

R10 — i18n

Key en-US pt-BR
ai.streaming.stop "Stop generating" "Parar geração"
ai.streaming.retry "Retry" "Tentar novamente"
ai.streaming.jump_to_bottom "Jump to bottom" "Ir pro fim"
ai.streaming.code_placeholder "(writing code…)" "(escrevendo código…)"
ai.streaming.error_max_retries "Try again later" "Tente novamente mais tarde"

T-suite

  • *1*Render tokens: stream 100 tokens → all visible incrementally; final text correct.
  • *2*Cursor visible: assert cursor element presente durante stream.
  • *3*Cursor removed on done: emit done event → cursor disappears within 100ms.
  • *4*Stop button: tap stop → emit cancel; assert stream halted; Retry button visible.
  • *5*Retry: tap retry → new stream starts; buffer cleared.
  • *6*Defer code block: emit fenceopen + 50 tokens + fenceclose → codeblock rendered ONCE (not pertoken); assert no flicker via animation frame audit.
  • *7*Autoscroll engage: scroll bottom → tokens append → scroll stays at bottom.
  • *8*Autoscroll escape: scroll up 100px → tokens append → scroll position unchanged; "Jump to bottom" chip visible.
  • *9*Reducedmotion: enable prefersreduced-motion → cursor static; autoscroll instant.
  • *10*A11y: screen reader announces partial text periodically (not per token); announces "Done" at end.
  • *1*Max retries: 3 retries dentro de 30s → 4ª retry shows max-retries message.
  • *2*Markdown partial: tokens partem **bold** ao meio → no flicker; final render correct.

Source: ../home/koder/dev/koder/meta/docs/stack/specs/ai-ui/streaming-text.kmd