AI streaming text

mandatory

Token-by-token reveal of AI-generated text with cursor, Stop button, Retry, autoscroll, and incremental markdown rendering. Hosted by chat-message-bubble (#105) and any surface that streams gateway responses (inline-suggest, agent step trace, etc.).

Spec — AI streaming text

Companion: chat-message-bubble.kmd é o consumer principal. Code blocks deferred per R4 (cross~~link [`code~~block.kmd](code-block.kmd) R7). Cursor animation respeita [motion.kmd`](..themesmotion.kmd) R6 reduced-motion.

Princípios

*ppend-only render*— tokens chegam, append no buffer, render incrementalmente. Sem reflow inteiro.
*ursor presence signals state*— pisca durante stream; some on done.
*top is prominent*— não em menu, sempre 1-tap reachable.
*efer code blocks*— NÃO highlight enquanto fence aberto (evita re-highlight per token).
*utoscroll com escape*— follow tail por default; user scroll up cancela; re~~engage on return~~to-bottom.

R1 — Token buffer

Input: stream de events do gateway (SSE ou WebSocket via chat-adapter).

event: token       data: "Hello"
event: token       data: " world"
event: token       data: "!"
event: done        data: {}

Buffer: StringBuffer append-only. Cada token → append + trigger incremental render.

Performance gate: render coalescing ≥16ms (60fps) — multiple tokens dentro de 16ms agrupam num único render frame.

R2 — Cursor

Cursor visible enquanto stream ativo:

Glyph: ▋ (default) ou ▮ (block) configurável per-preset.
Animation: opacity 1.0 → 0.3 → 1.0 com period 1000ms.
Position: imediatamente após último char renderizado.
Anchor: inline span; respeita line wrap.

On done event: cursor desaparece (fade~~out 100ms via motion.kmd R9 `motion~~effect-fast`).

On error event: cursor desaparece + error icon appears.

Reduced-motion: cursor estático (não pisca); ainda visível pra signal "streaming".

R3 — Stop button

Stop button rendered PROMINENTLY:

Position: floating action button right-aligned no composer area (mobile) OR inline trailing do streaming bubble (desktop).
Visible: ALWAYS during stream (não esconde em menu).
Action: emit SIGINT pro gateway via WebSocket cancel message; aguarda 200ms; força close se sem ack.
After stop: cursor sumiu; "Retry" button replaces Stop (cross~~link [`chat~~message~~bubble.kmd`](chat~~message-bubble.kmd) R4 error/stopped state).

Anti~~pattern (forbidden): Stop em overflow menu (⋯ button). Streaming long é frustrante; Stop MUST be 1~~tap.

R4 — Defer code block rendering

Markdown source pode incluir fenced code:

Aqui está o código:

def hello(): pass

Sem defer: cada token renderiza markdown → code block highlight re-runs per token → CPU/memory drain + flicker.

*ontract* code blocks ( opening fence detected) MUST stay como placeholder text dimmed até closing fence detected. APENAS após closing fence: trigger syntax highlight + render via [code~~block.kmd`](code~~block.kmd).

Placeholder estilo:

┌────────────────────────────────────┐
│ ```python                          │
│ ▋  (writing code...)               │
│                                    │
└────────────────────────────────────┘

Quando fechado: snap~~render para final code~~block widget.

R5 — Autoscroll com escape

Scroll behavior:

Default: follow tail (autoscroll on append).
User scroll up (>50px from bottom): cancel autoscroll; show "Jump to bottom" floating chip.
User scrolls back to bottom (<10px): re-engage autoscroll; chip hides.

Scroll velocity: smooth, não jump; via motion.kmd motion-spatial-fast spring.

Multi-bubble streaming (rare; cascade responses): autoscroll segue o LATEST bubble; older bubbles não movem absolute position.

R6 — Retry after stop

After user clicked Stop OR gateway emitted error:

"Retry" button replaces Stop position.
Click → re-invoke gateway com same context (last user message + history); new stream starts.
Buffer cleared antes do restart (não append em buffer parcial).
Max retries client-side: 3 dentro de 30s (avoid runaway loop); after threshold, mostrar "Try again later".

R7 — Markdown incremental rendering

Tokens podem partir construções markdown ao meio:

"In **markdown** *italic*"

Tokens recebidos:

"In "
"**mark" ← ainda incompleto
"down**" ← agora "*arkdown* pode renderar
" *italic*"

Strategy:

Render markdown ATÉ último construct completo. Partial: render como literal text (degrade graciosamente).
Re-evaluate AT NEWLINE OR every paragraph boundary (whichever mais lenient pra performance).
Específicos defer per R4: fenced code, tabelas grandes, blockquotes profundos.

Library: per~~surface (flutter_markdown incremental fork; marked.js~~compatible Web; etc.). Surfaces MUST passar T-suite mesmo em libs diferentes.

R8 — Surface bindings

Surface	API
Flutter	`KoderStreamingText({required stream, onStop, onRetry})` em `koder_kit/lib/src/ai/streaming_text.dart`
Web	`<koder-streaming-text source="event-source://...">` em `koder_web_kit`
Compose Android	`KoderStreamingText` em `koder-design-compose` (futuro)
SwiftUI iOS	idem em `koder-design-swift` (futuro)
CLI / TUI	Print incremental (`io.Writer.Write`); Stop via Ctrl+C; no cursor (terminal cursor já existe)

API: Stream<Token> input → Widget/Element output + callbacks onStop() + onRetry() + onDone(fullText).

R9 — Acessibilidade

Container: aria-live="polite" durante stream (announces increments mas não flood).
Cursor: aria-hidden="true".
Stop button: aria-label="Stop generating" (i18n).
After done: announce "Done" once.
After error: announce error description.
Reduced-motion: cursor estático; autoscroll instant (não smooth).
Touch: Stop target ≥48dp.

R10 — i18n

Key	en-US	pt-BR
`ai.streaming.stop`	"Stop generating"	"Parar geração"
`ai.streaming.retry`	"Retry"	"Tentar novamente"
`ai.streaming.jump_to_bottom`	"Jump to bottom"	"Ir pro fim"
`ai.streaming.code_placeholder`	"(writing code…)"	"(escrevendo código…)"
`ai.streaming.error_max_retries`	"Try again later"	"Tente novamente mais tarde"

T-suite

*1*Render tokens: stream 100 tokens → all visible incrementally; final text correct.
*2*Cursor visible: assert cursor element presente durante stream.
*3*Cursor removed on done: emit done event → cursor disappears within 100ms.
*4*Stop button: tap stop → emit cancel; assert stream halted; Retry button visible.
*5*Retry: tap retry → new stream starts; buffer cleared.
*6*Defer code block: emit fence~~open + 50 tokens + fence~~close → code~~block rendered ONCE (not per~~token); assert no flicker via animation frame audit.
*7*Autoscroll engage: scroll bottom → tokens append → scroll stays at bottom.
*8*Autoscroll escape: scroll up 100px → tokens append → scroll position unchanged; "Jump to bottom" chip visible.
*9*Reduced~~motion: enable prefers~~reduced-motion → cursor static; autoscroll instant.
*10*A11y: screen reader announces partial text periodically (not per token); announces "Done" at end.
*1*Max retries: 3 retries dentro de 30s → 4ª retry shows max-retries message.
*2*Markdown partial: tokens partem **bold** ao meio → no flicker; final render correct.

Cross-link

Companion: chat-message-bubble.kmd (host), code-block.kmd (defer R4)
Motion: themes/motion.kmd R6 reduced-motion + R9 springs
Backend: services/ai/chat-adapter/ (SSE wire), services/ai/gateway/ (provider)