AI multimodal input composer

mandatory

Unified composer for text + image + file + voice input. Drag-and-drop + paste hook + attach button + mic button. Upload preview chips with MIME validation gated by model capabilities. Voice mode transition to voice-mode.kmd (#121). Required for any AI chat surface.

Spec — AI multimodal input composer

Voice integration via voice-mode.kmd (#121) e voice/wake-word.kmd. Capability gating via model-selector.kmd (#113). Impl ticket voice-side: services/ai/ai#115 (já aberto).

Princípios

  1. *ne composer, four modes*— text, image, file, voice. No produto separated UI per mode.
  2. *raganddrop + paste + attach*— todo o composer aceita drops; paste from clipboard auto-attaches.
  3. *IME validation gated by model*— accept only what active model can consume.
  4. *oice as transition*— mic button enters dedicated voice mode (#121), não inline transcription.
  5. *torage scoped*— uploads em path-prefix per workspace.

R1 — Anatomia

┌─────────────────────────────────────────────────┐
│ [📷 photo.jpg · 240KB · ✗]                      │  ← upload preview chips
│ [📄 spec.pdf · 1.2MB · ✗]                       │
├─────────────────────────────────────────────────┤
│ [🖼] [📎] [text input area...........] [🎙] [➤]│  ← composer row
└─────────────────────────────────────────────────┘

Slots:

Slot Function
Preview chips One per attached file; remove (✗) per chip
Image button 🖼 Open image picker (camera + gallery)
Attach button 📎 Open file picker (per model MIME whitelist)
Text input Multiline; autogrow up to N lines; max-chars per model
Mic button 🎙 Enter voice mode (cross-link #121)
Send ➤ Submit; disabled if empty + no attachments

Drop zone: ENTIRE composer area accepts drops (visual highlight on dragover).

R2 — Upload preview chips

Per file attached:

┌────────────────────────────────────┐
│ [thumb] filename · MIME · size  ✗ │
└────────────────────────────────────┘

Thumb:

  • Image: actual thumbnail (96×96 max).
  • PDF: first page render (futuro; v1 = generic icon).
  • Audio: waveform mini.
  • Other: MIME icon.

Validation states:

State Visual
Valid (model accepts) Default chip
Invalid (model rejects MIME) Red border + tooltip "Model X doesn't support {MIME}"
Uploading Progress bar overlay
Failed Error icon + Retry button

R3 — MIME validation per model

Current model capability (from model-selector.kmd #113):

Model capability MIME accepted
Vision imagepng, imagejpeg, imagewebp, imagegif
Documents applicationpdf, textmarkdown, text/plain
Audio audiomp3, audiowav, audioogg, audiom4a
Video videomp4, videowebm
Files (generic text) any text

Source: ../home/koder/dev/koder/meta/docs/stack/specs/ai-ui/multimodal-input.kmd