Ai video

AI Video — Video Understanding + Generation Foundation

*rea:*Intelligence
*ath:*services/ai/video
*ind:*Video understanding (transcript + scenes + keyframes + summary) and generation (proxy)
*tatus:*v0.0.1 — sector bootstrapping (20260509)

Role in the stack

video is the single integration point for both video understanding and generation. Understanding orchestrates voice (audio track) + imaging (frame analysis) + scene detection — pure composition, no new model code. Generation is proxy~~only in v1 (Sora, Veo, Runway, Pika via services/ai/gateway); self~~hosted texttovideo stays out of scope until open models reach G4 capability.

It is the Koder analog of a video~~stack assemblage that today requires pasting together OpenAI Sora + Whisper + GPT~~4V + custom scene-detect — the foundation collapses that into a single normalized API with one auth boundary, one quota system, one billing event stream.

Boundary vs neighbors

services/ai/voice is the audio sibling and a hard dependency for analyze.
services/ai/imaging is the frame sibling for caption/keyframe.
services/ai/vision is the single-image counterpart.
`products/media