Kodec RFC 001 cine workspace architecture

RFC001 — kodercine workspace architecture

  • *tatus:*Draft (20260411)
  • *uthor:*Claude Opus 4.6 + Rodrigo
  • *arget release:*koder-cine v0.1.0 → v1.0.0 (Phase 1 through Phase 5)
  • *upersedes:*nothing — this is the founding doc of the koder-cine module

1. Summary

This RFC defines the architecture, crate map, error model, feature flag policy, testing strategy, and multi-year roadmap for *oder-cine* Koder's native Rust replacement for the subset of ffmpeg that Koder products actually consume.

The goal of Phase 1 is concrete and small: *top shelling out to ffprobe*in platform/talk/internal/bot/video.go::probe() by delivering a pureRust MP4 container parser and a `kodercine probe CLI that emits the same {duration, width, height, has_audio}` shape the Go code already reads.

Phases 2 through 5 are roadmap. Phase 1 is the commitment.

2. Why Rust

The full "why Rust vs alternatives" conversation is in the session notes that produced this module (search for "ffmpeg clone language choice"). The short version:

  • *IMD intrinsics*are non-negotiable for codec inner loops.

    Rust exposes std::arch::{x86_64,aarch64} stable today.

  • *emory safety*is non-negotiable for a parser of hostile

    input. The entire CVE history of ffmpeg is C memory bugs in decoders. Rust's ownership model shuts that class of bug out of safe code by construction.

  • *ross-compile*to every Koder target (x86_64 LinuxmacOS

    Windows, aarch64 LinuxmacOSiOS, eventually riscv64 Linux) is a first-class Cargo feature. No other candidate matches this.

  • *cosystem maturity* rav1e, dav1d, symphonia, mp4-rust,

    bytes, thiserror, libfuzzer-sys, all in production use. We stand on a decade of mediainRust experience instead of pioneering.

  • *FI to C*is cheap when we need to wrap libva/libnvenc for

    hardware accel, and equally cheap in the other direction — Go services can cgo into libkoder_cine.so without heroics.

Koder Koda is *ot*yet viable for this workload. The lang/520-548 backlog tracks the 28 capabilities the language needs before a Koder Koda native implementation is viable. When that day comes, migration is incremental cratebycrate; see §9 below.

3. Why not just use ffmpeg through a cleaner wrapper

We considered three "reuse ffmpeg" strategies and rejected all of them:

  1. *ink libav*(libavformat, libavcodec) into Go via cgo. Rejects

    because it keeps the C memory-safety surface area intact, and Go's cgo boundary is a known source of crashes when the C code misbehaves. The CVE problem is transferred, not solved.

  2. *un ffmpeg as a sidecar daemon*exposing a gRPC/HTTP API.

    Rejects because it trades process fork overhead for network overhead and adds a new service to operate. It would reduce fork cost but not eliminate it, and it would not address the CVE surface.

  3. *aintain a hardened ffmpeg fork* Rejects because the

    ffmpeg codebase is ~1.2 MLOC of C and our team does not have the bandwidth to audit it, much less harden it. Other organizations have tried (Chrome's media fork) and it is a multiyear fulltime effort.

The remaining option is: *rite the subset we actually use in Rust* matching ffmpeg's behavior only where Koder products depend on it, and leaving the rest uncovered. The subset is small — Phase 1 is just MP4 container probing. Phases 2-5 grow the subset under real demand.

4. Crate map

platform/cine/
├── Cargo.toml                    # workspace root
├── README.md
├── koder.toml                    # Koder product metadata
├── docs/rfcs/                    # design docs (this file lives here)
├── backlog/                      # pending/in-progress/done tickets
├── crates/
│   ├── media-core/               # Phase 1 — shared types
│   ├── media-container-mp4/      # Phase 1 — MP4/MOV/M4A/M4V atoms
│   ├── media-cli/                # Phase 1 — `koder-cine` CLI
│   │
│   ├── media-container-webm/     # Phase 2 — Matroska/WebM
│   ├── media-container-ogg/      # Phase 2 — Ogg (Opus, Vorbis)
│   ├── media-bitstream/          # Phase 2 — bit reader primitives
│   │                             #           shared by decoders
│   │
│   ├── media-codec-h264/         # Phase 3 — H.264 decoder (from scratch)
│   ├── media-codec-vp9/          # Phase 3 — VP9 decoder (wrap dav1d or port)
│   ├── media-codec-opus/         # Phase 3 — Opus decoder (port audiopus)
│   ├── media-codec-aac/          # Phase 3 — AAC-LC decoder
│   │
│   ├── media-filter-scale/       # Phase 4 — scaling / colorspace
│   ├── media-filter-select/      # Phase 4 — keyframe selection
│   │                             #           (replaces ffmpeg -vf select)
│   │
│   ├── media-codec-h265/         # Phase 5 — H.265 decoder
│   ├── media-codec-av1/          # Phase 5 — AV1 (wrap dav1d for now)
│   ├── media-hwaccel-vaapi/      # Phase 5 — VA-API hardware accel
│   ├── media-hwaccel-nvenc/      # Phase 5 — NVIDIA hardware encoder
│   └── media-sdk-go/             # Phase 5 — cdylib + Go bindings
└── tests/
    └── fixtures/                 # sample MP4/WebM files for tests

Crate count at the end of Phase 5: ~17. Comparable in organization to symphonia (which has ~12 codec crates) but with container parsers as separate first-class crates (not buried inside a demuxer).

5. Error model

Each crate defines its own Error enum using thiserror::Error, scoped to the operations of that crate. Top-level binaries and consumers convert via anyhow::Error at API boundaries.

// media-core/src/error.rs
use thiserror::Error;

#[derive(Error, Debug)]
pub enum Error {
    #[error("unsupported codec: {0:?}")]
    UnsupportedCodec(CodecId),
    #[error("invalid timestamp: {0}")]
    InvalidTimestamp(i64),
    // ... shared variants only
}

// media-container-mp4/src/error.rs
use thiserror::Error;

#[derive(Error, Debug)]
pub enum Mp4Error {
    #[error("not an MP4 file: missing ftyp atom")]
    NotAnMp4,
    #[error("truncated atom at offset {offset}: expected {expected} bytes, got {actual}")]
    TruncatedAtom { offset: u64, expected: u64, actual: u64 },
    #[error("core: {0}")]
    Core(#[from] media_core::Error),
}

Hot paths never allocate an error. All error variants carry Copy data or static strings. String formatting happens only when the error is displayed, via the Display impl.

6. Feature flags

Cargo features gate codec support so consumers only pay for what they use:

# consumer's Cargo.toml
[dependencies]
koder-cine = { version = "0.1", features = ["mp4", "h264", "aac"] }

Phase 1 features: mp4 (always on in Phase 1 since it's the only container).

Phase 3+ features: h264, h265, av1, vp9, aac, opus, hwaccel-vaapi, hwaccel-nvenc, hwaccel-videotoolbox. Each codec feature flips a cfg gate in the media-codec-* crates and pulls in the matching crate as a dep. No codec is in the default feature set except the ones needed by the probe use case (currently zero — probe is container-only).

7. Sync vs async

*odercine is sync by default.*Video decoding is CPUbound; async hides nothing and adds complexity. The one exception is the future media-container-http-stream crate (Phase 4+) that would read from a network stream — that one can be async, but everything upstream of it gets bytes already buffered.

This decision matches symphonia, rav1e, dav1d-rs, and every other production Rust media library I looked at.

8. Testing strategy

*hree layers*

  1. *nit tests*per crate in cratessrc/*.rs with #[test]

    modules. Small, focused, no external fixtures.

  2. *ntegration tests*in cratestests/ using real fixture

    files from tests/fixtures/. Each fixture is a small (< 1 MB) MP4/WebM generated via ffmpeg with a deterministic command line documented in tests/fixtures/README.md so anyone can regenerate them.

  3. *uzz tests*via cargo fuzz in cratesfuzz/ that feed

    random bytes into parsers. Every parser crate has at least one fuzz target from day one. This catches the panic / overflow / OOM classes that plague C parsers.

*onformance* when Phase 3 (codec decoders) lands, a shared tests/conformance/ directory gets checksummed reference outputs — a decoded frame must byte-match a reference frame decoded by ffmpeg from the same input. That is non-negotiable for a codec.

*ode coverage* target ≥ 80% line coverage on parser crates. Tracked via cargo llvm-cov in CI.

9. Migration path to Koder Koda

When the lang/520-548 epic has advanced enough (specifically: 521 SIMD + 522 LLVM backend + 523 generics + 526 alignment + 527 LTO + 528 cross-compile + 539 FFI + 547 C ABI export, the P0 subset — roughly years 23 of that roadmap), a *rateby-crate port*becomes viable.

The ordering I recommend:

  1. media-container-mp4 — first, because it's pure bit-twiddling

    with zero codec math and zero external deps. Calibrates the Rust→Koder Koda porting cost for all future crates.

  2. media-container-webm, media-container-ogg — same reasoning,

    different containers.

  3. media-bitstream — shared bit reader primitives.
  4. media-codec-opus — smallest codec, reference port for codec

    crates.

  5. media-codec-aac, media-codec-h264, others — in increasing

    complexity.

  6. media-cli — last, because it depends on everything else.

Crates that wrap C libs (media-codec-av1 wrapping dav1d, media-hwaccel-* wrapping libva/libnvenc) are ported last, or may never be ported — C deps stay C. Koder Koda calls them via its own FFI (ticket lang/539).

10. Roadmap

Phase 1 (now) — Container probe

  • Scaffold workspace ✅
  • media-core with Error, Duration, basic types
  • media-container-mp4 MP4 atom parser, Mp4File::probe() API
  • media-cli with koder-cine probe subcommand
  • Wire into platform/talk/internal/bot/video.go::probe() with a

    feature flag (USE_KODER_CINE_PROBE=1 env var or similar)

  • 3 fixture MP4 files + unit tests + integration tests
  • 1 fuzz target for the MP4 parser

*xit criteria* koder-cine probe foo.mp4 prints the same JSON shape ffprobe produces, for at least 3 real MP4 files.

*ime* 1-2 sessions of focused work.

Phase 2 — More containers + bitstream primitives

  • media-container-webm, media-container-ogg
  • media-bitstream (Golomb codes, exp-golomb, CABAC stub)
  • Keyframe extraction API (Mp4File::keyframes()) so

    koder-cine keyframes video.mp4 --max 5 works — second call in video.go replaced

  • Audio demux API (extract Opus/AAC bitstream without decode) —

    the Whisper pipeline just needs raw audio, not decoded samples

*xit criteria* koder-talk VideoAnalyzer no longer shells out to ffmpeg for probe + keyframe + audio demux. ffmpeg still used for: scale, JPEG encode of keyframes, audio transcode to Opus if needed.

*tatus (20260412):*probe ✅ (ticket talk/002 via TALKD_VIDEO_PROBE_BACKEND=cine). Audio demux partially done — the Opus fast-path is wired in ticket talk/009 (TALKD_AUDIO_BACKEND=cine), which remuxes Opus tracks straight to Ogg without an ffmpeg fork. Non-Opus audio (AAC etc.) still routes through ffmpeg because native decode/encode land in Phase 3/5. Keyframe extraction still uses ffmpeg endtoend: koder-cine identifies keyframes but has no scaler or JPEG encoder yet (Phase 4).

*ime* ~1-2 months.

Phase 3 — First real decoders

  • media-codec-h264 (the big one — 6-12 months standalone)
  • media-codec-opus (port audiopus, ~1 month)
  • media-codec-aac (LC only; ~2 months)
  • Decoder API in media-core: Decoder::decode_frame(packet) → Frame

*xit criteria* a 10-second H.264 MP4 decodes through koder-cine endtoend, frame-exact with ffmpeg output.

*ime* ~12-18 months.

Phase 4 — Filters + JPEG encode

  • media-filter-scale (bilinear + bicubic)
  • media-filter-colorspace (yuv420→rgb etc)
  • JPEG encoder (use jpeg-encoder or mozjpeg via ffi)
  • koder-cine keyframes --output frame-%03d.jpg produces the

    same JPEGs ffmpeg would

*xit criteria* koder-talk VideoAnalyzer.Analyze() works endtoend without ffmpeg installed on the host.

*ime* ~6 months on top of Phase 3.

Phase 5 — Encoder, HW accel, Go SDK

  • media-codec-h264-encoder (wrap x264 for now, native later)
  • media-hwaccel-vaapi (wrap libva)
  • media-hwaccel-nvenc, media-hwaccel-videotoolbox
  • media-sdk-go as cdylib + Go bindings, proper library use

    instead of CLI exec

*xit criteria* Koder products can transcode + encode without any ffmpeg binary anywhere in the stack.

*ime* ~18-24 months on top of Phase 4.

*otal honest estimate* ~35 years solo, ~23 years with a team of 2-3.

11. Licensing

*IT*for everything we write. This is the permissive default that matches the rest of Koder.

Dependencies must be *IT, Apache2.0, BSD, ISC, MPL2.0, or dual-licensed including one of the above* *PL and LGPL are forbidden*because they would force koder-cine (and transitively every Koder product that links it) to GPL. This excludes x264 (GPL) as a direct dep — we'll either wrap it via subprocess (GPL contamination avoided) or write our own encoder. dav1d (BSD) is fine. libopus (BSD) is fine. libx265 (GPL) is forbidden — we'll write an H.265 encoder ourselves or wrap via subprocess.

This constraint shapes Phase 5 scope significantly and must be revisited when encoder work starts.

11.1 MPL2.0 carveout (amended 20260412)

MPL-2.0 was added to the allow list above to unblock adoption of the Symphonia audio decoder family (symphonia-codec-aac and friends). The amendment is narrow and has two parts:

  1. *PL-2.0 is allowed for dependencies only.*Code Koder

    writes inside platform/cine/ stays MIT. We do not author new MPL-2.0 files in this workspace.

  2. *e do not modify MPL2.0 dependency files intree.*If a

    symphonia bug needs fixing, we upstream the fix (the normal MPL2.0 shareback obligation) rather than carrying a patched fork in platform/cine/. If upstream is unresponsive and we genuinely need a local patch, it goes in a fork repo with its patched files kept under MPL-2.0.

*hy this is safe:*MPL2.0 is filelevel copyleft. The license text (§3.2, §3.3) explicitly permits static linking of MPL files into a larger work under a different license. Mozilla designed MPL2.0 around exactly this usecase — Firefox extensions and embedders linking Mozilla libraries without contaminating the embedding code. Rust's cargo ecosystem has extensive precedent (rustls, webrender, parts of wasmtime all pull MPL-2.0 crates). Koder products linking libkoder_cine.so via cgo (Phase 5) are unaffected: the MPL boundary stops at the symphonia source files, not at the Rust crate boundary, and certainly not at the Go cgo boundary.

*hat this does NOT permit:*

  • Copying symphonia source into our tree and re-licensing it MIT
  • Authoring new MPL2.0 files as part of kodercine
  • Accepting LGPL / GPL deps — those stay forbidden regardless
  • Using MPL2.0 as cover for modifying dependency code inplace

The licenses/ audit file (if we ever add one) should explicitly call out every MPL-2.0 direct dep and confirm it is pulled as a dependency only, with no local modifications.

12. Rejected alternatives

Monolith instead of workspace

Considered a single koder-cine crate with modules per codec. Rejected: Cargo features don't compose cleanly across modules, and the migration path to Koder Koda wants crate-sized units anyway. Workspace wins on both ergonomics and future-proofing.

async from day one

Considered making the API async. Rejected for §7 reasons — CPU-bound work doesn't benefit from async, it just adds color.

no_std support

Considered. Rejected for Phase 1 — std gives us thiserror, anyhow, Vec, file I/O. Embedded use (IoT cameras etc) is a Phase 5+ concern and can be handled via feature gate then.

Use ffmpeg-next crate (Rust bindings to libav)

Considered. Rejected because it's just rebranded C, doesn't solve the CVE surface problem, adds a binding-maintenance burden, and makes the Koder Koda migration strictly harder (we'd be porting C, not Rust).

13. Open questions

  1. How do we test decoder output against ffmpeg reference without

    taking a dependency on ffmpeg installed? Probably: ship the reference outputs as checked-in fixtures (regenerated rarely by a maintainer with ffmpeg locally). Decide in Phase 3.

  2. Should media-core re-export the error type of each crate or

    only the domain types? Lean: domain types only. Each crate's error is its own concern.

  3. At what size does tests/fixtures/ outgrow git LFS threshold?

    Likely: Phase 2 when WebM fixtures with multiple tracks land. Revisit then.

  4. Does koder-cine eventually ship its own fuzzer corpus on the

    Koder Flow registry for the community? Stretch goal.

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/kodec-RFC-001-cine-workspace-architecture.md