Kodec RFC 001 cine workspace architecture
RFC001 — kodercine workspace architecture
- *tatus:*Draft (2026
0411) - *uthor:*Claude Opus 4.6 + Rodrigo
- *arget release:*koder-cine v0.1.0 → v1.0.0 (Phase 1 through Phase 5)
- *upersedes:*nothing — this is the founding doc of the koder-cine module
1. Summary
This RFC defines the architecture, crate map, error model, feature flag policy, testing strategy, and multi-year roadmap for *oder-cine* Koder's native Rust replacement for the subset of ffmpeg that Koder products actually consume.
The goal of Phase 1 is concrete and small: *top shelling out to ffprobe*in platform/talk/internal/bot/video.go::probe() by delivering a pureRust MP4 container parser and a `kodercine probe
CLI that emits the same {duration, width, height, has_audio}` shape the Go code already reads.
Phases 2 through 5 are roadmap. Phase 1 is the commitment.
2. Why Rust
The full "why Rust vs alternatives" conversation is in the session notes that produced this module (search for "ffmpeg clone language choice"). The short version:
- *IMD intrinsics*are non-negotiable for codec inner loops.
Rust exposes
std::arch::{x86_64,aarch64}stable today. - *emory safety*is non-negotiable for a parser of hostile
input. The entire CVE history of ffmpeg is C memory bugs in decoders. Rust's ownership model shuts that class of bug out of safe code by construction.
- *ross-compile*to every Koder target (x86_64 LinuxmacOS
Windows, aarch64 LinuxmacOSiOS, eventually riscv64 Linux) is a first-class Cargo feature. No other candidate matches this.
- *cosystem maturity* rav1e, dav1d, symphonia, mp4-rust,
bytes, thiserror, libfuzzer-sys, all in production use. We stand on a decade of media
inRust experience instead of pioneering. - *FI to C*is cheap when we need to wrap libva/libnvenc for
hardware accel, and equally cheap in the other direction — Go services can
cgointolibkoder_cine.sowithout heroics.
Koder Koda is *ot*yet viable for this workload. The lang/520-548 backlog tracks the 28 capabilities the language needs before a Koder Koda native implementation is viable. When that day comes, migration is incremental cratebycrate; see §9 below.
3. Why not just use ffmpeg through a cleaner wrapper
We considered three "reuse ffmpeg" strategies and rejected all of them:
- *ink libav*(libavformat, libavcodec) into Go via cgo. Rejects
because it keeps the C memory-safety surface area intact, and Go's cgo boundary is a known source of crashes when the C code misbehaves. The CVE problem is transferred, not solved.
- *un ffmpeg as a sidecar daemon*exposing a gRPC/HTTP API.
Rejects because it trades process fork overhead for network overhead and adds a new service to operate. It would reduce fork cost but not eliminate it, and it would not address the CVE surface.
- *aintain a hardened ffmpeg fork* Rejects because the
ffmpeg codebase is ~1.2 MLOC of C and our team does not have the bandwidth to audit it, much less harden it. Other organizations have tried (Chrome's media fork) and it is a multi
year fulltime effort.
The remaining option is: *rite the subset we actually use in Rust* matching ffmpeg's behavior only where Koder products depend on it, and leaving the rest uncovered. The subset is small — Phase 1 is just MP4 container probing. Phases 2-5 grow the subset under real demand.
4. Crate map
platform/cine/
├── Cargo.toml # workspace root
├── README.md
├── koder.toml # Koder product metadata
├── docs/rfcs/ # design docs (this file lives here)
├── backlog/ # pending/in-progress/done tickets
├── crates/
│ ├── media-core/ # Phase 1 — shared types
│ ├── media-container-mp4/ # Phase 1 — MP4/MOV/M4A/M4V atoms
│ ├── media-cli/ # Phase 1 — `koder-cine` CLI
│ │
│ ├── media-container-webm/ # Phase 2 — Matroska/WebM
│ ├── media-container-ogg/ # Phase 2 — Ogg (Opus, Vorbis)
│ ├── media-bitstream/ # Phase 2 — bit reader primitives
│ │ # shared by decoders
│ │
│ ├── media-codec-h264/ # Phase 3 — H.264 decoder (from scratch)
│ ├── media-codec-vp9/ # Phase 3 — VP9 decoder (wrap dav1d or port)
│ ├── media-codec-opus/ # Phase 3 — Opus decoder (port audiopus)
│ ├── media-codec-aac/ # Phase 3 — AAC-LC decoder
│ │
│ ├── media-filter-scale/ # Phase 4 — scaling / colorspace
│ ├── media-filter-select/ # Phase 4 — keyframe selection
│ │ # (replaces ffmpeg -vf select)
│ │
│ ├── media-codec-h265/ # Phase 5 — H.265 decoder
│ ├── media-codec-av1/ # Phase 5 — AV1 (wrap dav1d for now)
│ ├── media-hwaccel-vaapi/ # Phase 5 — VA-API hardware accel
│ ├── media-hwaccel-nvenc/ # Phase 5 — NVIDIA hardware encoder
│ └── media-sdk-go/ # Phase 5 — cdylib + Go bindings
└── tests/
└── fixtures/ # sample MP4/WebM files for testsCrate count at the end of Phase 5: ~17. Comparable in organization to symphonia (which has ~12 codec crates) but with container parsers as separate first-class crates (not buried inside a demuxer).
5. Error model
Each crate defines its own Error enum using thiserror::Error, scoped to the operations of that crate. Top-level binaries and consumers convert via anyhow::Error at API boundaries.
// media-core/src/error.rs
use thiserror::Error;
#[derive(Error, Debug)]
pub enum Error {
#[error("unsupported codec: {0:?}")]
UnsupportedCodec(CodecId),
#[error("invalid timestamp: {0}")]
InvalidTimestamp(i64),
// ... shared variants only
}
// media-container-mp4/src/error.rs
use thiserror::Error;
#[derive(Error, Debug)]
pub enum Mp4Error {
#[error("not an MP4 file: missing ftyp atom")]
NotAnMp4,
#[error("truncated atom at offset {offset}: expected {expected} bytes, got {actual}")]
TruncatedAtom { offset: u64, expected: u64, actual: u64 },
#[error("core: {0}")]
Core(#[from] media_core::Error),
}Hot paths never allocate an error. All error variants carry Copy data or static strings. String formatting happens only when the error is displayed, via the Display impl.
6. Feature flags
Cargo features gate codec support so consumers only pay for what they use:
# consumer's Cargo.toml
[dependencies]
koder-cine = { version = "0.1", features = ["mp4", "h264", "aac"] }Phase 1 features: mp4 (always on in Phase 1 since it's the only container).
Phase 3+ features: h264, h265, av1, vp9, aac, opus, hwaccel-vaapi, hwaccel-nvenc, hwaccel-videotoolbox. Each codec feature flips a cfg gate in the media-codec-* crates and pulls in the matching crate as a dep. No codec is in the default feature set except the ones needed by the probe use case (currently zero — probe is container-only).
7. Sync vs async
*odercine is sync by default.*Video decoding is CPUbound; async hides nothing and adds complexity. The one exception is the future media-container-http-stream crate (Phase 4+) that would read from a network stream — that one can be async, but everything upstream of it gets bytes already buffered.
This decision matches symphonia, rav1e, dav1d-rs, and every other production Rust media library I looked at.
8. Testing strategy
*hree layers*
- *nit tests*per crate in
cratessrc/*.rswith#[test]modules. Small, focused, no external fixtures.
- *ntegration tests*in
cratestests/using real fixturefiles from
tests/fixtures/. Each fixture is a small (< 1 MB) MP4/WebM generated via ffmpeg with a deterministic command line documented intests/fixtures/README.mdso anyone can regenerate them. - *uzz tests*via
cargo fuzzincratesfuzz/that feedrandom bytes into parsers. Every parser crate has at least one fuzz target from day one. This catches the panic / overflow / OOM classes that plague C parsers.
*onformance* when Phase 3 (codec decoders) lands, a shared tests/conformance/ directory gets checksummed reference outputs — a decoded frame must byte-match a reference frame decoded by ffmpeg from the same input. That is non-negotiable for a codec.
*ode coverage* target ≥ 80% line coverage on parser crates. Tracked via cargo llvm-cov in CI.
9. Migration path to Koder Koda
When the lang/520-548 epic has advanced enough (specifically: 521 SIMD + 522 LLVM backend + 523 generics + 526 alignment + 527 LTO + 528 cross-compile + 539 FFI + 547 C ABI export, the P0 subset — roughly years 23 of that roadmap), a *rateby-crate port*becomes viable.
The ordering I recommend:
media-container-mp4— first, because it's pure bit-twiddlingwith zero codec math and zero external deps. Calibrates the Rust→Koder Koda porting cost for all future crates.
media-container-webm,media-container-ogg— same reasoning,different containers.
media-bitstream— shared bit reader primitives.media-codec-opus— smallest codec, reference port for codeccrates.
media-codec-aac,media-codec-h264, others — in increasingcomplexity.
media-cli— last, because it depends on everything else.
Crates that wrap C libs (media-codec-av1 wrapping dav1d, media-hwaccel-* wrapping libva/libnvenc) are ported last, or may never be ported — C deps stay C. Koder Koda calls them via its own FFI (ticket lang/539).
10. Roadmap
Phase 1 (now) — Container probe
- Scaffold workspace ✅
media-corewithError,Duration, basic typesmedia-container-mp4MP4 atom parser,Mp4File::probe()APImedia-cliwithkoder-cine probesubcommand- Wire into
platform/talk/internal/bot/video.go::probe()with afeature flag (
USE_KODER_CINE_PROBE=1env var or similar) - 3 fixture MP4 files + unit tests + integration tests
- 1 fuzz target for the MP4 parser
*xit criteria* koder-cine probe foo.mp4 prints the same JSON shape ffprobe produces, for at least 3 real MP4 files.
*ime* 1-2 sessions of focused work.
Phase 2 — More containers + bitstream primitives
media-container-webm,media-container-oggmedia-bitstream(Golomb codes, exp-golomb, CABAC stub)- Keyframe extraction API (
Mp4File::keyframes()) sokoder-cine keyframes video.mp4 --max 5works — second call invideo.goreplaced - Audio demux API (extract Opus/AAC bitstream without decode) —
the Whisper pipeline just needs raw audio, not decoded samples
*xit criteria* koder-talk VideoAnalyzer no longer shells out to ffmpeg for probe + keyframe + audio demux. ffmpeg still used for: scale, JPEG encode of keyframes, audio transcode to Opus if needed.
*tatus (20260412):*probe ✅ (ticket talk/002 via TALKD_VIDEO_PROBE_BACKEND=cine). Audio demux partially done — the Opus fast-path is wired in ticket talk/009 (TALKD_AUDIO_BACKEND=cine), which remuxes Opus tracks straight to Ogg without an ffmpeg fork. Non-Opus audio (AAC etc.) still routes through ffmpeg because native decode/encode land in Phase 3/5. Keyframe extraction still uses ffmpeg endtoend: koder-cine identifies keyframes but has no scaler or JPEG encoder yet (Phase 4).
*ime* ~1-2 months.
Phase 3 — First real decoders
media-codec-h264(the big one — 6-12 months standalone)media-codec-opus(port audiopus, ~1 month)media-codec-aac(LC only; ~2 months)- Decoder API in
media-core:Decoder::decode_frame(packet) → Frame
*xit criteria* a 10-second H.264 MP4 decodes through koder-cine endtoend, frame-exact with ffmpeg output.
*ime* ~12-18 months.
Phase 4 — Filters + JPEG encode
media-filter-scale(bilinear + bicubic)media-filter-colorspace(yuv420→rgb etc)- JPEG encoder (use
jpeg-encoderormozjpegvia ffi) koder-cine keyframes --output frame-%03d.jpgproduces thesame JPEGs ffmpeg would
*xit criteria* koder-talk VideoAnalyzer.Analyze() works endtoend without ffmpeg installed on the host.
*ime* ~6 months on top of Phase 3.
Phase 5 — Encoder, HW accel, Go SDK
media-codec-h264-encoder(wrap x264 for now, native later)media-hwaccel-vaapi(wrap libva)media-hwaccel-nvenc,media-hwaccel-videotoolboxmedia-sdk-goas cdylib + Go bindings, proper library useinstead of CLI exec
*xit criteria* Koder products can transcode + encode without any ffmpeg binary anywhere in the stack.
*ime* ~18-24 months on top of Phase 4.
*otal honest estimate* ~35 years solo, ~23 years with a team of 2-3.
11. Licensing
*IT*for everything we write. This is the permissive default that matches the rest of Koder.
Dependencies must be *IT, Apache2.0, BSD, ISC, MPL2.0, or dual-licensed including one of the above* *PL and LGPL are forbidden*because they would force koder-cine (and transitively every Koder product that links it) to GPL. This excludes x264 (GPL) as a direct dep — we'll either wrap it via subprocess (GPL contamination avoided) or write our own encoder. dav1d (BSD) is fine. libopus (BSD) is fine. libx265 (GPL) is forbidden — we'll write an H.265 encoder ourselves or wrap via subprocess.
This constraint shapes Phase 5 scope significantly and must be revisited when encoder work starts.
11.1 MPL2.0 carveout (amended 20260412)
MPL-2.0 was added to the allow list above to unblock adoption of the Symphonia audio decoder family (symphonia-codec-aac and friends). The amendment is narrow and has two parts:
- *PL-2.0 is allowed for dependencies only.*Code Koder
writes inside
platform/cine/stays MIT. We do not author new MPL-2.0 files in this workspace. - *e do not modify MPL
2.0 dependency files intree.*If asymphonia bug needs fixing, we upstream the fix (the normal MPL
2.0 shareback obligation) rather than carrying a patched fork inplatform/cine/. If upstream is unresponsive and we genuinely need a local patch, it goes in a fork repo with its patched files kept under MPL-2.0.
*hy this is safe:*MPL2.0 is filelevel copyleft. The license text (§3.2, §3.3) explicitly permits static linking of MPL files into a larger work under a different license. Mozilla designed MPL2.0 around exactly this usecase — Firefox extensions and embedders linking Mozilla libraries without contaminating the embedding code. Rust's cargo ecosystem has extensive precedent (rustls, webrender, parts of wasmtime all pull MPL-2.0 crates). Koder products linking libkoder_cine.so via cgo (Phase 5) are unaffected: the MPL boundary stops at the symphonia source files, not at the Rust crate boundary, and certainly not at the Go cgo boundary.
*hat this does NOT permit:*
- Copying symphonia source into our tree and re-licensing it MIT
- Authoring new MPL
2.0 files as part of kodercine - Accepting LGPL / GPL deps — those stay forbidden regardless
- Using MPL
2.0 as cover for modifying dependency code inplace
The licenses/ audit file (if we ever add one) should explicitly call out every MPL-2.0 direct dep and confirm it is pulled as a dependency only, with no local modifications.
12. Rejected alternatives
Monolith instead of workspace
Considered a single koder-cine crate with modules per codec. Rejected: Cargo features don't compose cleanly across modules, and the migration path to Koder Koda wants crate-sized units anyway. Workspace wins on both ergonomics and future-proofing.
async from day one
Considered making the API async. Rejected for §7 reasons — CPU-bound work doesn't benefit from async, it just adds color.
no_std support
Considered. Rejected for Phase 1 — std gives us thiserror, anyhow, Vec, file I/O. Embedded use (IoT cameras etc) is a Phase 5+ concern and can be handled via feature gate then.
Use ffmpeg-next crate (Rust bindings to libav)
Considered. Rejected because it's just rebranded C, doesn't solve the CVE surface problem, adds a binding-maintenance burden, and makes the Koder Koda migration strictly harder (we'd be porting C, not Rust).
13. Open questions
- How do we test decoder output against ffmpeg reference without
taking a dependency on ffmpeg installed? Probably: ship the reference outputs as checked-in fixtures (regenerated rarely by a maintainer with ffmpeg locally). Decide in Phase 3.
- Should
media-corere-export the error type of each crate oronly the domain types? Lean: domain types only. Each crate's error is its own concern.
- At what size does
tests/fixtures/outgrow git LFS threshold?Likely: Phase 2 when WebM fixtures with multiple tracks land. Revisit then.
- Does koder-cine eventually ship its own fuzzer corpus on the
Koder Flow registry for the community? Stretch goal.