Kodec RFC 002 phase3 decoders
RFC002 — Phase 3: First real decoders (Opus, AACLC, H.264)
- *tatus:*Draft (2026
0412) - *uthor:*Claude Opus 4.6 + Rodrigo
- *arget release:*koder-cine v0.2.0 → v0.3.0
- *upersedes:*RFC-001 §10 Phase 3 (expands it with concrete decisions)
- *epends on:*RFC-001 (workspace architecture, error model, testing strategy)
1. Summary
Phase 1 and Phase 2 delivered *ontainer parsing*— koder-cine can open an MP4 or WebM, walk its atoms, enumerate tracks, emit keyframe and audio-frame byte ranges, and remux Opus tracks into Ogg. That was enough to land two real wins in koder-talk (ticket 002 probe backend, ticket 009 Opus audio fast-path).
*hase 3 crosses the decoder threshold.*This RFC scopes what that means concretely: which codecs, in which order, written from scratch vs wrapped from existing Rust crates, the API they share, the conformance bar, and what the end of Phase 3 does not deliver (spoiler: still no pixels reach a JPEG — that is Phase 4).
Phase 3 is the single largest commitment in the koder-cine roadmap. The RFC-001 estimate was "~12–18 months". This RFC reaffirms that and spends a section defending it.
2. Why decoders, why now
Two forces pull decoders onto the critical path:
- *udio demux is half
done.*Ticket 009 covered Opusin, butevery MP4 from iOS and every legacy recording is AAC-LC, which still shells out to ffmpeg for decode + re-encode to Opus. A native AAC decoder lets koder-cine produce Opus output for any input via an internal AAC→PCM→Opus-encode path, closing the audio half of the Phase 2 exit criterion.
- *eyframes are byte ranges, not images.*koder-cine already
knows where the I-frames live; Phase 4 turns them into JPEGs. That requires a video decoder (H.264/H.265) to produce YUV frames, a scaler (Phase 4), and a JPEG encoder (Phase 4). The video decoder is the Phase 3 prerequisite.
Both gate the same RFC-001 Phase 2 exit criterion we only partially met. Phase 3 is the only way to raise that flag fully.
3. Scope
In scope
- *media
codecopus`*— Opus audio decoder - *media
codecaac`*— AACLC decoder (LC profile only; HEAAC v1/v2 out of scope) - *media
codech264`*— H.264 decoder (Baseline + Main + High profiles, 8-bit 4:2:0 only) - *ecoder API in
media-core*— shared trait covering all three, packetin / frameout - *onformance harness*— checksum
based frameexact comparison against ffmpeg reference outputs - *uzz targets*for each decoder crate
- *LI additions*—
koder-cine decode <file>subcommand emitting raw YUV or PCM to stdout, for debugging and conformance
Explicitly out of scope (moves to later phases)
- H.265/HEVC decoder — *hase 5*(higher complexity, less traffic)
- VP9 / AV1 — *hase 5*(wrap dav1d-rs initially; write from scratch never)
- Any encoder — *hase 5*(Opus encoder included — we keep using
opus-extractfor remux and ffmpeg for re-encode until then) - Hardware acceleration (VA-API / NVENC / VideoToolbox) — *hase 5*
- JPEG encoding of decoded frames — *hase 4*
- Scaling / colorspace conversion — *hase 4*
- 10-bit or 4:2:2/4:4:4 H.264 — out of scope permanently unless Koder products demand it
- AAC-HE (SBR/PS) — out of scope permanently unless a real consumer needs it
4. Ordering and why
*pdated 20260412 after symphonia audit (§14).*The audit flipped the original plan. The new ordering is *AC → H.264 → Opus* for these reasons:
- *AC
LC (≈1–2 weeks, pending MPL2.0 license approval).*Wasexpected to take 2 months.
symphonia-codec-aacis a production crate (v0.5.5, 3.3M downloads, pure safe Rust), and the wiring to our Decoder trait is days, not months. *ated on §13 license approval for MPL-2.0 dependencies*(see §9, §14). If approved, AAC becomes the fastest real win of Phase 3 and the first crate to land — making it the natural calibration point for the Decoder trait, conformance harness and fuzz infra. - *.264 (≈9–12 months).*Unchanged. Still the big one, still
from scratch, still the gate for Phase 4. See §5.3. Milestones in §7.
- *pus (≈2–4 months, reassessed upward).*Was expected to take
1 month via
symphonia-codec-opusre-export. *he audit (§14) revealed that crate does not exist*— it is a planned entry, never shipped. The realistic pure-Rust Opus paths are: port libopus (BSD, ~30k LOC of C), or adopt a hobby crate (opus-decoder,mousiki) after serious audit. Neither is a 1-month job. Opus also delivers *ero new talk value*on its own because ticket 009 already serves Opus via remux. Moving it last: it can ship after H.264 with reduced urgency, and if Phase 5 priorities shift (hardware accel, encoders) it can be deferred without blocking anything.
Total: ~11–14 months if work runs backtoback single-threaded. The headline change is that the *irst real integration win arrives in weeks, not a year*— as long as MPL-2.0 gets approved.
5. Per-crate decisions
5.1 media-codec-aac (FIRST to land — was 5.2)
*pproach:*reexport `symphoniacodec-aac` behind our Decoder trait.
The audit in §14 confirmed symphonia-codec-aac is production-grade: v0.5.5, 3.3M downloads, actively maintained, pure safe Rust, no C deps, AACLC complete. HEAAC is unimplemented — matches our scope exactly.
*stimated effort:*1–2 weeks. The work is:
- Add
symphonia-codec-aacas a dep incrates/media-codec-aac/ - Thin adapter from symphonia's
Decodertrait to ourmedia_core::Decoder - Wire into
koder-cine decode --audiopath - Conformance harness: 3
5 MP4 fixtures with AACLC tracks, sha256vectors committed
- Fuzz target wrapping the adapter
*locker:*symphonia is licensed *PL-2.0* which is not on the RFC-001 §11 allow list. This requires either:
- (a) An explicit amendment to RFC
001 §11 adding MPL2.0 fordependencies only (not for code Koder writes), with the reasoning that MPL
2.0 is filelevel copyleft and does not contaminate consumers, or - (b) Rejection of MPL-2.0, in which case we fall back to "write
AAC-LC from scratch" and the estimate goes back up to ~2 months.
See §13 and §14 for the open decision.
5.2 media-codec-h264 (SECOND to land — unchanged from 5.3)
*pproach:*this is the hard one and symphonia does *ot*cover video codecs. The candidates in pure Rust:
- *openh264-sys2`*— bindings to Cisco's openh264 (BSD), which is
C. Rejected for the same FFI reason as Opus wrapper.
- *h264-reader`*— pure Rust, but it is a parser, not a decoder.
Useful for NAL / SPS / PPS parsing (we already have NAL parsers from tickets 021/024), not for pixel output.
- *rust
av/h264` (avcodec)*— abandoned since 2019, incomplete. - *rite from scratch.*Last resort; 9–12 months.
*ecision:*write from scratch, targeting the Baseline + Main + Highprofile 8bit 4:2:0 subset. Reuse media-bitstream for exp-Golomb + CABAC bit I/O. Ship incremental milestones (§7) so review happens every few weeks, not at the end. Leverage our own NAL parsers from media-container-mp4 and media-bitstream rather than duplicating.
*scape hatch:*if by month 3 of H.264 work the conformance harness shows we are less than 40% through the feature matrix, we pause H.264 and wrap openh264 via subprocess — not via FFI — so the Rust core stays clean. The subprocess is explicit debt, not a long-term home. This escape hatch exists so Phase 4 and Phase 5 are not held hostage to H.264 slip.
5.3 media-codec-opus (THIRD to land — reassessed)
*pproach:*this was supposed to be a cheap symphonia re-export. The audit (§14) killed that path: symphonia-codec-opus *oes not exist on crates.io*— it is a planned entry in the Symphonia README, nothing shipped. Real options:
- *opus-decoder`*(769 downloads, v0.1.1, "RFC 8251 conformant,
no unsafe, no FFI") — promising description, essentially no production adoption. Requires a conformance audit before trusting.
- *mousiki`*(466 downloads, v0.2.1, "Pure Rust Opus codec") —
similar situation. Hobbyist-grade.
- *ort libopus*(BSD, ~30k LOC of C) — 2–3 months by a focused
author. Canonical source of truth.
- *ait for Symphonia's planned implementation*— no timeline,
unwise to block on.
*ecision:*defer the decision to when Opus's turn actually comes. At that point, audit opus-decoder and mousiki against our conformance fixtures. If either passes, re-export with a thin adapter. If both fail, port libopus.
*hy third, not first:*ticket 009 already serves the Opus case without decoding (remux-only), so Opus delivers no new talk value. It can ship after H.264 without blocking anything downstream.
5.4 The meta-rule that changed
RFC-001 §2 said "write our own". This RFC's original §5 softened that to "symphonia first, fork if needed". The audit forces a further revision:
*o not reinvent codecs that already exist in safe pure Rust with production adoption. Do not depend on codecs that exist only in hobby crates or plannedbutunshipped crates. When neither applies, port from the C reference — do not wrap it.*
The practical consequence is that Phase 3 fragmentation is higher than expected: AAC is a reexport, H.264 is writefrom-scratch, Opus is TBD. That's fine — the shared Decoder trait (§6) keeps the consumer API uniform regardless of how each crate is implemented internally.
5.3 media-codec-h264
*pproach:*this is the hard one and symphonia does *ot*cover video codecs. The candidates in pure Rust:
- *openh264-sys2`*— bindings to Cisco's openh264 (BSD), which is
C. Rejected for the same FFI reason as Opus wrapper.
- *h264-reader`*— pure Rust, but it is a parser, not a decoder.
Useful for NAL / SPS / PPS parsing (we already have NAL parsers from tickets 021/024), not for pixel output.
- *rust
av/h264` (avcodec)*— abandoned since 2019, incomplete. - *rite from scratch.*Last resort; 9–12 months.
*ecision:*write from scratch, targeting the Baseline + Main + Highprofile 8bit 4:2:0 subset. Reuse media-bitstream for exp-Golomb + CABAC bit I/O. Ship incremental milestones (§7) so review happens every few weeks, not at the end. Leverage our own NAL parsers from media-container-mp4 and media-bitstream rather than duplicating.
*scape hatch:*if by month 3 of H.264 work the conformance harness shows we are less than 40% through the feature matrix, we pause H.264 and wrap openh264 via subprocess — not via FFI — so the Rust core stays clean. The subprocess is explicit debt, not a long-term home. This escape hatch exists so Phase 4 and Phase 5 are not held hostage to H.264 slip.
6. Decoder API (media-core)
All three decoders implement the same trait in media-core:
pub trait Decoder {
type Frame;
/// Configure from codec-specific metadata (e.g. avcC for H.264,
/// AudioSpecificConfig for AAC, OpusHead for Opus). Returns an
/// error if the configuration is unsupported (e.g. 10-bit H.264).
fn configure(&mut self, config: &[u8]) -> Result<(), Error>;
/// Feed one packet (one NAL access unit, one AAC raw block, one
/// Opus packet). Returns zero or more decoded frames. Decoders
/// may buffer — a packet in does not imply a frame out.
fn decode(&mut self, packet: &[u8], pts: i64) -> Result<Vec<Self::Frame>, Error>;
/// Flush any buffered frames. Called at end-of-stream.
fn flush(&mut self) -> Result<Vec<Self::Frame>, Error>;
}
pub struct VideoFrame {
pub pts: i64,
pub width: u32,
pub height: u32,
pub format: PixelFormat, // I420 only in Phase 3
pub planes: [Plane; 3], // Y, U, V
}
pub struct AudioFrame {
pub pts: i64,
pub sample_rate: u32,
pub channels: u8,
pub samples: Vec<f32>, // interleaved
}Key choices:
- *ero async.*Consistent with RFC-001 §7.
- *wned frame data, not borrowed.*Decoders own internal reference
buffers; callers get owned copies. This sacrifices some memory performance for API simplicity; we can revisit with a
decode_into(&mut Frame)variant if profiling demands it. - *o trait object by default.*Each decoder crate exports a
concrete type;
dyn Decoderis available but not the primary path. This keeps inlining aggressive in hot loops.
7. Milestones and review cadence
H.264 is the only crate large enough to warrant internal milestones. Proposed breakdown:
| M | Deliverable | Verification |
|---|---|---|
| M1 | SPS/PPS parse, NAL extraction wired to Decoder trait | Unit tests match h264-reader output |
| M2 | Intra |
Decode a 1-frame MP4 to YUV, checksum vs ffmpeg |
| M3 | CAVLC inter |
Decode a 30-frame MP4, checksum vs ffmpeg |
| M4 | CABAC | Same test with CABAC-encoded fixture |
| M5 | Deblocking filter | Same checksums now byte-exact with ffmpeg |
| M6 | B-slice + reference frame management | 10 |
| M7 | Conformance suite (JM reference vectors) passing | > 90% of applicable vectors green |
Each milestone has its own backlog ticket and its own review. We do not wait 12 months for a big-bang merge.
8. Conformance strategy
The question left open by RFC-001 §13.1 is: how do we test decoder output against ffmpeg reference without taking a runtime dep on ffmpeg?
*ecision:*checked-in reference fixtures, regenerated rarely. For each input fixture in tests/fixtures/, a maintainer with ffmpeg installed runs a documented command like:
ffmpeg -i input.mp4 -f rawvideo -pix_fmt yuv420p reference.yuv
sha256sum reference.yuv > reference.yuv.sha256The .sha256 file is committed; the .yuv is not (too large, easy to regenerate). Tests decode via koder-cine, compute the sha256 of the output, and compare against the committed reference. When ffmpeg changes output (rare in mature decoders), a maintainer regenerates.
For *it-exactness disputes* use the JM reference decoder conformance vectors from the H.264 / H.265 JVT. Those are the ground truth for decoder correctness; ffmpeg is just convenient.
Fuzz targets (cargo-fuzz) run continuously in CI as they do today for the parsers (see tickets 004, 012, 022, 023). Every decoder crate ships with at least one fuzz target at its first milestone.
9. License constraints recap
From RFC001 §11: MIT / Apache2.0 / BSD / ISC only. The Phase 3 audit (§14) surfaced *ne real license issue* Symphonia is *PL-2.0* not MIT as I mistakenly assumed in the RFC's first draft.
9.1 MPL-2.0 analysis
- *PL
2.0 is filelevel copyleft.*Modifications to MPL-licensedsource files must be released under MPL. Consumers of MPL code do not need to release their own code under MPL — the copyleft stops at the file boundary.
- *ractical consequence for koder
cine:*taking `symphoniacodec-aac`as a Cargo dependency is fine. koder-cine stays MIT. Koder products linking koder-cine stay unaffected. We cannot copy symphonia's source into our tree and re-license it, and we cannot modify it in-place without sharing back — but neither is needed for a dep.
- *tatic linking (Rust/Cargo default) is explicitly allowed*
by MPL
2.0 §3.2 and §3.3 — this is precisely the usecase MPL-2.0 was designed around (Mozilla's relationship with proprietary Firefox plugins).
*ecommendation:*amend RFC001 §11 to allow MPL2.0 *or dependencies only* keeping the Koder-authored code under MIT. This is a safe, narrow amendment that unlocks symphonia without broader policy drift.
*isk if rejected:*AAC-LC falls back to "write from scratch" (~2 months). Phase 3 timeline grows, but nothing else breaks.
9.2 Other Phase 3 license confirmations
- *ibopus*(BSD): porting C→Rust is fine
- *pus-decoder, mousiki*(need audit; probably MIT/Apache): fine
if they are, reject if they aren't
- *M H.264 reference decoder* BSD, spec freely published
- *orbidden, unchanged* x264 (GPL), x265 (GPL) — encoders, not
in Phase 3 scope anyway
10. Risks and escape hatches
| Risk | Likelihood | Impact | Escape |
|---|---|---|---|
| H.264 decoder slips past 12 months | *igh* | Blocks Phase 4 indefinitely | §5.3 subprocess escape: wrap openh264 via std::process::Command, document as debt |
| symphonia codecs fail conformance on real fixtures | Medium | +2 months to write custom decoders | §5.1, §5.2 fallback to scratch |
| CABAC implementation is subtly wrong and only shows up on specific streams | Medium | Silent corruption in production | JM reference vectors catch most; fuzz + real-world corpus catches the rest |
| Phase 3 effort starves Phase 4 / other Koder work | Medium | 12+ months of cine-only focus | Only land Phase 3 if there is genuine capacity; otherwise keep ffmpeg |
| 10-bit H.264 content appears in real talk traffic | Low | Decoder rejects, falls back to ffmpeg | Acceptable — ffmpeg stays on PATH as fallback through Phase 5 |
The last row matters: *fmpeg is not removed from the VideoAnalyzer config until Phase 5* Phase 3 narrows its usage; it does not eliminate it.
11. Open questions
- *ingle crate for all three codecs or one per crate?*RFC-001 §4
says one per crate. Confirming: yes,
media-codec-opus,media-codec-aac,media-codec-h264as independent crates, each behind its own Cargo feature. A consumer that only needs Opus (like talk's audio path) pays only Opus weight. - *ixelFormat variants for Phase 3* I420 only (YUV 4:2:0, 8-bit).
Anything else returns
UnsupportedPixelFormat. Revisit in Phase 5 when HW-decoded frames bring NV12 into the picture. - *rame allocator* should frames use a pool (
bumpalo,typed-arena) to avoid malloc on every frame? Lean: no in Phase 3, measure first. Pool adds complexity that only pays off if profiling shows allocation as a bottleneck. - *onformance harness location* new
crates/media-conformancecrate or per
codeccodec, so the harness lives next to the code it validates. A top-level crate is overkill for three codecs.tests/conformance/directories? Lean: per - *hether to adopt symphonia's
Decodertrait directly*insteadof writing our own in
media-core. Lean: write our own. symphonia's trait has assumptions (e.g., around its packet type) that do not match our NAL-oriented container world cleanly. A thin adapter at the symphonia boundary is cheap. - *enchmark baseline* when do we start tracking decode throughput
(FPS / Mbps) vs ffmpeg? Proposal: after AAC-LC lands, before H.264 starts. Gives us a calibration point without blocking early work.
12. What this RFC does not decide
- *o cdylib surface.*The Go side of the integration (how
libkoder_cine.soexposes decoders to talkd without exec) is Phase 5. Phase 3 shipping means Rust crates + CLI only; Go keeps callingkoder-cineas a subprocess. - *ncoder strategy.*Opus encoder, AAC encoder, H.264 encoder all
deferred to Phase 5. Implications for talk ticket 009: AAC-in inputs still need an encoder to produce Opus output — until Phase 5, that encode stays in ffmpeg, and koder-cine's win is only the decode side of the pipeline.
- *oder Koda port timing.*RFC-001 §9 still governs. Phase 3 adds
three large crates to the "port later" pile; nothing changes about the ordering or the gating capabilities in lang/520-548.
13. Approval checklist
Before this RFC moves from Draft → Accepted:
- [x] ~ymphonia audit: read
symphonia-codec-opusandsymphonia-codec-aacsource; confirm re-export path is viable~*one 20260412 — results in §14. Flipped the ordering (AAC first, Opus last) and surfaced the MPL-2.0 license question.* - [x] ~*MPL-2.0 dependency policy decision.*Rodrigo to decide
whether to amend RFC
001 §11 to allow MPL2.0 for deps only (§9.1). Gating AAC reexport vs fromscratch implementation.~*pproved 20260412. RFC001 §11 amended with narrow MPL2.0 depsonly carveout (see RFC001 §11.1). AAC reexport path unblocked.* - [ ] Rodrigo signs off on the 11–14 month estimate and accepts the
opportunity cost
- [ ] H.264 feasibility spike: 1 week of exploratory work on M1 (SPS/PPS
parse + Decoder trait wiring) to calibrate the month
bymonth estimate before committing - [x] ~onformance harness POC: one fixture, one sha256, one test, wired
in CI — proves the approach before building more decoders on it~*one 2026
0412, upgraded from POC to full sha256 in ticket 103.*New cratecrates/media-conformanceprovidesPcmHasher,sha256_pcm_f32_le, and aReferenceparser. Fixtures:tests/fixtures/sample-aac-lc-3s.aac.sha256andsample-640x480-3s-avc-aac.mp4.sha256(bothbaa63d0bc0ee637fcebd1ff03ddc36bf3a252fbe50f97e8c2d3aa4ff47eb6c8f). Tests:crates/media-codec-aac/tests/conformance_sha256.rsenforces MP4 ref match, ADTS ref match, and MP4↔ADTS bit-exact convergence in everycargo test. CLI anchor:koder-cine decode-audio <file>emitssha256_pcm_f32_le. Cross-host determinism validated between laptopnoteands.k.linin cine-v0.1.2. - [ ]
opus-decoderandmousikiaudit: only needed when Opus's turncomes (after H.264). Not on the Phase 3 kickoff path.
- [x] ~icket breakdown in
platform/cine/backlog/pending/with astable numbering scheme that keeps H.264 milestones distinct (e.g., 100
aac, 200h264m1 … 206h264m7, 300opus)~*one 20260412.*Phase 3 AAC tickets allocated and shipped as 100101102103104. H.264 reserved as the 200-block (200h264m1 spike pending). Opus reserved as 300-block.
14. Audit results (20260412)
Concrete findings from the symphonia audit that reshaped §4, §5, §9:
symphoniacodecaac — production-ready
- *xists on crates.io* v0.5.5, last updated 2026
0324 - *ownloads* 3,355,159 — serious production adoption
- *icense* MPL-2.0
- *mplementation* pure safe Rust, no C/C++ deps
- *overage* AAC
LC "Great" (per upstream README); HEAAC v1/v2not implemented
- *erdict* *iable as re
export* gated on MPL2.0 policyapproval. Saves ~2 months of Phase 3.
symphoniacodecopus — does not exist
- *OT published on crates.io.*
curl https://crates.io/api/v1/crates/symphonia-codec-opusreturns
"crate symphonia-codec-opus does not exist". - It is a planned entry in the Symphonia README codec table,
marked "—" (not started). There is no shipped code.
- *erdict* dead path. The RFC's original plan to "re-export
symphonia
codecopus in week 1" is impossible.
Pure-Rust Opus alternatives on crates.io
Searched crates.io for "opus decoder", found:
| Crate | Version | Downloads | Verdict |
|---|---|---|---|
opus-decoder |
0.1.1 | 769 | Hobbyist; needs serious audit before adoption |
mousiki |
0.2.1 | 466 | Hobbyist; same caveat |
opus-oxide |
0.0.1 | — | Pre-alpha; skip |
opus / audiopus |
various | — | FFI wrappers around libopus (C); rejected per RFC-001 §2 |
None has production adoption. When Opus's turn comes, the choice is between a deep audit of these (risky) or porting libopus (2–3 months). Deferred per §5.3.
License landscape
- *ymphonia is MPL-2.0* not MIT as the RFC's first draft
incorrectly assumed. This is the critical finding — see §9.1 for the impact analysis and recommended amendment to RFC-001 §11.
What the audit did NOT change
- *.264 plan unchanged.*No pure-Rust video codec exists. Write
from scratch remains the plan, with the subprocess escape hatch.
- *ecoder trait API unchanged.*§6 design holds regardless of
which crates implement it.
- *onformance strategy unchanged.*§8 (checked-in sha256 fixtures
+ JM reference vectors) still applies.
Audit budget
Spent: ~30 minutes (crates.io API queries, Symphonia README reading, license text verification). *roved the value of doing audit before writing code*— the Opus-first plan would have cost a week of wiring against a non-existent crate.