RFC 017 — Koder Koda Compiler Decomposition
RFC 017 — Koder Koda Compiler Decomposition
*tatus:*DRAFT — proposes a 7
phase divideand-conquer refactor of the Koder Koda compiler from a nearmonolithic singlepass design into a set of small, encapsulated, independently testable modules. Filed in response to growing coupling that blocks safe evolution of the language (new optimization passes, additional backends, structural search integration, etc.).
1. Summary
The Koder Koda compiler today is a nearmonolithic singlepass design. The bulk of the implementation lives in two files:
self-hosted/lib/compiler_backend.kd— *7,721 lines*Contains AST→bytecode compilation (
compile_nodeat line 872, ~2,500 lines), compile-time constant evaluator (~500 lines), comptime metaprogramming, pattern-match codegen, NASM x86_64 backend (generate_asm+emit_runtime{,_part1b,_part2}at ~7,800 lines of raw NASM strings), LLVM IR backend (generate_llvm_ir, ~600 lines).self-hosted/lib/bootstrap_frontend{,_pure}.kdandself-hosted/lib/bootstrap_parser.kd— *,200+ duplicated lines*across three bootstrap stages, all implementing the same lexer + parser + AST contract.
The compiler relies on *04 module-level $global variables*($bytecode, $label_count, $current_class_name, $llir_out, $variables, $generic_pending, $ivar_index, …) shared across all phases of compilation. There is no explicit IR — "bytecode" is an untyped list mutated in place; backends read it and emit assembly inline within the same module.
This RFC proposes a *phase divideand-conquer decomposition*that each ship independently with green tests, gradually transforming the compiler into:
Source → Lexer → Parser → AST → Sema → HIR → MIR → LIR → Backend
↓ ↑ ↑ ↑ ↓
Tokens Comptime Optimizer Per-target
(typed) (eval) (passes) code…with each phase a separate module, two explicit context objects (no globals), per-phase unit tests, and snapshot tests over IR boundaries.
2. Motivation
2.1 Symptoms
| Pain | Evidence |
|---|---|
| *od-object* | compiler_backend.kd = 17,721 lines mixing constant evaluator, comptime, ASM gen, runtime emission, LLVM IR. |
| *ar de globals* | 304 $global variables. Cannot instantiate two compilations in parallel; cannot construct test fixtures. |
| *ingle-pass entanglement* | compile_node (line 872) does semantic + bytecode + ASM in one pass. Refactoring AST handling risks NASM regressions. |
| *o explicit IR* | "Bytecode" emerges as a list of stringly-typed ops, untyped, mutated in place. Backends bind tightly to the producer. |
| *rontend duplication* | bootstrap_frontend.kd (722) + bootstrap_frontend_pure.kd (2,382) + bootstrap_parser.kd (969) repeat 25+ parser functions. Bug fix in #308 has to land in N places. |
| *untime hardcoded as code* | emit_runtime{,_part1b,_part2} = ~7,800 lines of raw NASM literal strings inside compiler logic. Updating runtime requires editing compiler. |
| *ackends entrelaçados* | NASM x86 + LLVM IR + ARM64 + ARM32 + WASM + C transpile + JS transpile coexist in one module sharing globals. Per-arch evolution propagates risk. |
| *ootstrap fragility* | The recent #702 work (Gemini, 2026 |
2.2 Why now
Three convergent pressures:
- *elf
hostedfirst promotion.*policies/self-hosted-first.kmdrequires Koder Koda to close G1 (feature parity), G2 (perf), G4 (capability) gates per case
ofuse. Decomposed compiler = velocity to close gates; monolithic compiler = each new gate a risk to the others. - *acklog tickets in flight.*#307 (LLVM stabilization), #701
(SIMD stdlib), #703 (PGO pipeline), #308 (parser robustness fix),
261 (native UI cross-platform) are all easier with Phases 1–3 of
this RFC done.
- *he Krep engine*(search
RFC001) is the nextself
hostedfirst pair after kodec. Its F1 perf bar of geomean ≤ 1.10× vsrg(with subsequent ≤ 1.05× forofficialstatus) demands SIMD codegen + LLVM optimization paths — both hampered by the current compiler shape.
3. Non-goals
- *ot a from-scratch rewrite.*Each phase is incremental,
testable, revertable. The compiler stays continuously buildable.
- *ot a language change.*No new Koder Koda features added by
this RFC; existing semantics preserved bit
forbit. - *ot removing single-pass for the bootstrap binary.*The
bootstrap path (
bootstrap_compiler.kd) stays minimal and monolithic — it only needs to bring up the full self-hosted compiler.
4. Decomposition — the seven phases
Each phase ends with a release tag and green test gate. Order matters: Phase 1 unblocks all subsequent phases.
Phase 1 — Context extraction (refactor puro, zero semantic change)
Convert the 304 $global variables into *wo explicit context objects*threaded through every compilation function.
class CompilationContext # Per-source, per-compile-run
- const_table, const_fns, const_defined
- string_literals, cstring_literals (+ index counters)
- extern_fns, extern_fn_seen
- generic_fns, generic_instances, generic_pending
- user_class_defs, export_c_fns, export_c_map
- target_triple, source_path_for_debug
- debug_mode
- capability_flags { uses_libdl, uses_win32, uses_wndproc, uses_phase_a_wndproc }
- output_buffers { asm_output, llir_out, bytecode }
- source_metadata (filename, span map)
class CodegenContext # Per-function or per-block
- label_count, sc_label_id
- current_class_name, parent_class_name, current_method_name
- loop_start_label, loop_end_label
- ivar_index, ivar_names, attr_reader_names
- variables, variable_seen, var_alignments
- func_var_cache
- llir_state { cur_fn, cur_fn_dbg_loc_id, vc, in_func, fn_local_vars, fn_params_map }Every function changes signature: f(...) → f(ctx, code_ctx, ...). Mechanical transformation, can be done with sed + manual review. Tests stay green throughout.
*ain unlocked:*unit tests become possible. Today, every test is forced to be an integration test (compileandrun) because state leaks across functions. After Phase 1, tests construct a fresh CompilationContext, call one function under test, assert on the context state.
*stimated effort:*1–2 weeks of focused work. Mechanical, low risk.
Phase 2 — Explicit KIR type (typed IR)
Today's "bytecode" is $bytecode = [] — a list of stringly-typed ops. Promote it to a typed IR with explicit node variants:
class KIR.Op
- kind (Push, Pop, Call, Jump, Phi, Class.Field, Pattern.Check, …)
- operands (typed per kind)
- source_span (file, line, col — for errors and debug info)
class KIR.Function
- name, params, locals
- blocks: KIR.Block[]
- exports: bool, generic: bool, …Conversion: compile_node(n) returns KIR.Op[] instead of mutating $bytecode. Backends consume KIR; they do not call compile_node again.
*ain unlocked:*snapshot tests over KIR. For a frozen Koder program, the KIR is deterministic; any regression in compilation shows up as a diff in a snapshot file. First line of defense against regressions in any subsequent phase.
*stimated effort:*2–3 weeks. Backwards-compat shim during transition (KIRaslist).
Phase 3 — Backend isolation
Today, compiler_backend.kd mixes NASM x86_64 + LLVM IR + ARM codegens + WASM + C/JS transpilers. Move each to a separate module:
self-hosted/lib/backends/
├── nasm_x86_64.kd # reads KIR, emits NASM (SysV ABI Linux)
├── nasm_win64.kd # reads KIR, emits NASM (MS x64 ABI Windows)
├── llvm_ir.kd # reads KIR, emits LLVM IR
├── arm64.kd # reads KIR, uses arm64_codegen + arm64_encoder
├── arm32.kd # ditto
├── wasm.kd # reads KIR, emits WASM via wasm_codegen + wasm_encoder
├── c_transpile.kd # reads KIR, emits C source
└── js_transpile.kd # reads KIR, emits JavaScript sourceEach backend exposes a single function: Backend.emit(kir: KIR.Module, ctx: CompilationContext) → ArtifactBytes. Backends do not share globals; they do not call compile_node. The existing per-arch encoder modules (arm64_encoder.kd, x86_encoder.kd, …) become dependencies of their respective backends only.
*ain unlocked:*adding a new architecture is a new file in backends/, not a sweeping change to compiler_backend.kd. Each backend is unit-testable: feed KIR, assert on output bytes.
*stimated effort:*4–6 weeks. Bulk move + per-backend test suite.
Phase 4 — Comptime evaluator extraction
The ~600 lines of eval_const_* and eval_comptime_* form a small compile-time interpreter for the language. Extract into self-hosted/lib/comptime_evaluator.kd with a minimal interface:
ComptimeEvaluator.eval_node(node, scope, ctx) → CompileTimeValue | Unevaluable
ComptimeEvaluator.eval_fn_call(fn_def, args, ctx) → CompileTimeValue | UnevaluableIndependent of codegen. Unit-testable. Pluggable: future advanced features (hygienic macros #668, partial evaluation, dependent types in compile-time context) attach here without touching the rest of the compiler.
*stimated effort:*1 week.
Phase 5 — Runtime as templated data, not code
The ~7,800 lines of emit_runtime{,_part1b,_part2} are essentially NASM templates with conditional sections gated on capability flags (uses_libdl, uses_win32, uses_wndproc). They are data, not code.
Move runtime to versioned template files:
self-hosted/runtime/
├── x86_64/koder_runtime.asm.tpl # ELF Linux + Win64 (with conditionals)
├── aarch64/koder_runtime.asm.tpl
├── arm32/koder_runtime.asm.tpl
├── wasm/koder_runtime.wat.tpl
└── ...with a small templating layer that recognizes:
${if uses_libdl}
extern dlopen
extern dlsym
...
${endif}Compiler reads template, substitutes flags from CompilationContext, emits final NASM. Runtime evolution becomes a diff on a .asm.tpl file — readable, reviewable, version-controllable.
*ain unlocked:*runtime patches no longer require editing compiler logic. Bug fixes to _rt_alloc or _rt_str_slice are single-file changes with isolated test impact. New runtime helpers (like the _rt_bytes_* family added in #702) are a single edit.
*stimated effort:*2–3 weeks. The bulk is mechanical extraction; the templating layer is small.
Phase 6 — HIR / MIR / LIR three-tier IR
After Phase 2 introduces a single KIR, refine into three tiers:
| Tier | Shape | Used for |
|---|---|---|
| *IR* | Tree, close to AST. Names not yet resolved. | Sema (name resolution, type checks), comptime eval, pattern desugaring, generic instantiation. |
| *IR* | Basic blocks, SSA, target-independent. Names resolved to local slots. | Optimizer passes: DCE, constant folding, inlining, autovec (#701), PGO (#703). |
| *IR* | Target-specific virtual instructions. Register hints. | Register allocation, calling convention, encoder dispatch. |
Conversions are typed passes: lower_hir_to_mir(HIR.Module) → MIR.Module, lower_mir_to_lir(MIR.Module, target) → LIR.Module. Each pass has input/output snapshot tests. New optimizations attach as new MIR passes without touching frontend or backends.
*ain unlocked:*modular optimization. Adding LLVM-grade passes (loop unrolling, vectorization, alias analysis, escape analysis) is incremental.
*stimated effort:*6–8 weeks. Architectural; involves rewriting compile_node and friends to produce HIR rather than direct bytecode.
Phase 7 — Frontend deduplication
Resolve bootstrap_frontend.kd + bootstrap_frontend_pure.kd + bootstrap_parser.kd into a single canonical:
self-hosted/lib/frontend/
├── token.kd # Token type
├── ast.kd # ASTNode type
├── lexer.kd # tokenize(source) → Token[]
└── parser.kd # parse(tokens) → ASTNode (single function entry)Bootstrap stages keep only the trimmed subsets they actually need (no copy-paste of full parser).
*ain unlocked:*parser bug #308 has one place to fix, not three. Future parser improvements (better error recovery, location tracking, incremental reparse for LSP) ship once.
*stimated effort:*1–2 weeks. Cirurgical — linebyline dedup + verification.
5. Sequencing & dependencies
Phase 1 ─┬─ Phase 2 ─── Phase 3 ─┬─ Phase 6
│ │
└─ Phase 4 └─ Phase 7
Phase 5- Phases 4 and 5 can run in parallel after Phase 2 (they touch
different parts of the codebase).
- Phase 6 requires Phase 3 done (typed IR + isolated backends are
prerequisites for the three-tier split).
- Phase 7 is independent of 456 but easier after Phase 1.
*otal estimated effort:*~7–9 months focused work for a single person; ~3–4 months for two-person team.
6. Per-phase gates
Each phase ships only when:
- All existing tests in `self-hosted