Visual golden widget pattern (headless compositor + screencopy)

Canonical recipe for shipping byte-exact visual regression goldens for any single widget in any Koder UI surface (GTK4/Adwaita today; Flutter / web / Android extensions follow the same shape). Established 2026-05-24 in `engines/sdk/koder_kit_gtk` across three iterations (KKGTK-002 R1+R2+R3a, registries #647-#653). Companion to `specs/develop/visual-regression-tdds.kmd § R1 Category C` (the normative test category) — this pattern is the **how**.

Pattern — Visual golden widget

When to use

A new widget (or new variant of an existing one) ships in a Koder SDK or product, and the team wants byte-exact visual regression detection for it. Three things must already be in place:

  • *ompositor:*koder-x with WLR_BACKENDS=headless support

    (Pilot 1 R1 of RFC005, shipped 202605-24).

  • *apture client:*grim (wlrscreencopyv1) reachable from

    the test host.

  • *ontainer chain:*for Adw widgets specifically, an

    AdwApplicationWindow → AdwPreferencesPage → AdwPreferencesGroup parent chain — discovered as loadbearing in KKGTK002 R2 (registry #650).

If any of those is missing, the host is not yet wired for this pattern. See policies/test-host-isolation.kmd for the canonical test host (s.khost1.dev-linux-klinux LXC as of 2026-05).

The four artifacts

Per widget, ship four files. Names follow the slug convention <widget_kind> (e.g. adw_switch_row, adw_password_entry_row, adw_action_row):

1. The repro binary — tests/repro_<widget>.c

A tiny GTK application (~50 lines) that constructs the canonical container chain, parameterizes the widget under a single env var (KKGTK_<WIDGET>_STATE or similar), presents the window, and quits after 2 seconds via g_timeout_add_seconds.

Template:

#include <adwaita.h>
#include <gtk/gtk.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

static void on_activate(GtkApplication *app, gpointer ud) {
    (void)ud;
    const char *state = getenv("KKGTK_<WIDGET>_STATE");
    

    GtkWidget *win = adw_application_window_new(app);
    gtk_window_set_default_size(GTK_WINDOW(win), 600, 200);

    AdwPreferencesPage *page = ADW_PREFERENCES_PAGE(adw_preferences_page_new());
    AdwPreferencesGroup *group = ADW_PREFERENCES_GROUP(adw_preferences_group_new());

    

    adw_preferences_page_add(page, group);
    adw_application_window_set_content(ADW_APPLICATION_WINDOW(win), GTK_WIDGET(page));
    gtk_window_present(GTK_WINDOW(win));

    g_timeout_add_seconds(2, (GSourceFunc)g_application_quit, app);
}

int main(int argc, char **argv) {
    AdwApplication *app = adw_application_new(
        "dev.koder.kkgtk.<widget>", G_APPLICATION_DEFAULT_FLAGS);
    g_signal_connect(app, "activate", G_CALLBACK(on_activate), NULL);
    int rc = g_application_run(G_APPLICATION(app), argc, argv);
    g_object_unref(app);
    return rc;
}

2. The goldens — tests/goldens/<widget>_<state>.png

One PNG per distinguishable state, captured *nce*under the test host, then committed. Capture procedure:

  1. Spawn koder-x with WLR_BACKENDS=headless +

    KODER_X_HEADLESS_TEST_OUTPUTS=1 in a sandboxed XDG_RUNTIME_DIR.

  2. Wait for wayland-N socket.
  3. Run the repro binary with the desired state env var,

    pointing WAYLAND_DISPLAY + GDK_BACKEND=wayland at the spawned compositor.

  4. After ~1 second (window stabilization), invoke

    grim -o HEADLESS-1 <out.png>.

  5. Repeat for each state.
  6. Verify md5s *iffer across states*before committing.

    Identical md5s across distinct states = capture isn't reflecting the state (see KKGTK-002 R2 — wrong container, wrong widget choice, etc.).

3. The check script — tests/headless/golden_check_<widget>.sh

Wraps the capture procedure into a re-runnable assertion. Inputs: no args → check all states; --update → accept current capture as the new golden. Failures save the diverging PNG to tests/goldens/_failures/<widget>_<state>_<ts>.png for investigation.

Template (per-state loop):

check_state() {
    local label="$1"
    local state_value="$2"
    local golden="$GOLDEN_DIR/<widget>_${label}.png"

    KKGTK_<WIDGET>_STATE="$state_value" \
    XDG_RUNTIME_DIR="$SANDBOX" \
    WAYLAND_DISPLAY="$SOCK" \
    GDK_BACKEND=wayland \
        "$BUILD/repro_<widget>" >/dev/null 2>&1 &
    APP_PID=$!
    sleep 1
    grim -o HEADLESS-1 "$SANDBOX/current-${label}.png" 2>/dev/null
    wait "$APP_PID" 2>/dev/null || true

    g_md5=$(md5sum "$golden" | cut -d' ' -f1)
    c_md5=$(md5sum "$SANDBOX/current-${label}.png" | cut -d' ' -f1)
    [ "$g_md5" = "$c_md5" ] || handle_mismatch
}

check_state state_a "value_a"
check_state state_b "value_b"

4. Aggregate runner — auto-picked

tests/headless/run_all_goldens.sh already in koder_kit_gtk globs all golden_check_*.sh in the same directory. No per-widget edit needed; the new check shows up automatically the next time the aggregate runs.

Anti-patterns

These were discovered the hard way during KKGTK-002 progression (registries #647 → #648 → #649 → #650). Avoid:

A1 — Single golden for state-changing widget

A widget with toggleable state (AdwSwitchRow on/off, password entry empty/filled) needs *t least two goldens*(one per state). Shipping just one and --update-ing it loses regression coverage for the un-captured state.

A2 — Identical md5 across "different" states

If two states produce the same md5, the capture is *ot*catching what you think it is. Stop. Investigate before committing. Likely causes (in priority order):

  1. *rong parent container*— Adw widgets need the

    AdwPreferencesGroup chain; without it the paint vfunc bails (the gtk_list_box_row_grab_focus: assertion 'box != NULL' failed log line is the giveaway).

  2. *rong widget choice*— some widgets visually don't

    change for the state you're varying. Pick a state that actually paints differently.

  3. *apture timing*— window not fully presented when grim

    ran. Bump the sleep or wait for a specific frame.

A3 — meson test integration

Don't wire golden_check into meson test. Meson wants binaries present at build time, but goldens need a *unning compositor*+ grim + per-host env that varies. Keep the checks as shell scripts invoked from CI / /k-housekeep / release gates.

A4 — paintable-based capture

GTK4's gtk_widget_paintable_new + gdk_paintable_snapshot returns NULL in headless wayland (R3b finding, commit 8940450473). Use compositor screencopy via grim, not widget- side paintable.

Registry

Each shipped widget under this pattern adds a row to registries/visual-regression-coverage.md. Use existing columns: A (overflow) / B (chrome) / C (proportion — this pattern) / D (sibling collision). Most Category C ✅ slots will come from this pattern.

Future surface kinds

Today this pattern is GTK/Adw-specific because the canonical container chain is Adw. Equivalent patterns for other surfaces:

  • *lutter*— MaterialApp → Scaffold → <widget>. Use

    golden_toolkit or its koder_test_screencap Dart equivalent. Captured pixels still go through wlroots screencopy if running headless via koder-x.

  • *eb (templ + HTMX, Flutter Web)*— Playwright /

    Puppeteer screenshot against the same headless koder-x + a real browser instance. The compositor screencopy path vs the browser screenshot API are equivalent at the byte level once the browser's render is committed.

  • *ndroid native (Compose)*— createComposeRule() +

    captureToImage(). Container chain less prescriptive than Adw but Compose's preview infrastructure mostly handles realization automatically.

Ratification

Pattern ratified by working implementation across three widgets in engines/sdk/koder_kit_gtk:

  • AdwSwitchRow (off/on) — registry #650
  • AdwPasswordEntryRow (empty/filled) — registry #651
  • AdwActionRow (titleonly/withsubtitle) — registry #652
  • Aggregate runner — registry #653
  • k-housekeep Phase 2.6 wire — `commandsk-housekeep.md`

Recipe is reusable as-is for any new Adw widget in any Koder SDK; the four artifacts plus a registry row complete the contract.

Source: ../home/koder/dev/koder/meta/docs/stack/specs/patterns/visual-golden-widget.kmd