Regression Tests per Bug
Toda correção de bug exige teste de regressão (3 categorias: behavioral, golden, estrutural) que falhe sem o fix e passe com ele. Testes ficam em tests/regression/. Registry: registries/regression-test-cases.md.
Policy — Regression Tests per Bug
When fixing any bug in any module of the monorepo, *ou must*create a regression test that:
- *eproduces the bug*— the test fails if the fix is reverted.
- *asses after the fix*
- *tays permanently in the repository*as regression protection.
Directory Structure (per module)
modulo/
tests/
regression/
NNN-descricao-do-bug.test.{ext}NNN: sequential number per module (001,002, ...).{ext}: language-specific extension of the module (.kd,.go,.ts,.dart,.py,.sh, etc.).- *anguage-specific conventions override*when the test runner requires a fixed filename format:
- *o*(
go testdiscovers only*_test.go): useNNN_descricao_test.go(e.g.001_sighup_reload_test.go). - *ust* tests live in
tests/withNNN_descricao.rs. - *ython (pytest)* prefer
test_NNN_descricao.py. - Other languages (
.kd,.ts,.dart,.sh, etc.): followNNN-descricao.test.{ext}.
- *o*(
Test Header
Every regression test must start with a comment containing:
- Reference to the backlog ticket (if any).
- Description of the bug in one line.
- Expected behavior (correct) vs buggy behavior.
Rules
- *imple and focused*— test only the specific bug, not the whole system.
- *o unstable external dependencies*(APIs, network, third-party services).
- *o UI tests coupled to layout*— test logic/behavior, not pixels.
- If the module has no
tests/regression/yet, create it on the first bug. - Name the file descriptively (e.g.
003-argv-variable-mismatch.test.kd).
Test Categories (preference order)
6.a — Behavioral (default)
Executes the code and validates observable behavior (output, exit code, state, side effects). Default choice. Any bug fix should try this category first.
6.b — Golden
Validates *nvariance/contract between multiple representations of the same data*via roundtrip, diff, or direct comparison. Firstclass category, not a workaround. Use when the goal isn't "X does Y" but "A and B agree on X". Typical cases: DSL↔generated code, encoder↔decoder, singlesourceoftruth with multiple consumers, binary↔text formats. The test runs something and checks exit code, but the "behavior" tested is consistency, not business logic.
*alidate by injecting drift*(e.g. intentionally alter one side and confirm the test fails) before committing. Mark as golden in the "Revisão?" column of the registry. Canonical examples: 023-opcode-dsl-consistency.test.sh, 024-kbcb-roundtrip.test.sh, 025-kode-disasm.test.sh, 026-bytekode-loc-sourcemap.test.sh in engines/lang/lang/tests/regression/.
6.c — Structural as stopgap (exception)
When the target module has no executable runtime harness and building one is out of scope for the fix, a structural test (grep/AST on source) is acceptable *nly if all*of these hold:
- (a) the test *ails*if the fix is reverted (verify with
git stashbefore committing); - (b) an *pen backlog ticket*tracks the construction of the harness, and is referenced in the test's comment header;
- (c) the test's top comment *xplains why*a behavioral test isn't viable today;
- (d) the registry entry (
meta/context/registries/regression-test-cases.md) is marked withestruturalin the "Revisão?" column to ease future auditing.
Structural stopgaps must be replaced with behavioral tests as soon as the harness exists.
Registry
- Keep the counter up
todate inmeta/context/registries/regression-test-cases.md. - Each new regression test adds an entry to the registry.
- *hile the counter ≤ 30 cases* review this policy and the process at each new case to identify improvements (recurring patterns, bug categories, needed helpers, structural adjustments). Apply improvements to the policy and inform the user.
- *fter 30 cases* the policy is considered mature — stop automatic reviews. The user can request a manual review at any time.