RFC 006 — Auto-Remediation Rules Engine

Approved

RFC 006 — Auto-Remediation Rules Engine

  • Tracking ticket: backlogdone006
  • Depends on:
    • RFC 001 — Ecosystem Map (Approved, 20260407)
    • RFC 002 — Kortex Architecture (Approved, 20260407) — especially §3.3 Reflexes
    • RFC 004 — Common Control Plane (Approved, 20260408) — the only outbound surface available to the engine; capabilities deploys and firewall were added by amendment in this revision
    • RFC 005 — LLM Provider Abstraction (Approved, 20260408) — defines propose_action, the bridge from advisory LLM suggestions into the deterministic engine
  • Status: Approved (20260408)

1. Summary

This RFC defines the Auto-Remediation Rules Engine — the concrete shape of the Reflexes subsystem introduced in RFC 002 §3.3. It specifies the declarative DSL operators use to describe reflexes, the catalog of rule types and actions, the mandatory safety guards (dryrun, rate limit, blast radius, approval, circuit breaker), the storage and versioning model, the starter rule pack shipped with Kortex, the testing and replay framework, the metrics the engine emits, and the LLMassisted learning loop that turns repeated manual interventions into proposed new rules.

The guiding principle is a hard separation between two complementary sources of remediation authority:

  • Rules engine (this RFC) — deterministic, testable, auditable, autonomous. Ships actions to upstream products without per-decision human approval, as long as guards are satisfied. This is the reflex arc of Kortex — fast, cheap, predictable.
  • LLM provider (RFC 005) — probabilistic, contextual, conversational, advisory-only. Never executes anything. Its single writeside tool is propose_action, which posts a Confirm/Reject card into chat. A confirmed proposal is funneled into the rules engine as a oneshot manual firing, but the engine still evaluates the same guards it would for any other rule.

The separation is not an implementation detail. It is the central safety property of the whole system. A subtle bug, a prompt injection, a hallucinated tool argument — none of these can bypass the rules engine's guards, because the LLM is physically incapable of calling the control plane directly. The engine is the only component with those credentials and the only component that can cross that line, and it cannot cross it without an explicit rule whose guards are satisfied.

This RFC unblocks the Reflexes implementation work (ticket 007+) and completes the architectural story opened by RFC 002. With RFCs 001–006 landed, every surface of Kortex — ingest, analysis, memory, coordination, LLM, control plane, reflexes — has a concrete specification.


2. Context and motivation

2.1 Why a deterministic engine at all

RFC 002 §3.3 described Reflexes as "the only outbound side of Kortex" and established the four-part rule shape (trigger, conditions, action, guards) and the four mandatory guards. It deferred the question how are those rules actually written and evaluated to this RFC.

The naive alternative — "just let the LLM decide what to do and execute its tool calls" — is the wrong answer for three reasons:

  1. Cost and latency. A typical Koder deployment might trip a dozen health-check failures per hour across infra/jet, platform/flow, platform/id, and the `observe

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/kortex-006-auto-remediation-rules-engine.kmd