Dns RFC 002 phase1 first pop network
RFC-002 — Koder Herald Phase 1: First PoP Network
| Field | Value |
|---|---|
| Status | *raft*(2026 |
| Author(s) | Rodrigo (with Claude as scribe) |
| Date | 2026 |
| Target module | platform/dns/ |
| Depends on | RFC-001 (accepted), Phase 0 complete |
1. Summary
Phase 1 moves Koder Herald from a Phase 0 softwareonly bridge (ClouDNS/Porkbun as execution layer) to operating its own authoritative DNS infrastructure at the first PoP. The PoP uses managed anycast (provided by the hosting partner — no own BGP yet) backed by Knot DNS + koderdns-sync. This phase proves the product on real infrastructure before investing in own BGP (Phase 2).
Phase 1 exit criteria:
- At least one PoP is live, reachable via anycast IP, and serving authoritative DNS
- koder
herald, koderdns-sync, and Knot DNS are running in production - Koder's own infrastructure (
koder.devand related domains) is migrated off ClouDNS to Koder Herald - At least 3 paying external customers are using the platform
- p95 query latency < 10 ms from Brazil; < 30 ms from Miami and Amsterdam
2. Architecture Recap (from RFC-001)
┌──────────────────────────────────────┐
│ koder-herald (engine) │
│ - Zone/record store (kdb) │
│ - GeoDNS rules │
│ - Failover monitors │
│ - DDNS service │
│ - Analytics collector │
└────────────────┬─────────────────────┘
│ HTTP API (zone serial polling)
┌───────────┴──────────┐
│ │
┌────▼──────────────────┐ │ (future PoPs)
│ PoP — São Paulo │ │
│ ┌──────────────────┐ │ │
│ │ Knot DNS │ │ │
│ │ (authoritative) │ │ │
│ └────────┬─────────┘ │ │
│ │ knotc │ │
│ ┌────────▼─────────┐ │ │
│ │ koder-dns-sync │ │ │
│ │ (zone agent) │ │ │
│ └──────────────────┘ │ │
│ Anycast IP: Vultr │ │
└───────────────────────┘ │3. Technical Decisions
Decision 1 — First PoP location: São Paulo, Brazil
*hosen* São Paulo (BR).
*ationale* Koder is a Brazilian company; all current customers are in Brazil. São Paulo has multiple quality data centres (Equinix SP, Ascenty, LGPD-compliant). PTT.br (São Paulo) is Brazil's primary IXP — useful for Phase 2 peering. Starting in BR minimises latency for the existing customer base before expanding globally.
Decision 2 — Managed anycast provider: Vultr BGP
*hosen* Vultr with BGP sessions (managed anycast).
*ationale* Vultr offers BGP on dedicated servers in São Paulo starting from ~$60/month. They announce the IP block — Koder does not need its own ASN or PI block at this stage. This removes the operational burden of running BIRD 2 and managing a BGP session before there is customer demand to justify it. Own BGP is Phase 2 (RFC-003).
Alternatives evaluated:
- *ivelocity*— also viable, slightly more expensive in BR
- *etzner*— no São Paulo PoP yet (2026)
- *urricane Electric*— requires own ASN upfront
Decision 3 — DNS software at PoP: Knot DNS 3.x
*hosen* Knot DNS 3.x, installed from official packages (knot on Debian/Ubuntu).
*ationale* Already decided in RFC001 §12 (decision #3). Knot DNS provides modgeoip (Phase 1 ticket #015 and RFC001 ticket #010), builtin DNSSEC with autosigning, and rate limiting — all in a single package. The koderdnssync agent writes zone files and calls `knotc zonereload`; no deep integration needed.
Decision 4 — DNSSEC: Knot DNS keymgr, auto-signing
*hosen* Knot DNS native DNSSEC auto-signing via keymgr. Herald does not sign zones — it delegates signing entirely to Knot DNS at the PoP.
*ationale* Knot DNS has a mature, production-grade DNSSEC implementation used by ccTLD operators. Implementing DNSSEC in Herald would duplicate this work and introduce security risk. The zone file exported by Herald is unsigned; Knot DNS signs it locally and serves the signed responses. Key material stays on the PoP (never leaves the server).
DNSSEC keys are backed up via keymgr backup to encrypted storage daily.
Decision 5 — Authentication: Koder ID JWT (replace XTenantID header)
*hosen* Bearer JWT issued by Koder ID. The sub claim is the tenant ID.
*ationale* Phase 0 uses a plain X-Tenant-ID header — no authentication, only identification. This is acceptable in a closed environment but not for a public API. Phase 1 wires Herald into Koder ID (already in production as of 20260409) as an API client. Herald validates the JWT signature using Koder ID's public key (fetched once at startup, cached with 24h TTL).
Migration: Phase 0 clients using X-Tenant-ID continue to work until Phase 2 (grace period). A feature flag LEGACY_TENANT_HEADER=true in Herald enables the fallback.
Decision 6 — Storage: keep kdb (SQLite-backed) for Phase 1
*hosen* kdb remains the storage layer for Phase 1.
*ationale* kdbnext (Rust/TiKV, RFC001 in platform/kdb/next/) is still in design. The current kdb handles the Herald workload well for Phase 1 volumes (thousands of zones, millions of records). Migrating to kdbnext midphase introduces unnecessary risk. Herald will be designed to swap the storage layer via the existing repository interface, so migration to kdb-next in Phase 2 will be straightforward.
Decision 7 — Zone transfer: zone file export (no AXFR protocol)
*hosen* koderdnssync fetches zone files via GET /api/v1/zones/{zone}/export (HTTP, BIND format), not via DNS AXFR protocol.
*ationale* Implementing AXFR (RFC 5936) in Herald is significant work for Phase 1. The HTTP-based export achieves the same result (full zone sync on serial change) and is simpler to implement, debug, and monitor. AXFR will be added in Phase 2 to support external secondaries (ticket #043).
Decision 8 — PoP health monitoring: Herald polls koderdnssync healthz
*hosen* Each koderdnssync agent exposes GET /healthz returning last-sync serial, uptime, and zone count. Herald polls this endpoint every 60 seconds. If a PoP misses 3 consecutive polls, Herald raises an alert.
*ationale* Simple and decoupled. The sync agent already has all the information needed (last synced serial, whether knotc reload succeeded). Herald does not need to query the DNS port (UDP 53) for health — that is the job of external monitoring (e.g. DNScheck, uptime monitoring).
4. Implementation Sequence
Tickets to implement in order:
| # | Ticket | Depends on | Phase 1 milestone |
|---|---|---|---|
| #011 | Production deployment of koder-herald | Phase 0 complete | Engine live |
| #012 | Koder ID JWT authentication | #011, Koder ID live | Secure API |
| #013 | DNSSEC key management (Knot DNS keymgr) | #014 | Signed zones |
| #014 | First PoP deployment — São Paulo (Vultr) | #011 | First PoP live |
| #015 | Knot DNS + mod-geoip at PoP | #014 | GeoDNS serving |
| #016 | Secondary DNS slave zone support | #014 | External secondaries |
| #017 | apps/domains integration | #012 | Customer onboarding |
| #018 | Customer billing model (zones + queries) | #017 | Revenue-ready |
| #019 | PoP heartbeat + multi-PoP failover | #014 | Operational monitoring |
| #020 | Terraform/OpenTofu provider | #012 | Infra |
| #021 | SOA and NS customisation | #014 | Custom nameservers |
| #022 | Rate limiting per tenant | #011 | Abuse protection |
| #023 | Zone locking and change review | #012 | Ops safety |
| #024 | DNSSEC key rollover automation | #013 | Key hygiene |
| #025 | Weighted round-robin routing | #015 | Traffic distribution |
5. Open Questions
- *ultr BGP or own ASN?*— Confirmed: Vultr managed anycast for Phase 1. Own ASN/BGP is Phase 2 (RFC-003).
- *ow many PoPs in Phase 1?*— Start with 1 (São Paulo). Add a 2nd PoP (Miami or Amsterdam) once the first is stable and there is demand from outside Brazil. Target: 2 PoPs by end of Phase 1.
- *db
next timeline*— Phase 1 stays on current kdb. If kdbnext reaches production before Phase 1 ends, evaluate migration opportunistically.
6. Success Criteria
- Koder Herald engine running in production on s.k.lin (or dedicated VM)
- São Paulo PoP live with Knot DNS serving authoritative DNS via managed anycast IP
koder.devand all Koder-operated domains migrated off ClouDNS to Koder Herald- JWT-authenticated public API with Swagger/OpenAPI spec published
- Minimum 3 paying external customers onboarded via apps/domains
- p95 query latency < 10 ms from São Paulo (measured via DNSperf or Catchpoint)
- Zero DNSSEC validation failures in production monitoring (90 days)
- Terraform provider published on Terraform Registry