Dns RFC 001 koder dns platform
RFC001 — Koder DNS Platform: authoritative DNSasaservice for the Koder Stack
| Field | Value |
|---|---|
| Status | *ccepted*(2026 |
| Author(s) | Rodrigo (with Claude as scribe) |
| Date | 2026 |
| Target module | platform/dns/ |
| Related | apps/domains/ tickets #007–#012 |
1. Summary
This RFC defines the architecture and phased roadmap for *oder DNS* an authoritative DNSasa-service platform built into the Koder Stack. The goal is to give Koder customers (and the Koder infrastructure itself) a fully managed DNS service with advanced features — GeoDNS, failover, DDNS, zone analytics, and secondary DNS — backed by a global anycast network operated by Koder.
The project is divided into three phases:
- *hase 0*— Software-only: advanced DNS logic (failover, GeoDNS, DDNS,
statistics) delivered on top of existing DNS providers (ClouDNS, Porkbun). Zero infrastructure investment. Shipped via
apps/domains. - *hase 1*— Own anycast: Koder acquires an ASN and a PI address block,
stands up 3–5 PoPs in strategic regions, and begins serving authoritative DNS with its own anycast network.
- *hase 2*— Global expansion: 15+ PoPs across all continents, full
competitive parity with ClouDNS, Cloudflare DNS, and Route53.
2. Goals
- *NS
asa-service for Koder customers*— any Koder customer can delegatetheir domain's DNS to the Koder DNS platform and manage records via
apps/domainsor the API. - *lobal low-latency resolution*— via anycast, queries resolve at the
nearest PoP. Target: p95 < 10 ms for South America, Europe, and North America by end of Phase 1.
- *dvanced routing*— GeoDNS (return different IPs per region/country),
failover (auto
switch on server down), and weighted roundrobin. - *perational self-sufficiency*— the Koder infrastructure itself migrates
off ClouDNS to its own DNS platform, eliminating a third-party dependency for a core service.
- *eveloper-grade API*— full CRUD for all record types, zone file
importexport, DDNS endpoint, and TerraformOpenTofu provider.
- *bservability*— per-zone query analytics, latency per PoP, anomaly
detection.
3. Non-goals
- *ecursive DNS resolver*— Koder DNS is authoritative only. It answers
for zones it is authoritative for; it does not resolve arbitrary queries on behalf of end users.
- *egistrar services*— domain registration stays in
apps/domainsviaPorkbun/Dynadot integrations. Koder DNS is not a registrar.
- *DoS scrubbing*— traffic scrubbing for volumetric attacks is out of
scope for Phase 1 and 2. Anycast inherently distributes attack traffic across PoPs, which provides partial mitigation, but dedicated scrubbing is a separate workstream.
- *NSSEC signing in Phase 0*— DNSSEC is planned for Phase 1 once we
control the authoritative servers.
- *ull Cloudflare parity*— Cloudflare has 300+ PoPs and a WAF, CDN, and
Workers platform built on their anycast network. That is a decade of infrastructure investment. This RFC scopes to DNS specifically.
4. Background
Today the Koder Stack uses *louDNS*as its DNS provider. ClouDNS hosts the koder.dev zone and provides a DNS management API that is used by apps/domains and by automation scripts (ACME DNS-01, deployment hooks).
ClouDNS is reliable and inexpensive (~$20/month), but it represents an external dependency for a service that is foundational to every Koder product. More importantly, as Koder grows into a platform company, DNS becomes a product in its own right — one that generates revenue, deepens the platform lock-in, and differentiates the Koder Stack from competitors.
The features that matter most for customers — GeoDNS, failover, DDNS — can be partially delivered in software today (Phase 0) and fully delivered with own infrastructure later (Phase 1+).
5. Architecture overview
┌─────────────────────────────────────────────────────────────┐
│ Koder DNS Platform │
│ │
│ ┌─────────────────┐ ┌──────────────────────────────┐ │
│ │ apps/domains │ │ platform/dns (engine) │ │
│ │ (management UI)│─────▶│ │ │
│ └─────────────────┘ │ - Zone management API │ │
│ │ - GeoDNS routing rules │ │
│ ┌─────────────────┐ │ - Failover monitor │ │
│ │ REST/gRPC API │─────▶│ - DDNS endpoint │ │
│ │ (external) │ │ - Analytics collector │ │
│ └─────────────────┘ │ - Zone file import/export │ │
│ └──────────┬───────────────────┘ │
└──────────────────────────────────────┼──────────────────────┘
│
┌────────────────────────┼──────────────────────┐
│ Phase 0 │ Phase 1+ │
│ (provider APIs) │ (own anycast) │
│ │ │
┌────▼────┐ ┌────────┐ ┌───▼──────────────────┐ │
│ ClouDNS │ │Porkbun │ │ Koder Anycast Network│ │
│ API │ │ API │ │ (PoPs: GRU, IAD, FRA)│ │
└─────────┘ └────────┘ └──────────────────────┘ │The engine (platform/dns) is the single source of truth for zone state. In Phase 0 it synchronizes that state to external providers via their APIs. In Phase 1+ it synchronizes to its own authoritative name servers at each PoP.
6. Phase 0 — Software-only (no infrastructure investment)
*bjective:*deliver the highest-value DNS features immediately, on top of existing DNS providers, with zero infrastructure cost.
6.1 Components
All logic lives in platform/dns (engine) and is surfaced via apps/domains (UI) and a REST API.
| Feature | Implementation |
|---|---|
| Failover DNS | Health check poller (HTTPTCPICMP) + provider API to swap records on failure |
| GeoDNS (approx.) | Push region-based records to ClouDNS GeoDNS API; UI to define rules |
| DDNS | Scoped token endpoint; updates A/AAAA via provider API |
| Zone file import | BIND parser → bulk create via provider API |
| Zone file export | Read records from provider API → generate BIND zone file |
| DNS statistics | Pull query stats from ClouDNS API; store in time-series DB |
| Secondary DNS | Configure slave zones via ClouDNS API |
6.2 Failover engine detail
Health check loop (per monitored endpoint):
every N seconds → probe primary IP (HTTP 200? TCP connect? ICMP alive?)
on failure:
wait M seconds (anti-flap)
if still down: call provider API → update A record to fallback IP
emit notification (Koder Talk / email / webhook)
on recovery:
wait recovery_delay
call provider API → restore primary IP
emit notificationThe poller runs as a goroutine pool inside the platform/dns server. State (current active IP per monitored record) is stored in kdb.
6.3 Acceptance criteria
- Failover detects a down server within 2× the configured check interval and
switches the record within 30 seconds of detection.
- GeoDNS rules defined in the UI are reflected in ClouDNS within 60 seconds.
- DDNS endpoint updates a record within 5 seconds of receiving a valid
request.
- Zone file round-trip (export → import on a fresh zone) produces identical
records.
7. Phase 1 — Own anycast infrastructure (first PoPs)
*bjective:*operate Koder's own authoritative DNS servers, served via anycast, eliminating the dependency on ClouDNS for koder.dev and offering the platform to customers.
7.1 Infrastructure prerequisites
*nycast strategy (Phase 1): managed anycast*— use a provider that already has a global anycast network and leases it as a service (Vultr BGP, Hivelocity Anycast, or Cloudflare Spectrum). This avoids operating BGP sessions, IXP memberships, and a NOC before there is paying customer demand. Own BGP infrastructure is deferred to Phase 2 (§8), at which point Koder registers its own ASN and PI block at LACNIC.
| Item | Description | Procurement path |
|---|---|---|
| Managed anycast IPs | Lease anycast-capable IPs from provider | Vultr BGP / Hivelocity |
| PoP — GRU (São Paulo) | VPS or dedicated with anycast IP at São Paulo | Vultr São Paulo / Hivelocity |
| PoP — IAD (Ashburn) | VPS or dedicated with anycast IP at Ashburn | Vultr Ashburn / Hivelocity |
| PoP — FRA (Frankfurt) | VPS or dedicated with anycast IP at Frankfurt | Hetzner Frankfurt + BGP addon |
7.2 Software stack at each PoP
Phase 1 uses *anaged anycast*— the provider (VultrHivelocityHetzner) handles BGP announcement. Koder operates only the DNS software layer.
Each PoP runs:
┌──────────────────────────────────────┐
│ Koder Herald PoP node │
│ │
│ ┌─────────────────┐ │
│ │ Knot DNS │ ← authoritative │
│ │ (zone serving) │ name server │
│ └────────┬────────┘ │
│ │ zone sync │
│ ┌────────▼────────┐ │
│ │ koder-dns-sync │ ← pulls zones │
│ │ (agent) │ from engine │
│ └─────────────────┘ │
│ │
│ Anycast IP: leased from provider │
│ BGP announcement: provider-managed │
└──────────────────────────────────────┘- *not DNS* high-performance authoritative name server (C, handles
millions of queries/second per core). Chosen for built
in modgeoip and rate limiting modules (see §12, decision #3). Receives zone updates fromkoder-dns-sync. - *oder
dnssync* a lightweight Go agent (part ofplatform/dns)that polls the DNS engine for zone changes and pushes them to the local Knot instance via AXFR or incremental API.
*hase 2 note* own BGP (BIRD 2, ASN, PI /24 block via LACNIC) replaces the managed anycast layer in Phase 2, adding full traffic engineering and IXP peering without changing the Knot DNS + koder
dnssync software stack.
7.3 Zone distribution flow
User edits record in apps/domains
→ platform/dns engine stores in kdb
→ increments zone serial
→ koder-dns-sync agents at each PoP poll for serial changes
→ pull updated zone → reload NSD/Knot
→ TTL governs how long resolvers cache the old answerTarget propagation: record change visible at all PoPs within 30 seconds.
7.4 GeoDNS with own infrastructure
With own anycast, true GeoDNS works as follows:
- Each PoP announces the same anycast IP.
- BGP routing ensures resolvers query the nearest PoP.
- The PoP serves the answer for the resolver's region.
- NSD/Knot supports GeoIP-based views natively (via BIND views syntax
or Knot's built
in `modgeoip` module). - GeoIP database: MaxMind GeoLite2 (free) or GeoIP2 City (paid, higher
accuracy). Updated weekly via automated job.
7.5 Acceptance criteria
koder.devresolves from GRU, IAD, and FRA PoPs with p95 latency < 5 msfrom each respective region.
- A record change propagates to all PoPs within 30 seconds.
- Failure of a single PoP is transparent to end users (BGP withdraws the
announcement; traffic reroutes to nearest remaining PoP automatically).
- Koder DNS is serving at least 10 external customer zones.
8. Phase 2 — Global expansion
*bjective:*reach 15+ PoPs across all continents, achieving sub-10ms resolution worldwide and full competitive parity with ClouDNS.
8.1 Target PoP map
| Region | City | Priority |
|---|---|---|
| South America | São Paulo (GRU) | Phase 1 |
| North America East | Ashburn (IAD) | Phase 1 |
| Europe West | Frankfurt (FRA) | Phase 1 |
| North America West | Los Angeles (LAX) | Phase 2 |
| Europe North | Amsterdam (AMS) | Phase 2 |
| Asia Pacific | Singapore (SIN) | Phase 2 |
| Asia Pacific | Tokyo (NRT) | Phase 2 |
| Australia | Sydney (SYD) | Phase 2 |
| South America | Buenos Aires (EZE) | Phase 2 |
| Africa | Johannesburg (JNB) | Phase 2 |
| Middle East | Dubai (DXB) | Phase 2 |
| Europe South | Madrid (MAD) | Phase 2 |
| North America Central | Chicago (ORD) | Phase 2 |
| Asia | Mumbai (BOM) | Phase 2 |
| Europe East | Warsaw (WAW) | Phase 2 |
8.2 Additional Phase 2 features
- *NSSEC*— sign zones at the authoritative server; publish DS records at
registrar.
- *NS over HTTPS (DoH) and DNS over TLS (DoT)*— encrypted resolver
endpoints for privacy-conscious clients.
- *erraform / OpenTofu provider*—
koder/dnsprovider for infrastructureas code.
- *ebhook on record change*— notify external systems when DNS state
changes.
- *ate limiting and abuse protection*— per-source IP query rate limiting
at the PoP level to mitigate amplification attacks.
- *LA dashboard*— public uptime page per PoP, per zone.
9. Engine module: platform/dns
9.1 Responsibilities
- *one store* authoritative state of all zones and records (backed by kdb).
- *rovider sync* for Phase 0, push zone changes to ClouDNS/Porkbun APIs.
For Phase 1+, push to PoP agents via zone transfer.
- *ailover engine* health check poller + automatic record switching.
- *eoDNS rule engine* store and evaluate geographic routing rules.
- *DNS service* token-gated endpoint to update A/AAAA records.
- *nalytics ingestion* receive query stats from PoP agents; store in
time-series DB.
- *one file I/O* BIND zone file parser and serializer.
- *anagement API* REST + gRPC, consumed by
apps/domainsand externalclients.
9.2 Tech stack
- *anguage* Go (consistent with
platform/kmail,platform/raven,platform/id) - *torage* kdb (zone records, failover state, GeoDNS rules)
- *ueue* for async zone sync jobs
- *bservability* OpenTelemetry traces + metrics, exported to
observe/stack
9.3 API surface (draft)
# Zones
GET /api/v1/zones
POST /api/v1/zones
GET /api/v1/zones/{zone}
DELETE /api/v1/zones/{zone}
# Records
GET /api/v1/zones/{zone}/records
POST /api/v1/zones/{zone}/records
PUT /api/v1/zones/{zone}/records/{id}
DELETE /api/v1/zones/{zone}/records/{id}
# Zone files
GET /api/v1/zones/{zone}/export → BIND zone file
POST /api/v1/zones/{zone}/import ← BIND zone file
# Failover
GET /api/v1/zones/{zone}/monitors
POST /api/v1/zones/{zone}/monitors
PUT /api/v1/zones/{zone}/monitors/{id}
DELETE /api/v1/zones/{zone}/monitors/{id}
# GeoDNS
GET /api/v1/zones/{zone}/records/{id}/geo-rules
POST /api/v1/zones/{zone}/records/{id}/geo-rules
DELETE /api/v1/zones/{zone}/records/{id}/geo-rules/{id}
# DDNS
POST /api/v1/ddns/update?token=&hostname=&ip=
# Analytics
GET /api/v1/zones/{zone}/stats?from=&to=&resolution=10. Infrastructure costs (estimates)
Phase 0
| Item | Cost |
|---|---|
| ClouDNS (keep) | ~$20/month |
| Engineering | (internal) |
| *otal* | *$20/month* |
Phase 1 (3 PoPs)
| Item | Cost/month (est.) |
|---|---|
| ASN + PI /24 (LACNIC) | R$70 ($13) |
| PoP GRU (1U colo or VPS BGP) | ~$80–150 |
| PoP IAD (Vultr BGP or colo) | ~$80–150 |
| PoP FRA (Hetzner dedicated) | ~$80–150 |
| IXP memberships (PTT.br etc) | ~$30–80 |
| *otal* | *$300–550/month* |
Phase 2 (15 PoPs)
Estimated $1,500–3,000/month depending on colocation vs. cloud BGP mix. At that scale, DNSasa-service revenue from customers should cover costs.
11. Relationship to existing modules
| Module | Relationship |
|---|---|
apps/domains |
Primary UI for managing zones and records. Tickets #007–#012 are Phase 0 deliverables |
platform/raven |
Raven provisions MX + SPF + DKIM records for email tenants; will call platform/dns API |
platform/id |
ID service provisions DNS records for new tenant subdomains |
infra/ |
PoP node configs, BIRD BGP templates, NSD/Knot configs |
observe/ |
PoP health metrics, query rate dashboards, latency per region |
platform/kdb |
Primary storage for zone state, failover state, analytics |
12. Decisions
| # | Question | Decision | Rationale |
|---|---|---|---|
| 1 | Product name | *erald*(platform/dns engine · apps/herald product) — Brand 84 (Great) |
Evaluated 10 candidates; atlasnexussignal eliminated by collision; herald leads (84) followed by relay (81) and resolve (75). "Herald" = the one who announces/delivers — direct DNS metaphor, no major tech collision |
| 2 | Anycast strategy for Phase 1 | *anaged anycast first*(Vultr BGP or Hivelocity), own BGP deferred to Phase 2 | Faster time |
| 3 | Authoritative name server | *not DNS* | mod |
| 4 | GeoIP data provider | *axMind GeoLite2*to start; evaluate upgrade to GeoIP2 City when Phase 1 has >50 customer zones | GeoLite2 is accurate enough for continent |
| 5 | RIR for ASN and PI block | *ACNIC* | Koder is a Brazilian company; LACNIC is the RIR for Latin America and the Caribbean. ARIN is for North America — no justification to register there first |
All questions resolved
All 5 open questions are now closed. See table above.
13. Backlog summary
platform/dns/backlog/ (new tickets — engine)
- #001 Zone store and management API (CRUD + kdb backend)
- #002 Provider sync adapter (ClouDNS + Porkbun) — Phase 0 bridge
- #003 Failover health check engine
- #004 GeoDNS rule engine
- #005 DDNS token service
- #006 Zone file import/export (BIND)
- #007 Analytics ingestion and time-series storage
- #008 koder
dnssync agent (PoP agent for Phase 1) - #009 BIRD BGP config templates and PoP provisioning runbook
- #010 Knot DNS integration and GeoIP module setup
apps/domains/backlog/ (already created — tickets #007–#012)
- #007 Failover DNS UI
- #008 GeoDNS UI
- #009 DDNS management
- #010 DNS statistics dashboard
- #011 Secondary DNS / zone transfer UI
- #012 Zone file import/export UI
infra/backlog/ (infrastructure)
- ASN registration (LACNIC)
- PI /24 IPv4 + /48 IPv6 block (LACNIC)
- PoP GRU colocation contract
- PoP IAD colocation/BGP contract
- PoP FRA colocation/BGP contract
- IXP memberships (PTT.br, Equinix IX, DE-CIX)