Id

Koder ID

  • *rea:*Foundation
  • *ath:*services/foundation/id/engine (canonical; legacy platform/ symlinked per RFC-003)
  • *ind:*Identity provider (OIDC / OAuth2 / SAML) — custom Go microservice
  • *ersion:*0.9.3
  • *tatus:*In production at id.koder.dev since 20260409

Role in the stack

Koder ID is the centralized identity layer for every Koder product. Every product that needs authentication redirects to https://id.koder.dev/oauth/v2/authorize, gets an OIDC token back, and uses it to authorize the user. Products never implement their own login flow — they consume tokens minted by ID.

The implementation is a fromscratch Go IAM (not Zitadel). It targets hyperscale tenant density (kdbnext substrate) and is the production identity provider. The codebase lives in services/foundation/id/platform/.

Primary couplings

Consumer Relationship
foundation/flow Uses Koder ID as OIDC provider for Koder Flow git forge login
foundation/kompass Organizations and tenants scoped via Koder ID
apps/pass / foundation/pass Companion app for passkeys and identity management
products/horizontal/chat/engine dev.koder.chat OIDC client — exchanges Koder ID tokens for Chat sessions
products/vertical/lex/app dev.koder.lex OIDC client — PKCE flow for legal app auth
Every Koder SaaS product OIDC client — redirects to ID for login

Auth Flow — Handle Lookup

CreateFlow resolves the login identifier in two steps: first by email (GetUserByEmail), then by handle/username (GetUserByHandle) if the email lookup fails. Both strategies produce the same flow.UserID; on both failing the flow is created with an empty UserID and fails generically at the password step (timing-safe — existence is never revealed to the caller).

The by_handle secondary index on identity_users (IndexID 2) was added alongside the existing by_email index (IndexID 1). The gRPC method GetUserByHandle was added to IdentityService.

Interfaces

  • OIDC Discovery: https://id.koder.dev/.well-known/openid-configurationclaims_supported includes tenant (added 20260518, engine#069) for RP-side tenant validation
  • Authorization: GET /oauth/v2/authorize
  • Token: POST /oauth/v2/token — id_token carries tenant claim matching AccessTokenClaims.TenantID
  • UserInfo: GET /api/v1/me — returns sub, email, full_name, preferred_username
  • Admin gRPC: RegisterClientRequest accepts optional desired_client_id (field 7) for firstparty clients with stable reversedomain IDs
  • Migration endpoint: POST /v1/admin/migrate/oauth-client (requires KODER_ID_MIGRATION_MODE=true) — registers clients with specific IDs
  • Admin REST + CLI surface fully documented in engine/docs/rfcs/RFC-007-admin-service-and-cli.md (revised 20260518 to reflect shipped reality from engine#073)

Selfconformance to oauthflow.kmd (R1.E1)

Koder ID's own administration UIs (engine/account-ui/, engine/admin-ui/) ride a cookie-session direct from the engine — they do NOT perform OAuth round-trips against the same issuer. This is the *1.E1 exception*ratified 20260518 in specs/auth/oauth-flow.kmd and tracked in registries/koder-id-auth-coverage.md as SKIP (R1.E1). External RPs (Flow, Kall, Dek, …) remain subject to the full R1+R6 + T-suite.

GeoIP enrichment (tickets 075083084)

pkg/geoip ships a Resolver interface with two implementations: NoOpResolver (private/unknown short-circuit) and MaxMindResolver (reads GeoLite2City.mmdb via `github.comoschwaldmaxminddbgolang). Both auth and sso services consume pkg/geoip — moved out of servicesauthinternal/geoip` into the shared package in #084 to anticipate the ASN follow-ups (#077/#085) and other future consumers.

Wiring: both services/auth/cmd/main.go and services/sso/cmd/main.go read env KODER_ID_GEOIP_DB (default empty → NoOpResolver) and pass the resolver to AuthService.SetGeoResolver / SSOService.SetGeoResolver. Schema columns shipped on auth_events (migrations 1723) and sso_sessions (migrations 814): country code + name + city + lat + lon + source + lookupat. The idtoken discovery + tenant claim are unaffected — these fields live on persisted events/sessions only.

ASN / network-type enrichment (tickets 077/085)

pkg/asn mirrors the pkg/geoip design: Resolver interface + NoOpResolver + MaxMindResolver (GeoLite2ASN / GeoIP2ISP). The resolver layers a curated *atacenter ASN*table (pkg/asn/datacenter.go — AWSGCPAzureDOHetznerOVHLinodeVultr OracleAlibabaIBMTencentCloudflareFastlyScalewayContaboLeaseweb) and a curated *PN provider ASN*table (NordMullvadExpressProton Surfshark/IPXO) on top of the raw lookup. The TorExitTracker pulls https://check.torproject.org/torbulkexitlist once a day and pushes the set into the resolver via SetTorExits (override before lookup).

Wiring: services/auth/cmd/main.go reads env KODER_ID_ASN_DB to load the .mmdb. KODER_ID_TOR_LIST=1 opt-in starts the refresh goroutine. Schema columns shipped on auth_events (migrations 24-27): asn, asnorg, networktype, asnlookupat. network_type is a closed taxonomy: unknownprivateresidentialmobiledatacentervpntor.

Risk scoring (per policies/multi-tenant-by-default.kmd + future #086) will compose ParsedASN.RiskWeight() (residential0, vpn20, datacenter50, tor80) with geoanomaly and devicemismatch signals.

Personal Access Tokens (#104)

PAT issuance lands as pat_tokens (migrations 38-40) + PATRepository + PATService (IssueListRevokeValidate) + `v1mepats HTTP endpoints + koderidcli pat {create,list,revoke}. Tokens are formatted as kpat_<43 base64 chars> (32 bytes entropy), stored as SHA-256 hex with a unique index. Scope validation at issuance gates against pkg/scopes.WellKnownPATScopes (read:usage` Wave 1, more as consumers land). Plaintext is returned exactly once (copy-now UX in CLI/UI). The Flutter Settings UI lives in follow-up #106.

Admin erasure-replay endpoint (#098)

POST /admin/v1/erasure-replay triggers ErasureService.ReplayErasures in a goroutine and returns 202 + job_id. Companion to the env-gated boot replay (KODER_ID_ERASURE_REPLAY_ON_BOOT=1 from #094); same loopback-only posture as the rest of admin REST (LXC firewall is the enforcement layer until admin auth ratifies). Optional body {"reason": "post-restore 2026-05-14"} for audit.

Perservice observability (#102 closed + RFC015)

Every sub-service exposes /health + /metrics via the shared pkg/observability helper (Wave 1 ratified 20260516, Wave 2 covered 15 subservices). `RFC015-observability.md (this batch) codifies the wiring pattern, response shape, and rollout map; the helper gains CheckKDB/CheckRedis/CheckPinger/CheckFunc dependency-check constructors so per-service main.go can wire realistic health checks in one line. Auth service wired as exemplar (observability.CheckKDB("kdb", kdbClient)). RED-style histograms + gRPC interceptor tracked in engine#107; admin metrics aggregator update in engine#108`.

*ave 3b (20260520, #113 + #114 closed):*auth + identity cmd/main.go now wrap the parent HTTP mux with observability.WrapHandler("", handler) so every route emits koder_id_http_request_duration_seconds{handler=<URL.Path>,method,status,quantile} samples. Their grpc.NewServer is constructed with grpc.UnaryInterceptor(observability.UnaryServerInterceptor()), which records koder_id_grpc_server_handled_seconds{grpc_service,grpc_method,grpc_code,quantile} into a sibling registry exposed by the same /metrics endpoint. Reservoir is capped at 1024 samples per (handler,method,status) and per (grpcservice,grpcmethod,grpc_code) key — bounded cardinality. The interceptor's self-conformance test (grpc_interceptor_test.go) drives an in-process gRPC server through the standard health probe to assert sample emission for both OK and PermissionDenied codes.

OAuth Device Authorization Grant (RFC 8628, #116)

Headless CLI / AI / CI authentication landed 20260520 (#116 phases 12, 45, 7-partial — phases 3 + 6 open):

  • *tart endpoint*POST /oauth/v2/device_authorization mints a

    32byte URLsafe device_code + 8char `XXXXXXXX user_code from the BCP-47-safe alphabet (ABCDEFGHJKLMNPQRSTUVWXYZ23456789` — no 0O1IL). Default lifetime 600s; recommended poll interval 5s.

  • *torage*lives in oauth_device_authorizations (kdb1 table)

    / DeviceAuthorizationRow proto (kdbnext). The kdbnext repo uses Scan + filter for user_code lookups (bounded table; index added later if perf requires it).

  • *oken endpoint*POST /oauth/v2/token accepts both

    grant_type=urn:ietf:params:oauth:grant-type:device_code and the short alias device_code. Implements the §3.5 polling protocol in service.ExchangeDeviceCode: authorization_pending / slow_down (backtoback polls under interval/2) / access_denied / expired_token / invalid_grant (client_id mismatch). On success: mints access + id_token (15min, no refresh yet) and *eletes the row*for single-use enforcement.

  • *IDC discovery*advertises the new grant in

    grant_types_supported and exposes device_authorization_endpoint per RFC 8628 §4.

  • *erification page*GET/POST /oauth/v2/device is a vanilla

    HTML form (no templateengine dependency) that prefills the usercode from `?usercode= (matches verificationuricomplete) and accepts email + password + decision in a single submit. Allow → service.AuthorizeDevice; Deny → service.DenyDevice`; surfaces friendly errors for unknown / expired / MFA-locked branches. Multi-step "if already logged in" flow deferred until the oauth service gains session-cookie detection.

  • *ub CLI*(products/dev/hub/cli/main.go::cmdLogin) now talks

    to https://id.koder.dev/oauth/v2/device_authorization directly (form-encoded, no Authorization header), polls /oauth/v2/token with proper slow_down / expired_token / access_denied / invalid_grant handling, and parses the returned id_token JWT to populate cfg.UserID (sub) + cfg.Username (name → email fallback). Override via KODER_ID_URL env for self-hosted Koder ID instances.

  • *eed client khub-cli*is registered as a public client

    with device_code + refresh_token grants via engine/tools/seed-clients/main.go (the catalog now supports per-entry grantTypes overrides).

  • *-suite + handler tests*
    • services/oauth/internal/service/device_grant_test.go

      T1-T8 (start fields, alphabet, pending, slow_down, success+delete, denied, expired, single-use) + client_id mismatch + unauthorized_client. 10/10 PASS.

    • services/oauth/internal/handler/device_grant_verification_test.go

      — 7 tests on the verification page (GET render, GET pre-fill, POST missingunknownexpired user_code, noauthclient, methodnotallowed). 7/7 PASS.

OAuth client_credentials grant — service accounts (#115)

Machinetomachine auth path for CI / /k-ship / AI agents that need to publish without a human in the loop. The client_credentials grant is OAuth 2.0 §4.4; this section documents the Koder ID-specific contracts that #115 hardened.

  • *onfidential clients only.*tokenClientCredentials rejects

    public clients up front — a public client without a registered secret cannot prove possession of the credential, so granting it a JWT would defeat the grant's purpose.

  • *onstant-time secret compare*via subtle.ConstantTimeCompare.
  • *rant-list enforcement.*Client must list client_credentials

    in client.GrantTypes; otherwise the endpoint returns unauthorized_client (400). Catches the case where a client was provisioned for one grant and is being abused with another.

  • *cope subset check.*Requested scopes must all be present in

    client.Scopes; an unregistered scope yields invalid_scope. An omitted scope parameter grants the full registered set.

  • *ubject convention.*JWTs carry sub=service-account:<client_id>

    so downstream middleware (notably the Hub's requireDev) can recognize service-account tokens without a schema change. The usertablerow materialization (kind='service' on koder_users) remains an open slice; the prefix convention is enough for the immediate publishing path.

  • *o refresh token.*Clients re-mint when the access token

    expires (1h TTL).

  • *ub CLI*(products/dev/hub/cli) recognizes three env vars in

    precedence order: KHUB_CLIENT_CREDENTIALS (file with client_id:client_secret), KHUB_CLIENT_ID + KHUB_CLIENT_SECRET (paired vars), and falls back to the cached human token from khub login when none are set. KODER_ID_URL overrides the issuer for self-hosted instances.

  • *est suite* services/oauth/internal/handler/client_credentials_test.go

    — T1 success roundtrip + JWT shape, T2 wrong secret, public-client reject, unauthorizedclient guard, invalidscope, omitted-scope defaults, unknown client, form-body credentials. 8/8 PASS.

  • *perator runbook*

    meta/context/runbooks/hub-publishing-service-account.kmd.

Open followups: serviceaccount materialization on koder_users (kind='service' column + auto-Apply to developer table) and automated 90-day rotation cron.

Multi-tenant isolation (RLS, tickets 100 + 105)

*hase 1*(#100, 20260518) shipped Postgres-only RLS DDL helpers in pkg/kdb (AddPostgresOnly, BackendKind, WithTenantTx) and CREATE POLICY migrations for the admin service's tenant-scoped tables (api_keys, audit_log). All behind KODER_ID_RLS_ENABLE=true.

*hase 2 slice 1*(#105, this batch) wires the admin service to actually serve requests under RLS:

  • pkg/kdb.ContextWithStore + StoreFromContext propagate the

    requestscoped txstore via context. WithTenantTx stashes the tx-store automatically so unmodified callsites pick it up.

  • apikey_repo + audit_repo read every DB op via

    kdb.StoreFromContext(ctx, r.client.Raw()) — tx-aware by default, fallback to the longlived store when no ctxstore is set.

  • services/admin/internal/handler/rls_middleware.go wraps every

    admin HTTP request in WithTenantTx when the env gate is on, using the request's X-Tenant-ID. Requests without a tenant header (systemadmin ops like listall-tenants) skip the wrap.

  • Wired in services/admin/cmd/main.go between mux and ListenAndServe.

*hase 2 slice 2*(#105, this batch) extends the same pattern to the auth service:

  • services/auth/internal/repository/migrations_rls.go registers

    RLS for auth_events + auth_flows + auth_lockouts (the 3 hot- path tables) at versions 12001201, 12101211, 1220/1221.

  • event_repo, flow_repo, lockout_repo route every DB op via

    kdb.StoreFromContext(ctx, r.client.Raw()).

  • services/auth/internal/handler/rls_middleware.go mirrors the

    admin helper; services/auth/cmd/main.go wires both the RLS migrations (behind the env gate) and the middleware.

*hase 2 slice 3*(#105, this batch) covers oauth + session:

  • oauth: client_repo, authcode_repo, consent_repo, key_repo

    routed via kdb.StoreFromContext; RLS for clients, authorization_codes, consents, signing_keys at versions 13001301, 13101311, 13201321, 13301331.

  • session: session_repo routed; RLS for sessions, access_tokens

    at 14001401, 14101411.

  • Both services ship a mirror rls_middleware.go and wire it in

    main.go behind KODER_ID_RLS_ENABLE.

*hase 2 slice 4*(#105, this batch) — narrower than originally scoped. Auditing the remaining sub-services revealed that *udit, behavior, device, access, qr, sync, telephony, webhooks, and sso use org_id / user_id / no isolation column — NOT tenant_id* The RLS template from #100 / #105 slice 13 doesn't fit them asis. Captured in engine#109: each needs categorization (retroactive tenantscope, orgscope with a new WithOrgTx helper, user-scope with WHEREclauseonly, or genuinely system-wide) before RLS adoption proceeds.

The slice 4 work that did ship: *aml*(legitimately tenant-scoped, 2 tables) — sprepo + idpkeyrepo routed via kdb.StoreFromContext; RLS migrations 1500-1511 for saml_sps + saml_idp_keys; rls_middleware.go mirror; main.go wire.

*hase 2 slice 5*(#105, this batch) — finishes the truly tenantscoped subservices:

  • *uth completion* remaining 6 repos (magiclink, socialprovider,

    socialstate, workspacesso, ldap, identity) routed via StoreFromContext; migrations_rls.go extended to versions 1230-1289 covering magiclinks / socialproviders / socialloginstates / workspacessoconfigs / ldap_servers / user_identities.

  • *dentity*(biggest single subservice — 15 tenantscoped tables

    out of 32 total): 17 repos refactored, migrations_rls.go (new) registers RLS at versions 1600-1696 for users, credentials, mfadevices, verificationtokens, groups, group_members, scimtokens, subtenantrole_bindings, invites, roles, roleassignments, webhooks, webhookdeliveries, pat_tokens, erasurerequests. `rlsmiddleware.go` mirror + main.go wire.

*LS adoption count after slice 5* admin + auth + oauth + session + saml + identity = */6 of the confirmed tenantscoped subservices (100%)*— every sub-service that genuinely uses tenant_id is now ctx-aware and gated behind KODER_ID_RLS_ENABLE.

*hase 2 infra extension*(batch 11, 20260518) — pkg/kdb now ships WithOrgTx + WithUserTx helpers (mirrors of WithTenantTx, setting app.current_org_id / app.current_user_id GUCs instead). #109 carries the schema audit + classification of the remaining 8 subservices: audit + access = orgscoped; behavior + qr + sync + telephony + webhooks + sso = userscoped (with a handful of system wide tables — rate_limits, sso_clients, sso_revocations — that don't need RLS at all). Mixedscope subservices (device) split per-table or use composite predicates.

*hase 2 slice 6*(#105, batch 12) — first adoption of the new scopes:

  • *udit*(org-scoped exemplar): 9 repository callsites routed via

    kdb.StoreFromContext(ctx, r.db); migrations_rls.go registers RLS for audit_events, mfa_policies, org_settings with the org_id = current_setting('app.current_org_id') predicate (versions 1700-1721); rls_middleware.go wraps in WithOrgTx using X-Org-ID header; main.go wires both gated by env.

  • *r*(user-scoped exemplar): 4 callsites routed; RLS for

    qr_sessions with user_id predicate (1800-1801); WithUserTx middleware extracts X-User-ID (already injected by the gateway for authenticated requests).

Remaining work: behavior, sync, telephony, webhooks (user-scoped, mechanical now), access (org-scoped), device (mixed — split per table). Crosstenant + crossorg + cross-user T1 integration tests + default-on cutover remain as gating items. Crosstenant T1 integration test against a real Postgres + defaulton cutover after staging burn-in remain as gating items before flipping the env-gate default.

*hase 2 slice 7*(#105, batch 13 / 20260520) — wires the *ebhooks*subservice (userscoped, mirrors the qr exemplar):

  • 18 repository callsites routed via kdb.StoreFromContext(ctx, r.db).
  • migrations_rls.go registers RLS for webhooks,

    webhook_deliveries, and api_keys with the user_id = current_setting('app.current_user_id') predicate (versions 1900-1921).

  • rls_middleware.go wraps in WithUserTx using X-User-ID.
  • main.go gates both behind KODER_ID_RLS_ENABLE.

*FC016 ratified in the same commit*(`docsrfcsRFC016rowlevel-security.md). Codifies the two-layer model (Postgres RLS + middleware), the three-scope taxonomy (tenant / org / user), the three-file per-sub-service layout, the migration-version namespace table (100-block per sub-service), the koderspecaudit multi-tenancy -trict` audithook contract, and the fourcriterion default-on cutover plan.

*LS adoption count after slice 7* admin + auth + oauth + session + saml + identity + sso = tenantscoped (7); audit = orgscoped (1); qr + webhooks = userscoped (2). *0 subservices wired.*Remaining mechanical work: behavior, sync, telephony (user-scoped), access (org-scoped), device (mixed) — each follows an existing exemplar.

*hase 2 slice 8*(#105, batch 14 / 20260520) — wires the *ccess*subservice (orgscoped, mirrors the audit exemplar):

  • 8 repository callsites routed via kdb.StoreFromContext(ctx, r.db).
  • migrations_rls.go registers RLS for access_policies and

    access_evaluations (versions 23002311 per RFC016 §5's persubservice 100-block). Policy predicate admits both org_id = current_setting('app.current_org_id', true) rows AND org_id = '' global rows so system-wide access policies remain visible to every org.

  • rls_middleware.go wraps in WithOrgTx using X-Org-ID.
  • main.go gates both behind KODER_ID_RLS_ENABLE.

*LS adoption count after slice 8* 7 tenant + 2 org + 2 user = *1 sub-services wired.*Remaining mechanical work: behavior, sync, telephony (user-scoped), device (mixed — split per table).

*hase 2 slices 911*(#105, batch 15 / 202605-20) — bundles three userscoped subservices in one lote (mechanical, all mirror the qr/webhooks exemplar):

  • *ehavior*(slice 9, versions 2000-2099): RLS on

    behavior_events, user_risk_profiles, anomaly_records. 9 repository callsites routed.

  • *ync*(slice 10, versions 2100-2199): RLS on sync_versions,

    sync_entries, sync_devices, sync_snapshots. 16 callsites routed.

  • *elephony*(slice 11, versions 2200-2299): RLS on

    otp_requests, verified_phones. rate_limits intentionally excluded — composite key-only table without user_id, global by design (wrapping it in WithUserTx would silently drop all rows). 13 callsites routed.

*LS adoption count after slice 11* 7 tenant + 2 org + 5 user = *4 sub-services wired.*Only *evice*(mixed — split per table) remains in the mechanical-wiring queue.

Federated Identity (Social Login)

Social login is implemented inside services/auth/ (shares the auth service, reuses IdentityClient/SessionClient).

  • *roviders builtin* Google, GitHub, Microsoft, GitLab; any OIDCgeneric provider can be configured per tenant.
  • *low* GET /v1/auth/social/login/{provider}/start → redirect to provider → GET /v1/auth/social/login/{provider}/callback → session tokens.
  • *ccount linking* on first social login, links by existing email or creates a new account. Provider profile picture imported to user avatar.
  • *nlink* DELETE /v1/auth/identities/{user_id}/{provider} — denied if it's the last auth method and no password credential exists.
  • *I* Login and signup pages show social buttons when providers are configured. account.html template shows linked identities with link/unlink controls.
  • *torage* user_identities table (provider, provideruserid, userid snapshot). `socialproviders per tenant. socialloginstates` for CSRF (TTL 10 min).

Proto

  • services/foundation/id/platform/proto/koder/id/admin/v1/admin.proto
  • RegisterClientRequest.desired_client_id (string, field 7) — optional stable ID for first-party clients

Registered OIDC Clients

Client ID Product Notes
dev.koder.chat Koder Chat Registered 20260422 via MigrateClient
dev.koder.lex Koder Lex Registered 20260422 via MigrateClient
  • services/foundation/id/platform/ — production Go source
  • apps/pass / foundation/pass — identity companion app

Source: ../home/koder/dev/koder/meta/docs/stack/modules/id.md