Take RFC 001 architecture overview

TAKERFC001 — Koder Take Architecture Overview

  • *tatus:*Draft
  • *ate:*20260427
  • *cope:*Architecture for Koder Take, the Koder Stack's flagship cinéfilo social network (films, series, documentaries, anime, stand-up).
  • *elated RFCs:*id-RFC-006-identity-service.md, recsys-RFC-001 (in services/ai/recsys/docs/rfcs/), kdb-RFC-014-graph-and-timeseries.md

Summary

Koder Take is the consumer-facing product slot for the audiovisual social network vertical of the Suite. It is the Letterboxd / Trakt / TVTime equivalent inside the Koder Stack — a diary, rating system, list builder, and recommendation engine for everything narrative on a screen. This RFC defines the architecture and the seams against existing Koder modules (Play, Tune, Cine, recsys, kdb, id, billing) so implementation tickets can be opened with confidence.

Two assertions shape the design:

  1. *ake is a separate product, not a feature of Cine.*Cine is a B2B engine for cinema operations (tickets, programming, concession). Take is a B2C social product. They share identity (Koder ID), catalog (kdbbacked), and integrate via welldefined seams (Cine ticket purchase → Take diary entry), but they ship independently, brand independently, and reach independently.
  2. *ake consumes airecsys; it does not own a recommender.*Recommendations are computed by `airecsys` (the same engine that powers Play and Tune). Take's job is the interface to the recommender (rating capture, multiaspect "similar to" anchoring, filter UX, score breakdown display) and the social graph that feeds collaborativefiltering signals.

Motivation

The Koder Stack ships two consumption surfaces for video content — Koder Play (web/mobile streaming) and Koder Tune (Smart TV player) — and a B2B operations engine for theatrical cinema (Koder Cine). What is missing is the *ersonal layer* where the user records what they watched, rates it, builds lists, follows critics, and asks "what should I watch tonight?". Without that layer:

  • Play and Tune watch sessions produce engagement signal but no expressed preference (joinha/estrelas) — weakening the recommender.
  • Users have nowhere to log films watched outside the Koder Stack (cinema, Netflix, DVD), making any Koder recommender blind to ~80% of their actual viewing history.
  • Cine has no way to enrich a ticket purchase with social context (reviews, "people who saw this also saw…", critic recommendations).
  • The Koder ID social graph has no audiovisual edge type, so the network effect of "follow what your friends watch" can never form.

Letterboxd has demonstrated that this product, done with cinéfilo respect, retains users at engagement rates rivaling general-purpose social networks. There is no reason the Koder Stack should ship without it.

Product surface (v1)

Core flows

# Flow Trigger Outcome
1 *ndofvideo rating prompt* Play/Tune signals video reached ≥90% runtime Joinha up/down captured; optional 5star refinement; joinhadown asks reason via chip
2 *anual rating + diary entry* User opens title page in Take app and rates Entry persisted with date, optional location, mood, tags, written take
3 *Already watched" marking* User taps "watched" on title or episode Status set; recommender suppresses; granularity = film / episode / season / series
4 *atchlist* User taps "want to see" Status separate from "watched"; recommender prioritizes
5 *ecommendation feed* User opens Discover tab Mixed feed: blended recsys + cold-start + curated lists; each item shows score breakdown + trailer link
6 *Similar to" search* User picks 1-N anchor titles, optionally aspects, optionally negative anchors Ranked result list using hybrid embedding + collaborative filtering
7 *iltered search* User opens Filters and combines yearcountrygenredirectoractor/rating DSL-level filter applied to the catalog
8 *ritic / friend follow* User follows another account Their ratings appear in feed; their lists become discoverable
9 *ists (private + public + collaborative)* User creates a list Items added/reordered; visibility controlled
10 *inema check-in via Cine* User buys ticket via Cine Diary entry pre-populated with venue, showtime, party size; user confirms after the session

Differentiation

The features the team committed to in the design conversation that don't exist in Letterboxd / Trakt today:

  1. *ulti-aspect "similar to" anchoring*— user not only picks anchor titles but specifies what kind of similarity they want (action, tone, pacing, theme, protagonist archetype) and can supply *egative anchors*("similar to Bourne but no torture"). Implemented as weighted slots over the recsys embedding space.
  2. *core breakdown transparency*— every title page and recommendation card shows RT / IMDb / Letterboxd / Take separately. No collapsed single number, ever. Cinéfilos want to see divergence; this is a brand-defining choice.
  3. *railer in recommendations*— every suggested title surfaces its official trailer with both audio tracks (original/subtitled and dubbed), side by side. Audio language is a first-class field in the catalog.
  4. *ative crossproduct wiring*— Play/Tune automark watched at ≥90% / abandoned at <15%; Cine ticket purchase pre-populates a diary entry; the same Koder ID is the social account.

Architecture

Block diagram

graph TD
  subgraph Apps[Apps layer]
    Mobile[suite/take/app/mobile<br/>Flutter — iOS + Android]
    Desktop[suite/take/app/desktop<br/>Flutter — Linux + Windows + macOS]
    Site[suite/take/site<br/>Landing page]
  end

  subgraph Platform[Take backend]
    Take[suite/take/platform<br/>Go SaaS<br/>diary · ratings · lists · social graph]
  end

  subgraph Sources[Content sources]
    Play[suite/play<br/>Streaming]
    Tune[suite/tune<br/>Smart TV]
    Cine[suite/cine<br/>Cinema tickets]
  end

  subgraph AI[AI layer]
    Recsys[ai/recsys<br/>Recommender]
    Runtime[ai/runtime<br/>Embeddings + LLM]
  end

  subgraph Foundation
    ID[foundation/id<br/>Koder ID]
    Billing[foundation/billing<br/>Premium tier]
    Reporter[foundation/reporter<br/>Error reporting]
  end

  subgraph Data
    KDB[data/kdb<br/>Catalog · ratings · diary · social graph]
    Search[data/search<br/>Title / person / list search]
    Blob[data/blob<br/>User-uploaded posters / list covers]
  end

  subgraph External[External catalogs]
    TMDb[TMDb API<br/>metadata · cast · crew · trailers]
    Trakt[Trakt API<br/>aggregate watch data]
    JustWatch[JustWatch API<br/>where-to-watch]
  end

  Mobile --> Take
  Desktop --> Take
  Site --> Take

  Play -.end-of-video signal.-> Take
  Tune -.end-of-video signal.-> Take
  Cine -.ticket purchase.-> Take

  Take --> Recsys
  Recsys --> Runtime
  Take --> ID
  Take --> Billing
  Take --> Reporter
  Take --> KDB
  Take --> Search
  Take --> Blob

  Take -.metadata sync.-> TMDb
  Take -.aggregate ratings.-> Trakt
  Take -.where-to-watch.-> JustWatch

Data model (kdb collections)

Collection Purpose Key shape
take.titles Catalog row, sourced from TMDb + manual editor tmdb_id PK; kind (filmseriesepisodedocspecial); original_title, release_year, country[], genre[], director[], cast[], runtime_min, score_external{tmdb,imdb,rt,lbox}, trailer_url{original,dubbed}
take.ratings One per (user, title) user_id × title_id PK; thumb (updownnull); stars (0.5–5.0 in 0.5 steps, nullable); reason_chip[] (only for thumb-down); created_at, updated_at
take.diary Time-stamped log of viewings id PK; user_id, title_id, watched_at (date, possibly fuzzy), runtime_pct, source (playtunecineelsewheremanual), venue, companions[], mood, note (free text)
take.watch_status "Already watched" / "want to see" user_id × title_id PK; state (watchedwatchlistnone); watched_here bool; last_episode (for series); updated_at
take.lists User-created lists id PK; user_id, title, description, visibility (privatefollowerspublic), collaborators[], cover_blob_id
take.list_items Items in a list list_id × position PK; title_id, note
take.follows Social graph edge follower_id × followee_id PK; created_at
take.reviews Long-form written takes id PK; user_id, title_id, body_md, spoiler, published_at, visibility

All collections are tenantscoped to the take tenant in Koder ID. Crosstenant queries (e.g., showing aggregate "Take rating") are computed by take/platform and cached, never queried from the app directly.

Service boundaries

Service Owns Endpoints (sketch)
take/platform/api/catalog Title lookup, metadata refresh, search proxy GET /titles/{id}, GET /search?q=, POST /admin/refresh/{id}
take/platform/api/ratings Capture and edit ratings, endofvideo webhook POST /ratings, PATCH /ratings/{id}, POST /webhooks/end-of-video
take/platform/api/diary Diary entries POST /diary, GET /users/{id}/diary
take/platform/api/discover Recommendation feed, "similar to" search, filtered search POST /discover/feed, POST /discover/similar, POST /discover/filter
take/platform/api/lists Lists CRUD, list discovery POST /lists, GET /lists/{id}, POST /lists/{id}/items
take/platform/api/social Follow graph, feed, profile POST /follow, GET /users/{id}/feed, GET /users/{id}/profile

api/discover is the most interesting: it is a thin wrapper over ai/recsys. The ranking, embedding lookup, and collaborative filtering all happen in recsys; Take only assembles the request (user context + filters + anchors + aspects) and renders the response (with score breakdown + trailer URLs).

"Similar to" — implementation sketch

The flagship differentiator. Implementation across two recsys primitives:

  1. *mbedding centroid + nearestneighbor*— for each anchor title, fetch its embedding from recsys (computed via ai/runtime). Take usersupplied weights per aspect (action / tone / pacing / theme / protagonist) to project the embedding into a sub-space. Compute the centroid of all anchor projections. Negative anchors subtract their projected vector from the centroid. ANN over the catalog returns N candidates.
  2. *ollaborative filtering rerank*— recsys reranks the N candidates using "users who rated all anchors highly also rated X highly" signal. Final rank is a weighted blend (default 60% content embedding, 40% collaborative; user-tunable in advanced mode).

Tickets ai/recsys#similar-anchored and ai/recsys#similar-aspect-projection will be opened against lock-koder-play once it archives.

External catalog ingestion

Source Method Refresh cadence Fields used
*MDb* Official API (free, key required) Nightly delta + on-demand on first user view All metadata: title, year, country, genre, director, cast, runtime, posters, trailers
*rakt* Official API (free tier) Weekly aggregate Aggregate watched/rating counts (anonymized) for cold-start signal
*ustWatch* Partial API (free tier limits) Daily for top 10k titles, on-demand otherwise "Where to watch" links per region
*MDb* *ot used*in v1 — licensed dataset is expensive, scraping violates ToS We display the IMDb score as text only, sourced indirectly via TMDb's external_ids field where available
*otten Tomatoes* *ot used*— scraping violates ToS Same: surfaced via TMDb's vote_average proxy when available
*etterboxd* *o public API.*Users can import their viewing history via Letterboxd's CSV export (manual upload). One-shot per user Imports populate take.ratings and take.diary for that user

The strict ToS posture is deliberate. The product cannot ship if it violates RT or IMDb terms; instead, the score breakdown panel labels each source clearly and falls back to "—" when data is unavailable.

Cross-product seams

Seam Trigger Action
*lay endofvideo* suite/play emits event when video hits ≥90% runtime Take captures auto-watched event; shows in-app prompt for thumb up/down on next session
*lay abandonment* suite/play emits event when video stops at <15% runtime Take records abandoned signal; recsys treats as soft negative
*une endofvideo* Same as Play but from Smart TV Same
*ine ticket purchase* suite/cine emits ticket-purchased event Take pre-creates a diary entry with venue + showtime; user confirms after
*ine showing watched* suite/cine emits session-attended event (NFC entry scan) Diary entry is auto-confirmed
*ecsys feedback loop* Take rating events sent to recsys nightly batch Recsys retrains user-level embedding

All seams are *ventdriven*via data/mq — no synchronous coupling. If a downstream module is offline, the event queues; nothing on the userfacing path of the source product breaks.

Identity, billing, privacy

Identity

Take is a tenant in Koder ID (tenant: take). The Koder Person becomes a Take account on first login; profile data (display name, avatar) syncs from foundation/id. Username allocation follows policies/username-allocation.kmd.

Billing

Take v1 is *ree with optional premium*(take.koder.dev/premium):

Tier Price target Features
Free R$0 All core features (rate, diary, lists, follow, discover, similar-to). Score breakdown shown. Ads in feed (light, contextual).
Premium R$19,90/mo Adfree. Advanced filters (DSL). Multiaspect "similar to" with all 5 aspects (free tier limited to 2). Letterboxd CSV import. Public list collaborators.

Subscription managed by foundation/billing. No ads served to users in EUUK pending GDPR review (see `policiessecurity.kmd`).

Privacy

  • Ratings are *ublic by default*but each one can be marked private at capture time.
  • Diary entries are *ollowers-only by default*
  • "Already watched" status is *ever public*— it leaks too much about taste and habits.
  • Following is symmetric-visible (you can see who follows you, and vice versa) — no shadow follows.
  • Letterboxd CSV imports must be optin per import (no autopublishing).

Phase plan

Phase Scope Module impact Lock
* (this RFC)* Architecture ratified, scaffold + initial backlog suite/take directory, RFC, ~15 tickets lock-koder-take
* — Catalog* TMDb sync, title page (read-only), search, score breakdown UI products/horizontal/take/platform, data/kdb schemas TBD
* — Capture* Manual rating, diary, "watched"/"watchlist", endofvideo webhook (Play + Tune) + suite/play, suite/tune thin clients After lock-koder-play archives
* — Discover* Recommendation feed, "similar to" anchoring, filtered search + ai/recsys extensions After Phase 2
* — Social* Follow graph, feed, lists, reviews, cinema check-in via Cine + suite/cine ticket-purchase event After Phase 3
* — Premium* Billing wiring, ads, advanced filters, Letterboxd CSV import + foundation/billing, ad provider After Phase 4

Phase 1 + Phase 0 can ship under lock-koder-take. Phases 2+ are coordinated with the relevant cross-module locks.

Open questions (resolved later, not blocking RFC ratification)

  1. ~*Episodelevel "watched" granularity*~ — *ESOLVED 20260514 (take#019)* *itmap wins* Benchmark on internal/storage/watch_bench_test.go shows bitmap is 102× faster on bulkmark (200ep series), 136× faster on SeriesRollup, and 144–185× smaller in bytes at the typical (200ep / 30 series) and heavy (2000ep / 80 series) RFC profiles. Sparselist shipped first as #006's MVP (working code now); migration to bitmap is gated on a separate impl ticket (rolling backfill + readside fallback during the window). Bitmap entry loses perepisode source (records source at the series level); accepted as worstcase loss because the same series tends to be consumed via the same source. Bench artifact: tests/bench/episode_watch_bench_test.go (in internal/storage/).
  2. *spect projection vector basis*— the multiaspect "similar to" feature requires a fixed basis of aspects (action, tone, pacing, theme, archetype). Whether to use a handcurated set of 5 vs 10 vs 20 is a recsys decision; ticket recsys#similar-aspect-basis.
  3. *d provider*— foundation/ads (built for Play) or third-party? Defer to Phase 5.
  4. *etterboxd CSV format support*— do we ingest only the official export, or also third-party Letterboxd archive tools? Defer to Phase 5.

Refs

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/take-RFC-001-architecture-overview.md