Text Search

mandatory

Substring and full-text search inside the Koder Stack uses the kdb-native substrate; external search engines (Elasticsearch, Meilisearch, Algolia, Typesense, Solr, Sphinx) are not adopted as dependencies of any Koder component. The substrate ships in two shapes — trigram inverted index for `LIKE` / `ILIKE` / similarity (PG `pg_trgm` equivalent) and GIN over tokenized columns for full-text (PG `tsvector` equivalent). Both live in `kdb-record` and are queried through `kdb-pgwire`'s standard SQL surface. Applied case of `self-hosted-first.kmd` for the search domain.

Policy — Text Search

Substring search and full-text search inside the Koder Stack use the *db-native substrate* External search engines are not adopted as dependencies of any Koder component.

This is the *pplied case*of self-hosted-first.kmd for the search domain. The 5 gates of self~~hosted~~first apply here the same way they apply to web servers (Koder Jet vs nginx) and media engines (kodec vs FFmpeg).

Position

Shape	Substrate	PG analogue	kdb ticket
Substring / `LIKE '%…%'` / `ILIKE` / similarity	Trigram inverted index in `kdb-record`	`pg_trgm`	`#679` (this drop scaffolds; future fatias wire planner + storage)
Full-text (tokenize → stem → rank)	GIN over tokenized JSON path in `kdb-record`	`tsvector` + GIN	`#537` (RFC-012 §7 document substrate + GIN)
Vector similarity (already shipped)	HNSW index in `kdb-record::vector`	`pgvector`	`#535` (v1 linear) → `#540` (HNSW)

All three shapes:

live inside infra/data/kdb (kdb-record storage + kdb-planner use)
multi~~tenant by tenant_id path~~prefix per
policies/multi-tenant-by-default.kmd
queried via standard SQL — no separate API or client library
multi~~tenant isolated at the substrate level (cross~~tenant reads
return [], not 403)

Why not external

The 5 gates of self-hosted-first.kmd evaluated against the search domain:

Gate	Score
1 — Capability	Trigram + GIN cover the use cases the Koder Stack actually has. The Stack does not index web~~scale public corpora; it indexes user~~facing data (messages in `koder-talk`, docs in `koder-kortex`, repo content in `flow.koder.dev`). Capability ceiling matches need; no infinite scaling required.
2 — Performance	Trigram index for `LIKE` (closed in `#679`'s critério) lands kdb within 1.2× PG; PG itself is competitive with Elasticsearch for sub~~100M~~doc corpora. Specialized engines win at >100M docs; the Stack doesn't operate there.
3 — Ergonomics	Standard SQL — zero new client lib, zero new dialect, zero new operational surface. Specialized engines require a separate cluster, separate auth, separate indexing pipeline, separate query language (DSL or domain-specific JSON), and separate monitoring.
4 — Self-coverage	kdb is already the substrate for the Stack; adding trigram + GIN to the substrate is incremental. Adding Elasticsearch would create a second source of truth requiring shadow-write + reconciliation forever.
5 — Migration cost	Zero — the Stack has not adopted an external search engine, so this policy doesn't gate migration, it gates introduction.

All five gates pass for the Stack's use cases. External search engines remain off the dependency list.

What this policy forbids

Adding a dependency on Elasticsearch / OpenSearch / Meilisearch /
Algolia / Typesense / Solr / Sphinx / Lucene-derived embedded search to any Koder Stack component.
Hand-rolling a search service in any Koder product (Flow, Talk,
Kortex, Hub, etc.) instead of using the kdb substrate.
Shipping a "search adapter" that bypasses SQL and talks to a
parallel store.

What this policy requires

New search-shaped features inside the Stack go through kdb. If
the kdb substrate doesn't yet have the index type needed (e.g. trigram before #679 lands), open a kdb sub-ticket against #679 / #537 / #540 rather than reaching outside.
LIKE '%…%' queries must be benchmarked against the trigram
index when one exists. Linear scans on a column that should be indexed are a perf regression, not the default.
Existing trigram / GIN / vector code paths reuse the existing
kdb-record::trigram / kdb-record::gin / kdb-record::vector modules per policies/reuse-first.kmd — no parallel implementations.

Strictness

*trict.*No advisory bypass. If a Koder product genuinely needs external search capability the Stack doesn't have, the path is to open a kdb ticket to add it, not to reach outside. Exceptions require RFC modification of this policy or self-hosted-first.kmd.

Sibling policies

self-hosted-first.kmd — the
algorithm; this doc is the data for the search domain.
reuse-first.kmd — within-Stack reuse;
forbids parallel kdb-record::trigram clones.
multi-tenant-by-default.kmd —
trigram + GIN indexes carry tenant_id path prefix from commit 1.
hyperscale-first.kmd — substrate
shape scales without external dependency.