Search (Web Search/Scrape): foundations
Search (Web Search/Scrape) — foundations RFC
Status
*ccepted*— ratificada 20260509 (mesmo dia da abertura) como parte da onda piloto de bootstrap servicesai. Implementação iniciada em `servicesaisearch; tickets em servicesaisearchbacklogpending/{001..005}`.
Summary
Web searchscraping pra agents — análogo Perplexity APITavilyFirecrawlExa.
Motivation
Foi referenciado no commit recente (/k-evolve koder-ai vs firecrawl+wispr+testsprite em 20260509)! Agents sem web search = dump completo do mundo na sessão. Foundation crítica pra agentes úteis.
Scope
In
- Web search (proxy + fallback engine)
- Scrape (Firecrawl-like)
- Citations
- Dedup
Out (yet)
- Deep research multi-step (escopo agents)
Initial design
Surfaces
backend/— Go API + worker scrapeapp/— não aplicável v1
Key APIs
POST /v1/search/web— searchPOST /v1/search/scrape— fetch + cleanPOST /v1/search/news— news vertical
Dependencies
services/ai/gateway— LLM rerankservices/ai/cache— search resultsinfra/data/kdb-blob— scrape cache
Relation to existing sectors
- Pré-requisito de agents úteis em produção
- Consome cache pra reduzir custo
Selfhostedfirst analysis (5 gates)
- *1 Feature parity* zero
- *2 Performance* N/A
- *3 Stability* N/A
- *4 Capability* SearxNG self-hosted + Firecrawl FOSS viáveis
- *5 Critical-path readiness* destrava agents de produção
Open questions
- Q1: Default engine — SearxNG self-hosted ou proxy comercial?
- Q2: Robots.txt compliance enforce no service?
Next steps
- Ratificar esta RFC (1 round de comments).
- Criar sector dir
services/ai/search/comkoder.toml,README.md, skeleton. - Abrir tickets de implementação em
services/ai/search/backlog/pending/. - Registrar em
meta/docs/stack/registries/self-hosted-pairs.mdse substituir externo.