Classify (Classification/NLU): foundations
Classify (Classification/NLU) — foundations RFC
Status
*ccepted*— 20260509. Sector bootstrap (skeleton + 5 impl tickets) landed as part of /k-go services/ai audit wave (Modo C). Q1 resolved: fastText for language, BGEclass for sentiment + multilabel, LLM via gateway for zeroshot tail. Q2 resolved: pertenant taxonomies stored in this sector under kdb-doc, versioned; fewshot routing when ≥5 examples per intent, zeroshot otherwise.
Summary
Classificação — intent, sentiment, topic, language detection. guard/ faz só safety; este é genérico.
Motivation
Classificação é primitiva ML clássica que LLM faz mal e caro. Modelos pequenos resolvem 100x mais barato. Sem service, cada produto chama gateway com LLM grande.
Scope
In
- Zero-shot classify (LLM fallback)
- Few
shot finetuned - Language detect
- Sentiment
- Intent
Out (yet)
- Custom training (escopo training/)
Initial design
Surfaces
backend/— Go API + worker (modelos pequenos local)app/— não aplicável v1
Key APIs
POST /v1/classify/labels— multi-labelPOST /v1/classify/sentiment— posnegneutralPOST /v1/classify/language— detectPOST /v1/classify/intent— intent recognition
Dependencies
services/ai/runtime— modelos pequenos localservices/ai/gateway— fallback LLMservices/ai/embed— semantic similarity
Relation to existing sectors
- Distinto de
guard/(safety classifier) - Consumido por chat-adapter (intent), ingest pipelines
Selfhostedfirst analysis (5 gates)
- *1 Feature parity* zero
- *2 Performance* N/A
- *3 Stability* N/A
- *4 Capability* BERT-tiny, fastText viáveis
- *5 Critical-path readiness* destrava ingest pipelines + chat routing
Open questions
- Q1: Default models — fastText (light) ou BGE-class (heavier)?
- Q2: Custom labels — onde armazena?
Next steps
- Ratificar esta RFC (1 round de comments).
- Criar sector dir
services/ai/classify/comkoder.toml,README.md, skeleton. - Abrir tickets de implementação em
services/ai/classify/backlog/pending/. - Registrar em
meta/docs/stack/registries/self-hosted-pairs.mdse substituir externo.