Classify (Classification/NLU): foundations

accepted

Classify (Classification/NLU) — foundations RFC

Status

*ccepted*— 20260509. Sector bootstrap (skeleton + 5 impl tickets) landed as part of /k-go services/ai audit wave (Modo C). Q1 resolved: fastText for language, BGEclass for sentiment + multilabel, LLM via gateway for zeroshot tail. Q2 resolved: pertenant taxonomies stored in this sector under kdb-doc, versioned; fewshot routing when ≥5 examples per intent, zeroshot otherwise.

Summary

Classificação — intent, sentiment, topic, language detection. guard/ faz só safety; este é genérico.

Motivation

Classificação é primitiva ML clássica que LLM faz mal e caro. Modelos pequenos resolvem 100x mais barato. Sem service, cada produto chama gateway com LLM grande.

Scope

In

  • Zero-shot classify (LLM fallback)
  • Fewshot finetuned
  • Language detect
  • Sentiment
  • Intent

Out (yet)

  • Custom training (escopo training/)

Initial design

Surfaces

  • backend/ — Go API + worker (modelos pequenos local)
  • app/ — não aplicável v1

Key APIs

  • POST /v1/classify/labels — multi-label
  • POST /v1/classify/sentiment — posnegneutral
  • POST /v1/classify/language — detect
  • POST /v1/classify/intent — intent recognition

Dependencies

  • services/ai/runtime — modelos pequenos local
  • services/ai/gateway — fallback LLM
  • services/ai/embed — semantic similarity

Relation to existing sectors

  • Distinto de guard/ (safety classifier)
  • Consumido por chat-adapter (intent), ingest pipelines

Selfhostedfirst analysis (5 gates)

  • *1 Feature parity* zero
  • *2 Performance* N/A
  • *3 Stability* N/A
  • *4 Capability* BERT-tiny, fastText viáveis
  • *5 Critical-path readiness* destrava ingest pipelines + chat routing

Open questions

  • Q1: Default models — fastText (light) ou BGE-class (heavier)?
  • Q2: Custom labels — onde armazena?

Next steps

  1. Ratificar esta RFC (1 round de comments).
  2. Criar sector dir services/ai/classify/ com koder.toml, README.md, skeleton.
  3. Abrir tickets de implementação em services/ai/classify/backlog/pending/.
  4. Registrar em meta/docs/stack/registries/self-hosted-pairs.md se substituir externo.

Source: ../home/koder/dev/koder/meta/docs/stack/rfcs/classify-RFC-001-foundations.kmd