Data blob
Data Blob
- *rea:*Data Platform
- *ath:*
data/blob - *ind:*Distributed blob storage substrate (RFC-001)
Role in the stack
kdb-blob é o substrato de armazenamento de objetos para tudo que produz bytes grandes na Stack: uploads, segmentos de mídia transcodificada, backups, snapshots, modelos de ML. Splits content into 4 MB chunks addressed by BLAKE3256, supports range reads, multipart uploads, replication, erasure coding, lifecycle tiers, and an S3compatible gateway.
Deployed as kdb-blob-server (axum, default 0.0.0.0:7400); RUST_LOG controls tracing; Prometheus /metrics available.
Substrate
- *ackend*
FsBlobStore(filesystemrooted, contentaddressed). v0.2 swaps to networked backends via the same trait surface. - *eplication (BLOB
)*) drives the actual fan-out.pick_placementchooses N nodes per chunk; production worker (BLOB - *rasure coding (BLOB
)* ReedSolomon 6+3 for cold tier — 2× space efficiency vs 3× replication while tolerating 3 node losses. - *ifecycle (BLOB
)* hot → warm → cold → archive transitions per access pattern. Daemon scheduler queued in BLOB. - *C (BLOB
)* refcountbased, with grace period before reclaim. - *crub (BLOB-)* daemon walks chunks, verifies BLAKE3, repairs from peers when corruption detected.
Key features
- [x] Single
shot PUT / GET / HEAD / DELETE (BLOB#002#003). - [x] Multipart resumable upload (BLOB-).
- [x] Replication topology (BLOB
) + erasure coding (BLOB). - [x] Lifecycle tiers (BLOB
) + GC (BLOB) + scrub/repair (BLOB-). - [x] S3
compatible gateway —).ListObjectsV2, SigV4 parsing (BLOB - [x] FS backend (BLOB
), HTTP server (BLOB), Prometheus + tracing (BLOB-/#014). - [x] GC + scrub daemons (BLOB-).
- [x] Bench harness (BLOB-).
- [x]
koder_foundation_util::RateLimiterintegration (BLOB-). - [x]
BlobEventSink—PutAccepted/GetServed/ChunkScrubbed/GcSwept/ReplicationLag(BLOB-). - [x]
#[instrument]parity on PUTGETHEAD/DELETE (BLOB-). - [x] Tenant isolation auth — Bearer token → allowed_tenants (BLOB-).
- [x] Per
tenant rate limit on PUT (BLOB). - [x] Blob size cap (100 GB) + Content
Length validation (BLOB). - [x]
kdb_blob_put_seconds/get_secondspersizebucket histograms (BLOB-).
SLO targets (RFC §SLOs)
- p99 PUT (small blob ≤ 1 MB) ≤ *0 ms*
- p99 GET (cached) ≤ *0 ms*
- Durability: *1 9s*(replication + erasure).
- Availability: *9.99%*(multi-region replication).
Interfaces
- HTTP (Koder native):
PUT /v1/{tenant}/blobs/{id},GET,HEAD,DELETE,/v1/{tenant}/blobs/{id}/head,/healthz,/metrics. - HTTP (S3
compatible):).s3compat::list_objects_v2, SigV4 parsing scaffolding (verification queued BLOB - Library:
kdb_blob::{FsBlobStore, MultipartManager, lifecycle_evaluate, GcLedger, ScrubStats, HttpState, TenantAuth, BlobEventSink}.
Open backlog
4 followups in tenant storage quota (BLOBinfra/data/blob/backlog/pending/: per), SigV4 signature verification (BLOB), lifecycle daemon scheduler (BLOB), replication worker fanout (BLOB-).