Hub RFC 001 delta delivery
RFC-001 — Delta Delivery
*uthor:*Koder Engineering *ate:*20260415 *tatus:*Draft *odule:*productsdevhub
Table of Contents
- Summary
- Problem Statement
- Goals
- Non-Goals
- Background
- Design
- 6.1 Algorithm — bsdiff
- 6.2 Delta Artifact Naming
- 6.3 Publish Flow
- 6.4 Download API
- 6.5 Client Apply Flow
- 6.6 Integrity Verification
- 6.7 Backwards Compatibility
- 6.8 Storage Layout
- Bandwidth Savings Estimate
- Security Considerations
- Failure Modes and Mitigations
- Observability
- Roadmap
- Open Questions
- Alternatives Considered
- References
1. Summary
Koder Hub currently redistributes full .kpkg packages on every version bump, even when the diff between two consecutive versions is minimal. For a typical desktop application of 30 MB, a 1-line bug fix produces a 30 MB download. This RFC proposes a binary delta delivery mechanism based on *sdiff* reducing incremental update bandwidth by an estimated 80–95%.
2. Problem Statement
2.1 Current Behavior
When a publisher pushes myapp@1.2.1 after myapp@1.2.0:
- The full
.kpkg(20–50 MB) is uploaded to the object store. - Every client polling for updates downloads the full package regardless of how small the actual change is.
- At 10,000 active installations updating once per week, a 30 MB package consumes *300 GB/week*of egress per app.
2.2 Root Cause
The Store treats packages as opaque blobs. There is no mechanism to compute, store, or serve the difference between two versions of the same package.
2.3 Impact
- *ublisher cost:*egress and storage grow linearly with each release regardless of diff size.
- *nd
user experience:*slow updates on metered or lowbandwidth connections. - *ederation cost (RFC-002):*mirrors pulling full packages amplify the bandwidth problem across the network.
2.4 Data Point
Empirical analysis of five internal Koder applications shows that consecutive minor/patch releases share 90–97% of bytes at the binary level, primarily due to statically linked runtimes (Flutter, Go) and large static assets (icons, fonts, locale files) that do not change between releases.
3. Goals
- G1: Reduce download size for minor/patch updates by 80–95%.
- G2: Guarantee byte-perfect integrity — a corrupt or partial delta must never result in a silently broken installation.
- G3: Zero regression for clients that do not support delta (opt-in via HTTP header).
- G4: Publisher workflow unchanged — publishers push full packages; delta computation is automatic and server-side.
- G5: Compatible with the federation protocol (RFC-002) — federated mirrors can serve deltas they have cached.
4. Non-Goals
- Delta computation on the client side (Phase 2 prerequisite only).
- Streaming / chunked patch application (out of scope for Phase 1).
- Delta between non-consecutive versions beyond N=3 depth (diminishing returns).
- Source-level diffing (this is binary delta only; source diffs are handled by the version control layer).
5. Background
5.1 bsdiff
bsdiff (Colin Percival, 2003) is a binary diff algorithm optimized for executable code. It exploits the structure of compiled binaries by using suffix arrays to find matching byte sequences across versions and encoding the difference as a sequence of control tuples (x, y, z) where:
xbytes are copied from the old file and added to a "diff" bytestring.ybytes are copied verbatim from the "extra" bytestream.- Seek forward
zbytes in the old file.
The resulting patch is highly compressible and typically 5–20× smaller than the raw binary delta between two versions of the same application.
5.2 .kpkg Format
A .kpkg file (see specs/kpkg/format.kmd) is a ZIP-based container. bsdiff operates on the raw bytes of the full .kpkg file rather than individual ZIP members, which:
- Avoids needing a ZIP-aware differ.
- Works correctly even when the ZIP central directory moves due to file additions/removals.
- Is simpler to implement and audit.
6. Design
6.1 Algorithm — bsdiff
Delta generation uses the reference bsdiff implementation compiled as a Go CGO wrapper (or a pure-Go reimplementation such as gabstep/binarydist).
patch_bytes = bsdiff(old_kpkg_bytes, new_kpkg_bytes)
patch_compressed = gzip(patch_bytes, level=9)
artifact_name = "<sha256_new>.<sha256_old>.bsdiff.gz"The patch is stored base64url-encoded only if transmitted in a JSON envelope; when stored as a binary object in the object store, raw bytes are used.
*otation:*
SHA256_NEW— lowercase hex SHA-256 of the new (target).kpkg.SHA256_OLD— lowercase hex SHA-256 of the base (source).kpkg.
6.2 Delta Artifact Naming
<SHA256_NEW>.<SHA256_OLD>.bsdiff.gzExample:
a3f9e1...c2b4.8d01fa...7e93.bsdiff.gzThis content-addressed naming scheme:
- Allows caching at any layer (CDN, mirror, local disk) by identity.
- Avoids namespace collisions between apps or versions.
- Enables mirrors (RFC-002) to cache and serve deltas without knowing app metadata.
6.3 Publish Flow
When a publisher pushes a new version V_new of an application:
Publisher
│
▼
POST /api/v1/apps/{slug}/publish
│ body: multipart/form-data { kpkg: <binary>, manifest: <json> }
│
▼
Store Backend
├── Validate manifest signature
├── Compute SHA256(V_new) → sha256_new
├── Store full package: objects/<sha256_new>.kpkg
├── Retrieve version history: [V_n-1, V_n-2, V_n-3] (N=3)
│
└── For each base_version in history:
sha256_base = base_version.sha256
if delta_exists(sha256_new, sha256_base): skip
patch = bsdiff(read(sha256_base + ".kpkg"), read(sha256_new + ".kpkg"))
compressed = gzip(patch)
store("deltas/<sha256_new>.<sha256_base>.bsdiff.gz", compressed)
record DeltaEntry { sha256_new, sha256_base, size_bytes, created_at }*=3 rationale:*Clients update within 3 release cycles in 99%+ of cases (internal telemetry). Storing deltas beyond 3 versions provides diminishing returns while increasing storage cost linearly.
*sync computation:*Delta generation is performed in a background worker after the publish API returns 202 Accepted. The package is immediately available for full download; delta availability follows within seconds to minutes depending on package size.
6.4 Download API
Existing endpoint (unchanged for full download):
GET /api/v1/apps/{slug}/download?version=XNew: delta-aware endpoint
GET /api/v1/apps/{slug}/download?version=X&base=SHA256_BASE*eaders sent by client:*
Accept: application/x-bsdiff, application/zip;q=0.9
X-Koder-Store-Version: 1.2.0*erver decision logic:*
if request has base= param AND Accept includes application/x-bsdiff:
sha256_new = manifest[version=X].sha256
sha256_base = base (from query param)
delta_path = "deltas/<sha256_new>.<sha256_base>.bsdiff.gz"
if delta_path exists in object store:
return 200 {
Content-Type: application/x-bsdiff
Content-Encoding: gzip
X-Koder-Delta-Base: <sha256_base>
X-Koder-Delta-Target: <sha256_new>
X-Koder-Delta-Target-Size: <uncompressed_kpkg_size>
body: <delta bytes>
}
else:
return full package (delta not yet computed or base too old)
else:
return full package*esponse codes:*
| Code | Meaning |
|---|---|
| 200 | Delta or full package returned |
| 404 | App or version not found |
| 409 | Delta computation in progress; retry after X seconds (Retry-After header) |
6.5 Client Apply Flow
Client (Flutter / CLI)
│
├── 1. Read installed package SHA256 → sha256_base
├── 2. Check for update: GET /apps/{slug}/manifest?version=latest
│ → new version available, sha256_new, size_full
│
├── 3. Request delta:
│ GET /apps/{slug}/download?version=X&base=<sha256_base>
│ Accept: application/x-bsdiff, ...
│
├── 4a. If response Content-Type == application/x-bsdiff:
│ decompress gzip → raw_patch_bytes
│ target_bytes = bspatch(current_kpkg_bytes, raw_patch_bytes)
│ actual_sha256 = sha256(target_bytes)
│ if actual_sha256 != sha256_new: goto 4b (fallback)
│ write target_bytes → disk
│ install new package
│
└── 4b. Fallback (no delta, mismatch, or any error):
GET /apps/{slug}/download?version=X (full package)
verify sha256
install*tomic write:*The patched bytes are written to a temporary file in the same directory, then renamed atomically (os.Rename) to avoid partial writes.
6.6 Integrity Verification
Integrity is enforced at three levels:
- *anifest SHA256:*The app manifest (signed by the publisher) contains
sha256of the expected final.kpkg. The client always verifies the applied result matches this value.
- *elta artifact hash:*The server records the SHA256 of each
.bsdiff.gzfile at generation time. The download endpoint includesX-Koder-Delta-Artifact-SHA256header for in-transit verification.
- *ull-package fallback:*Any verification failure — corrupt download, bspatch error, SHA256 mismatch — triggers an automatic fallback to full package download. The fallback is logged and reported in telemetry.
Verification failure → log event {
app: slug,
version_new: X,
base_sha256: ...,
failure_reason: "sha256_mismatch" | "bspatch_error" | "download_error",
fallback: true
}6.7 Backwards Compatibility
Delta delivery is *trictly opt-in*by the client:
- Clients that do not send
Accept: application/x-bsdiffalways receive full packages. - Clients that do not send
?base=SHA256always receive full packages. - Existing CLI (
koder-hub install) and legacy Flutter clients are unaffected.
The server never sends a delta unless the client explicitly requests one. This ensures zero behavior change for existing clients during rollout.
6.8 Storage Layout
object-store/
├── packages/
│ └── <sha256>.kpkg # full packages (existing)
└── deltas/
└── <sha256_new>.<sha256_old>.bsdiff.gz*etention policy:*
- Full packages: retain indefinitely (all versions must remain downloadable).
- Delta artifacts: retain for as long as both the base and target versions are published. When a version is yanked, all delta artifacts referencing it as base are garbage-collected.
7. Bandwidth Savings Estimate
Based on internal analysis of 5 Koder applications:
| Scenario | Full Package | Delta Size | Saving |
|---|---|---|---|
| Patch release (bug fix) | 35 MB | 0.5–2 MB | 94–99% |
| Minor release (new feature) | 35 MB | 2–7 MB | 80–94% |
| Major release (new runtime) | 35 MB | 15–25 MB | 29–57% |
*leet projection*(10,000 active installs, one update per week, avg 35 MB package):
| Update type | Without delta | With delta (est.) |
|---|---|---|
| Patch | 350 GB/week | 10–70 GB/week |
| Minor | 350 GB/week | 21–70 GB/week |
| Major | 350 GB/week | 148–249 GB/week |
Delta delivery does not reduce bandwidth for major releases where large binary sections are replaced (e.g., runtime version bump), but these events are infrequent.
8. Security Considerations
8.1 Malicious Delta
A delta artifact produced from a tampered base can produce an arbitrary target file. Mitigations:
- Delta generation runs only on the Store backend, never on user-supplied input.
- The client always verifies the final SHA256 against the publisher-signed manifest. A valid delta applied to the wrong base will almost certainly produce a SHA256 mismatch and trigger fallback.
8.2 Delta Amplification
A malicious publisher could upload packages designed to produce pathologically large delta artifacts (patch bombs). Mitigation:
- Maximum delta artifact size cap:
min(full_package_size * 1.1, 100 MB). If bsdiff output exceeds this cap, the delta is discarded and only the full package is served.
8.3 Storage Exhaustion
With N=3 deltas per publish and a busy publisher, delta storage can grow. Mitigation:
- Delta storage is soft-limited per app (configurable, default 500 MB).
- Delta GC runs on version yanks and retention policy expiry.
9. Failure Modes and Mitigations
| Failure | Client behavior | Server behavior |
|---|---|---|
| Delta not yet computed | Fallback to full download | Return 200 full package |
| Delta generation fails | N/A | Log error; mark delta as unavailable; serve full package |
| Client bspatch error | Fallback to full download | N/A |
| SHA256 mismatch after patch | Fallback to full download | Log telemetry event |
| Object store unavailable | Retry with exponential backoff | 503 |
| Base version yanked mid-download | Download will fail; client retries without base param | Delta GC removes artifact |
10. Observability
*etrics (Prometheus):*
store_delta_generation_duration_seconds{app, status}
store_delta_size_bytes{app, version_new, version_base}
store_delta_download_total{app, result} # result: delta_hit | delta_miss | fallback
store_delta_bandwidth_saved_bytes_total{app}*og events:*
delta.generated— artifact created, includes sizes and generation time.delta.generation_failed— error detail.delta.served— client received delta.delta.fallback— client requested delta but received full package (with reason).delta.apply_failed— client-side apply failure (reported via telemetry endpoint).
11. Roadmap
Phase 1 — Server-Side Delta Generation + API (Weeks 1–4)
- bsdiff Go wrapper integrated into Store backend.
- Background worker for delta generation on publish.
- Object store layout for delta artifacts.
- Download API delta
aware response (ContentType, headers). - Unit + integration tests for delta generation and serving.
- Metrics and logging.
Phase 2 — Flutter Client Apply (Weeks 5–8)
- bspatch integration in Flutter app (via FFI or Dart port).
- Delta-aware download flow with SHA256 verification.
- Fallback to full package on any failure.
- UI: show "Updating (small patch)" vs "Updating (full download)".
Phase 3 — Edge Node Delta Serving (Weeks 9–16)
- Federated mirrors (RFC-002) cache delta artifacts by content hash.
- Edge nodes serve deltas without contacting Origin for known hashes.
- Benchmark: time
toupdate at high concurrency (10k simultaneous updates).
12. Open Questions
- *ure Go vs CGO bsdiff:*Using CGO links the reference C implementation but complicates cross
compilation. A pureGo reimplementation (e.g.,gabstep/binarydist) avoids CGO but may be slower for large packages. Decision needed before Phase 1 implementation.
- *elta depth N:*N3 is based on internal telemetry. Should N be configurable per app? Large enterprise deployments may have slower update cycles requiring N5 or N=7.
- *arallel delta generation:*Should the worker generate all N deltas in parallel? Parallelism speeds up availability but increases CPU and memory peaks. Start sequential, profile, then parallelize if needed.
- *elta for pre
release channels:*Shouldrelease packages change more frequently and are installed by fewer users; the ROI may not justify the storage cost.alphaandbetachannels participate in delta delivery? Pre
13. Alternatives Considered
13.1 xdelta3
xdelta3 is a widely used binary delta algorithm. Compared to bsdiff, it is faster at generation time but produces larger patches for compiled executables. Since our bottleneck is download bandwidth (not generation time), bsdiff is preferred.
13.2 rsync/librsync
rsync operates on rolling checksums and requires the client to have the old file and a rolling checksum manifest. This would require a roundtrip to compute the checksum set before the server can construct the delta. bsdiff generates the full delta serverside from both versions, avoiding the round-trip.
13.3 ZIP member-level delta
Computing bsdiff on individual ZIP members (resources, assets, native libraries) could improve patch quality when only a few members change. The complexity of a ZIPaware differ and the additional infrastructure required (memberlevel object store) do not justify the marginal improvement over whole-archive bsdiff for our package sizes.
13.4 Separate "update package" published by developers
Some stores require developers to manually publish a delta package. This shifts computation burden to the publisher and requires tooling on their side. Automatic server-side delta generation (this RFC) provides better developer ergonomics with no workflow change required.
14. References
- C. Percival, "Naive differences of executable code," 2003. http:/ww.daemonology.netbsdiff
gabstep/binarydist— pure-Go bsdiff/bspatch implementationspecs/kpkg/format.kmd— Koder Package format specification- RFC
002 — Federation Protocol (productsdevhubdocsrfcs/RFC002federationprotocol.md) - Ticket #013 — Delta Delivery Server (productsdevhubbacklogpending/013
deltadelivery-server.md) - Ticket #014 — Delta Delivery Client (productsdevhubbacklogpending/014
deltadelivery-client.md)