Pg dump v17
pg-dump v17 — Wire Compatibility Spec
Normative spec for what pg_dump --version 17 expects from a Postgres-wire server. Used as the contract that infra/data/kdb (kdbpgwire / kdbgateway) must satisfy to support pg_dump roundtrips, and as the sourceof-truth for the per-table coverage work tracked in infra/data/kdb#341.
1. Provenance
The catalogue of queries below was captured live from PostgreSQL 17.9 (Debian 17.9-0+deb13u1) using infra/data/kdb/scripts/pgdump-v17-capture.sh.
The script spins a throw-away PG 17 cluster, applies a varied schema (5 tables, 4 indexes including partial, 2 explicit sequences, FK + CHECK constraints, view, materialized view, enum type), and runs pg_dump under log_statement = all so the server records the exact stream of statements pg_dump issued.
Captures are checked in under meta/docs/stack/specs/postgres-compat/captures/pg-dump-v17-queries-<flag>.sql for the following flag combinations:
| File suffix | Flags |
|---|---|
default |
(no flags) |
schema-only |
--schema-only |
data-only |
--data-only |
no-owner-no-acl |
--no-owner --no-acl |
table-users |
--table=users |
Recapture by running the script — output is bytestable across runs of the same PG 17 minor (pg_dump does not emit non-deterministic queries).
2. Session prelude (every invocation)
The first ~13 statements in every pg_dump run are session-shape GUCs. Every server claiming pg_dump-compat MUST accept these without error (returning SET / SELECT is sufficient; the values can be ignored if the server has no equivalent knob):
SELECT pg_catalog.set_config('search_path', '', false);
SELECT pg_catalog.pg_is_in_recovery();
SELECT pg_catalog.set_config('search_path', '', false);
SET DATESTYLE = ISO;
SET INTERVALSTYLE = POSTGRES;
SET extra_float_digits TO 3;
SET synchronize_seqscans TO off;
SET statement_timeout = 0;
SET lock_timeout = 0;
SET idle_in_transaction_session_timeout = 0;
SET transaction_timeout = 0;
SET row_security = off;
SELECT set_config(name, 'view, foreign-table', false)
FROM pg_settings
WHERE name = 'restrict_nonsystem_relation_kind';
BEGIN;
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ, READ ONLY;The final two open a readonly repeatableread transaction that wraps the entire dump; servers without MVCC snapshots can accept these as no-ops provided they expose a stable read for the duration of the dump.
3. Catalogue queries — required surface
pg_dump v17 issues queries against the following pg_catalog relations. Servers MUST respond with the row shape PostgreSQL 17 produces; missing columns cause pg_dump to bail with a parser error in its PQgetvalue(... , <col>) calls. Functions in *old*are outofband helpers (not relations) — they MUST exist and return the documented type.
3.1 Relations (must be queryable, even if empty)
| Relation | Min. columns pg_dump v17 reads | Notes |
|---|---|---|
pg_roles |
oid, rolname |
Empty result OK if --no-owner |
pg_extension |
tableoid, oid, extname, extrelocatable, extversion, extconfig, extcondition |
Empty result OK; pg_dump iterates the list |
pg_depend |
classid, objid, refobjid (+ refclassid filter, deptype filter) |
Drives extension-membership detection |
pg_namespace |
tableoid, oid, nspname, nspowner, nspacl (+ acldefault('n', nspowner)) |
At minimum public and pg_catalog |
pg_class |
tableoid, oid, relname, relnamespace, relkind, reltype, relowner, relchecks, relhasindex, relhasrules, relpages, relhastriggers, relpersistence, reloftype, relacl, relfrozenxid, relreplident, relrowsecurity, relforcerowsecurity, relminmxid, reloptions, relispopulated, relispartition |
Drives the entire object stream — see §4 below |
pg_proc |
tableoid, oid, proname, prolang, pronargs, proargtypes, prorettype, proacl, pronamespace, proowner, prokind |
Filter prokind <> 'a' (non-aggregates) |
pg_type |
tableoid, oid, typname, typnamespace, typacl, typowner, typelem, typrelid, typtype, typisdefined |
Used to resolve column types |
pg_language |
tableoid, oid, lanname, lanpltrusted, lanplcallfoid, laninline, lanvalidator, lanacl, lanowner |
Filter WHERE lanispl |
pg_attribute |
attname, atttypmod, attstorage, attcompression, attmissingval, attstattarget, attndims, attlen, attbyval, attalign, attnotnull, atthasdef, attidentity, attgenerated, attisdropped, attislocal, attinhcount |
*-CRITICAL*— column shape drives CREATE TABLE reconstruction |
pg_constraint |
oid, conname, contype, conrelid, conindid, connamespace, confupdtype, confdeltype, confmatchtype, condeferrable, condeferred, convalidated, conkey, confkey, confrelid |
CHECK / FK / UNIQUE reconstruction |
pg_index |
indrelid, indkey, indcollation, indclass, indoption, indpred, indexprs, indisclustered, indisreplident, indisvalid, indisready, indislive |
Drives CREATE INDEX reconstruction |
pg_inherits |
inhrelid, inhparent, inhseqno |
Partition / inheritance tree |
pg_description |
objoid, classoid, objsubid, description |
COMMENT ON … reconstruction |
pg_publication |
oid, pubname, pubowner, puballtables, pubinsert, pubupdate, pubdelete, pubtruncate, pubviaroot |
Empty result OK |
pg_settings |
name, setting |
set_config(...) FROM pg_settings WHERE name = '…' |
pg_foreign_table |
ftrelid, ftserver |
Empty result OK if no FDWs |
3.2 Relations newly required by PG 17 (gaps vs PG 15/16)
These either did not exist or had different shape in earlier PG and are likely to be missing in a server claiming PG 13/14 compatibility:
| Relation | Why pg_dump v17 needs it |
|---|---|
pg_cast |
Resolves implicit-cast paths during type reconstruction |
pg_collation |
Per |
pg_conversion |
Server-encoding conversions |
pg_default_acl |
Default privileges from ALTER DEFAULT PRIVILEGES … |
pg_event_trigger |
Event triggers (DDL hooks) — empty result usually OK |
pg_init_privs |
Captures the "original ACL" assigned by an extension |
pg_opclass |
Index operator class lookup |
pg_operator |
Operator definitions (mostly transitive via opfamily) |
pg_opfamily |
Index operator family |
pg_publication_namespace |
Publication |
pg_range |
Range type bounds |
pg_rewrite |
View / materialized-view definitions (joined with pg_get_viewdef) |
pg_seclabels |
Security labels (SECURITY LABEL …) |
pg_sequence |
Sequence parameters (PG 10+ split-out of pg_class.relkind = 'S') |
pg_statistic_ext |
Extended statistics objects |
pg_tablespace |
Non-default tablespaces — usually empty in cloud setups |
pg_transform |
Transform mappings (e.g. plpython hstore) |
pg_ts_config / pg_ts_dict / pg_ts_parser / pg_ts_template |
Full |
3.3 Functions
| Function | Return | When pg_dump v17 calls it |
|---|---|---|
pg_is_in_recovery() |
bool |
Session prelude — almost certainly returns false |
pg_get_viewdef(oid) |
text |
View / materialized-view body |
pg_get_indexdef(oid) |
text |
CREATE INDEX … reconstruction |
pg_get_constraintdef(oid) |
text |
CHECK / FK / UNIQUE body |
pg_get_triggerdef(oid) |
text |
Trigger reconstruction |
pg_get_expr(node, relid) |
text |
Default |
pg_get_serial_sequence(table, col) |
text |
Reidrata SERIAL / IDENTITY columns |
pg_options_to_table(text[]) |
setof | FDW server OPTIONS decoding |
acldefault(char, oid) |
aclitem[] |
Default-ACL synthesis (called inline in §3.1 queries) |
array_remove(any[], any) |
any[] |
Used inline to drop check_option=… from reloptions |
3.4 PG 17 row-shape additions
The big v15→v17 movement in pg_attribute matters:
attstorageis"char"in v17 (wasoidhistorically; some legacyshims still return oid — pg_dump v17 will reject those rows).
attcompressionexists from v14+ ("char", default'p'for pglz).attmissingvalisanyarray(often emitted as text).attstattargetisint2(became nullable in v17 —NULLallowed).attgeneratedis"char"('s'for STORED,''for none).attidentityis"char"('a'for ALWAYS,'d'for BY DEFAULT,''for none).
pg_class.relrowsecurity and relforcerowsecurity are bool (RLS-related). pg_class.relispartition is bool (PG 10+); MUST exist even on non-partitioned tables (default false).
4. Object-fetch loop (the heart of pg_dump)
After the session prelude, pgdump runs *ne large `pgclass` query*to materialize the entire tableofobjects:
SELECT c.tableoid, c.oid, c.relname, c.relnamespace, c.relkind, c.reltype,
c.relowner, c.relchecks, c.relhasindex, c.relhasrules, c.relpages,
c.relhastriggers, c.relpersistence, c.reloftype, c.relacl,
acldefault(CASE WHEN c.relkind = 'S' THEN 's'::"char" ELSE 'r'::"char" END,
c.relowner) AS acldefault,
CASE WHEN c.relkind = 'f'
THEN (SELECT ftserver FROM pg_catalog.pg_foreign_table WHERE ftrelid = c.oid)
ELSE 0 END AS foreignserver,
c.relfrozenxid, tc.relfrozenxid AS tfrozenxid, tc.oid AS toid,
tc.relpages AS toastpages, tc.reloptions AS toast_reloptions,
d.refobjid AS owning_tab, d.refobjsubid AS owning_col,
tsp.spcname AS reltablespace, false AS relhasoids,
c.relispopulated, c.relreplident, c.relrowsecurity,
c.relforcerowsecurity, c.relminmxid, tc.relminmxid AS tminmxid,
array_remove(array_remove(c.reloptions, 'check_option=local'),
'check_option=cascaded') AS reloptions,
CASE WHEN 'check_option=local' = ANY (c.reloptions) THEN 'LOCAL'::text
WHEN 'check_option=cascaded' = ANY (c.reloptions) THEN 'CASCADED'::text
ELSE NULL END AS checkoption,
am.amname, (d.deptype = 'i') IS TRUE AS is_identity_sequence,
c.relispartition AS ispartition
FROM pg_class c …(Truncated — see captures/pg-dump-v17-queries-schema-only.sql for the full text.) The relkind values returned drive the rest of the dump:
relkind |
Followed by |
|---|---|
'r' |
pg_attribute → pg_constraint → pg_index → LOCK TABLE … ACCESS SHARE → COPY … TO STDOUT (data-only) |
'v' |
pg_get_viewdef(oid) |
'm' |
pg_get_viewdef(oid) (materialized) + REFRESH MATERIALIZED VIEW directive |
'S' |
pg_sequence row + setval('…', last_value, is_called) |
'i' |
Index — usually emitted by following pg_index |
'p' |
Partitioned table — followed by pg_inherits to enumerate children |
'c' |
Composite type — pg_dump may skip if the type is system-managed |
'f' |
Foreign table — pulls the server name from the foreignserver correlated subquery |
The LOCK TABLE … IN ACCESS SHARE MODE is non-negotiable: servers MUST either implement table locks or accept the syntax as a no-op (returning LOCK TABLE). Without it, pg_dump v17 aborts the dump.
5. Data path (--data-only)
For each non-system table whose relkind = 'r', pg_dump issues:
COPY public.users (id, handle, email, score, created_at,
tags, metadata, avatar, status) TO STDOUT;The column list is materialized from the pg_attribute query above. The server MUST emit the response on the wire as CopyOutResponse → CopyData* → CopyDone → CommandComplete("COPY n"). Text format is the default; binary is not used by pg_dump v17.
Implementation in kdbpgwire: ticket [infradatakdb#340](......infradatakdbbacklogdone340pgwirecopyto-stdout.kmd).
6. Outofscope for v17 compat
- *oreign
key reconstruction across schemas with `-schema=…`*—pg_dump v17 chases FKs to outside-schema parents; the audit script pins to
publicso the captures do not exercise this corner. - *erver
sideobject dump uses aCOMMENT ON LARGE OBJECT*— largeseparate code path (
--blobs/--no-blobs) not exercised by the audit. Tracked separately if/when lobs become a goal. - *arallel pg_dump (
-j N)*— parallel restore uses extra COPYsessions but the per-session query stream is the same.
- *pg_dump -nserts
** — usesINSERT INTO … VALUES (…)` instead ofCOPY. Audit captures focus on the default COPY path.
7. Versioning
Captures are pinned to PG 17.9. Re-run the capture script when:
- PostgreSQL 18 ships and
pg_dumpv18 diverges (verify with a freshcapture against PG 18 in
/usr/lib/postgresql/18/bin/). - A user reports a pg_dump failure on kdb-gateway that the diff report
does not cover.
See kdb-gateway-v0.0.13-vs-pg17-pgdump.md for the perquery coverage delta against the current kdbpgwire surface.
8. Conformance status (20260511)
The 8 originallyidentified RCRITICAL gaps have been closed across a multiday series of fixes culminating in the FK roundtrip milestone. Latest harness state (scripts/pg-dump-roundtrip.sh):
| Check | State |
|---|---|
pg_dump --schema-only exit code |
**✓ |
pg_dump --data-only exit code |
**✓ |
psql -f kdb-schema.sql (PG 17) |
*c=0*✓ |
psql -f kdb-data.sql (PG 17) |
*c=0*✓ |
| Row |
*00% match*✓ |
| FK guard (orphan INSERT rejected) | *locked*(enforced) ✓ |
| Schema diff lines | *5*(was 77 originally) |
Closed work-items
#341umbrella — pg_catalog coverage for the 8 R-CRITICAL gaps#342— round-trip test harness#343— extended-queryset_configemission#344(16 waves) — universaltableoid+ per-table column gaps#345— plannerIS TRUE / IS FALSE#346— subquery catalog inheritance#347—IS DISTINCT FROM#348— text[N]subscript#349—ARRAY (SELECT …)constructor#350—Text → Array<T>cast#351—pg_catalog.prefix-strip for aggregates#353— FK round-trip umbrella (PK ALTER, FK ALTER, conindid linkage)#357— regclass cast → oid#360— FK capture under tenant config#361— synthetic pgindex + pgclass for PK/UNIQUE
Open polish items (post-milestone)
#352— schema-diff cosmetic cleanup (35 lines remaining;\restrictrandom tokens unavoidable; public SCHEMA placeholder block deferred)
#359— public SCHEMA block suppression (requires pg_dump v17 sourcevendoring to map the suppression algorithm)
Regression coverage
infra/data/kdb/crates/kdb-pgwire/tests/pgwire_fk_roundtrip.rs guards the milestone's three load-bearing wirings (see registry entries 518-520).