fix(providers): harden postgres major upgrades#151
Conversation
|
Thanks for this @bherila — the volume-handoff hardening is the right direction, and these are genuinely dangerous paths (a bad handoff = pg18 opening stale pg17 PGDATA), so the "fail early instead of silently reusing incompatible data" framing is exactly right. The A few items to address before merge (CHANGELOG conflict aside): 1. Route the new exec through
The code being replaced was already a bespoke inline loop, so this isn't introducing the first duplication — but consolidating onto 2. Copy-sidecar leak on the 3. Real pg17→pg18 validation before merge (the important one). Test changes themselves look good — swapping the |
|
If you have time to run this on a site that actually uses pg, that's super helpful -- personally I only use mysql for my sites |
|
I'm going to rebase this and bring back my other PR either tonight or tomorrow |
|
perfect, I'm with #158 currently When I merge it I will be able to test this |
7a5652a to
94be534
Compare
Summary
Extracts the Postgres major-upgrade hardening that was originally discovered while validating MariaDB PITR E2E coverage in #138.
This PR keeps the scope to Postgres upgrade correctness only:
s3_sourcesandbackupsrows in the Postgres upgrade test fixture sopostgres_major_upgrades.pre_upgrade_backup_idsatisfies the real FK constraint;psql SELECT 1, not just Docker health /pg_isready, and create the configured DB if entrypoint initialization has not finished that race yet;PGDATAafter a supposedly clean copy;Why this is required
The pg17 to pg18 upgrade path depends on a clean data-volume handoff. During CI validation, the old behavior could leave the live volume or copy sidecar around and then start the target pg18 container against stale pg17 data. That produces misleading readiness states and can make an upgrade appear to advance while the target database is not actually usable.
The fixture update is also required because the tests now run against the real schema: the fake backup id used by the upgrade tests violated the
backupsFK once the orchestrator persisted it topostgres_major_upgrades.pre_upgrade_backup_id.Together these changes make the Postgres upgrade tests exercise the real control-plane constraints and make the runtime upgrade path fail early instead of silently reusing incompatible data.
Test Plan
This branch was rebased onto the latest
upstream/main(clean rebase, no conflicts). Onlycrates/temps-providersis touched.Local:
cargo fmt --all -- --checkcargo check --lib -p temps-providers— passes, no warningscargo test --lib -p temps-providers postgres—87 passed; 0 failed(Docker-dependent bodies skip gracefully at runtime when Docker is unavailable, per repo policy)cargo test -p temps-providers --features docker-tests --lib orchestrator_phase_new_container_completes_under_timeout -- --nocapture --test-threads=1(passes locally; Docker-dependent body skipped because local Docker is unavailable)CI (full suite, including Docker/E2E, runs against this branch's head
94be5343):