K8SPXC-1800 | run prepare job for PITR#2449
Conversation
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
| echo ${gtid} | ||
| } | ||
|
|
||
|
|
There was a problem hiding this comment.
[shfmt] reported by reviewdog 🐶
There was a problem hiding this comment.
Pull request overview
Updates the PXC restore controller so PITR restores run the “prepare cluster” job (needed for cross-cluster replication and password-changed backups), and extends PITR E2E coverage to exercise restore-from-source onto a freshly recreated cluster with changed system user passwords.
Changes:
- Move PITR execution to occur after the “Preparing Cluster” stage by running it from the
RestorePrepareClusterstate handler. - Extend PITR E2E scripts to support dynamic root passwords and add a new case restoring onto a recreated cluster.
- Update PITR test cluster manifest finalizers to ensure PVC cleanup during cluster deletion/recreate flows.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| pkg/controller/pxcrestore/controller.go | Adjust restore state machine so PITR is triggered after the prepare job completes. |
| e2e-tests/pitr/run | Add password randomization helper and a new PITR restore-on-new-cluster scenario. |
| e2e-tests/pitr/conf/pitr.yml | Add PVC-deletion finalizers to support delete/recreate in PITR E2E. |
| e2e-tests/functions | Pass root password into PITR recovery checks and wait for new intermediate restore states. |
Comments suppressed due to low confidence (1)
pkg/controller/pxcrestore/controller.go:355
- PITR restore for clusters with version < 1.18.0 appears to be broken: PITR job creation/state transition was moved into reconcileStatePrepareCluster, but reconcileStateRestore routes <1.18.0 restores directly to RestoreStartCluster (skipping RestorePrepareCluster), so RestorePITR is never reached and the PITR job won't run. Consider keeping the PITR branching in reconcileStateRestore for <1.18.0 (or otherwise ensuring PITR flows still create/run the PITR job when PrepareCluster is skipped).
if cluster.CompareVersionWith("1.18.0") >= 0 {
log.Info("preparing cluster", "cluster", cr.Spec.PXCCluster)
cr.Status.State = api.RestorePrepareCluster
} else {
log.Info("starting cluster", "cluster", cr.Spec.PXCCluster)
cr.Status.State = api.RestoreStartCluster
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| wait_backup_restore ${restore} "Restoring" | ||
|
|
||
| wait_backup_restore ${restore} "Preparing Cluster" | ||
|
|
||
| wait_backup_restore ${restore} "Point-in-time recovering" |
| desc "[CASE 4] delete PXC cluster ${cluster} and wait for PVCs" | ||
| kubectl_bin delete pxc ${cluster} | ||
| desc "[CASE 4] wait for PVCs (app.kubernetes.io/instance=${cluster}) to be deleted" |
There was a problem hiding this comment.
kubectl delete blocks until the object is gone
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
commit: b47912f |
CHANGE DESCRIPTION
Problem:
Prepare job does not run when PITR is specified. As a result, PXC pods do not come up for cross-cluster replication or if the passwords are changed since the backup.
Cause:
Prepare job is skipped in PITR step.
Solution:
Run prepare job for PITR
CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
compare/*-oc.yml)?Config/Logging/Testability