-
Notifications
You must be signed in to change notification settings - Fork 82
Add HCP full backup/restore test suite for clusters with data plane #1921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: oadp-dev
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mgencur The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This commit introduces a complete HCP (Hosted Control Plane) backup and restore testing framework with support for both newly created and existing HostedCluster environments. - Add `hcp_full_backup_restore_suite_test.go`: Complete test suite for full HCP backup/restore scenarios - Support for two operational modes: - `create`: Creates new HostedCluster for testing (existing behavior) - `existing`: Uses pre-existing HostedCluster with data plane - Add Makefile variables for HCP test configuration: - `HC_BACKUP_RESTORE_MODE`: Controls test execution mode (create/existing) - `HC_NAME`: Specifies HostedCluster name for existing mode - `HC_KUBECONFIG`: Path to guest cluster kubeconfig for existing mode - Pass HCP configuration parameters to e2e test execution - Refactor `runHCPBackupAndRestore()` function for unified handling of both modes - Add guest cluster verification functions (`PreBackupVerifyGuest`, `PostRestoreVerifyGuest`) - Separate log gathering and DPA resource cleanup into reusable functions - Enhanced error handling and validation for both control plane and guest cluster - Add support for kubeconfig-based guest cluster operations - Implement pre/post backup verification for guest cluster resources - Add namespace creation/validation tests for guest cluster functionality - Add `GetHostedCluster()` method to retrieve existing HostedCluster objects - Add `ClientGuest` field to `HCHandler` for guest cluster operations - Improve error message formatting in DPA helpers - Add comprehensive testing documentation for HCP scenarios - Include examples for running tests against existing HostedControlPlane - Document environment variable configuration options - Add conditional must-gather build based on `SKIP_MUST_GATHER` flag - Enhanced e2e test parameter passing for HCP configurations The implementation supports testing both scenarios where OADP needs to: 1. Create a new HostedCluster and test backup/restore (existing functionality) 2. Work with an existing HostedCluster that already has workloads and data plane This enables comprehensive testing of HCP backup/restore functionality in realistic production-like environments where clusters already exist and contain user workloads. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
@@ -0,0 +1,94 @@ | |||
package e2e_test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you suggest a better name for this file? Maybe hcp_external_cluster_backup_restore_suite_test.go
?
Or hcp_existing_cluster_backup_restore_suite_test.go
func runHCPBackupAndRestore(brCase HCPBackupRestoreCase, updateLastBRcase func(brCase HCPBackupRestoreCase), h *libhcp.HCHandler) { | ||
const ( | ||
HCModeCreate HCBackupRestoreMode = "create" // Create new HostedCluster for test | ||
HCModeExisting HCBackupRestoreMode = "existing" // Get existing HostedCluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this could be external
instead of existing
...
The work can be further extended for testing with ROSA, here's the initial draft: https://github.com/mgencur/oadp-operator/pull/1/files (just showing the idea, I sent this PR against my own branch). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the PR still on going? I'm missing some verifications at multiple levels. In general looks good apart from the comments mentioned in the review.
Said that, I would put these new tests under a different flag, something like TEST_HCP_NEW to avoid impact the current tests until we have them properly set.
Makefile
Outdated
-hc_backup_restore_mode=$(HC_BACKUP_RESTORE_MODE) \ | ||
-hc_name=$(HC_NAME) \ | ||
-hc_kubeconfig=$(HC_KUBECONFIG) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would isolate those 3 new flags, and bring those up only when the TEST_HCP == true. This way we would not affect the other tests reqs
} | ||
} | ||
|
||
func postBackupVerifyGuest() VerificationFunctionGuest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those ones will be reused, adding more checks? or the verification is just that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is just the beginning. It's basically a smoke test for the guest cluster. We can add checks over time, in separate PRs.
Are the PR still on going? I'm missing some verifications at multiple levels.
What other verifications are you missing? Just curious if it's specific to the guest cluster or something more general.
@kaovilai FYI |
- Replace HC_BACKUP_RESTORE_MODE with TEST_HCP_EXTERNAL flag - Rename "existing" mode to "external" for clarity - Move HCP external test args to separate HCP_EXTERNAL_ARGS variable - Rename hcp_full_backup_restore_suite_test.go to hcp_external_cluster_backup_restore_suite_test.go - Update test labels from "hcp" to "hcp_external" for external cluster tests - Simplify Makefile by removing unused HC mode variables from main test-e2e target - Update documentation to reflect new external cluster test configuration
@jparrill , tried to address your review comments. |
/hold |
…rieval - Remove HC_KUBECONFIG flag and related global variables from test suite - Remove hardcoded crClientForHC global client initialization - Add GetHostedClusterKubeconfig() method to dynamically retrieve kubeconfig from HostedCluster status - Update pre/post backup verification to create client on-demand using retrieved kubeconfig - Clean up Makefile to remove HC_KUBECONFIG parameter handling - Simplify HCHandler by removing ClientGuest field This change improves test reliability by ensuring the guest cluster client is always created with the current kubeconfig rather than relying on potentially stale configuration passed via flags. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
/unhold |
@mgencur: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This commit introduces a complete HCP (Hosted Control Plane) backup and restore testing framework with support for both newly created and existing HostedCluster environments.
Key Features Added:
New Test Infrastructure
hcp_full_backup_restore_suite_test.go
: Test suite for full HCP backup/restore scenarios with data planecreate
: Creates new HostedCluster for testing (existing behavior)external
: Uses pre-existing HostedCluster with data planeEnhanced Configuration Options
HC_NAME
: Specifies HostedCluster name for existing modeImproved Test Architecture
runHCPBackupAndRestore()
function for unified handling of both modesPreBackupVerifyGuest
,PostRestoreVerifyGuest
)Documentation Updates
Build System Improvements
SKIP_MUST_GATHER
flagTechnical Implementation:
The implementation supports testing both scenarios where OADP needs to:
This enables comprehensive testing of HCP backup/restore functionality in realistic production-like environments where clusters already exist and contain user workloads.
🤖 PR description Generated with Claude Code and modified.
Why the changes were made
How to test the changes made
-hc_backup_restore_mode=external
.