Add OADP toolset for managing Velero backups and restores#122
Add OADP toolset for managing Velero backups and restores#122shubham-pampattiwar wants to merge 6 commits intoopenshift:mainfrom
Conversation
|
/assign matzew |
|
woot! thanks @shubham-pampattiwar |
|
/assign @Cali0707 |
|
@shubham-pampattiwar thanks for contributing a new toolset! A few thoughts:
Our experience with the core toolset and others so far is that in general we want to try to minimize the number of tools in the server, so that we don't overwhelm the model's context. E.g. cursor will return a warning to users when somewhere in the realm of 20-60 tools are loaded (and this is providing 90!) IMO we should be working to drastically reduce the number of tools provided by this toolset. A few things that can likely help:
For example, in the kubevirt toolset they made the action to be done to a VM a parameter of the tool, rather than having one tool for each action: openshift-mcp-server/pkg/toolsets/kubevirt/vm/lifecycle/tool.go Lines 40 to 44 in f13d475 Happy to discuss further on how to reduce the count of the tools while still ensuring agents can take all the actions they need to! |
Hey @Cali0707, thanks for the feedback! Makes sense - 90 tools is definitely too many. Here's my plan to reduce to ~9 tools using the action parameter pattern like kubevirt: Less common CRDs (DeleteBackupRequest, DownloadRequest, ServerStatusRequest, PodVolumeBackup/Restore) can use generic resources_* tools. Let me know if this approach works! |
|
Reducing the number of tools sounds like a good idea, as well leveraging core tools (e.g. A good idea is to also look at providing evals, like other toolsets (kubevirt for instance) We use mcpchecker, which now has also quickstarts for get going! |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: shubham-pampattiwar The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Addressed review feedback - refactored from 90 individual tools to 8 consolidated tools using the action parameter pattern (like kubevirt). Changes:
Also added 8 eval tasks - all passing (8/8 tasks, 24/24 assertions). |
pkg/oadp/nonadmin.go
Outdated
| @@ -0,0 +1,134 @@ | |||
| package oadp | |||
There was a problem hiding this comment.
As far as I can tell, nothing in this file is referenced in the toolset - maybe we can remove?
pkg/oadp/vmrestore.go
Outdated
| @@ -0,0 +1,55 @@ | |||
| package oadp | |||
There was a problem hiding this comment.
As far as I can tell, nothing in this file is referenced in the toolset - maybe we can remove?
pkg/oadp/cloudstorage.go
Outdated
| @@ -0,0 +1,33 @@ | |||
| package oadp | |||
There was a problem hiding this comment.
As far as I can tell, nothing in this file is referenced in the toolset - maybe we can remove?
pkg/oadp/downloadrequest.go
Outdated
| @@ -0,0 +1,33 @@ | |||
| package oadp | |||
There was a problem hiding this comment.
As far as I can tell, nothing in this file is referenced in the toolset - maybe we can remove?
pkg/oadp/podvolume.go
Outdated
| @@ -0,0 +1,29 @@ | |||
| package oadp | |||
There was a problem hiding this comment.
As far as I can tell, nothing in this file is referenced in the toolset - maybe we can remove?
pkg/toolsets/oadp/toolset.go
Outdated
| } | ||
|
|
||
| // GetTools returns all tools provided by this toolset. | ||
| // The toolset provides 10 consolidated tools covering all OADP CRDs: |
There was a problem hiding this comment.
| // The toolset provides 10 consolidated tools covering all OADP CRDs: | |
| // The toolset provides 8 consolidated tools covering all OADP CRDs: |
pkg/oadp/backup.go
Outdated
| // GetBackupLogs retrieves backup logs | ||
| // Note: In a real implementation, this would create a DownloadRequest and fetch logs from object storage | ||
| // For now, we return the backup's status information as a simplified version |
There was a problem hiding this comment.
As far as I can tell (and from this comment) - this does not actually retrieve logs. IMO this would confuse the agent, as it is executing a "logs" action. So, we should do one of:
- remove the logs action
- rename the logs action to something more intuitive
- Actually retrieve the logs here
pkg/oadp/restore.go
Outdated
|
|
||
| // GetRestoreLogs retrieves restore logs | ||
| // Note: Similar to backup logs, this returns status information | ||
| func GetRestoreLogs(ctx context.Context, client dynamic.Interface, namespace, name string) (string, error) { |
There was a problem hiding this comment.
As far as I can tell (and from this comment) - this does not actually retrieve logs. IMO this would confuse the agent, as it is executing a "logs" action. So, we should do one of:
- remove the logs action
- rename the logs action to something more intuitive
- Actually retrieve the logs here
pkg/toolsets/oadp/backup.go
Outdated
| namespace := oadp.DefaultOADPNamespace | ||
| if v, ok := params.GetArguments()["namespace"].(string); ok && v != "" { | ||
| namespace = v | ||
| } |
There was a problem hiding this comment.
There are some helpers for retrieving string/bool values from the params: api.OptionalString(params, "namespace", oadp.DefaultOADPNamespace) - maybe use those here (and on other tools) ?
| // parseLabelSelector parses a label selector string like "app=myapp,env=prod" into a map | ||
| func parseLabelSelector(selector string) map[string]string { | ||
| result := make(map[string]string) | ||
| if selector == "" { | ||
| return result | ||
| } | ||
|
|
||
| pairs := splitIgnoreEmpty(selector, ',') | ||
| for _, pair := range pairs { | ||
| kv := splitIgnoreEmpty(pair, '=') | ||
| if len(kv) == 2 { | ||
| result[kv[0]] = kv[1] | ||
| } | ||
| } | ||
| return result | ||
| } | ||
|
|
There was a problem hiding this comment.
this seems to be dropping any non equality selector expressions - maybe we should do either:
- return errors when we see a non equality selector (so that agent is informed that some of the selector is not being considered)
- use full selector syntax support here
|
@shubham-pampattiwar the toolset looks much better now! I left a few comments around the code, but overall this is looking great |
e852e86 to
9be67bd
Compare
|
Addressed review comments: Removed unused files:
Code changes:
Rebased on main to resolve merge conflict with netedge toolset. All tests passing. |
This adds an OADP (OpenShift API for Data Protection) toolset that enables AI agents to manage backup and restore operations on OpenShift clusters through the MCP server. New tools (21 total): - Backup: list, get, create, delete, logs - Restore: list, get, create, delete, logs - Schedule: list, get, create, delete, pause - Storage Locations: BSL and VSL list/get - DataProtectionApplication: list, get Fixes: OADP-7194
Add documentation for the OADP (OpenShift API for Data Protection) toolset including all 21 tools for managing Velero backups, restores, schedules, storage locations, and DataProtectionApplications.
Complete OADP toolset implementation covering all CRDs shipped by OADP: Velero v1 CRDs: - BackupRepository: list, get, delete - DeleteBackupRequest: list, get - DownloadRequest: list, get, create, delete - PodVolumeBackup: list, get - PodVolumeRestore: list, get - ServerStatusRequest: list, get, create, delete Velero v2alpha1 CRDs: - DataUpload: list, get, cancel - DataDownload: list, get, cancel OADP CRDs: - CloudStorage: list, get, create, delete - DataProtectionTest: list, get, create, delete - DPA: create, update, delete (added to existing list, get) NAC CRDs: - NonAdminBackup: list, get, create, delete - NonAdminRestore: list, get, create, delete - NonAdminBackupStorageLocation: list, get, create, update, delete - NonAdminBackupStorageLocationRequest: list, get, approve - NonAdminDownloadRequest: list, get, create, delete VM CRDs: - VirtualMachineBackupsDiscovery: list, get, create, delete - VirtualMachineFileRestore: list, get, create, delete Existing CRD updates: - BSL: create, update, delete (added to existing list, get) - VSL: create, update, delete (added to existing list, get) - Schedule: update (added to existing list, get, create, delete, pause) Includes comprehensive unit tests for all CRUD operations.
- Update README.md OADP section with all 90 tools covering 23 CRDs - Add docs/OADP.md with detailed toolset documentation - Update docs/README.md to include OADP and Observability links
Per PR review feedback, refactor OADP tools to use action parameter pattern (like kubevirt) instead of individual tools per operation. Changes: - Consolidate 90 individual tools into 8 action-based tools: - oadp_backup (list, get, create, delete, logs) - oadp_restore (list, get, create, delete, logs) - oadp_schedule (list, get, create, delete, pause, unpause) - oadp_storage_location (list, get for BSL and VSL) - oadp_dpa (list, get) - oadp_repository (list, get) - oadp_data_mover (list, get for uploads/downloads) - oadp_data_protection_test (list, get, create, delete) - Add 8 eval tasks for OADP toolset testing - Update documentation to reflect new tool structure - Fix eval scripts to use explicit Velero API groups (backup.velero.io, restore.velero.io, schedule.velero.io) to avoid conflicts with OpenShift built-in resources All 8/8 evals pass with 24/24 assertions.
- Remove unused pkg/oadp/ files (nonadmin, vmrestore, cloudstorage, downloadrequest, podvolume, serverstatus, deletebackuprequest) - Rename 'logs' action to 'status' for backup/restore tools (returns status info, not actual logs) - Use api.OptionalString helper instead of manual param parsing - Fix parseLabelSelector to return error on non-equality selectors - Update comment: 8 tools instead of 10 - Update tests to expect 'status' action
9be67bd to
abb9103
Compare
|
@shubham-pampattiwar: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@shubham-pampattiwar I'm curious - did you run the evals with just the core toolset enabled vs. this new toolset? Most of the tools seem to be CRUD tools, and on frontier models they can often succeed with the generic resource tools for many domains (but not all, which is why evals are so useful) |
| Annotations: api.ToolAnnotations{ | ||
| Title: "OADP: Backup", | ||
| ReadOnlyHint: ptr.To(false), | ||
| DestructiveHint: ptr.To(false), |
There was a problem hiding this comment.
Wouldn't this be true since there could be a delete action?
| Annotations: api.ToolAnnotations{ | ||
| Title: "OADP: Backup Repository", | ||
| ReadOnlyHint: ptr.To(false), | ||
| DestructiveHint: ptr.To(false), |
There was a problem hiding this comment.
Wouldn't this be true since there could be a delete action?
| Annotations: api.ToolAnnotations{ | ||
| Title: "OADP: Data Mover", | ||
| ReadOnlyHint: ptr.To(false), | ||
| DestructiveHint: ptr.To(false), |
There was a problem hiding this comment.
Wouldn't this be true since there could be a cancel action?
|
|
||
| | Tool | CRDs Covered | Actions | | ||
| |------|--------------|---------| | ||
| | `oadp_backup` | Backup | list, get, create, delete, logs | |
There was a problem hiding this comment.
I think this got switched to status (probably relevant in other places in the docs too)
| | `oadp_backup` | Backup | list, get, create, delete, logs | | |
| | `oadp_backup` | Backup | list, get, create, delete, status | |
| s.Run("returns empty list when no backup repositories exist", func() { | ||
| list, err := ListBackupRepositories(s.ctx, s.client, DefaultOADPNamespace, metav1.ListOptions{}) | ||
| s.NoError(err) | ||
| s.Empty(list.Items) | ||
| }) |
There was a problem hiding this comment.
I think this may break in the future, since the objects created within one s.Run are persisted within the test - so this would only work when run first
There was a problem hiding this comment.
Probably worth verifying the rest of the unit tests that are doing similar as well
| DefaultOADPNamespace = "openshift-adp" | ||
| ) | ||
|
|
||
| var ( |
There was a problem hiding this comment.
I don't think all of the GVRs defined here are currently being used, maybe we can clean up the unused ones?
| }, | ||
| "labelSelector": { | ||
| Type: "string", | ||
| Description: "Label selector to filter backups (for list action)", |
There was a problem hiding this comment.
Since we seem to only support equality based labels (given the parse function included here), should we add something to this description?
Summary
Adds OADP (OpenShift API for Data Protection) toolset to the kubernetes-mcp-server, enabling AI agents to create, monitor, and manage data protection workflows.
Tools (8 consolidated action-based tools)
Following the kubevirt pattern, each tool uses an
actionparameter to select the operation:oadp_backupoadp_restoreoadp_scheduleoadp_storage_locationoadp_dpaoadp_repositoryoadp_data_moveroadp_data_protection_testExample Usage
Test plan
go test -v ./pkg/toolsets/oadp/...)make build && make lint)Changes in this PR