MGMT-23234: [BE] Improve the UX around the amd/nvidia operators selection - add profile to selection by shay23bra · Pull Request #9993 · openshift/assisted-service

shay23bra · 2026-03-10T16:19:20Z

Add independent GPU vendor selection for OpenShift AI bundles - Users can now select NVIDIA only, AMD only, both, or neither through nvidia_enabled and amd_enabled
boolean parameters
Maintain backward compatibility - Both parameters default to true, so existing API calls continue to work unchanged and include both GPU operators as before
Support both endpoints - Works for bundle listing (/v2/operators/bundles) and single bundle retrieval (/v2/operators/bundles/{id}) with identical parameter behavior
Installation-time GPU filtering - Enhanced ResolveDependencies to honor GPU vendor selection from operator properties during cluster installation, ensuring
users' GPU filter choices ({"gpu_filter":{"nvidia_enabled":true,"amd_enabled":false}}) are respected in both bundle discovery and actual operator deployment

Summary by CodeRabbit

New Features
- Bundle list and get endpoints accept amd_enabled and nvidia_enabled query flags (default: enabled) to include/exclude GPU-specific operators; responses reflect GPU-based filtering when flags are provided.
Documentation
- OpenAPI/Swagger updated to expose amd_enabled and nvidia_enabled parameters.
- User guide: "GPU Vendor Selection" added with examples for NVIDIA/AMD/both/none.

openshift-ci-robot · 2026-03-10T16:19:33Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-03-10T16:19:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds AMD/NVIDIA GPU vendor toggles across API, client, server, manager, mocks, URL builders, and OpenAPI: new AmdEnabled/NvidiaEnabled parameters (default true) flow from request binding through handler branching to new Manager GPU-filtering methods that filter bundles and their operator dependencies.

Changes

Cohort / File(s)	Summary
Client Parameters `client/operators/v2_get_bundle_parameters.go`, `client/operators/v2_list_bundles_parameters.go`	Added `AmdEnabled` and `NvidiaEnabled` (*bool) fields, defaults set to true, chainable setters/getters, SetDefaults updates, and query serialization in WriteToRequest.
REST Binders & URL Builders `restapi/operations/operators/v2_get_bundle_parameters.go`, `restapi/operations/operators/v2_list_bundles_parameters.go`, `restapi/operations/operators/v2_get_bundle_urlbuilder.go`, `restapi/operations/operators/v2_list_bundles_urlbuilder.go`	Introduced `amd_enabled` and `nvidia_enabled` query params: new fields, default initialization, BindRequest parsing helpers, and URL builder serialization.
OpenAPI / Spec `restapi/embedded_spec.go`, `swagger.yaml`	Added `amd_enabled` and `nvidia_enabled` boolean query parameters (default true) to GET /v2/operators/bundles and GET /v2/operators/bundles/{id} in embedded spec and swagger.yaml.
Handler Routing `internal/operators/handler/handler_v2.go`	Handler now selects GPU-aware variants (`ListBundlesWithGPUFilter` / `GetBundleWithGPUFilter`) when GPU params are present; otherwise uses existing List/Get paths.
Manager Core & API `internal/operators/manager.go`	Added public `GPUFilter` type, new API methods `ListBundlesWithGPUFilter` and `GetBundleWithGPUFilter`, Manager implementations, GPU filter parsing/apply helpers, and dependency-removal logic to exclude operators per GPU flags.
Mocks `internal/operators/mock_operators_api.go`	Added mock methods and recorder entries for `GetBundleWithGPUFilter` and `ListBundlesWithGPUFilter` matching the new signatures.
Docs `docs/user-guide/additional-operator-notes.md`	New "GPU Vendor Selection" docs showing `gpu_filter` examples and behavior (nvidia_enabled / amd_enabled options and defaults).

Sequence Diagram

sequenceDiagram
    participant Client
    participant Handler
    participant Manager
    participant BundleStorage

    Client->>Handler: GET /v2/operators/bundles?amd_enabled=false&nvidia_enabled=true
    Handler->>Manager: ListBundlesWithGPUFilter(filters, featureIDs, GPUFilter{AmdEnabled:false, NvidiaEnabled:true})
    Manager->>BundleStorage: Fetch bundles
    loop per bundle
        Manager->>Manager: Evaluate bundle operators
        Manager->>Manager: shouldSkipOperatorForGPUFilter(operator, GPUFilter)
        alt skip operator
            Manager->>Manager: Remove operator and dependents
        else
            Manager->>Manager: Keep operator
        end
        Manager->>Manager: Assemble filtered bundle
    end
    Manager-->>Handler: Return filtered bundles
    Handler-->>Client: 200 OK with filtered bundles

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality	⚠️ Warning	PR introduces GPU filtering functionality without corresponding test coverage for new methods, handler conditional routing, ResolveDependencies scenarios, and contains an untested shallow copy bug.	Add comprehensive test coverage for ListBundlesWithGPUFilter, GetBundleWithGPUFilter, handler GPU parameters, ResolveDependencies GPU scenarios, and fix the shallow copy bug in ensureAllGPUOperators.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title clearly identifies the main change: adding GPU vendor selection (AMD/NVIDIA) for OpenShift AI bundles with profile support.
Description check	✅ Passed	The PR description comprehensively covers motivation, implementation details, backward compatibility, endpoint support, and installation-time behavior with clear examples.
Stable And Deterministic Test Names	✅ Passed	PR contains only production code; no Ginkgo test files with dynamic test names are modified.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci-robot · 2026-03-10T16:24:35Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 4

🧹 Nitpick comments (2)

swagger.yaml (1)

internal/operators/manager.go (1)

743-790: Unify bundle construction behind the filter-aware path.

These methods now duplicate the existing ListBundles / GetBundle loops. Please make the legacy methods delegate here with a nil filter so omitted params and explicit amd_enabled=true&nvidia_enabled=true cannot drift over time.

♻️ Proposed refactor

 func (mgr *Manager) ListBundles(filters *featuresupport.SupportLevelFilters, featureIDs []models.FeatureSupportLevelID) []*models.Bundle {
-	var ret []*models.Bundle
-
-	for _, basicBundleDetails := range operatorscommon.Bundles {
-		completeBundleDetails, err := mgr.GetBundle(basicBundleDetails.ID, featureIDs)
-		if err != nil {
-			mgr.log.Error(err)
-			continue
-		}
-
-		if mgr.isBundleSupported(completeBundleDetails, filters, featureIDs) {
-			ret = append(ret, completeBundleDetails)
-		}
-	}
-
-	return ret
+	return mgr.ListBundlesWithGPUFilter(filters, featureIDs, nil)
 }
 
 func (mgr *Manager) GetBundle(bundleID string, featureIDs []models.FeatureSupportLevelID) (*models.Bundle, error) {
-	bundle, ok := mgr.lookupBundle(bundleID)
-	if !ok {
-		return nil, fmt.Errorf("bundle '%s' is not supported", bundleID)
-	}
-
-	for _, operator := range mgr.olmOperators {
-		operatorBundles := operator.GetBundleLabels(featureIDs)
-		for _, operatorBundle := range operatorBundles {
-			if operatorBundle == bundleID {
-				bundle.Operators = append(bundle.Operators, operator.GetName())
-				break
-			}
-		}
-	}
-
-	return bundle, nil
+	return mgr.GetBundleWithGPUFilter(bundleID, featureIDs, nil)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 743 - 790, Refactor the legacy
ListBundles and GetBundle implementations to delegate to the GPU-aware paths by
calling ListBundlesWithGPUFilter(..., nil) and GetBundleWithGPUFilter(..., nil)
respectively so that bundle construction is unified and GPU filtering defaults
(nil -> no filter / both vendors) cannot drift; update references to
operatorscommon.Bundles usage only in
ListBundlesWithGPUFilter/GetBundleWithGPUFilter, remove duplicated operator
iteration logic from the legacy methods, and ensure function signatures remain
compatible (use the existing Manager.ListBundles, Manager.GetBundle names to
call into Manager.ListBundlesWithGPUFilter and Manager.GetBundleWithGPUFilter
with a nil GPUFilter).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/operators/manager.go`:
- Around line 765-789: GetBundleWithGPUFilter currently filters operators
transiently but never persists the GPUFilter, so ResolveDependencies and
openshift_ai.GetDependencies lose the user's selection; persist the GPU filter
on the Cluster model (e.g., add a Cluster.GPUFilter/SelectedGPUVendor field)
when the user selects it, then update Manager.ResolveDependencies (and any
callers, e.g., refresh_status_preprocessor and inventory flows) to accept and
forward the GPUFilter; modify openshift_ai.GetDependencies signature to accept
the GPUFilter and have its logic (ClusterHasGPU checks) respect the passed
filter rather than only physical GPU presence; locate helper
shouldSkipOperatorForGPUFilter and lookupBundle to ensure consistent operator
filtering semantics across GetBundleWithGPUFilter and dependency expansion.

In `@restapi/embedded_spec.go`:
- Around line 5844-5855: Update the parameter descriptions for nvidia_enabled
and amd_enabled to explicitly document combined and non-applicable behaviors:
state that both flags apply only to the openshift-ai bundle, that when the
request targets a non-openshift-ai bundle the flags are ignored, and that if
both nvidia_enabled and amd_enabled are false no GPUs will be enabled for
openshift-ai (resulting in CPU-only deployment or an error as applicable). Make
identical clarifying edits for all occurrences of these parameters (the blocks
that define "nvidia_enabled" and "amd_enabled" in embedded_spec.go) so the
public API contract is unambiguous to clients.

In `@restapi/operations/operators/v2_get_bundle_parameters.go`:
- Around line 106-128: The current bindAmdEnabled function treats an explicit
empty query value as omitted; change it so when raw == "" it returns a
validation error instead of nil (respecting AllowEmptyValue: false).
Specifically, in bindAmdEnabled, replace the early-return-on-empty with
returning errors.InvalidType("amd_enabled", "query", "bool", raw) (or another
appropriate swagger validation error) so an explicit ?amd_enabled= fails
validation; apply the same change to the corresponding GPU flag binder (e.g.,
bindNvidiaEnabled / the binder covering lines 170-192) so all explicit empty GPU
flags are rejected.

In `@restapi/operations/operators/v2_list_bundles_parameters.go`:
- Around line 136-158: The binder currently treats an empty boolean query like
omission, so functions bindAmdEnabled and bindNvidiaEnabled should reject an
explicit empty value; update each binder to check hasKey and if hasKey && raw ==
"" return a validation error (e.g. errors.InvalidType for the parameter name and
"query" location) instead of silently returning nil, and also propagate this
change back to the swagger/template that generates these binders so future
generations include the same guard.

---

Nitpick comments:
In `@internal/operators/manager.go`:
- Around line 743-790: Refactor the legacy ListBundles and GetBundle
implementations to delegate to the GPU-aware paths by calling
ListBundlesWithGPUFilter(..., nil) and GetBundleWithGPUFilter(..., nil)
respectively so that bundle construction is unified and GPU filtering defaults
(nil -> no filter / both vendors) cannot drift; update references to
operatorscommon.Bundles usage only in
ListBundlesWithGPUFilter/GetBundleWithGPUFilter, remove duplicated operator
iteration logic from the legacy methods, and ensure function signatures remain
compatible (use the existing Manager.ListBundles, Manager.GetBundle names to
call into Manager.ListBundlesWithGPUFilter and Manager.GetBundleWithGPUFilter
with a nil GPUFilter).

In `@swagger.yaml`:
- Around line 2785-2794: Extract the duplicated query parameter blocks for
nvidia_enabled and amd_enabled into reusable top-level Swagger parameter
definitions (e.g., components.parameters.NvidiaEnabled and
components.parameters.AmdEnabled) and replace each inline occurrence (the two
identical blocks at/around the current query params) with $ref references to
those component parameters; ensure the parameter names, descriptions, types
(boolean) and defaults are preserved in the shared definitions and update both
endpoints (the blocks at lines around 2785–2794 and 2833–2842) to reference the
new shared parameters so they stay in sync.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7d19782-3e24-4fec-9192-4f63267b8190

📥 Commits

Reviewing files that changed from the base of the PR and between 988a064 and d46660d.

📒 Files selected for processing (11)

client/operators/v2_get_bundle_parameters.go
client/operators/v2_list_bundles_parameters.go
internal/operators/handler/handler_v2.go
internal/operators/manager.go
internal/operators/mock_operators_api.go
restapi/embedded_spec.go
restapi/operations/operators/v2_get_bundle_parameters.go
restapi/operations/operators/v2_get_bundle_urlbuilder.go
restapi/operations/operators/v2_list_bundles_parameters.go
restapi/operations/operators/v2_list_bundles_urlbuilder.go
swagger.yaml

coderabbitai · 2026-03-10T16:32:39Z

internal/operators/manager.go

+func (mgr *Manager) GetBundleWithGPUFilter(bundleID string, featureIDs []models.FeatureSupportLevelID, gpuFilter *GPUFilter) (*models.Bundle, error) {
+	bundle, ok := mgr.lookupBundle(bundleID)
+	if !ok {
+		return nil, fmt.Errorf("bundle '%s' is not supported", bundleID)
+	}
+
+	// Get all operators for the bundle based on feature IDs
+	for _, operator := range mgr.olmOperators {
+		operatorBundles := operator.GetBundleLabels(featureIDs)
+		for _, operatorBundle := range operatorBundles {
+			if operatorBundle == bundleID {
+				operatorName := operator.GetName()
+
+				// Apply GPU filtering
+				if mgr.shouldSkipOperatorForGPUFilter(operatorName, gpuFilter) {
+					continue
+				}
+
+				bundle.Operators = append(bundle.Operators, operatorName)
+				break
+			}
+		}
+	}
+
+	return bundle, nil


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "== ResolveDependencies special-case ==" rg -n -C4 'operatorscommon\.HasOperator$ret, openshiftai\.Operator\.Name$|nvidiagpu\.Operator\.Name|amdgpu\.Operator\.Name' internal/operators/manager.go echo echo "== OpenShift AI dependency implementation ==" fd . internal/operators/openshiftai -e go -t f | while read -r f; do echo "--- $f ---" rg -n -C4 '^\s*func .*GetDependencies\s*\(|^\s*func .*GetBundleLabels\s*\(|nvidia-gpu|amd-gpu|nvidiagpu\.Operator\.Name|amdgpu\.Operator\.Name' "$f" done echo echo "== GPU filter usage and dependency expansion call sites ==" rg -n -C3 --type=go '\bGPUFilter\b|\bNvidiaEnabled\b|\bAmdEnabled\b|\bResolveDependencies\s*\(' internal

Repository: openshift/assisted-service

Length of output: 630

🏁 Script executed:

# Find ResolveDependencies signature and call sites rg -n "func.*ResolveDependencies\|ResolveDependencies\(" internal/operators/manager.go -A 2 -B 1 | head -50

Repository: openshift/assisted-service

Length of output: 52

🏁 Script executed:

# Check openshiftai GetDependencies implementation cat internal/operators/openshiftai/operator.go | grep -n "GetDependencies\|GetBundleLabels\|func" | head -30

Repository: openshift/assisted-service

Length of output: 144

🏁 Script executed:

# Look for GPU filter usage in the codebase rg -n "GetBundleWithGPUFilter" internal --type=go -B 2 -A 2

Repository: openshift/assisted-service

Length of output: 3737

🏁 Script executed:

# List openshiftai directory ls -la internal/operators/openshiftai/

Repository: openshift/assisted-service

Length of output: 926

🏁 Script executed:

# Find ResolveDependencies in manager.go grep -n "ResolveDependencies" internal/operators/manager.go

Repository: openshift/assisted-service

Length of output: 448

🏁 Script executed:

# Check handler_v2.go to see the full context of GetBundleWithGPUFilter usage cat -n internal/operators/handler/handler_v2.go | sed -n '160,180p'

Repository: openshift/assisted-service

Length of output: 801

🏁 Script executed:

# Search for where GetBundleWithGPUFilter is actually used and what happens with the result rg -n "GetBundleWithGPUFilter" . --type=go -A 5

Repository: openshift/assisted-service

Length of output: 4092

🏁 Script executed:

# Get ResolveDependencies implementation cat -n internal/operators/manager.go | sed -n '493,570p'

Repository: openshift/assisted-service

Length of output: 3016

🏁 Script executed:

# Check openshift_ai_operator.go for GetDependencies implementation cat internal/operators/openshiftai/openshift_ai_operator.go | grep -n "GetDependencies\|GetBundleLabels\|NvidiaEnabled\|AmdEnabled" -A 10 -B 2

Repository: openshift/assisted-service

Length of output: 1733

🏁 Script executed:

# Check if GPU filter is stored/threaded in cluster or elsewhere rg -n "NvidiaEnabled\|AmdEnabled\|GPUFilter" internal/operators --type=go | grep -E "Cluster|cluster" | head -20

Repository: openshift/assisted-service

Length of output: 52

🏁 Script executed:

# Find where ResolveDependencies is called rg -n "\.ResolveDependencies\(" . --type=go -B 3 -A 3

Repository: openshift/assisted-service

Length of output: 15112

🏁 Script executed:

# Check if GPU filter is stored in Cluster model rg -n "NvidiaEnabled\|AmdEnabled" internal/common --type=go -B 2 -A 2

Repository: openshift/assisted-service

Length of output: 52

🏁 Script executed:

# Verify openshift_ai GetDependencies doesn't check any GPU-related fields on cluster cat internal/operators/openshiftai/openshift_ai_operator.go | sed -n '78,111p'

Repository: openshift/assisted-service

Length of output: 821

GPU filter selection is not persisted and will be overridden by ResolveDependencies.

GetBundleWithGPUFilter correctly filters the bundle response, but the GPU filter is never passed to ResolveDependencies, which is called later in the provisioning flow (refresh_status_preprocessor.go:155, inventory.go:3396).

When ResolveDependencies calls openshift_ai.GetDependencies(), the OpenShift AI operator has no knowledge of the user's GPU filter selection. Instead, it returns GPU operators based solely on whether the cluster physically has those GPUs (ClusterHasGPU()). If a cluster has both NVIDIA and AMD GPUs, both will be added as dependencies regardless of which vendor the user selected via the GPU filter.

The workaround at lines 533–539 (marking nvidia-gpu/amd-gpu as DependencyOnly) does not prevent them from being returned. The GPU filter selection must be persisted in the Cluster model and threaded through to ResolveDependencies and openshift_ai.GetDependencies for the filtering to survive dependency expansion.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/operators/manager.go` around lines 765 - 789, GetBundleWithGPUFilter currently filters operators transiently but never persists the GPUFilter, so ResolveDependencies and openshift_ai.GetDependencies lose the user's selection; persist the GPU filter on the Cluster model (e.g., add a Cluster.GPUFilter/SelectedGPUVendor field) when the user selects it, then update Manager.ResolveDependencies (and any callers, e.g., refresh_status_preprocessor and inventory flows) to accept and forward the GPUFilter; modify openshift_ai.GetDependencies signature to accept the GPUFilter and have its logic (ClusterHasGPU checks) respect the passed filter rather than only physical GPU presence; locate helper shouldSkipOperatorForGPUFilter and lookupBundle to ensure consistent operator filtering semantics across GetBundleWithGPUFilter and dependency expansion.

restapi/embedded_spec.go

restapi/operations/operators/v2_get_bundle_parameters.go

restapi/operations/operators/v2_list_bundles_parameters.go

openshift-ci · 2026-03-10T16:33:56Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: shay23bra
Once this PR has been reviewed and has the lgtm label, please assign jhernand for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2026-03-10T16:40:17Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

♻️ Duplicate comments (4)

restapi/operations/operators/v2_get_bundle_parameters.go (1)

106-128: ⚠️ Potential issue | 🟡 Minor

Reject explicit empty GPU flags here too.

Line 116 and Line 180 still return early on raw == "", so ?amd_enabled= / ?nvidia_enabled= behave like omitted params and keep the default true. That hides client bugs and does not match AllowEmptyValue: false. Because this file is generated, the durable fix belongs in the swagger source/template too.

Suggested fix

 func (o *V2GetBundleParams) bindAmdEnabled(rawData []string, hasKey bool, formats strfmt.Registry) error {
 	var raw string
 	if len(rawData) > 0 {
 		raw = rawData[len(rawData)-1]
 	}

 	// Required: false
 	// AllowEmptyValue: false

-	if raw == "" { // empty values pass all other validations
-		// Default values have been previously initialized by NewV2GetBundleParams()
+	if !hasKey {
+		// Default values have been previously initialized by NewV2GetBundleParams()
 		return nil
 	}
+	if raw == "" {
+		return errors.InvalidType("amd_enabled", "query", "bool", raw)
+	}

 	value, err := swag.ConvertBool(raw)
 	if err != nil {
 		return errors.InvalidType("amd_enabled", "query", "bool", raw)
 	}
 	o.AmdEnabled = &value

 	return nil
 }

 func (o *V2GetBundleParams) bindNvidiaEnabled(rawData []string, hasKey bool, formats strfmt.Registry) error {
 	var raw string
 	if len(rawData) > 0 {
 		raw = rawData[len(rawData)-1]
 	}

 	// Required: false
 	// AllowEmptyValue: false

-	if raw == "" { // empty values pass all other validations
-		// Default values have been previously initialized by NewV2GetBundleParams()
+	if !hasKey {
+		// Default values have been previously initialized by NewV2GetBundleParams()
 		return nil
 	}
+	if raw == "" {
+		return errors.InvalidType("nvidia_enabled", "query", "bool", raw)
+	}

 	value, err := swag.ConvertBool(raw)
 	if err != nil {
 		return errors.InvalidType("nvidia_enabled", "query", "bool", raw)
 	}
 	o.NvidiaEnabled = &value

 	return nil
 }

Also applies to: 170-192

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@restapi/operations/operators/v2_get_bundle_parameters.go` around lines 106 -
128, The bind functions currently treat an explicit empty query value (raw ==
"") as if the parameter was omitted, preserving defaults; update bindAmdEnabled
and the corresponding bindNvidiaEnabled to reject explicit empty values because
AllowEmptyValue is false: instead of returning nil when raw == "", return a
validation error (e.g., errors.InvalidType("amd_enabled", "query", "bool", raw)
and errors.InvalidType("nvidia_enabled", "query", "bool", raw") or an
appropriate errors.Required/InvalidType) so that requests like ?amd_enabled= or
?nvidia_enabled= fail validation rather than silently keeping defaults.

restapi/operations/operators/v2_list_bundles_parameters.go (1)

136-158: ⚠️ Potential issue | 🟡 Minor

Reject explicit empty GPU flags instead of falling back to the default.

Line 146 and Line 247 still treat ?amd_enabled= / ?nvidia_enabled= as omission, so malformed requests silently keep the default true. That contradicts AllowEmptyValue: false and can return a broader result set than the caller asked for. Since this file is generated, please fix the swagger source/template as well.

Suggested fix

 func (o *V2ListBundlesParams) bindAmdEnabled(rawData []string, hasKey bool, formats strfmt.Registry) error {
 	var raw string
 	if len(rawData) > 0 {
 		raw = rawData[len(rawData)-1]
 	}

-	if raw == "" { // empty values pass all other validations
-		// Default values have been previously initialized by NewV2ListBundlesParams()
+	if !hasKey {
+		// Default values have been previously initialized by NewV2ListBundlesParams()
 		return nil
 	}
+	if raw == "" {
+		return errors.InvalidType("amd_enabled", "query", "bool", raw)
+	}

 	value, err := swag.ConvertBool(raw)
 	if err != nil {
 		return errors.InvalidType("amd_enabled", "query", "bool", raw)
 	}
 	o.AmdEnabled = &value

 	return nil
 }

 func (o *V2ListBundlesParams) bindNvidiaEnabled(rawData []string, hasKey bool, formats strfmt.Registry) error {
 	var raw string
 	if len(rawData) > 0 {
 		raw = rawData[len(rawData)-1]
 	}

-	if raw == "" { // empty values pass all other validations
-		// Default values have been previously initialized by NewV2ListBundlesParams()
+	if !hasKey {
+		// Default values have been previously initialized by NewV2ListBundlesParams()
 		return nil
 	}
+	if raw == "" {
+		return errors.InvalidType("nvidia_enabled", "query", "bool", raw)
+	}

 	value, err := swag.ConvertBool(raw)
 	if err != nil {
 		return errors.InvalidType("nvidia_enabled", "query", "bool", raw)
 	}
 	o.NvidiaEnabled = &value

 	return nil
 }

Also applies to: 237-259

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@restapi/operations/operators/v2_list_bundles_parameters.go` around lines 136
- 158, The bindAmdEnabled (and similarly bindNvidiaEnabled) currently treats an
explicit empty query value (e.g. ?amd_enabled=) as omission and returns nil;
instead, when hasKey is true and raw == "" you must reject the request because
AllowEmptyValue is false. Modify bindAmdEnabled to check if hasKey && raw == ""
and return an appropriate validation error (use the same error pattern as other
validation failures, e.g. errors.InvalidType/InvalidFormat for "amd_enabled" in
"query") so the empty value is not silently treated as the default; apply the
same change to bindNvidiaEnabled.

restapi/embedded_spec.go (1)

5844-5855: ⚠️ Potential issue | 🟡 Minor

Document the combined behavior of nvidia_enabled and amd_enabled.

These descriptions still define each flag separately, but the API contract is still ambiguous for nvidia_enabled=false + amd_enabled=false and for requests that target non-openshift-ai bundles. Please clarify those cases in all four blocks so clients do not have to infer the behavior.

✍️ Suggested wording

-            "description": "Enable NVIDIA GPU support for OpenShift AI bundle. Only applies to openshift-ai bundle.",
+            "description": "Whether to include NVIDIA-specific content in the OpenShift AI bundle. Applies only to the openshift-ai bundle; for non-openshift-ai bundles this flag is ignored. Can be combined independently with amd_enabled. When both nvidia_enabled and amd_enabled are false, the bundle is returned without GPU-specific content.",
...
-            "description": "Enable AMD GPU support for OpenShift AI bundle. Only applies to openshift-ai bundle.",
+            "description": "Whether to include AMD-specific content in the OpenShift AI bundle. Applies only to the openshift-ai bundle; for non-openshift-ai bundles this flag is ignored. Can be combined independently with nvidia_enabled. When both nvidia_enabled and amd_enabled are false, the bundle is returned without GPU-specific content.",

Also applies to: 5913-5924, 17340-17351, 17409-17420

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@restapi/embedded_spec.go` around lines 5844 - 5855, The parameter docs for
nvidia_enabled and amd_enabled are ambiguous about combined and non-openshift-ai
behavior; update the descriptions for the nvidia_enabled and amd_enabled query
parameters (the parameter entries named "nvidia_enabled" and "amd_enabled") in
all four blocks so they explicitly state: (1) what happens when both flags are
false (no GPU support will be enabled/requested), (2) that these flags only
apply to requests targeting the openshift-ai bundle and will be ignored for
other bundles (and should note whether ignored means treated as false or no-op),
and (3) the default behavior when omitted; make the same exact clarification in
each occurrence referenced (the blocks around the existing "nvidia_enabled" and
"amd_enabled" entries at the four locations) so clients can rely on a consistent
API contract.

internal/operators/manager.go (1)

🧹 Nitpick comments (2)

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@internal/operators/manager.go`:
- Around line 765-789: GetBundleWithGPUFilter only filters operators in the
returned Bundle but doesn't prevent dependency resolution from reintroducing GPU
operators; to fix, thread the GPUFilter into the dependency-resolution path: add
a gpuFilter parameter to ResolveDependencies and its internal
getDependencies/GetDependencies(cluster) call chain (and update callers), then
apply mgr.shouldSkipOperatorForGPUFilter(operatorName, gpuFilter) when
expanding/adding dependencies so GPU-disabled operators are never added during
dependency resolution; update function signatures (ResolveDependencies,
getDependencies, GetDependencies) and any call sites accordingly to preserve the
user's GPU selection.

In `@restapi/embedded_spec.go`:
- Around line 5844-5855: The parameter docs for nvidia_enabled and amd_enabled
are ambiguous about combined and non-openshift-ai behavior; update the
descriptions for the nvidia_enabled and amd_enabled query parameters (the
parameter entries named "nvidia_enabled" and "amd_enabled") in all four blocks
so they explicitly state: (1) what happens when both flags are false (no GPU
support will be enabled/requested), (2) that these flags only apply to requests
targeting the openshift-ai bundle and will be ignored for other bundles (and
should note whether ignored means treated as false or no-op), and (3) the
default behavior when omitted; make the same exact clarification in each
occurrence referenced (the blocks around the existing "nvidia_enabled" and
"amd_enabled" entries at the four locations) so clients can rely on a consistent
API contract.

In `@restapi/operations/operators/v2_get_bundle_parameters.go`:
- Around line 106-128: The bind functions currently treat an explicit empty
query value (raw == "") as if the parameter was omitted, preserving defaults;
update bindAmdEnabled and the corresponding bindNvidiaEnabled to reject explicit
empty values because AllowEmptyValue is false: instead of returning nil when raw
== "", return a validation error (e.g., errors.InvalidType("amd_enabled",
"query", "bool", raw) and errors.InvalidType("nvidia_enabled", "query", "bool",
raw") or an appropriate errors.Required/InvalidType) so that requests like
?amd_enabled= or ?nvidia_enabled= fail validation rather than silently keeping
defaults.

In `@restapi/operations/operators/v2_list_bundles_parameters.go`:
- Around line 136-158: The bindAmdEnabled (and similarly bindNvidiaEnabled)
currently treats an explicit empty query value (e.g. ?amd_enabled=) as omission
and returns nil; instead, when hasKey is true and raw == "" you must reject the
request because AllowEmptyValue is false. Modify bindAmdEnabled to check if
hasKey && raw == "" and return an appropriate validation error (use the same
error pattern as other validation failures, e.g.
errors.InvalidType/InvalidFormat for "amd_enabled" in "query") so the empty
value is not silently treated as the default; apply the same change to
bindNvidiaEnabled.

---

Nitpick comments:
In `@internal/operators/handler/handler_v2.go`:
- Around line 127-142: Duplicate GPU filter construction and nil-check should be
extracted into a single helper to avoid drift: create a function (e.g.,
buildGPUFilter or getGPUFilter) that accepts the params (accessing
params.NvidiaEnabled and params.AmdEnabled) and returns *operators.GPUFilter or
nil; replace the inline construction and nil-check in both locations that call
h.operatorsAPI.ListBundlesWithGPUFilter and h.operatorsAPI.ListBundles so
callers simply call the helper and then choose between
ListBundlesWithGPUFilter(filter) and ListBundles(filter) based on whether the
helper returned nil.

In `@internal/operators/manager.go`:
- Around line 792-815: The function shouldSkipOperatorForGPUFilter currently
compares operatorName to hard-coded strings "nvidia-gpu" and "amd-gpu"; replace
those literals with the operator name constants by using nvidiagpu.Operator.Name
and amdgpu.Operator.Name respectively in the switch/case comparisons inside
shouldSkipOperatorForGPUFilter so the filter uses the canonical operator
identifiers and stays in sync with the imported operator definitions.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3cf7327c-fab8-4ad0-9715-1654ad98986d

📥 Commits

Reviewing files that changed from the base of the PR and between d46660d and 55abe77.

⛔ Files ignored due to path filters (2)

vendor/github.com/openshift/assisted-service/client/operators/v2_get_bundle_parameters.go is excluded by !**/vendor/**, !vendor/**
vendor/github.com/openshift/assisted-service/client/operators/v2_list_bundles_parameters.go is excluded by !**/vendor/**, !vendor/**

📒 Files selected for processing (11)

client/operators/v2_get_bundle_parameters.go
client/operators/v2_list_bundles_parameters.go
internal/operators/handler/handler_v2.go
internal/operators/manager.go
internal/operators/mock_operators_api.go
restapi/embedded_spec.go
restapi/operations/operators/v2_get_bundle_parameters.go
restapi/operations/operators/v2_get_bundle_urlbuilder.go
restapi/operations/operators/v2_list_bundles_parameters.go
restapi/operations/operators/v2_list_bundles_urlbuilder.go
swagger.yaml

🚧 Files skipped from review as they are similar to previous changes (2)

restapi/operations/operators/v2_list_bundles_urlbuilder.go
swagger.yaml

codecov · 2026-03-10T17:14:17Z

Codecov Report

❌ Patch coverage is 5.75916% with 180 lines in your changes missing coverage. Please review.
✅ Project coverage is 43.98%. Comparing base (988a064) to head (751ae57).
⚠️ Report is 4 commits behind head on master.

Files with missing lines	Patch %	Lines
internal/operators/manager.go	6.55%	110 Missing and 4 partials ⚠️
restapi/embedded_spec.go	0.00%	48 Missing ⚠️
internal/operators/handler/handler_v2.go	14.28%	15 Missing and 3 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #9993      +/-   ##
==========================================
- Coverage   44.08%   43.98%   -0.11%     
==========================================
  Files         414      415       +1     
  Lines       72248    72438     +190     
==========================================
+ Hits        31852    31862      +10     
- Misses      37517    37689     +172     
- Partials     2879     2887       +8

Files with missing lines	Coverage Δ
internal/operators/handler/handler_v2.go	`42.35% <14.28%> (-9.17%)`	⬇️
restapi/embedded_spec.go	`0.00% <0.00%> (ø)`
internal/operators/manager.go	`56.78% <6.55%> (-14.79%)`	⬇️

... and 5 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

openshift-ci-robot · 2026-03-10T18:18:29Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

internal/operators/manager.go (2)

834-844: Use constants for GPU operator names.

Same as above - replace hardcoded strings with constants.

♻️ Proposed fix

 	switch operatorName {
-	case "nvidia-gpu":
+	case nvidiagpu.Operator.Name:
 		// Skip nvidia-gpu if nvidia is explicitly disabled
 		if gpuFilter.NvidiaEnabled != nil && !*gpuFilter.NvidiaEnabled {
 			return true
 		}
-	case "amd-gpu":
+	case amdgpu.Operator.Name:
 		// Skip amd-gpu if amd is explicitly disabled
 		if gpuFilter.AmdEnabled != nil && !*gpuFilter.AmdEnabled {
 			return true
 		}
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 834 - 844, Replace the hardcoded
operator name strings in the switch (the cases matching "nvidia-gpu" and
"amd-gpu") with package-level string constants (e.g., OperatorNvidiaGPU,
OperatorAmdGPU) and use those constants in the switch on operatorName; update
any other references to these literal names (such as the gpuFilter checks
referencing gpuFilter.NvidiaEnabled and gpuFilter.AmdEnabled in the same file)
to use the constants so the operator names are centralized and avoid
duplication.

784-785: Use constants for GPU operator names.

Replace hardcoded strings with the existing constants for consistency and maintainability.

♻️ Proposed fix

-				if operatorName == "nvidia-gpu" || operatorName == "amd-gpu" {
+				if operatorName == nvidiagpu.Operator.Name || operatorName == amdgpu.Operator.Name {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 784 - 785, Replace the hardcoded
GPU operator name strings in the comparison with the project constants: instead
of comparing operatorName to "nvidia-gpu" and "amd-gpu", use the existing GPU
name constants (e.g., NvidiaGPUOperator and AMDGPUOperator) so
filteredGPUOperators is appended using the constants; update the comparisons
around operatorName and the append to use those constant symbols to ensure
consistency with other code.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/operators/manager.go`:
- Around line 796-820: The current removal collects deps from
filteredGPUOperators via mgr.olmOperators[...].GetDependencies and
unconditionally deletes any matching name from includedOperators; instead, for
each candidate dependency (from dependenciesToRemove) check all remaining
included operators (iterate includedOperators) and call GetDependencies on their
olmOperators entries to see if any still depend on that candidate; only remove
the dependency if no remaining included operator lists it. Update the logic
around filteredGPUOperators/gpuFilter and the dependenciesToRemove pass to
perform this per-dependency existence check before filtering includedOperators.

---

Nitpick comments:
In `@internal/operators/manager.go`:
- Around line 834-844: Replace the hardcoded operator name strings in the switch
(the cases matching "nvidia-gpu" and "amd-gpu") with package-level string
constants (e.g., OperatorNvidiaGPU, OperatorAmdGPU) and use those constants in
the switch on operatorName; update any other references to these literal names
(such as the gpuFilter checks referencing gpuFilter.NvidiaEnabled and
gpuFilter.AmdEnabled in the same file) to use the constants so the operator
names are centralized and avoid duplication.
- Around line 784-785: Replace the hardcoded GPU operator name strings in the
comparison with the project constants: instead of comparing operatorName to
"nvidia-gpu" and "amd-gpu", use the existing GPU name constants (e.g.,
NvidiaGPUOperator and AMDGPUOperator) so filteredGPUOperators is appended using
the constants; update the comparisons around operatorName and the append to use
those constant symbols to ensure consistency with other code.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: cacee46a-fd8c-4725-b8e5-ec9eda4782ef

📥 Commits

Reviewing files that changed from the base of the PR and between 55abe77 and adb4d66.

📒 Files selected for processing (1)

internal/operators/manager.go

internal/operators/manager.go

…nstallation flow

openshift-ci-robot · 2026-03-10T19:48:02Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2026-03-10T19:50:11Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

internal/operators/manager.go (1)

595-617: ⚠️ Potential issue | 🟠 Major

Keep shared dependencies if a retained operator still requires them.

This pruning step removes every dependency of the filtered GPU operator, even when another operator left in filtered still depends on it. That can drop shared deps like nodefeaturediscovery/kmm from the resolved install set.

Suggested fix

 	if len(filteredGPUOperators) > 0 {
 		dependenciesToRemove := make(map[string]bool)
+		dependenciesToKeep := make(map[string]bool)
 		for _, filteredGPU := range filteredGPUOperators {
 			if gpuOperator, exists := mgr.olmOperators[filteredGPU]; exists {
 				deps, err := gpuOperator.GetDependencies(&common.Cluster{})
 				if err == nil {
 					for _, dep := range deps {
 						dependenciesToRemove[dep] = true
 						mgr.log.Infof("Filtering out dependency %s of GPU operator %s", dep, filteredGPU)
 					}
 				}
 			}
 		}
 
+		for _, operator := range filtered {
+			if olmOperator, exists := mgr.olmOperators[operator.Name]; exists {
+				deps, err := olmOperator.GetDependencies(&common.Cluster{})
+				if err == nil {
+					for _, dep := range deps {
+						dependenciesToKeep[dep] = true
+					}
+				}
+			}
+		}
+
 		// Apply dependency filtering
 		finalFiltered := make([]*models.MonitoredOperator, 0, len(filtered))
 		for _, operator := range filtered {
-			if !dependenciesToRemove[operator.Name] {
+			if !dependenciesToRemove[operator.Name] || dependenciesToKeep[operator.Name] {
 				finalFiltered = append(finalFiltered, operator)
 			}
 		}
 		filtered = finalFiltered
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 595 - 617, The current logic
builds dependenciesToRemove from filteredGPUOperators and removes them
unconditionally, which drops shared deps still needed by retained operators;
change it to only remove a dependency when no retained operator in filtered
depends on it by: after computing dependenciesToRemove from
filteredGPUOperators, compute a requiredDeps map by iterating the remaining
operators in filtered (models.MonitoredOperator) and, for each, look up
mgr.olmOperators[operator.Name] and call GetDependencies to mark deps that must
be kept; then filter out only those dependencies that are not present in
requiredDeps before rebuilding filtered (finalFiltered). Ensure you handle
GetDependencies errors the same way as existing code and reference
filteredGPUOperators, mgr.olmOperators, GetDependencies, dependenciesToRemove,
filtered, and models.MonitoredOperator when locating places to change.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/operators/manager.go`:
- Around line 542-578: The parsed GPU filter currently uses plain bools so any
missing gpu_filter keys default to false; change
GPUFilterProperties.GPUFilter.NvidiaEnabled and AmdEnabled to *bool so omissions
are distinguishable, then in applyGPUFilterFromProperties only set the resulting
GPUFilter's NvidiaEnabled/AmdEnabled pointers when the parsed *bool is non-nil
(i.e., assign &props.GPUFilter.NvidiaEnabled value only if
props.GPUFilter.NvidiaEnabled != nil), leaving gpuFilter nil or its fields nil
when omitted so downstream ResolveDependencies/ensureAllGPUOperators preserve
default-enabled behavior; update types and the conversion logic in
applyGPUFilterFromProperties accordingly (refer to GPUFilterProperties and
applyGPUFilterFromProperties).

---

Duplicate comments:
In `@internal/operators/manager.go`:
- Around line 595-617: The current logic builds dependenciesToRemove from
filteredGPUOperators and removes them unconditionally, which drops shared deps
still needed by retained operators; change it to only remove a dependency when
no retained operator in filtered depends on it by: after computing
dependenciesToRemove from filteredGPUOperators, compute a requiredDeps map by
iterating the remaining operators in filtered (models.MonitoredOperator) and,
for each, look up mgr.olmOperators[operator.Name] and call GetDependencies to
mark deps that must be kept; then filter out only those dependencies that are
not present in requiredDeps before rebuilding filtered (finalFiltered). Ensure
you handle GetDependencies errors the same way as existing code and reference
filteredGPUOperators, mgr.olmOperators, GetDependencies, dependenciesToRemove,
filtered, and models.MonitoredOperator when locating places to change.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: db36f77d-82f2-489f-b877-d6db294f3ce6

📥 Commits

Reviewing files that changed from the base of the PR and between adb4d66 and f7d35a0.

📒 Files selected for processing (2)

docs/user-guide/additional-operator-notes.md
internal/operators/manager.go

internal/operators/manager.go

…lues

openshift-ci-robot · 2026-03-11T13:48:29Z

@shay23bra: This pull request references MGMT-23234 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (2)

internal/operators/manager.go (2)

914-938: ⚠️ Potential issue | 🔴 Critical

Same shared dependency removal bug exists here.

This dependency removal logic has the same issue flagged above in applyGPUFilterFromProperties (Lines 598-621). Dependencies are removed without checking if other included operators still need them. Apply the same fix pattern: collect dependencies of remaining included operators and only remove a dependency if no remaining operator depends on it.

,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 914 - 938, The code collects
dependencies of filtered GPU operators into dependenciesToRemove and
unconditionally removes them from includedOperators, which breaks shared
dependencies; update the logic in the block that references
filteredGPUOperators, gpuFilter, mgr.olmOperators and GetDependencies so you
first compute the set of dependencies still required by the remaining
includedOperators (iterate over includedOperators after removing
filteredGPUOperators, call GetDependencies on each remaining operator and
aggregate neededDeps), then only remove a dependency from includedOperators if
it is not present in neededDeps; replace the current dependenciesToRemove-only
approach with this two-step: build neededDeps from remaining operators, then
filter includedOperators by excluding ops that are both in the filtered set and
not required by neededDeps.

598-621: ⚠️ Potential issue | 🔴 Critical

Dependency removal logic incorrectly removes shared dependencies between operators.

The code removes an operator from the result if its name appears in dependenciesToRemove, without checking if other included operators still depend on it.

Example scenario:

OpenStack Controller (OSC) and Node Feature Discovery (NFD) are enabled
User disables NVIDIA GPU operator (which depends on NFD)
Code marks NFD as a dependency to remove (from NVIDIA)
NFD gets filtered out, breaking OSC which still needs it

The fix should only remove a dependency if no included operator depends on it.

,

🐛 Proposed fix to preserve shared dependencies

 	// Remove dependencies of filtered GPU operators
 	if len(filteredGPUOperators) > 0 {
 		dependenciesToRemove := make(map[string]bool)
+		dependenciesToKeep := make(map[string]bool)
+
 		for _, filteredGPU := range filteredGPUOperators {
 			if gpuOperator, exists := mgr.olmOperators[filteredGPU]; exists {
 				deps, err := gpuOperator.GetDependencies(&common.Cluster{})
 				if err == nil {
 					for _, dep := range deps {
 						dependenciesToRemove[dep] = true
-						mgr.log.Infof("Filtering out dependency %s of GPU operator %s", dep, filteredGPU)
 					}
 				}
 			}
 		}
 
+		// Collect dependencies of remaining included operators
+		for _, op := range filtered {
+			if operator, exists := mgr.olmOperators[op.Name]; exists {
+				deps, err := operator.GetDependencies(&common.Cluster{})
+				if err == nil {
+					for _, dep := range deps {
+						dependenciesToKeep[dep] = true
+					}
+				}
+			}
+		}
+
 		// Apply dependency filtering
 		finalFiltered := make([]*models.MonitoredOperator, 0, len(filtered))
 		for _, operator := range filtered {
-			if !dependenciesToRemove[operator.Name] {
+			if !dependenciesToRemove[operator.Name] || dependenciesToKeep[operator.Name] {
 				finalFiltered = append(finalFiltered, operator)
+			} else {
+				mgr.log.Infof("Filtering out dependency %s of removed GPU operator", operator.Name)
 			}
 		}
 		filtered = finalFiltered
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 598 - 621, The dependency-removal
collects dependenciesToRemove from filteredGPUOperators but then unconditionally
filters out any operator whose name is in dependenciesToRemove, which
incorrectly removes shared dependencies still required by other included
operators; change the logic to compute dependency reference counts among the
operators that will remain (use the same filtered slice of
*models.MonitoredOperator), by calling GetDependencies on mgr.olmOperators for
each operator in filtered to build a map[string]int of how many included
operators depend on each dependency, then remove an operator only if
dependenciesToRemove[name] is true AND the dependency count for that name is
zero; update the filtering step that builds finalFiltered to consult that
reference-count map instead of the boolean dependenciesToRemove map so shared
dependencies are preserved.

🧹 Nitpick comments (2)

internal/operators/manager.go (2)

952-962: Use constants instead of string literals for operator names.

Same issue as above - use nvidiagpu.Operator.Name and amdgpu.Operator.Name constants instead of string literals for consistency and maintainability.

♻️ Proposed fix

 	switch operatorName {
-	case "nvidia-gpu":
+	case nvidiagpu.Operator.Name:
 		// Skip nvidia-gpu if nvidia is explicitly disabled
 		if gpuFilter.NvidiaEnabled != nil && !*gpuFilter.NvidiaEnabled {
 			return true
 		}
-	case "amd-gpu":
+	case amdgpu.Operator.Name:
 		// Skip amd-gpu if amd is explicitly disabled
 		if gpuFilter.AmdEnabled != nil && !*gpuFilter.AmdEnabled {
 			return true
 		}
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 952 - 962, Replace the string
literals "nvidia-gpu" and "amd-gpu" in the operatorName switch with the
canonical constants exported by the GPU operator packages (use
nvidiagpu.Operator.Name and amdgpu.Operator.Name) so the switch matches the
single source-of-truth names; update the two case labels in the switch inside
manager.go (the block that checks gpuFilter.NvidiaEnabled and
gpuFilter.AmdEnabled) to use those constants instead of raw strings.

902-903: Use constants instead of string literals for operator names.

Using string literals "nvidia-gpu" and "amd-gpu" is inconsistent with other parts of the code that use nvidiagpu.Operator.Name and amdgpu.Operator.Name. If these constants change, this code would silently break.

♻️ Proposed fix

-				if operatorName == "nvidia-gpu" || operatorName == "amd-gpu" {
+				if operatorName == nvidiagpu.Operator.Name || operatorName == amdgpu.Operator.Name {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@internal/operators/manager.go` around lines 902 - 903, Replace the string
literals "nvidia-gpu" and "amd-gpu" with the corresponding operator name
constants to avoid fragile comparisons: update the conditional that checks
operatorName (where filteredGPUOperators is appended) to compare against
nvidiagpu.Operator.Name and amdgpu.Operator.Name instead of the hard-coded
strings so future renames stay consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/operators/manager.go`:
- Around line 626-643: The function ensureAllGPUOperators currently
shallow-copies the operators slice into result, so setting
operator.DependencyOnly = true mutates the original MonitoredOperator pointers;
fix by deep-copying each element into new MonitoredOperator instances before
modifying them (e.g., iterate the input operators and append copies to result),
then perform the DependencyOnly updates and the checks against
nvidiagpu.Operator.Name and amdgpu.Operator.Name on these copies to avoid
mutating the caller's slice.

---

Duplicate comments:
In `@internal/operators/manager.go`:
- Around line 914-938: The code collects dependencies of filtered GPU operators
into dependenciesToRemove and unconditionally removes them from
includedOperators, which breaks shared dependencies; update the logic in the
block that references filteredGPUOperators, gpuFilter, mgr.olmOperators and
GetDependencies so you first compute the set of dependencies still required by
the remaining includedOperators (iterate over includedOperators after removing
filteredGPUOperators, call GetDependencies on each remaining operator and
aggregate neededDeps), then only remove a dependency from includedOperators if
it is not present in neededDeps; replace the current dependenciesToRemove-only
approach with this two-step: build neededDeps from remaining operators, then
filter includedOperators by excluding ops that are both in the filtered set and
not required by neededDeps.
- Around line 598-621: The dependency-removal collects dependenciesToRemove from
filteredGPUOperators but then unconditionally filters out any operator whose
name is in dependenciesToRemove, which incorrectly removes shared dependencies
still required by other included operators; change the logic to compute
dependency reference counts among the operators that will remain (use the same
filtered slice of *models.MonitoredOperator), by calling GetDependencies on
mgr.olmOperators for each operator in filtered to build a map[string]int of how
many included operators depend on each dependency, then remove an operator only
if dependenciesToRemove[name] is true AND the dependency count for that name is
zero; update the filtering step that builds finalFiltered to consult that
reference-count map instead of the boolean dependenciesToRemove map so shared
dependencies are preserved.

---

Nitpick comments:
In `@internal/operators/manager.go`:
- Around line 952-962: Replace the string literals "nvidia-gpu" and "amd-gpu" in
the operatorName switch with the canonical constants exported by the GPU
operator packages (use nvidiagpu.Operator.Name and amdgpu.Operator.Name) so the
switch matches the single source-of-truth names; update the two case labels in
the switch inside manager.go (the block that checks gpuFilter.NvidiaEnabled and
gpuFilter.AmdEnabled) to use those constants instead of raw strings.
- Around line 902-903: Replace the string literals "nvidia-gpu" and "amd-gpu"
with the corresponding operator name constants to avoid fragile comparisons:
update the conditional that checks operatorName (where filteredGPUOperators is
appended) to compare against nvidiagpu.Operator.Name and amdgpu.Operator.Name
instead of the hard-coded strings so future renames stay consistent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d4630a39-fc8c-4d65-9504-95b9bef3e5fe

📥 Commits

Reviewing files that changed from the base of the PR and between f7d35a0 and 751ae57.

📒 Files selected for processing (1)

internal/operators/manager.go

coderabbitai · 2026-03-11T13:52:57Z

internal/operators/manager.go

+// ensureAllGPUOperators ensures both nvidia-gpu and amd-gpu operators are included as dependencies (backward compatibility)
+func (mgr *Manager) ensureAllGPUOperators(operators []*models.MonitoredOperator) []*models.MonitoredOperator {
+	result := make([]*models.MonitoredOperator, len(operators))
+	copy(result, operators)
+
+	// Track which GPU operators are already present
+	hasNvidia := false
+	hasAmd := false
+
+	for _, operator := range result {
+		if operator.Name == nvidiagpu.Operator.Name {
+			operator.DependencyOnly = true
+			hasNvidia = true
+		} else if operator.Name == amdgpu.Operator.Name {
+			operator.DependencyOnly = true
+			hasAmd = true
+		}
+	}


⚠️ Potential issue | 🟡 Minor

Shallow copy causes unintended mutation of input operators.

The copy() function copies pointers, so modifying operator.DependencyOnly = true in the loop affects the original operators slice passed by the caller. This could cause unexpected side effects.

🛡️ Proposed fix to deep copy operators before modification

func (mgr *Manager) ensureAllGPUOperators(operators []*models.MonitoredOperator) []*models.MonitoredOperator { - result := make([]*models.MonitoredOperator, len(operators)) - copy(result, operators) + result := make([]*models.MonitoredOperator, 0, len(operators)) + for _, op := range operators { + // Create a copy to avoid modifying the original + opCopy := *op + result = append(result, &opCopy) + } // Track which GPU operators are already present hasNvidia := false hasAmd := false - for _, operator := range result { + for i, operator := range result { if operator.Name == nvidiagpu.Operator.Name { - operator.DependencyOnly = true + result[i].DependencyOnly = true hasNvidia = true } else if operator.Name == amdgpu.Operator.Name { - operator.DependencyOnly = true + result[i].DependencyOnly = true hasAmd = true } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@internal/operators/manager.go` around lines 626 - 643, The function ensureAllGPUOperators currently shallow-copies the operators slice into result, so setting operator.DependencyOnly = true mutates the original MonitoredOperator pointers; fix by deep-copying each element into new MonitoredOperator instances before modifying them (e.g., iterate the input operators and append copies to result), then perform the DependencyOnly updates and the checks against nvidiagpu.Operator.Name and amdgpu.Operator.Name on these copies to avoid mutating the caller's slice.

openshift-ci · 2026-03-11T16:28:35Z

@shay23bra: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/images	`751ae57`	link	true	`/test images`
ci/prow/edge-e2e-metal-assisted-virtualization-4-21	`751ae57`	link	true	`/test edge-e2e-metal-assisted-virtualization-4-21`
ci/prow/okd-scos-images	`751ae57`	link	true	`/test okd-scos-images`
ci/prow/e2e-agent-compact-ipv4	`751ae57`	link	true	`/test e2e-agent-compact-ipv4`
ci/prow/edge-e2e-metal-assisted-lvm-4-21	`3363470`	link	true	`/test edge-e2e-metal-assisted-lvm-4-21`
ci/prow/edge-e2e-ai-operator-ztp	`751ae57`	link	true	`/test edge-e2e-ai-operator-ztp`
ci/prow/edge-e2e-metal-assisted-openshift-ai-4-20	`751ae57`	link	true	`/test edge-e2e-metal-assisted-openshift-ai-4-20`
ci/prow/edge-e2e-metal-assisted-osc-4-21	`3363470`	link	true	`/test edge-e2e-metal-assisted-osc-4-21`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 10, 2026

openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 10, 2026

openshift-ci bot requested review from javipolo and ori-amizur March 10, 2026 16:32

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

openshift-ci bot added the api-review Categorizes an issue or PR as actively needing an API review. label Mar 10, 2026

Add nvidia_enabled and amd_enabled parameters

55abe77

shay23bra force-pushed the MGMT-23234-BE-Improve-the-UX-around-the-amd-nvidia-operators-selection-add-profile-to-selection branch from d46660d to 55abe77 Compare March 10, 2026 16:34

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

fix: filter GPU operator dependencies in bundle discovery

adb4d66

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

internal/operators/manager.go Show resolved Hide resolved

implement properties-based GPU filtering in ResolveDependencies for i…

f7d35a0

…nstallation flow

coderabbitai bot reviewed Mar 10, 2026

View reviewed changes

internal/operators/manager.go Show resolved Hide resolved

shay23bra added 2 commits March 10, 2026 22:04

fix GPU filter properties to distinguish omitted vs explicit false va…

3363470

…lues

fix variable shadowing in ResolveDependencies function

751ae57

coderabbitai bot reviewed Mar 11, 2026

View reviewed changes

Conversation

shay23bra commented Mar 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (2 warnings)

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openshift-ci bot commented Mar 10, 2026

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Mar 10, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-ci-robot commented Mar 11, 2026 • edited by openshift-ci bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci bot commented Mar 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

shay23bra commented Mar 10, 2026 •

edited by coderabbitai bot

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

coderabbitai bot commented Mar 10, 2026 •

edited

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

codecov bot commented Mar 10, 2026 •

edited

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 10, 2026 •

edited by openshift-ci bot

Loading

openshift-ci-robot commented Mar 11, 2026 •

edited by openshift-ci bot

Loading