Skip to content

chore(notebook-controller): fix of some of the linter errors#780

Open
jstourac wants to merge 1 commit intoopendatahub-io:mainfrom
jstourac:fixGolangciLint
Open

chore(notebook-controller): fix of some of the linter errors#780
jstourac wants to merge 1 commit intoopendatahub-io:mainfrom
jstourac:fixGolangciLint

Conversation

@jstourac
Copy link
Member

@jstourac jstourac commented Mar 18, 2026

https://redhat.atlassian.net/browse/RHOAIENG-12107

Description

How Has This Been Tested?

Merge criteria:

  • The commits are squashed in a cohesive manner and have meaningful messages.
  • Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
  • The developer has manually tested the changes and verified that the changes work

Summary by CodeRabbit

  • Bug Fixes

    • Improved HTTP response cleanup to reduce resource leaks.
    • Stronger test setup validation to fail fast on bad environment configuration.
    • Standardized and simplified log messages for clearer diagnostics.
    • More consistent propagation of labels/annotations to workload and networking resources.
  • Chores

    • Internal logic and control-flow adjustments to increase stability and maintainability.

@jstourac jstourac self-assigned this Mar 18, 2026
@openshift-ci openshift-ci bot requested review from caponetto and paulovmr March 18, 2026 18:44
@openshift-ci
Copy link

openshift-ci bot commented Mar 18, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from jstourac. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

This diff modifies HTTP response-body handling to log Close() errors via deferred closures; removes explicit nil checks for kernel/terminal slices, treating only empty length as "no data"; changes SetStopAnnotation to unconditionally delete the notebooks.kubeflow.org/last_activity annotation; moves logger initialization in UpdateNotebookLastActivityAnnotation after a nil-meta early return; updates StatefulSet/VirtualService label and annotation sourcing to use instance.Labels/instance.Annotations and to set pod-template Spec.Template.{Labels,Annotations}; and tightens test setup to set env vars before calling initGlobalVars() with error checks.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Security & Logic Findings

  • culling_controller.go / pkg/culler/culler.go — Response body Close logging: deferring a closure that logs Close() errors is fine, but ensure Close() is always reached even on earlier returns and that large numbers of open responses cannot exhaust connections (CWE-772). Action: confirm client reuse semantics and add tests for connection reuse under failure paths.

  • culler.go — Removal of nil slice checks: changing from kernels == nil || len(...) == 0 to len(...) == 0 assumes callers never pass nil slices. If a nil slice can occur, this risks a nil dereference or logic gap (CWE-476). Action: enforce non-nil slice creation at call sites or restore explicit nil guard.

  • culler.go — Unconditional delete of last_activity annotation: calling delete(meta.GetAnnotations(), ...) without verifying annotations map may be a no-op if map is nil; however meta.GetAnnotations() may return nil, causing a panic when deleting from nil map is allowed in Go but using nil map is fine for delete — still confirm intention. Action: document behavioral change and add unit tests covering absent annotation and nil annotations map.

  • notebook_controller.go — Label/annotation source shift: propagating Notebook labels/annotations into Pod template alters pod identity and may affect label-based selectors, NetworkPolicy, or Service behavior. Action: review all consumers of Notebook/ObjectMeta labels and update tests to ensure no regression.

  • tests — TestNotebookIsIdle: test now fatals on environment setup errors and calls initGlobalVars() after env mutation. Action: ensure test isolation and restore original global state after tests to avoid flakiness.

No CVE identifiers identified in this diff.

🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix of some of the linter errors' directly reflects the PR's stated objective to fix golangci-lint errors in the notebook-controller, matching the actual changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot added size/m and removed size/m labels Mar 18, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@components/notebook-controller/controllers/culling_controller_test.go`:
- Around line 254-257: The test mutates global env with os.Setenv inside the
t.Run subtests (loop setting envVar/val) causing cross-test leakage; replace
os.Setenv calls with t.Setenv(envVar, val) inside the subtest so the testing
framework automatically restores the environment after each subtest (update the
loop that currently calls os.Setenv(err := os.Setenv(envVar, val)) to use
t.Setenv(envVar, val) and remove manual cleanup).

In `@components/notebook-controller/controllers/culling_controller.go`:
- Around line 285-291: The code currently decodes resp.Body directly via
json.NewDecoder(resp.Body).Decode(&kernels) which allows an attacker to stream
arbitrarily large JSON and exhaust memory; fix by wrapping the response body
with a size limit (e.g. io.LimitReader(resp.Body, maxBodySize)) or read into an
io.LimitedReader before creating the decoder, define a reasonable constant like
maxNotebookResponseBytes, then replace json.NewDecoder(resp.Body) with
json.NewDecoder(io.LimitReader(resp.Body, maxNotebookResponseBytes)) and handle
the case where the limit is hit (return an explicit error/log) before closing
resp.Body; ensure the same pattern is applied anywhere resp.Body is decoded
(references: resp.Body, json.NewDecoder, kernels).

In `@components/notebook-controller/controllers/notebook_controller.go`:
- Around line 190-192: The template labels on the StatefulSet are only being
updated inside the replica-change conditional (the block comparing replicas), so
label-only updates leave foundStateful.Spec.Template.Labels stale; move the
block that compares and assigns Template.Labels (the check using
reflect.DeepEqual on foundStateful.Spec.Template.Labels vs
ss.Spec.Template.Labels and the assignment foundStateful.Spec.Template.Labels =
ss.Spec.Template.Labels) out of the replica-reconciliation branch so it runs
unconditionally whenever CopyStatefulSetFields has synced ObjectMeta labels;
ensure this runs every reconciliation pass (e.g., after CopyStatefulSetFields)
so ss/ foundStateful template labels are always kept in sync.

In `@components/notebook-controller/pkg/culler/culler.go`:
- Line 136: The code is deleting the wrong annotation key (uses
"notebooks.kubeflow.org/last_activity" with an underscore), leaving the actual
LAST_ACTIVITY_ANNOTATION (".../last-activity") intact; update the deletion to
remove the correct annotation by calling delete(meta.GetAnnotations(),
LAST_ACTIVITY_ANNOTATION) or using the hyphenated string constant, ensuring the
old underscore key is no longer relied upon.
- Around line 189-195: The code decodes HTTP response bodies with
json.NewDecoder(resp.Body) into kernels without bounds; replace direct use of
resp.Body with an io.LimitReader to prevent memory exhaustion (e.g., lr :=
io.LimitReader(resp.Body, maxJSONSize)) and pass lr to json.NewDecoder instead;
add a package-level constant like maxJSONSize (e.g., 10<<20 for 10MB) and ensure
io is imported; apply the same change for the other json.Decode usage around the
later decode (the second json.NewDecoder call) and keep the existing
resp.Body.Close() defer.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: a1cbf2b7-4f79-4854-b001-1fab0cda4209

📥 Commits

Reviewing files that changed from the base of the PR and between cbb479d and 72ab393.

📒 Files selected for processing (4)
  • components/notebook-controller/controllers/culling_controller.go
  • components/notebook-controller/controllers/culling_controller_test.go
  • components/notebook-controller/controllers/notebook_controller.go
  • components/notebook-controller/pkg/culler/culler.go

Comment on lines +254 to +257
if err := os.Setenv(envVar, val); err != nil {
t.Fatalf("Failed to set env var %s: %v", envVar, err)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# Check the file and get context around lines 254-257
wc -l components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 140


🏁 Script executed:

# Read the file around the specified lines with context
sed -n '240,270p' components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 792


🏁 Script executed:

# Get a broader view of the test structure to understand if it's using subtests
sed -n '200,280p' components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 1819


🏁 Script executed:

# Check Go version requirement
fd -e "go.mod" | head -1 | xargs cat | grep "^go "

Repository: opendatahub-io/kubeflow

Length of output: 49


🏁 Script executed:

# Verify the actual current code is as expected
sed -n '254,257p' components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 187


🏁 Script executed:

# Check the test function signature to understand the testing context better
sed -n '200,260p' components/notebook-controller/controllers/culling_controller_test.go | head -70

Repository: opendatahub-io/kubeflow

Length of output: 1651


🏁 Script executed:

# Look for the test function name and start
rg "func Test.*notebookIsIdle" -A 5 components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 49


🏁 Script executed:

# Search for the test function that contains the code around line 254
rg "func Test" components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 237


🏁 Script executed:

# Get the full context of the issue - from the test function declaration onwards
sed -n '170,280p' components/notebook-controller/controllers/culling_controller_test.go

Repository: opendatahub-io/kubeflow

Length of output: 2498


Use t.Setenv to avoid cross-test environment leakage.

The os.Setenv calls inside t.Run subtests mutate global process state without cleanup. Subtests with empty env maps will inherit environment variables from prior test cases, creating order-dependent test behavior. Replace with t.Setenv, which automatically restores the environment when each subtest completes.

Proposed fix
-			for envVar, val := range c.env {
-				if err := os.Setenv(envVar, val); err != nil {
-					t.Fatalf("Failed to set env var %s: %v", envVar, err)
-				}
-			}
+			for envVar, val := range c.env {
+				t.Setenv(envVar, val)
+			}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if err := os.Setenv(envVar, val); err != nil {
t.Fatalf("Failed to set env var %s: %v", envVar, err)
}
}
for envVar, val := range c.env {
t.Setenv(envVar, val)
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/controllers/culling_controller_test.go` around
lines 254 - 257, The test mutates global env with os.Setenv inside the t.Run
subtests (loop setting envVar/val) causing cross-test leakage; replace os.Setenv
calls with t.Setenv(envVar, val) inside the subtest so the testing framework
automatically restores the environment after each subtest (update the loop that
currently calls os.Setenv(err := os.Setenv(envVar, val)) to use t.Setenv(envVar,
val) and remove manual cleanup).

Comment on lines +285 to 291
defer func() {
if err := resp.Body.Close(); err != nil {
log.Error(err, "Error closing response body.")
}
}()
err := json.NewDecoder(resp.Body).Decode(&kernels)
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

fd -t f "culling_controller.go" --exec wc -l {} \;

Repository: opendatahub-io/kubeflow

Length of output: 137


🏁 Script executed:

fd -t f "culling_controller.go" --exec cat -n {} \; | sed -n '280,320p'

Repository: opendatahub-io/kubeflow

Length of output: 1310


🏁 Script executed:

fd -t f "culling_controller.go" --exec grep -n "json.NewDecoder" {} \;

Repository: opendatahub-io/kubeflow

Length of output: 180


🏁 Script executed:

fd -t f "culling_controller.go" --exec head -n 30 {} \; | grep -E "^import|^import \(|\"io\"|\"encoding/json"|\"net/http"

Repository: opendatahub-io/kubeflow

Length of output: 135


🏁 Script executed:

fd -t f "culling_controller.go" --exec cat -n {} \; | sed -n '1,50p'

Repository: opendatahub-io/kubeflow

Length of output: 2079


Critical: bound Notebook API body decoding to prevent memory-exhaustion DoS (CWE-400).

Lines 290 and 314 decode HTTP responses without size limits. A compromised or malicious notebook endpoint can exhaust controller memory by streaming oversized JSON, destabilizing reconciliation.

Fix
 import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"io"
 	"net/http"
 	"os"
 	"strconv"
 	"time"
@@
 const DEFAULT_DEV = "false"
+const MAX_NOTEBOOK_API_BODY_BYTES int64 = 1 << 20 // 1 MiB
@@
-	err := json.NewDecoder(resp.Body).Decode(&kernels)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&kernels)
@@
-	err := json.NewDecoder(resp.Body).Decode(&terminals)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&terminals)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/controllers/culling_controller.go` around
lines 285 - 291, The code currently decodes resp.Body directly via
json.NewDecoder(resp.Body).Decode(&kernels) which allows an attacker to stream
arbitrarily large JSON and exhaust memory; fix by wrapping the response body
with a size limit (e.g. io.LimitReader(resp.Body, maxBodySize)) or read into an
io.LimitedReader before creating the decoder, define a reasonable constant like
maxNotebookResponseBytes, then replace json.NewDecoder(resp.Body) with
json.NewDecoder(io.LimitReader(resp.Body, maxNotebookResponseBytes)) and handle
the case where the limit is hit (return an explicit error/log) before closing
resp.Body; ensure the same pattern is applied anywhere resp.Body is decoded
(references: resp.Body, json.NewDecoder, kernels).

Comment on lines +190 to 192
if !reflect.DeepEqual(foundStateful.Spec.Template.Labels, ss.Spec.Template.Labels) {
foundStateful.Spec.Template.Labels = ss.Spec.Template.Labels
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

# 1) Inspect CopyStatefulSetFields behavior for ObjectMeta/Template label reconciliation
rg -nP --type=go 'func\s+CopyStatefulSetFields\s*\(' -C20

# 2) Show all StatefulSet label assignments in notebook controller
rg -nP --type=go 'foundStateful\.(Labels|Spec\.Template\.Labels)|ss\.(Labels|Spec\.Template\.Labels)' components/notebook-controller/controllers/notebook_controller.go -C3

# 3) Re-check test expectations around StatefulSet object labels
rg -nP 'Expect\(sts\.GetLabels\(\)\)\.To\(Equal\(notebook\.GetLabels\(\)\)\)' components/notebook-controller/controllers/notebook_controller_bdd_test.go -C3

Repository: opendatahub-io/kubeflow

Length of output: 3997


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Get the full reconciliation logic around lines 189-192
sed -n '170,210p' components/notebook-controller/controllers/notebook_controller.go

# Check if CopyStatefulSetFields is called before or after the replica check
rg -nP 'CopyStatefulSetFields' components/notebook-controller/controllers/notebook_controller.go -B5 -A5

# Verify what triggers the entire reconciliation block
sed -n '180,200p' components/notebook-controller/controllers/notebook_controller.go

Repository: opendatahub-io/kubeflow

Length of output: 2991


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Search for any other template label assignments/reconciliation in notebook controller
rg -nP 'Spec\.Template\.Labels\s*=' components/notebook-controller/controllers/notebook_controller.go

# Verify CopyStatefulSetFields doesn't handle template labels
sed -n '107,140p' components/common/reconcilehelper/util.go

Repository: opendatahub-io/kubeflow

Length of output: 933


Template labels lag behind ObjectMeta labels on label-only updates.

Template label reconciliation is incorrectly gated by the replica check (lines 189–192). When a Notebook's labels change without replica scaling, CopyStatefulSetFields syncs ObjectMeta labels unconditionally, but Template.Labels remain stale. This causes pod template label drift, breaking pod scheduling selectors, network policies, and label-based identity.

Move the template label sync outside the replica condition:

Fix
-	if ss.Spec.Replicas != nil && foundStateful.Spec.Replicas != nil && *ss.Spec.Replicas != *foundStateful.Spec.Replicas {
-		if !reflect.DeepEqual(foundStateful.Spec.Template.Labels, ss.Spec.Template.Labels) {
-			foundStateful.Spec.Template.Labels = ss.Spec.Template.Labels
-		}
-	}
+	if !reflect.DeepEqual(foundStateful.Spec.Template.Labels, ss.Spec.Template.Labels) {
+		foundStateful.Spec.Template.Labels = ss.Spec.Template.Labels
+	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/controllers/notebook_controller.go` around
lines 190 - 192, The template labels on the StatefulSet are only being updated
inside the replica-change conditional (the block comparing replicas), so
label-only updates leave foundStateful.Spec.Template.Labels stale; move the
block that compares and assigns Template.Labels (the check using
reflect.DeepEqual on foundStateful.Spec.Template.Labels vs
ss.Spec.Template.Labels and the assignment foundStateful.Spec.Template.Labels =
ss.Spec.Template.Labels) out of the replica-reconciliation branch so it runs
unconditionally whenever CopyStatefulSetFields has synced ObjectMeta labels;
ensure this runs every reconciliation pass (e.g., after CopyStatefulSetFields)
so ss/ foundStateful template labels are always kept in sync.

delete(meta.GetAnnotations(), "notebooks.kubeflow.org/last_activity")
}
}
delete(meta.GetAnnotations(), "notebooks.kubeflow.org/last_activity")
Copy link

@coderabbitai coderabbitai bot Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Wrong annotation key is deleted; stale last-activity remains.

The code deletes "notebooks.kubeflow.org/last_activity" (underscore), but the active key is LAST_ACTIVITY_ANNOTATION (.../last-activity, hyphen). This leaves the intended annotation untouched.

Proposed fix
-	delete(meta.GetAnnotations(), "notebooks.kubeflow.org/last_activity")
+	delete(meta.GetAnnotations(), LAST_ACTIVITY_ANNOTATION)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
delete(meta.GetAnnotations(), "notebooks.kubeflow.org/last_activity")
delete(meta.GetAnnotations(), LAST_ACTIVITY_ANNOTATION)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/pkg/culler/culler.go` at line 136, The code is
deleting the wrong annotation key (uses "notebooks.kubeflow.org/last_activity"
with an underscore), leaving the actual LAST_ACTIVITY_ANNOTATION
(".../last-activity") intact; update the deletion to remove the correct
annotation by calling delete(meta.GetAnnotations(), LAST_ACTIVITY_ANNOTATION) or
using the hyphenated string constant, ensuring the old underscore key is no
longer relied upon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jstourac lets check this , please is the annotation really been wrong all this while ?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it does seem like that - as I mentioned below, I didn't want to do any other changes here except the linter fixes. I have a followup branch (wip PR) with changes that should address all the comments from the bot here including this one https://github.com/opendatahub-io/kubeflow/pull/781/changes#diff-bfc792d035f63eea8527be116419aca884baf8d8bc2c3e758408e4e7e621af7eR137 . Will open a separate smaller PRs later on so that it's easier to review and faster to merge 🙂

Comment on lines +189 to 195
defer func() {
if err := resp.Body.Close(); err != nil {
log.Error(err, "Error closing response body.")
}
}()
err := json.NewDecoder(resp.Body).Decode(&kernels)
if err != nil {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n components/notebook-controller/pkg/culler/culler.go | sed -n '180,230p'

Repository: opendatahub-io/kubeflow

Length of output: 1748


🏁 Script executed:

cat -n components/notebook-controller/pkg/culler/culler.go | sed -n '1,180p' | tail -100

Repository: opendatahub-io/kubeflow

Length of output: 3611


🏁 Script executed:

head -50 components/notebook-controller/pkg/culler/culler.go

Repository: opendatahub-io/kubeflow

Length of output: 1728


🏁 Script executed:

rg -n "json\.(NewDecoder|Unmarshal)" components/notebook-controller/pkg/culler/culler.go

Repository: opendatahub-io/kubeflow

Length of output: 180


CWE-400: Unbounded JSON decode on HTTP response bodies allows memory exhaustion (lines 194, 218).

Severity: Critical. Notebook API response body is decoded without size limits; malicious or oversized response exhausts controller memory and disrupts reconciliation.

Proposed fix
 import (
 	"encoding/json"
 	"fmt"
+	"io"
 	"net/http"
 	"os"
 	"strconv"
 	"time"
@@
 const DEFAULT_DEV = "false"
+const MAX_NOTEBOOK_API_BODY_BYTES int64 = 1 << 20 // 1 MiB
@@
-	err := json.NewDecoder(resp.Body).Decode(&kernels)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&kernels)
@@
-	err := json.NewDecoder(resp.Body).Decode(&terminals)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&terminals)

Per Go security guidelines: Use io.LimitReader for HTTP response bodies to prevent memory exhaustion.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
defer func() {
if err := resp.Body.Close(); err != nil {
log.Error(err, "Error closing response body.")
}
}()
err := json.NewDecoder(resp.Body).Decode(&kernels)
if err != nil {
defer func() {
if err := resp.Body.Close(); err != nil {
log.Error(err, "Error closing response body.")
}
}()
err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&kernels)
if err != nil {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/pkg/culler/culler.go` around lines 189 - 195,
The code decodes HTTP response bodies with json.NewDecoder(resp.Body) into
kernels without bounds; replace direct use of resp.Body with an io.LimitReader
to prevent memory exhaustion (e.g., lr := io.LimitReader(resp.Body,
maxJSONSize)) and pass lr to json.NewDecoder instead; add a package-level
constant like maxJSONSize (e.g., 10<<20 for 10MB) and ensure io is imported;
apply the same change for the other json.Decode usage around the later decode
(the second json.NewDecoder call) and keep the existing resp.Body.Close() defer.

@jstourac jstourac mentioned this pull request Mar 18, 2026
3 tasks
@jstourac
Copy link
Member Author

@coderabbitai - all your suggestions seem to be truly reasonable - I want to keep this PR clean just for the golang linter error fixes, but I have opened the #781 PR now which will address all your suggestions eventually.

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

Tip

For best results, initiate chat on the files or code changes.

@jstourac: Understood — keeping this PR scoped to linter fixes is a clean approach. I'll track the remaining suggestions in #781. Thanks for the heads-up!

@jstourac jstourac requested review from andyatmiami and harshad16 and removed request for caponetto and paulovmr March 20, 2026 20:50
@openshift-ci openshift-ci bot added size/m and removed size/m labels Mar 25, 2026
@rhods-ci-bot
Copy link

/group-test

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
components/notebook-controller/controllers/culling_controller_test.go (1)

253-260: ⚠️ Potential issue | 🟡 Minor

Use t.Setenv to prevent subtest environment leakage.

Line 254 mutates process-global env with os.Setenv; this makes subtests order-dependent.

Remediation diff
 		t.Run(c.testName, func(t *testing.T) {
+			for _, envVar := range []string{
+				"DEV", "CULL_IDLE_TIME", "ENABLE_CULLING", "CLUSTER_DOMAIN", "IDLENESS_CHECK_PERIOD",
+			} {
+				t.Setenv(envVar, "")
+			}
 			for envVar, val := range c.env {
-				if err := os.Setenv(envVar, val); err != nil {
-					t.Fatalf("Failed to set env var %s: %v", envVar, err)
-				}
+				t.Setenv(envVar, val)
 			}
 			if err := initGlobalVars(); err != nil {
 				t.Fatalf("Failed to initialize global vars: %v", err)
 			}
#!/bin/bash
# Verify direct os.Setenv usage in this test file.
rg -n -C2 '\bos\.Setenv\(' components/notebook-controller/controllers/culling_controller_test.go
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/controllers/culling_controller_test.go` around
lines 253 - 260, Replace the process-global os.Setenv calls in the test setup
loop with t.Setenv to avoid leaking environment changes between subtests: in the
test that currently iterates "for envVar, val := range c.env { if err :=
os.Setenv(... ) ... }" use t.Setenv(envVar, val) for each entry, then call
initGlobalVars() as before; this ensures the test framework will restore envs
automatically and prevents order-dependent failures.
components/notebook-controller/controllers/culling_controller.go (1)

285-291: ⚠️ Potential issue | 🔴 Critical

Critical: unbounded JSON decode allows memory-exhaustion DoS (CWE-400).

Line 290 and Line 314 decode resp.Body without a byte limit. A compromised notebook endpoint can stream oversized JSON and exhaust controller memory.

Remediation diff
 import (
 	"context"
 	"encoding/json"
 	"fmt"
+	"io"
 	"net/http"
 	"os"
 	"strconv"
 	"time"
@@
 const DEFAULT_DEV = "false"
+const MAX_NOTEBOOK_API_BODY_BYTES int64 = 1 << 20 // 1 MiB
@@
-	err := json.NewDecoder(resp.Body).Decode(&kernels)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&kernels)
@@
-	err := json.NewDecoder(resp.Body).Decode(&terminals)
+	err := json.NewDecoder(io.LimitReader(resp.Body, MAX_NOTEBOOK_API_BODY_BYTES)).Decode(&terminals)
#!/bin/bash
# Verify unbounded decoder usage still exists in the reviewed file.
rg -n -C2 'json\.NewDecoder\(resp\.Body\)|io\.LimitReader' components/notebook-controller/controllers/culling_controller.go

As per coding guidelines, "**/*.go: Use io.LimitReader for HTTP response bodies (prevent memory exhaustion)".

Also applies to: 309-314

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@components/notebook-controller/controllers/culling_controller.go` around
lines 285 - 291, The JSON decoding of resp.Body
(json.NewDecoder(resp.Body).Decode(&kernels) and the similar decode at lines
~309-314) is unbounded and can be abused to exhaust memory; wrap resp.Body with
io.LimitReader using a reasonable max (e.g. const maxJSONBytes =
<appropriate_bytes>) and pass that LimitedReader to json.NewDecoder (e.g.
json.NewDecoder(io.LimitReader(resp.Body, maxJSONBytes))). After decoding,
handle errors that indicate truncation/limits appropriately and ensure resp.Body
is still closed in the existing defer. Replace direct uses of
json.NewDecoder(resp.Body) with the limited reader variant wherever kernels or
other response JSON are decoded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@components/notebook-controller/controllers/culling_controller_test.go`:
- Around line 253-260: Replace the process-global os.Setenv calls in the test
setup loop with t.Setenv to avoid leaking environment changes between subtests:
in the test that currently iterates "for envVar, val := range c.env { if err :=
os.Setenv(... ) ... }" use t.Setenv(envVar, val) for each entry, then call
initGlobalVars() as before; this ensures the test framework will restore envs
automatically and prevents order-dependent failures.

In `@components/notebook-controller/controllers/culling_controller.go`:
- Around line 285-291: The JSON decoding of resp.Body
(json.NewDecoder(resp.Body).Decode(&kernels) and the similar decode at lines
~309-314) is unbounded and can be abused to exhaust memory; wrap resp.Body with
io.LimitReader using a reasonable max (e.g. const maxJSONBytes =
<appropriate_bytes>) and pass that LimitedReader to json.NewDecoder (e.g.
json.NewDecoder(io.LimitReader(resp.Body, maxJSONBytes))). After decoding,
handle errors that indicate truncation/limits appropriately and ensure resp.Body
is still closed in the existing defer. Replace direct uses of
json.NewDecoder(resp.Body) with the limited reader variant wherever kernels or
other response JSON are decoded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Central YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 3c95dd57-2c90-4fab-bcf2-e43383a27c04

📥 Commits

Reviewing files that changed from the base of the PR and between 72ab393 and e246714.

📒 Files selected for processing (4)
  • components/notebook-controller/controllers/culling_controller.go
  • components/notebook-controller/controllers/culling_controller_test.go
  • components/notebook-controller/controllers/notebook_controller.go
  • components/notebook-controller/pkg/culler/culler.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • components/notebook-controller/pkg/culler/culler.go
  • components/notebook-controller/controllers/notebook_controller.go

@openshift-ci
Copy link

openshift-ci bot commented Mar 25, 2026

@jstourac: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/odh-notebook-controller-e2e e246714 link true /test odh-notebook-controller-e2e

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants