Skip to content

Conversation

@NERLOE
Copy link
Contributor

@NERLOE NERLOE commented Oct 24, 2025

Add KUBERNETES_WORKER_PRIORITY_CLASS_NAME environment variable to allow configuring a priority class for task run pods in Kubernetes environments.

This addresses pod preemption issues in resource-constrained clusters, particularly in GKE Autopilot where low-priority pods can be preempted by system pods during scale-up operations.

Closes #2630

✅ Checklist

  • I have followed every step in the contributing guide
  • The PR title follows the convention.
  • I ran and tested the code works

Testing

Added example configuration and tested assignment of priority class on our worker pods in our GKE Autopilot production cluster.

supervisor:
  extraEnvVars:
    - name: KUBERNETES_WORKER_PRIORITY_CLASS_NAME
      value: "trigger-task-runs"

extraManifests:
  - apiVersion: scheduling.k8s.io/v1
    kind: PriorityClass
    metadata:
      name: trigger-task-runs
    value: 100000
    globalDefault: false

Changelog

  • Added: KUBERNETES_WORKER_PRIORITY_CLASS_NAME environment variable to apps/supervisor/src/env.ts
  • Added: Support for priorityClassName in Kubernetes pod spec (apps/supervisor/src/workloadManager/kubernetes.ts)
  • Enhancement: Worker pods can now be configured with priority classes to prevent preemption in resource-constrained environments
  • Compatibility: Fully backward compatible - optional configuration with no breaking changes

Screenshots

N/A - Infrastructure/backend change. Configuration can be verified via:

# Verify priority class is applied to new worker pods
kubectl get pods -n trigger -l app=task-run \
  -o custom-columns=NAME:.metadata.name,PRIORITY:.spec.priority,PRIORITY_CLASS:.spec.priorityClassName

Expected output after configuration:

NAME                               PRIORITY   PRIORITY_CLASS
runner-cmh4xxxxx                   100000     trigger-task-runs

Add KUBERNETES_POD_PRIORITY_CLASS_NAME environment variable to allow
configuring a priority class for task run pods in Kubernetes environments.

This addresses pod preemption issues in resource-constrained clusters,
particularly in GKE Autopilot where low-priority pods can be preempted
by system pods during scale-up operations.

Fixes: Pod preemption causing SIGTERM failures during cluster scale-up
Related: GKE Autopilot resource contention
@changeset-bot
Copy link

changeset-bot bot commented Oct 24, 2025

⚠️ No Changeset found

Latest commit: 7a77c9f

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 24, 2025

Walkthrough

This pull request extends the Kubernetes configuration schema by adding a new optional environment variable KUBERNETES_WORKER_PRIORITY_CLASS_NAME. The environment variable definition is added to the Env schema, and corresponding logic is implemented in the workload manager to conditionally include the priorityClassName field in pod specifications when this variable is defined. These changes introduce new configuration capability without modifying existing fields or control flow.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

The review scope is limited to two tightly-related files with homogeneous changes focused on a single feature. The modifications are straightforward: adding an optional schema field and implementing conditional logic to use it. No complex logic density, no structural refactoring, and minimal risk of unintended side effects on existing functionality.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (4 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "feat(supervisor): add k8 worker pod priority class support" clearly and accurately describes the main change in the pull request. It follows the conventional commit format with a proper scope (supervisor) and concisely identifies the primary feature being added: Kubernetes worker pod priority class support. The title directly corresponds to the implementation of the new KUBERNETES_WORKER_PRIORITY_CLASS_NAME environment variable and its integration into the pod specification.
Linked Issues Check ✅ Passed The implementation directly addresses all primary coding objectives from linked issue #2630. The changes add the KUBERNETES_WORKER_PRIORITY_CLASS_NAME environment variable configuration capability and modify the Kubernetes pod spec to conditionally include the priorityClassName field. This implementation allows operators to configure priority classes for task run pods to prevent preemption during cluster scale-up in resource-constrained environments like GKE Autopilot. The solution is backward compatible with no breaking changes, and the PR description demonstrates testing in a GKE Autopilot production cluster.
Out of Scope Changes Check ✅ Passed All changes in this pull request are directly scoped to the objectives defined in linked issue #2630. The modifications are limited to two files: apps/supervisor/src/env.ts (adding the environment variable) and apps/supervisor/src/workloadManager/kubernetes.ts (using the variable in pod spec). No extraneous files were modified, no unrelated functionality was changed, and all changes are focused exclusively on implementing the priority class support feature for Kubernetes worker pods.
Description Check ✅ Passed The PR description is comprehensive and follows the required template structure. It includes all essential sections: the issue reference (Closes #2630), a completed checklist confirming the contributing guide was followed and the code was tested, a detailed testing section with example configuration and kubectl verification commands, a thorough changelog section with specific file modifications, and an appropriate "N/A" note for screenshots with reasoning. The description provides clear context about the problem (pod preemption in GKE Autopilot) and demonstrates that the solution was tested in production.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 255ea0a and 7a77c9f.

📒 Files selected for processing (2)
  • apps/supervisor/src/env.ts (1 hunks)
  • apps/supervisor/src/workloadManager/kubernetes.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

**/*.{ts,tsx}: Always prefer using isomorphic code like fetch, ReadableStream, etc. instead of Node.js specific code
For TypeScript, we usually use types over interfaces
Avoid enums
No default exports, use function declarations

Files:

  • apps/supervisor/src/env.ts
  • apps/supervisor/src/workloadManager/kubernetes.ts
🧬 Code graph analysis (1)
apps/supervisor/src/workloadManager/kubernetes.ts (2)
apps/supervisor/src/env.ts (1)
  • env (120-120)
apps/webapp/app/env.server.ts (1)
  • env (1208-1208)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (22)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
  • GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
  • GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
  • GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
  • GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
  • GitHub Check: typecheck / typecheck
🔇 Additional comments (2)
apps/supervisor/src/env.ts (1)

84-84: LGTM! Clean addition of optional priority class configuration.

The environment variable is correctly defined as optional, properly placed within the Kubernetes settings section, and follows the existing naming convention.

apps/supervisor/src/workloadManager/kubernetes.ts (1)

289-293: LGTM! Consistent implementation of optional priorityClassName.

The conditional spread pattern matches the existing approach for schedulerName and nodeSelector, ensuring the field is only included when configured. The Kubernetes API will handle validation of the priority class name at pod creation time, with errors properly caught and handled by the existing error handling logic.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: add support for configuring worker pod priority class

1 participant