Skip to content

feat: add capability to stop allocations to job-runner policy#7

Closed
cdunster wants to merge 1 commit intomainfrom
update-runner-policy
Closed

feat: add capability to stop allocations to job-runner policy#7
cdunster wants to merge 1 commit intomainfrom
update-runner-policy

Conversation

@cdunster
Copy link
Copy Markdown
Contributor

@cdunster cdunster commented Mar 26, 2026

Required to allow the GitHub CI to stop/kill running Nomad allocations.

Summary by CodeRabbit

  • New Features
    • Job runner can now cancel allocations in CI workflows, extending its permissions for improved workflow control.

@cdunster cdunster self-assigned this Mar 26, 2026
@cocogitto-bot
Copy link
Copy Markdown

cocogitto-bot bot commented Mar 26, 2026

✔️ e2e986b - Conventional commits check succeeded.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 26, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: abe1b797-6db7-4ded-8e75-e595b97fcecd

📥 Commits

Reviewing files that changed from the base of the PR and between 2b68cb9 and e2e986b.

📒 Files selected for processing (2)
  • job-runner.policy.hcl
  • main.go

Walkthrough

The job-runner ACL policy is expanded to grant the alloc-lifecycle capability, enabling cancellation of allocations. The policy description is updated to reflect this new permission scope in CI workflows.

Changes

Cohort / File(s) Summary
Nomad ACL Policy Configuration
job-runner.policy.hcl
Added alloc-lifecycle capability to the default namespace capabilities list, expanding permissions beyond list-jobs, read-job, and submit-job.
Application Entry Point
main.go
Updated the policy description string to include "cancelling allocations in CI workflows" alongside existing workflow capabilities.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding the 'alloc-lifecycle' capability to the job-runner policy to stop/cancel allocations.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch update-runner-policy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cdunster cdunster requested a review from a team March 26, 2026 18:33
@holochain-release-automation2
Copy link
Copy Markdown
Collaborator

🍹 preview on nomad-server/holochain/nomad-server

Pulumi report

View in Pulumi Cloud

  Previewing update (holochain/nomad-server)

View Live: https://app.pulumi.com/holochain/nomad-server/nomad-server/previews/b3c7250a-4a80-40be-bb97-52d2fc94e337

pulumi:pulumi:Stack: (same)
  [urn=urn:pulumi:nomad-server::nomad-server::pulumi:pulumi:Stack::nomad-server-nomad-server]
  +-command:remote:CopyToRemote: (replace)
      [id=e29f2fc1]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:CopyToRemote::copy-job-runner-policy]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    - source  : asset(file:1289f09) { ./job-runner.policy.hcl }
    + source  : asset(file:8187cd7) { ./job-runner.policy.hcl }
    ~ triggers: [
        - [0]: asset(file:1289f09) { ./job-runner.policy.hcl }
        + [0]: asset(file:8187cd7) { ./job-runner.policy.hcl }
      ]
  +-command:remote:Command: (replace)
      [id=chown-etc-nomad-dir8917c53d]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::chown-etc-nomad-dir]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [2]: "e29f2fc1" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=start-nomad-service22cf16b5]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::start-nomad-service]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [2]: "e29f2fc1" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=apply-job-runner-policyff4dcb70]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::apply-job-runner-policy]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ create  : "nomad acl policy apply -address=https://localhost:4646 -ca-cert=/etc/nomad.d/nomad-agent-ca.pem -token=\"$LC_ACL_TOKEN\" -description=\"For running jobs and reading Node status in CI workflows\" job-runner /etc/nomad.d/job-runner.policy.hcl" => "nomad acl policy apply -address=https://localhost:4646 -ca-cert=/etc/nomad.d/nomad-agent-ca.pem -token=\"$LC_ACL_TOKEN\" -description=\"For running jobs, reading Node status, and cancelling allocations in CI workflows\" job-runner /etc/nomad.d/job-runner.policy.hcl"
    ~ triggers: [
        ~ [0]: "e29f2fc1" => [unknown]
      ]
Resources:
  +-4 to replace
  18 unchanged
  

@pulumi
Copy link
Copy Markdown

pulumi bot commented Mar 26, 2026

🍹 The Update (preview) for holochain/nomad-server/nomad-server (at e2e986b) was successful.

✨ Neo Explanation

The Nomad job runner policy has been updated, triggering a re-upload of the policy file and a re-run of the Nomad service restart sequence to apply the new policy. Expect a brief Nomad service restart during deployment.

Root Cause Analysis

The job runner policy file content has changed (indicated by ~source on copy-job-runner-policy), which has triggered a cascading replacement of all dependent remote commands. The ~triggers diff on the other resources confirms they are chained to fire whenever the policy file or upstream steps change.

Dependency Chain

A change to the job runner policy source file causes copy-job-runner-policy to be replaced (re-uploaded to the remote server). This replacement invalidates the trigger chain downstream:

  • chown-etc-nomad-dir re-runs to ensure correct file ownership after the new policy is copied
  • apply-job-runner-policy re-runs to apply the updated policy to Nomad
  • start-nomad-service re-runs, likely to restart/reload Nomad so it picks up the new policy

Risk analysis

The Nomad service will be restarted as part of this change (start-nomad-service replacement). This may cause a brief interruption to job scheduling or running workloads depending on how the restart is handled. No stateful resources (databases, storage) are being modified or deleted.

Resource Changes

    Name                     Type                         Operation
+-  copy-job-runner-policy   command:remote:CopyToRemote  replaced
+-  chown-etc-nomad-dir      command:remote:Command       replaced
+-  start-nomad-service      command:remote:Command       replaced
+-  apply-job-runner-policy  command:remote:Command       replaced

@cdunster
Copy link
Copy Markdown
Contributor Author

Merging this will restart the Nomad service on the server droplet so let's do it when it's not in use.

@cdunster
Copy link
Copy Markdown
Contributor Author

Closing in favour of #8 so that we are not re-applying multiple times.

@cdunster cdunster closed this Mar 30, 2026
@cdunster cdunster deleted the update-runner-policy branch March 30, 2026 09:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants