Skip to content

Improvements and missing policy#8

Merged
cdunster merged 3 commits intomainfrom
improvements-and-missing-policy
Mar 30, 2026
Merged

Improvements and missing policy#8
cdunster merged 3 commits intomainfrom
improvements-and-missing-policy

Conversation

@cdunster
Copy link
Copy Markdown
Contributor

@cdunster cdunster commented Mar 30, 2026

Takes over from #7, so that we are not re-applying multiple times.

This PR:

  • Adds the alloc-lifecycle capability to the GitHub runner token policy which is required to allow the GitHub CI to stop/kill running Nomad allocations.
  • Sets the server's advertised address to the URL instead of it's public IP, this means that if the IP changes then the clients should still be able to find the new server.
  • Fixed a potential issue where updating the new Nomad Variables for the Unyt scenario would not trigger the command to run.

Summary by CodeRabbit

  • Chores
    • Expanded job management permissions to include allocation lifecycle control capabilities.
    • Updated Nomad server endpoint configuration to use a fixed hostname instead of dynamic IP discovery, improving service discoverability.
    • Adjusted infrastructure configuration for service integration.

This means that if the server's IP address changes then the clients
should still be able to find the new IP address as long as the DNS
record is up to date.
Now updating any of the Nomad variable resources correctly triggers the
command to rerun.
@cdunster cdunster self-assigned this Mar 30, 2026
@cocogitto-bot
Copy link
Copy Markdown

cocogitto-bot bot commented Mar 30, 2026

✔️ e2e986b...4e0b717 - Conventional commits check succeeded.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 30, 2026

Walkthrough

Updates Nomad ACL policy to add allocation lifecycle management capability, modifies policy application command description in infrastructure-as-code, and replaces dynamic IP templates with a fixed server hostname in Nomad configuration.

Changes

Cohort / File(s) Summary
Nomad ACL Policy Updates
job-runner.policy.hcl, main.go
Added alloc-lifecycle capability to namespace policy; updated policy description to reference "cancelling allocations"; expanded Pulumi command triggers to include Durable Objects configuration.
Nomad Server Configuration
nomad.hcl
Replaced templated public IP placeholders with fixed hostname nomad-server-01.holochain.org in advertise configuration for both HTTP and RPC endpoints.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Suggested reviewers

  • jost-s
  • ThetaSinner
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title is vague and overly generic, using non-descriptive terms ('Improvements', 'missing policy') that don't convey specific meaningful information about the actual changes. Revise the title to be more specific and descriptive, such as 'Add alloc-lifecycle capability to job-runner policy' or 'Update Nomad configuration for DNS and policy improvements' to clearly reflect the main changes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch improvements-and-missing-policy

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@holochain-release-automation2
Copy link
Copy Markdown
Collaborator

🍹 preview on nomad-server/holochain/nomad-server

Pulumi report

View in Pulumi Cloud

  Previewing update (holochain/nomad-server)

View Live: https://app.pulumi.com/holochain/nomad-server/nomad-server/previews/f0609553-6546-4816-bbdc-2b8dace3d58c

pulumi:pulumi:Stack: (same)
  [urn=urn:pulumi:nomad-server::nomad-server::pulumi:pulumi:Stack::nomad-server-nomad-server]
  +-command:remote:CopyToRemote: (replace)
      [id=09ccb1a7]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:CopyToRemote::copy-nomad-config]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    - source  : asset(file:b2202f8) { ./nomad.hcl }
    + source  : asset(file:e78658d) { ./nomad.hcl }
    ~ triggers: [
        - [0]: asset(file:b2202f8) { ./nomad.hcl }
        + [0]: asset(file:e78658d) { ./nomad.hcl }
      ]
  +-command:remote:CopyToRemote: (replace)
      [id=e29f2fc1]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:CopyToRemote::copy-job-runner-policy]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    - source  : asset(file:1289f09) { ./job-runner.policy.hcl }
    + source  : asset(file:8187cd7) { ./job-runner.policy.hcl }
    ~ triggers: [
        - [0]: asset(file:1289f09) { ./job-runner.policy.hcl }
        + [0]: asset(file:8187cd7) { ./job-runner.policy.hcl }
      ]
  +-command:remote:Command: (replace)
      [id=chown-etc-nomad-dir-before-server-certbea9d1c1]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::chown-etc-nomad-dir-before-server-cert]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [2]: "09ccb1a7" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=chown-etc-nomad-dir8917c53d]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::chown-etc-nomad-dir]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [1]: "09ccb1a7" => [unknown]
        ~ [2]: "e29f2fc1" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=start-nomad-service22cf16b5]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::start-nomad-service]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [0]: "09ccb1a7" => [unknown]
        ~ [2]: "e29f2fc1" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=apply-job-runner-policyff4dcb70]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::apply-job-runner-policy]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ create  : "nomad acl policy apply -address=https://localhost:4646 -ca-cert=/etc/nomad.d/nomad-agent-ca.pem -token=\"$LC_ACL_TOKEN\" -description=\"For running jobs and reading Node status in CI workflows\" job-runner /etc/nomad.d/job-runner.policy.hcl" => "nomad acl policy apply -address=https://localhost:4646 -ca-cert=/etc/nomad.d/nomad-agent-ca.pem -token=\"$LC_ACL_TOKEN\" -description=\"For running jobs, reading Node status, and cancelling allocations in CI workflows\" job-runner /etc/nomad.d/job-runner.policy.hcl"
    ~ triggers: [
        ~ [0]: "e29f2fc1" => [unknown]
      ]
  +-command:remote:Command: (replace)
      [id=add-nomad-jobs-vars11f78a9d]
      [urn=urn:pulumi:nomad-server::nomad-server::command:remote:Command::add-nomad-jobs-vars]
      [provider=urn:pulumi:nomad-server::nomad-server::pulumi:providers:command::default_1_0_2::3d5f4907-5ccd-4d31-8098-9549b40692ec]
    ~ triggers: [
        ~ [1]: "560913518" => "https://wind-tunnel-durable-objects.holochain.org/"
        + [2]: [secret]
        + [3]: "560913518"
      ]
Resources:
  +-7 to replace
  15 unchanged
  

@pulumi
Copy link
Copy Markdown

pulumi bot commented Mar 30, 2026

🍹 The Update (preview) for holochain/nomad-server/nomad-server (at 4e0b717) was successful.

✨ Neo Explanation

Updated Nomad configuration and/or job runner policy files are being pushed to the server, triggering a full re-run of the setup commands including a Nomad service restart — expect a brief Nomad downtime during the apply.

Root Cause Analysis

The Nomad configuration files and/or job runner policy files have been modified locally. Both copy-nomad-config and copy-job-runner-policy show a ~source diff, meaning the file contents being pushed to the remote server have changed. Additionally, all remote commands have updated triggers, which are tied to these file contents — so when the files change, all dependent commands are re-triggered.

Dependency Chain

  1. copy-nomad-config — Updated Nomad config file is copied to the server
  2. copy-job-runner-policy — Updated policy file is copied to the server
  3. All downstream commands re-run due to trigger changes:
    • chown-etc-nomad-dir and chown-etc-nomad-dir-before-server-cert — Fix ownership of the Nomad config directory
    • add-nomad-jobs-vars — Applies job variable configuration
    • apply-job-runner-policy — Applies the updated ACL/policy for the job runner
    • start-nomad-service — Restarts the Nomad service to pick up the new configuration

Each resource is doing a delete-and-replace (not an in-place update), meaning Pulumi will tear down the old remote command records and re-execute them fresh.

Risk Analysis

The Nomad service will be restarted as part of this update (start-nomad-service replace), which means a brief disruption to Nomad is expected. No stateful resources (databases, storage, VMs) are being replaced or deleted. The server itself is unchanged (15 resources untouched). The risk is limited to a short Nomad service restart window.

Resource Changes

    Name                                    Type                         Operation
+-  copy-job-runner-policy                  command:remote:CopyToRemote  replaced
+-  start-nomad-service                     command:remote:Command       replaced
+-  add-nomad-jobs-vars                     command:remote:Command       replaced
+-  chown-etc-nomad-dir-before-server-cert  command:remote:Command       replaced
+-  apply-job-runner-policy                 command:remote:Command       replaced
+-  copy-nomad-config                       command:remote:CopyToRemote  replaced
+-  chown-etc-nomad-dir                     command:remote:Command       replaced

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
nomad.hcl (1)

8-9: Consider centralizing the advertised hostname to avoid drift.

This FQDN is now repeated in multiple places (nomad.hcl Line 8-9, nomad-client.hcl Line 7-10, and certificate generation in main.go Line 170). Consider sourcing it from one config value to prevent future mismatch.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nomad.hcl` around lines 8 - 9, Centralize the advertised hostname by
introducing a single configuration value (e.g., advertised_host) and replace the
duplicated literals used for the http and rpc entries in nomad.hcl and the
client entries in nomad-client.hcl with references to that config value; update
main.go’s certificate generation site (the block that builds the certificate
SANs / common name around Line ~170) to read the same advertised_host instead of
hardcoding "nomad-server-01.holochain.org", and ensure the new config is loaded
(env var or shared config loader) so all three places (nomad.hcl http/rpc,
nomad-client.hcl entries, and main.go cert generation) derive the hostname from
the single source.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@nomad.hcl`:
- Around line 8-9: Centralize the advertised hostname by introducing a single
configuration value (e.g., advertised_host) and replace the duplicated literals
used for the http and rpc entries in nomad.hcl and the client entries in
nomad-client.hcl with references to that config value; update main.go’s
certificate generation site (the block that builds the certificate SANs / common
name around Line ~170) to read the same advertised_host instead of hardcoding
"nomad-server-01.holochain.org", and ensure the new config is loaded (env var or
shared config loader) so all three places (nomad.hcl http/rpc, nomad-client.hcl
entries, and main.go cert generation) derive the hostname from the single
source.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df6513f7-9f7f-4a7c-9f52-d5b2c56508f0

📥 Commits

Reviewing files that changed from the base of the PR and between 2b68cb9 and 4e0b717.

📒 Files selected for processing (3)
  • job-runner.policy.hcl
  • main.go
  • nomad.hcl

@cdunster cdunster requested a review from a team March 30, 2026 10:01
@cdunster cdunster merged commit 2b54d0f into main Mar 30, 2026
6 checks passed
@cdunster cdunster deleted the improvements-and-missing-policy branch March 30, 2026 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants