Skip to content

feat: add dev-test-iroh deployment#16

Merged
ThetaSinner merged 3 commits intomainfrom
feat/dev-test-iroh
Mar 26, 2026
Merged

feat: add dev-test-iroh deployment#16
ThetaSinner merged 3 commits intomainfrom
feat/dev-test-iroh

Conversation

@ThetaSinner
Copy link
Copy Markdown
Member

@ThetaSinner ThetaSinner commented Mar 25, 2026

Summary

  • Add standalone Iroh relay bootstrap server deployment (dev-test-bootstrap2-iroh)
  • Use Podman Quadlet for native systemd container management (replacing podman-compose). This means that startup is automatic, where it used to require manual setup steps.
  • Provision DNS (A + AAAA records) via Cloudflare Pulumi provider
  • Certbot TLS with retry loop for DNS propagation
  • CI lint step for the new cloud-init file
  • README documentation for Cloudflare config and service management

Summary by CodeRabbit

  • New Features

    • Added a second Iroh bootstrap server deployment with automatic Cloudflare DNS records (A/AAAA) and automated TLS provisioning; service runs in a container managed by systemd.
  • Documentation

    • Added docs for Cloudflare-managed DNS, required Pulumi config keys, permissions, and operational/update procedures.
  • Chores

    • CI lint expanded to validate cloud-init schema and surface errors in the workflow.

Add a standalone Iroh relay bootstrap server for dev testing. Uses Podman
Quadlet for native systemd container management, Cloudflare for DNS
provisioning via Pulumi, and certbot with a retry loop for TLS.
@holochain-release-automation2
Copy link
Copy Markdown
Collaborator

holochain-release-automation2 commented Mar 25, 2026

🍹 preview on network-services/holochain/network-services

Pulumi report

View in Pulumi Cloud

  Previewing update (holochain/network-services)

View Live: https://app.pulumi.com/holochain/network-services/network-services/previews/301a97f8-4da3-4df6-8fb4-64de8f0b1fae

pulumi:pulumi:Stack: (same)
  [urn=urn:pulumi:network-services::network-services::pulumi:pulumi:Stack::network-services-network-services]
  +-digitalocean:index/droplet:Droplet: (replace)
      [id=560807127]
      [urn=urn:pulumi:network-services::network-services::digitalocean:index/droplet:Droplet::dev-test-bootstrap2-iroh]
      [provider=urn:pulumi:network-services::network-services::pulumi:providers:digitalocean::default_4_40_1::c42f8f30-35f6-405f-ae12-fc5b7ae94c76]
    ~ userData: 
        - 1ff5aee1134cf465b3b3f890941aa1252c6803b2
        + #cloud-config
        + snap:
        +   commands:
        +     0: [install, core]
        +     1: [refresh, core]
        +     2: [install, --classic, certbot]
        + write_files:
        +   - content: |
        +       [Container]
        +       Image=ghcr.io/holochain/kitsune2_bootstrap_srv_iroh_relay:v0.4.0-dev.6
        +       Exec=kitsune2-bootstrap-srv --production --listen [::]:443 --tls-cert /etc/letsencrypt/live/dev-test-bootstrap2-iroh.holochain.org/fullchain.pem...
        +       Environment=RUST_LOG=info
        +       Network=host
        +       Volume=/etc/letsencrypt:/etc/letsencrypt:ro
        +       [Service]
        +       Restart=always
        +       [Install]
        +       WantedBy=multi-user.target default.target
        +     path: /etc/containers/systemd/bootstrap.container
        +     permissions: "0644"
        +   - content: |
        +       #!/bin/bash
        +       set -euo pipefail
        +       # Restart journald so its storage directory matches the machine ID
        +       # that cloud-init may have reinitialized.
        +       systemctl restart systemd-journald
        +       apt-get update -y
        +       apt-get install -y podman
        +       max_attempts=30
        +       delay=20
        +       for attempt in $(seq 1 $max_attempts); do
        +         echo "Certbot attempt $attempt of $max_attempts..."
        +         if certbot certonly --standalone -d dev-test-bootstrap2-iroh.holochain.org --non-interactive --agree-tos -m contact@holochain.org; then
        +           echo "Certificate obtained successfully."
        +           systemctl daemon-reload
        +           systemctl enable --now bootstrap
        +           exit 0
        +         fi
        +         sleep_for=$((delay * attempt))
        +         echo "Certbot failed. Retrying in ${sleep_for}s..."
        +         sleep "$sleep_for"
        +       done
        +       echo "Failed to obtain certificate after $max_attempts attempts."
        +       exit 1
        +     path: /opt/bootstrap_srv/provision-cert.sh
        +     permissions: "0755"
        + runcmd:
        +   - /opt/bootstrap_srv/provision-cert.sh
  ~ cloudflare:index/record:Record: (update)
      [id=d156fd19fac60609cc7c6d13854478a5]
      [urn=urn:pulumi:network-services::network-services::cloudflare:index/record:Record::dev-test-bootstrap2-iroh-A]
      [provider=urn:pulumi:network-services::network-services::pulumi:providers:cloudflare::default_5_49_1::fb94b5ba-bc69-42d8-8f1d-9abd2b71d912]
    ~ content: "134.122.68.98" => [unknown]
  ~ cloudflare:index/record:Record: (update)
      [id=930cdb85cacec7c574215c41f7bddcd3]
      [urn=urn:pulumi:network-services::network-services::cloudflare:index/record:Record::dev-test-bootstrap2-iroh-AAAA]
      [provider=urn:pulumi:network-services::network-services::pulumi:providers:cloudflare::default_5_49_1::fb94b5ba-bc69-42d8-8f1d-9abd2b71d912]
    ~ content: "2a03:b0c0:3:f0:0:2:32f9:4000" => [unknown]
Resources:
  ~ 2 to update
  +-1 to replace
  3 changes. 3 unchanged
  

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 25, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c58765cd-333e-45cc-bf9f-c90ff3039577

📥 Commits

Reviewing files that changed from the base of the PR and between 79ab470 and 6965f17.

📒 Files selected for processing (2)
  • .github/workflows/preview.yaml
  • dev-test-bootstrap2-iroh/cloud-init.yaml
✅ Files skipped from review due to trivial changes (1)
  • .github/workflows/preview.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • dev-test-bootstrap2-iroh/cloud-init.yaml

Walkthrough

Adds Pulumi and cloud-init artifacts to provision a DigitalOcean Droplet running an Iroh bootstrap relay, create Cloudflare DNS A/AAAA records, introduce Cloudflare Pulumi config keys, add a cloud-init schema lint step in CI, and document the new deployment and operations.

Changes

Cohort / File(s) Summary
Iroh bootstrap config & provisioning
dev-test-bootstrap2-iroh/cloud-init.yaml, main.go
Adds cloud-init that installs Podman/certbot, writes a systemd container unit and a provisioning script that obtains Let's Encrypt certs; refactors configureDevTestBootstrapSrv to read its own YAML and adds configureDevTestBootstrap2Iroh to create a Droplet and Cloudflare A/AAAA records.
Pulumi config & deps
Pulumi.network-services.yaml, go.mod
Adds Pulumi config keys dns:cloudflare-zone-id and cloudflare:apiToken, and an indirect go.mod dependency github.com/pulumi/pulumi-cloudflare/sdk/v5@v5.49.1.
Documentation
README.md
Adds Cloudflare-managed DNS instructions, required Pulumi config and API permission, and a new "Iroh bootstrap server (dev-test)" section with operational commands and update workflow.
CI workflow
.github/workflows/preview.yaml
Adds a lint job step that runs cloud-init schema against dev-test-bootstrap2-iroh/cloud-init.yaml, appends output to the step summary, annotates errors, and fails the job on schema validation errors.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add dev-test-iroh deployment' is clear and directly relates to the main changeset: adding a new Iroh relay bootstrap server deployment infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/dev-test-iroh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@ThetaSinner ThetaSinner marked this pull request as ready for review March 25, 2026 17:14
@ThetaSinner ThetaSinner requested review from a team and jost-s March 25, 2026 17:14
mattyg
mattyg previously approved these changes Mar 25, 2026
jost-s
jost-s previously approved these changes Mar 25, 2026
Align naming with existing convention where "bootstrap2" refers to the
Kitsune2 bootstrap protocol version.
@ThetaSinner ThetaSinner dismissed stale reviews from jost-s and mattyg via 79ab470 March 25, 2026 18:11
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
main.go (1)

84-92: Extract duplicated SSH-key lookup into a helper.

The same GetSshKeys + fingerprint mapping appears in multiple deployment functions; pulling it into one helper will reduce drift and simplify future changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@main.go` around lines 84 - 92, Extract the duplicated DigitalOcean SSH-key
lookup into a helper: create a function (e.g., fetchSshFingerprints or
GetSshFingerprints) that calls digitalocean.GetSshKeys(ctx,
&digitalocean.GetSshKeysArgs{}, nil), handles the error, and returns the
[]string of key.Fingerprint; then replace the repeated blocks that declare
getSshKeysResult, err and build sshFingerprints with calls to this helper,
preserving the context variable name (ctx) and error handling pattern used by
callers.
.github/workflows/preview.yaml (1)

24-31: Consider de-duplicating cloud-init lint steps.

The new block duplicates existing parsing/annotation logic; this is likely to drift again. A small reusable script/composite action with a file argument would reduce maintenance risk.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/preview.yaml around lines 24 - 31, Duplicate cloud-init
lint/parsing logic in the "Lint dev-test-bootstrap2-iroh cloud-init file" step
should be consolidated into a reusable script or composite action that accepts
the cloud-init file path; extract the pipeline of "cloud-init schema -c <file>
2> >(tee stderr.log) >> $GITHUB_STEP_SUMMARY" plus the subsequent grep/sed
annotation (the stderr.log -> grep 'in.*line.*column' ... sed 's/...::error
.../') into a single script or action, commit that script (e.g.,
scripts/lint-cloud-init.sh) and replace the duplicated step body with a call
that passes the file argument (dev-test-bootstrap2-iroh/cloud-init.yaml) so all
uses of cloud-init schema, stderr.log, the grep filter, and the ::error sed
formatter are centralized and reused.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/preview.yaml:
- Line 29: The CI annotation is hardcoding file=cloud-init.yaml in the sed
replacement, preventing clickable diagnostics for other files; update the sed
replacement in the grep/sed pipeline (the command that pipes stderr.log through
grep and sed) to use the captured filename group (\2) for the file= value and in
the title if desired so the annotation becomes file=\2,line=\3,col=\4 with the
message/title using \1 and \2 as needed.

In `@dev-test-bootstrap2-iroh/cloud-init.yaml`:
- Around line 44-46: The bootstrap service is started but not enabled, so it
won't persist across reboots; update the cloud-init commands around the
"systemctl start bootstrap" invocation to enable the service as well (e.g., use
"systemctl enable --now bootstrap" or run "systemctl enable bootstrap" before
starting) so the "bootstrap" service is enabled on boot and will survive host
reboots while keeping the final "exit 0".
- Around line 37-50: The retry window for obtaining the certificate is too short
(max_attempts and delay) causing premature failure; update the certbot retry
loop (the for loop that iterates using max_attempts and delay around the certbot
certonly command) to increase max_attempts (e.g., to allow 30+ minutes) and
implement exponential backoff (multiply delay each attempt and add optional
jitter) so DNS propagation has more time before giving up, and only call
systemctl start bootstrap/exit 0 after a successful certbot run.

---

Nitpick comments:
In @.github/workflows/preview.yaml:
- Around line 24-31: Duplicate cloud-init lint/parsing logic in the "Lint
dev-test-bootstrap2-iroh cloud-init file" step should be consolidated into a
reusable script or composite action that accepts the cloud-init file path;
extract the pipeline of "cloud-init schema -c <file> 2> >(tee stderr.log) >>
$GITHUB_STEP_SUMMARY" plus the subsequent grep/sed annotation (the stderr.log ->
grep 'in.*line.*column' ... sed 's/...::error .../') into a single script or
action, commit that script (e.g., scripts/lint-cloud-init.sh) and replace the
duplicated step body with a call that passes the file argument
(dev-test-bootstrap2-iroh/cloud-init.yaml) so all uses of cloud-init schema,
stderr.log, the grep filter, and the ::error sed formatter are centralized and
reused.

In `@main.go`:
- Around line 84-92: Extract the duplicated DigitalOcean SSH-key lookup into a
helper: create a function (e.g., fetchSshFingerprints or GetSshFingerprints)
that calls digitalocean.GetSshKeys(ctx, &digitalocean.GetSshKeysArgs{}, nil),
handles the error, and returns the []string of key.Fingerprint; then replace the
repeated blocks that declare getSshKeysResult, err and build sshFingerprints
with calls to this helper, preserving the context variable name (ctx) and error
handling pattern used by callers.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d8ba7aff-5b7f-4411-9c83-31fe495a03d5

📥 Commits

Reviewing files that changed from the base of the PR and between 5758074 and 79ab470.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (6)
  • .github/workflows/preview.yaml
  • Pulumi.network-services.yaml
  • README.md
  • dev-test-bootstrap2-iroh/cloud-init.yaml
  • go.mod
  • main.go

if ! cloud-init schema -c dev-test-bootstrap2-iroh/cloud-init.yaml 2> >(tee stderr.log) >> $GITHUB_STEP_SUMMARY
then
# Print errors as such in GitHub logs.
cat stderr.log | grep "in.*line.*column" -B1 -A1 --no-group-separator | sed -z 's/\([^\n]*\)\n\s*in "\([^"]*\)", line \([[:digit:]]\+\), column \([[:digit:]]\+\):\s*/::error file=cloud-init.yaml,title=\1 in \2,line=\3,col=\4::/g'
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use the actual file path in CI error annotations.

On Line 29, the annotation still uses file=cloud-init.yaml; this breaks clickable diagnostics for this new target file.

🔧 Proposed fix
-            cat stderr.log | grep "in.*line.*column" -B1 -A1 --no-group-separator | sed -z 's/\([^\n]*\)\n\s*in "\([^"]*\)", line \([[:digit:]]\+\), column \([[:digit:]]\+\):\s*/::error file=cloud-init.yaml,title=\1 in \2,line=\3,col=\4::/g'
+            cat stderr.log | grep "in.*line.*column" -B1 -A1 --no-group-separator | sed -z 's/\([^\n]*\)\n\s*in "\([^"]*\)", line \([[:digit:]]\+\), column \([[:digit:]]\+\):\s*/::error file=dev-test-bootstrap2-iroh\/cloud-init.yaml,title=\1 in \2,line=\3,col=\4::/g'
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
cat stderr.log | grep "in.*line.*column" -B1 -A1 --no-group-separator | sed -z 's/\([^\n]*\)\n\s*in "\([^"]*\)", line \([[:digit:]]\+\), column \([[:digit:]]\+\):\s*/::error file=cloud-init.yaml,title=\1 in \2,line=\3,col=\4::/g'
cat stderr.log | grep "in.*line.*column" -B1 -A1 --no-group-separator | sed -z 's/\([^\n]*\)\n\s*in "\([^"]*\)", line \([[:digit:]]\+\), column \([[:digit:]]\+\):\s*/::error file=dev-test-bootstrap2-iroh\/cloud-init.yaml,title=\1 in \2,line=\3,col=\4::/g'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/preview.yaml at line 29, The CI annotation is hardcoding
file=cloud-init.yaml in the sed replacement, preventing clickable diagnostics
for other files; update the sed replacement in the grep/sed pipeline (the
command that pipes stderr.log through grep and sed) to use the captured filename
group (\2) for the file= value and in the title if desired so the annotation
becomes file=\2,line=\3,col=\4 with the message/title using \1 and \2 as needed.

- Fix CI annotation file path for clickable diagnostics
- Use systemctl enable --now to persist service across reboots
- Increase certbot retry window with linear backoff (30 attempts)
- Restart journald early to fix machine ID mismatch
- Install podman in runcmd to ensure correct ordering
@cocogitto-bot
Copy link
Copy Markdown

cocogitto-bot bot commented Mar 26, 2026

✔️ adf300b...6965f17 - Conventional commits check succeeded.

@ThetaSinner ThetaSinner merged commit 20a2cd4 into main Mar 26, 2026
5 checks passed
@ThetaSinner ThetaSinner deleted the feat/dev-test-iroh branch March 26, 2026 12:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants