Skip to content

Set up automated deployments for TecHub using Kamal and GitHub Actions (staging + production) #182

@loftwah

Description

@loftwah

Implement a complete, observable deployment workflow for TecHub using Kamal and GitHub Actions.
Goals:

  • Pushes to staging deploy automatically to staging.techub.life.
  • Pushes or merges to main deploy automatically to techub.life.
  • Deploys are locked, traceable, reversible, and observable.
  • Staging is toggleable and isolated (own DB schema and Spaces bucket).
  • PR previews are ephemeral, auto-cleaned, and auditable.
  • Notifications and telemetry are structured via Axiom and Resend.

Required Secrets (GitHub → Settings → Secrets and variables → Actions → Secrets)

Name Type Used By Notes
KAMAL_REGISTRY_PASSWORD string (PAT) CI + Server GHCR auth (packages:write,read)
RAILS_MASTER_KEY string App runtime Rails credentials decrypt
SSH_KEY multi-line PEM CI SSH to droplet
AXIOM_TOKEN string CI + Local Axiom ingestion
AXIOM_DATASET string CI + Local e.g. techub-deployments
AXIOM_BASE_URL string (optional) CI + Local defaults to https://api.axiom.co
RESEND_API_KEY string CI Email notifications
TO_EMAILS JSON array CI ["dean@techub.life","jarad@techub.life"]
HOST_IP string Prod only Droplet IP or hostname

Secrets are mandatory. Workflows must abort if any are missing.
Validation example:

for key in KAMAL_REGISTRY_PASSWORD RAILS_MASTER_KEY SSH_KEY AXIOM_TOKEN AXIOM_DATASET RESEND_API_KEY TO_EMAILS; do
  [ -z "${!key}" ] && echo "::error::Missing $key" && exit 1
done

Required DNS

Record Target Purpose
A techub.life Droplet IP Production
A staging.techub.life Droplet IP Staging
A *.preview.techub.life Droplet IP PR previews

Cloudflare provides DNS only; Kamal handles proxying via kamal-proxy.


Kamal Configs

config/deploy.yml (production)

Existing config; must include:

registry:
  server: ghcr.io
  username: <%= ENV.fetch("REGISTRY_USERNAME", "loftwah") %>
  password:
    - KAMAL_REGISTRY_PASSWORD

config/deploy.staging.yml

service: techub
image: ghcr.io/techub-life/techub
servers:
  web:
    hosts: <%= ENV.fetch("WEB_HOSTS", "staging.techub.life").split(",") %>
  job:
    hosts: <%= ENV.fetch("JOB_HOSTS", ENV.fetch("WEB_HOSTS", "staging.techub.life")).split(",") %>
    cmd: bin/jobs start
proxy:
  ssl: true
  host: <%= ENV.fetch("APP_HOST", "staging.techub.life") %>
registry:
  server: ghcr.io
  username: <%= ENV.fetch("REGISTRY_USERNAME", "loftwah") %>
  password:
    - KAMAL_REGISTRY_PASSWORD
env:
  secret:
    - RAILS_MASTER_KEY
  clear:
    RAILS_ENV: staging
    RACK_ENV: staging
    NODE_ENV: production
    SOLID_QUEUE_IN_PUMA: true
builder:
  arch: amd64

Workflows

.github/workflows/deploy-staging.yml

Push to staging triggers build, push, deploy, smoke, rollback, telemetry, and email.

name: Deploy — Staging
on:
  push:
    branches: [staging]
permissions:
  contents: read
  packages: write

jobs:
  deploy:
    runs-on: ubuntu-latest
    concurrency:
      group: deploy-staging
      cancel-in-progress: true
    env:
      WEB_HOSTS: staging.techub.life
      APP_HOST: staging.techub.life
      REGISTRY_USERNAME: ${{ github.actor }}
      KAMAL_REGISTRY_PASSWORD: ${{ secrets.KAMAL_REGISTRY_PASSWORD }}
      RAILS_MASTER_KEY: ${{ secrets.RAILS_MASTER_KEY }}
      AXIOM_TOKEN: ${{ secrets.AXIOM_TOKEN }}
      AXIOM_DATASET: ${{ secrets.AXIOM_DATASET }}
      AXIOM_BASE_URL: ${{ secrets.AXIOM_BASE_URL }}
      RESEND_API_KEY: ${{ secrets.RESEND_API_KEY }}
      TO_EMAILS: ${{ secrets.TO_EMAILS }}
      SSH_KEY: ${{ secrets.SSH_KEY }}

    steps:
      - uses: actions/checkout@v5

      - name: Write SSH key
        run: |
          install -m 600 -D /dev/null ~/.ssh/id_ed25519
          printf "%s" "$SSH_KEY" > ~/.ssh/id_ed25519
          printf "Host *\n  StrictHostKeyChecking no\n" >> ~/.ssh/config

      - uses: ruby/setup-ruby@v1
        with:
          ruby-version: .ruby-version
          bundler-cache: true

      - name: Acquire Kamal lock
        run: bin/kamal lock acquire -m "actor=${GITHUB_ACTOR} sha=${GITHUB_SHA}"

      - name: Build & Push
        run: |
          bin/kamal build
          bin/kamal push

      - name: Deploy
        run: bin/kamal deploy -d config/deploy.staging.yml --skip-push

      - name: Smoke check
        id: smoke
        run: bin/kamal app exec -i web -- bash -lc "curl -fsS http://localhost/up >/dev/null"

      - name: Rollback on failure
        if: ${{ steps.smoke.outcome == 'failure' }}
        run: bin/kamal rollback

      - name: Release Kamal lock
        if: always()
        run: bin/kamal lock release

      - name: Emit Axiom telemetry
        if: always()
        run: |
          msg=$([ "${{ job.status }}" = "success" ] && echo deploy_success || echo deploy_failed)
          payload=$(jq -n --arg m "$msg" --arg sha "$GITHUB_SHA" --arg env "staging" '{ts:(now|strftime("%Y-%m-%dT%H:%M:%SZ")),level:"INFO",message:$m,app:"techub",env:$env,sha:$sha}')
          curl -sfS "${AXIOM_BASE_URL:-https://api.axiom.co}/v1/datasets/${AXIOM_DATASET}/ingest" \
            -H "Authorization: Bearer ${AXIOM_TOKEN}" -H "Content-Type: application/json" -d "[$payload]"

      - name: Email on failure
        if: ${{ failure() && env.RESEND_API_KEY != '' && env.TO_EMAILS != '' }}
        run: |
          recipients=$(echo '${{ env.TO_EMAILS }}' | jq -c .)
          payload=$(jq -n --argjson to "$recipients" \
            --arg subject "[Staging Deploy Failure] ${GITHUB_REPOSITORY}" \
            --arg text "Deployment failed: ${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}" \
            '{from:"ops@techub.life", to:$to, subject:$subject, text:$text}')
          curl -sfS https://api.resend.com/emails \
            -H "Authorization: Bearer $RESEND_API_KEY" -H "Content-Type: application/json" -d "$payload"

Production workflow is identical except:

  • Trigger: branches: [main]
  • WEB_HOSTS / APP_HOST: ${{ secrets.HOST_IP }}
  • env:"production" in Axiom payloads and email subject.

Rollback

Smoke failure → automatic kamal rollback.
Manual:

bin/kamal rollback -d config/deploy.yml

Previews

  • Trigger: PR open or updated.
  • Deploy to: pr-<num>.preview.techub.life.
  • Comment: post Markdown table with commit and PR preview URLs.
  • Remove on: PR close or merge (kamal app remove -d config/deploy.pr-<num>.yml).
  • Cleanup: nightly job or cron removes previews older than TTL (default 7 days).

Example PR Comment

Type URL
PR Preview https://pr-123.preview.techub.life
Commit Preview https://commit-abcd123.preview.techub.life

Staging Toggle

  • On: bin/kamal deploy -d config/deploy.staging.yml
  • Off: bin/kamal app remove -d config/deploy.staging.yml
  • Optional rake tasks:
namespace :ops do
  task :staging_on  { sh "bin/kamal deploy -d config/deploy.staging.yml" }
  task :staging_off { sh "bin/kamal app remove -d config/deploy.staging.yml || true" }
end

Observability (Axiom events)

Emit structured events (INFO level JSON):

  • deploy_start, deploy_success, deploy_failed
  • rollback_triggered
  • lock_acquired, lock_released
  • deploy_skipped_locked (if lock prevents run)
  • preview_created, preview_removed
  • staging_on, staging_off

All events include:

{ "ts": "2025-11-07T02:15:00Z", "app": "techub", "env": "staging", "sha": "abc123", "run_url": "" }

Guardrails

  • Use Kamal’s built-in lock (bin/kamal lock) — never edit lock files manually unless Kamal fails.
  • Always run deploys through Actions or the Kamal CLI (no ad-hoc SSH commands).
  • Abort immediately if required secrets are missing.
  • Never push to any registry other than ghcr.io/techub-life/techub.
  • Keep main branch protected (PR review required).
  • staging branch remains open for testing.

Success Criteria

  1. Push to staging → automatic deploy to staging.techub.life.
  2. Push/merge to main → automatic deploy to techub.life.
  3. Locking prevents overlapping deploys.
  4. Rollback runs automatically on failed smoke check.
  5. Axiom receives structured logs for every lifecycle event.
  6. Resend notifies recipients on any failed deploy.
  7. Staging can be turned off/on manually.
  8. PR previews spin up and self-destruct correctly.
  9. No missing secrets or hard-coded credentials.

Runbook

  • Force unlock (only if Kamal crashed): bin/kamal lock release
  • Local deploy: bin/kamal lock acquire && bin/kamal deploy -d config/deploy.staging.yml && bin/kamal lock release
  • Rollback manually: bin/kamal rollback
  • Preview cleanup (if orphaned): run nightly or manually bin/kamal app remove -d config/deploy.pr-<num>.yml
  • Staging off: bin/kamal app remove -d config/deploy.staging.yml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions