Skip to content

refactor ghx batch publish contract and docs#744

Merged
jolestar merged 4 commits intomainfrom
refactor/ghx-batch-public-contract
Feb 27, 2026
Merged

refactor ghx batch publish contract and docs#744
jolestar merged 4 commits intomainfrom
refactor/ghx-batch-public-contract

Conversation

@jolestar
Copy link
Collaborator

Summary

  • rename ghx publish multi-action mode from intent to batch (no compatibility alias)
  • switch CLI/API surface from --intent/publish-intent.json to --batch/publish-batch.json
  • document publish-batch.json as a public schema (not internal-only), including required/optional fields and action types
  • clarify mode selection guidance: single action uses direct commands, multi-action uses batch mode
  • add explicit payload-safety guidance: large multiline text must use --body-file/--comments-file
  • update related docs/tests/examples to match batch naming and behavior

Validation

  • ./tests/skill-mode/test_publisher.sh (pass)

Notes

  • intentionally excluded unrelated untracked docs files from this PR

Copilot AI review requested due to automatic review settings February 27, 2026 08:03
@vercel
Copy link

vercel bot commented Feb 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
holon Ready Ready Preview, Comment Feb 27, 2026 2:40pm

@cloudflare-workers-and-pages
Copy link

cloudflare-workers-and-pages bot commented Feb 27, 2026

Deploying holon with  Cloudflare Pages  Cloudflare Pages

Latest commit: b6e417d
Status: ✅  Deploy successful!
Preview URL: https://7de897dd.holon-1dl.pages.dev
Branch Preview URL: https://refactor-ghx-batch-public-co.holon-1dl.pages.dev

View logs

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the GHX multi-action publish flow by renaming the “intent” mode/schema to “batch”, updating the CLI surface and documentation to treat publish-batch.json as the public contract, and clarifying safer patterns for large text payloads.

Changes:

  • Rename ghx.sh intent run --intent=...ghx.sh batch run --batch=... (and update tests/examples/docs accordingly).
  • Update publish.sh internals from INTENT_FILE/intent terminology to BATCH_FILE/batch terminology.
  • Document publish-batch.json as a public schema and add explicit “text payload safety” guidance (use --body-file / --comments-file).

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/skill-mode/test_publisher.sh Updates skill-mode publisher tests to use batch run and publish-batch.json.
tests/skill-mode/README.md Updates test documentation references from intent mode to batch mode.
skills/github-review/references/SCRIPTS.md Updates examples to reference batch mode for multi-action publishing.
skills/ghx/scripts/publish.sh Renames intent parsing/execution to batch parsing/execution and updates CLI flag to --batch.
skills/ghx/scripts/ghx.sh Updates top-level CLI routing from intent to batch and forwards --batch flags.
skills/ghx/references/github-publishing.md Updates public publishing guide: batch schema, mode selection, and payload safety rules.
skills/ghx/SKILL.md Updates GHX skill contract docs to describe batch mode and publish-batch schema as public.
docs/skills.md Updates recommended artifact names from publish-intent.json to publish-batch.json.
docs/manifest-format.md Updates manifest documentation examples to reference publish-batch.json.
Comments suppressed due to low confidence (2)

skills/ghx/scripts/publish.sh:145

  • parse_pr_ref_from_batch() reads .pr_ref from the global $BATCH_FILE instead of the batch file passed into execute_batch(). This makes execute_batch <path> fragile (it can parse the wrong file or fail if $BATCH_FILE is unset) and is inconsistent with validate_batch, which correctly uses its argument. Pass the batch file path into parse_pr_ref_from_batch(batch_file) (or set BATCH_FILE="$batch_file" inside execute_batch) and have the function use that value for the jq read.
parse_pr_ref_from_batch() {
  if [[ -z "$PR_REF" ]]; then
    PR_REF=$(jq -r '.pr_ref' "$BATCH_FILE")
  fi
  if [[ "$PR_REF" == "null" || -z "$PR_REF" ]]; then
    log_error "No PR reference specified and not found in batch file"
    return 1

skills/ghx/scripts/ghx.sh:62

  • ghx.sh pr ... currently treats --batch=* as a "global" option and forwards it to publish.sh along with the direct subcommand. Because publish.sh prioritizes batch mode when --batch is set, a user who accidentally provides --batch to pr create|update|comment will silently run batch mode and ignore the direct command. Consider removing --batch=* from the pr option passthrough, or make publish.sh error out when both a direct command and --batch are provided.
      case "$arg" in
        --pr=*|--repo=*|--dry-run|--from=*|--batch=*)
          global_opts+=("$arg")

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +219 to 235
execute_batch() {
local batch_file="$1"
validate_batch "$batch_file" || return 1
parse_pr_ref_from_batch || return 1

local action_count
action_count=$(jq '.actions | length' "$intent_file")
action_count=$(jq '.actions | length' "$batch_file")
local results_json='[]'
local total=0 completed=0 failed=0

source "${SCRIPT_DIR}/lib/publish.sh"

for ((i=FROM_INDEX; i<action_count; i++)); do
local action_type action_params
action_type=$(jq -r ".actions[$i].type" "$intent_file")
action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$intent_file")
action_type=$(jq -r ".actions[$i].type" "$batch_file")
action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$batch_file")
total=$((total + 1))
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

execute_batch passes untrusted publish-batch.json data as action_params into action handlers (e.g. action_post_review) without constraining any path-like fields. In lib/publish.sh these params are used to build body and comments_file paths relative to GITHUB_OUTPUT_DIR but only reject absolute paths, so values containing ../ can escape the output directory and cause arbitrary Markdown/JSON files to be read and their contents posted back to GitHub. To avoid path traversal and potential secret exfiltration, normalize and strictly validate these path parameters (e.g., reject .. segments and enforce that resolved paths stay under GITHUB_OUTPUT_DIR or come from a small allowed filename set) before using them.

Copilot uses AI. Check for mistakes.
@holonbot
Copy link
Contributor

holonbot bot commented Feb 27, 2026

Review: PR #744 - Refactor ghx batch publish contract and docs

Summary

This PR refactors the ghx publish system by renaming "intent mode" to "batch mode" and promoting the schema from internal-only to a public contract. It also adds stdin support via --body-file=- and improves documentation around payload safety.

Overall Assessment: APPROVED with minor suggestions

The refactoring is well-executed with comprehensive test coverage. The documentation improvements are substantial and valuable. The renaming makes the terminology clearer for external consumers.

Key Findings

1. Breaking Change Without Migration Path (informational)

Severity: warn

The rename from intent to batch is a breaking change for any external skills or workflows using publish-intent.json or the intent run command. The PR description intentionally excludes a compatibility alias, which is acceptable given this is early in the project's evolution, but be aware that this may require updates to any external consumers.

Recommendation: Consider adding a brief migration note in the commit message or a CHANGES entry if external consumers exist.

2. Stdin State Management in Batch Mode (low priority)

Severity: nit

In skills/ghx/scripts/lib/publish.sh:327-340, the stdin capture mechanism uses global environment variables (GHX_STDIN_BODY, GHX_STDIN_BODY_CAPTURED). In batch mode, if multiple actions in the same batch use --body-file=-, they will all share the same stdin content (which is captured once on first use). This is likely intentional but not documented.

Recommendation: Add a comment clarifying this behavior in the batch schema documentation, or consider whether per-action stdin is needed (which would require a different design).

3. Existing Security Issue (already raised by Copilot)

Severity: error (pre-existing)

The Copilot review correctly identified a path traversal vulnerability in execute_batch() where action parameters are used to construct file paths without sufficient validation. This is a pre-existing issue (not introduced by this PR) but should be addressed before the batch schema is widely used by external skills.

Recommendation: File a follow-up issue to add path normalization and validation to parse_body_param() and similar functions that construct file paths from batch parameters.

Positive Aspects

  1. Excellent test coverage - The new test_body_file_stdin() test is comprehensive, validating both successful execution and captured content.

  2. Clear documentation - The batch schema is now well-documented with required/optional fields, action types, and third-party integration guidance.

  3. Safety guidance - The payload safety section (skills/ghx/SKILL.md:126-151) provides valuable guidance to prevent common shell escaping issues.

  4. Consistent renaming - The intentbatch rename is applied consistently across all files, including tests and documentation.

  5. Backward compatibility marker - The marker default change in action_post_comment() (skills/ghx/scripts/lib/publish.sh:349-353) from jq -r '.marker // "holon-publish-marker"' to checking for empty is an improvement.

Testing Status

  • Unit tests: ✅ Passing (Test Unit, Test Agent)
  • Integration tests: 🔄 In progress (Test Integration)
  • CodeQL: ✅ No new alerts
  • Manual validation: ✅ Per PR description

Recommendations for Merge

  1. ✅ Merge is acceptable as-is
  2. Consider creating a follow-up issue for the path traversal hardening
  3. Consider documenting the stdin behavior in batch mode more explicitly

Files Changed

  • skills/ghx/scripts/publish.sh: Core batch mode implementation
  • skills/ghx/scripts/lib/publish.sh: Stdin support and marker handling
  • skills/ghx/scripts/ghx.sh: CLI routing updates
  • skills/ghx/SKILL.md: Public contract documentation
  • skills/ghx/references/github-publishing.md: Publishing guide updates
  • tests/skill-mode/test_publisher.sh: Comprehensive test additions
  • Documentation updates for consistency

Reviewed commit: bb77744795a3d6f9aa09db03cbf12f3d3763d62a
Context coverage: Complete (all core artifacts available)

Copy link
Contributor

@holonbot holonbot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: #744

Summary

This PR successfully refactors the ghx publish multi-action mode from "intent" to "batch", with clear naming, documentation, and test updates. The changes are well-structured, properly tested, and address the stated goals effectively.

Overall Assessment: APPROVE with minor suggestions

Key Strengths

  1. Clear Naming: The "batch" terminology is more intuitive and better describes the multi-action behavior than "intent"
  2. Comprehensive Testing: Added new test coverage including:
    • stdin support via --body-file -
    • batch/direct mode conflict detection
    • Proper validation of batch file parsing
  3. Public Contract Documentation: publish-batch.json is now properly documented as a public schema with clear field specifications
  4. Safety Guidance: Added explicit guidance about using --body-file for large text payloads to avoid shell escaping issues
  5. Good Validation: Added check to prevent combining --batch with direct commands (lines 519-523 in publish.sh)

Security Consideration

⚠️ Path Traversal Risk in Batch Execution (Already Reported)

The existing GitHub Copilot reviewer correctly identified a path traversal vulnerability at skills/ghx/scripts/publish.sh:219-236:

Issue: execute_batch passes untrusted publish-batch.json data as action_params into action handlers. These params are used to build body and comments_file paths relative to GITHUB_OUTPUT_DIR, but the validation only rejects absolute paths. Values containing ../ can escape the output directory and cause arbitrary Markdown/JSON files to be read and their contents posted to GitHub.

Recommendation: Normalize and strictly validate path parameters before use:

  • Reject .. segments in path validation
  • Enforce that resolved paths stay under GITHUB_OUTPUT_DIR
  • Consider allowing only a small set of known safe filenames

This should be addressed before merge or documented as a known limitation if the threat model is acceptable.

Findings

Low Priority Improvements

  1. CLI Option Passthrough Consistency (ghx.sh:61)

    • The --batch=* option was correctly removed from pr subcommand passthrough in commit 596bb35
    • This prevents accidental batch mode invocation when users meant direct commands
    • Good fix that matches the new validation in publish.sh
  2. Marker Default Value Change (lib/publish.sh:349-353)

    • Changed from using jq -r '.marker // "holon-publish-marker"' to explicit empty check
    • This is more explicit and correct, but could have been kept as the original one-liner
    • Not a functional issue, just a style difference
  3. Log Output Redirection (lib/publish.sh:300, 306, 314, 318)

    • Added >&2 to several log_info/log_warn calls
    • This is correct for ensuring logs go to stderr
    • Inconsistent application (some calls still missing redirection)
  4. Documentation Completeness

    • SKILL.md and github-publishing.md now have excellent documentation of the batch schema
    • Good examples of stdin usage and mode selection guidance
    • Clear explanation of when to use batch vs direct mode

Validation

✅ All CI checks passing (CodeQL, Test Unit, Test Run Safety, skill-mode tests)
✅ Test coverage expanded appropriately for new functionality
✅ Documentation updated consistently across all referenced files
✅ Breaking change handled cleanly (no compatibility alias needed as noted in PR description)

Recommendations

Before Merge

  1. Address the path traversal vulnerability identified in the Copilot review or document the threat model decision

After Merge (Optional)

  1. Consider adding integration tests for the batch mode with actual GitHub API mocks
  2. Consider documenting the security model for publish-batch.json path handling explicitly

Conclusion

This is a well-executed refactoring that improves clarity and documentation of the ghx batch publish mechanism. The code quality is good, tests are comprehensive, and the public contract is now properly documented.

The primary concern is the path traversal vulnerability in batch execution, which should be addressed or explicitly accepted as a design decision within the documented threat model.

Recommendation: APPROVE after addressing the security concern

action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$intent_file")
action_type=$(jq -r ".actions[$i].type" "$batch_file")
action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$batch_file")
total=$((total + 1))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Path traversal vulnerability: execute_batch passes untrusted batch params to action handlers without sufficient validation. The action_params are used to construct body/comments_file paths, but validation only rejects absolute paths. Values containing ../ can escape GITHUB_OUTPUT_DIR and read arbitrary files. Normalize paths, reject .. segments, and enforce resolved paths stay under GITHUB_OUTPUT_DIR before using in action handlers.\n\nSuggestion: In parse_body_param() and similar path parsing functions, add strict validation: 1) Resolve paths to absolute using realpath, 2) Check resolved path starts with GITHUB_OUTPUT_DIR, 3) Reject paths containing .. components. Or restrict to a whitelist of safe filenames.

@jolestar
Copy link
Collaborator Author

Thanks, this is valid and I fixed it in b6e417d.

What changed:

  • Added strict path validation helper in skills/ghx/scripts/lib/publish.sh:
    • reject absolute paths
    • reject .. traversal segments
    • for existing paths, enforce canonical resolved path stays under GITHUB_OUTPUT_DIR
  • Switched both body file resolution and comments_file resolution to use that helper.
  • Added regression test test_batch_path_traversal_rejected in tests/skill-mode/test_publisher.sh to assert ../outside/secret.md is rejected in batch mode.
  • Updated publishing docs to include path-safety requirements for file-like params.

Validation:

  • ./tests/skill-mode/test_publisher.sh passed (16/16).

Copy link
Contributor

@holonbot holonbot bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: #744

Summary

This PR successfully refactors the ghx publish multi-action mode from "intent" to "batch" terminology, with improved documentation, comprehensive testing, and properly addresses the path traversal security concern. The changes are well-structured, properly tested, and represent a quality improvement to the codebase.

Overall Assessment: ✅ APPROVE


Key Strengths

  1. Clear Naming: The "batch" terminology is more intuitive and better describes the multi-action behavior than "intent"

  2. Security Fix: The path traversal vulnerability identified in previous reviews has been properly addressed through the new resolve_output_path_for_read() function in lib/publish.sh:

    • Rejects absolute paths
    • Validates against .. traversal using regex: (^|/)\.\.(/|$)
    • Verifies canonical paths stay under GITHUB_OUTPUT_DIR
    • Proper test coverage: test_batch_path_traversal_rejected()
  3. Public Contract Documentation: publish-batch.json is now properly documented as a public schema with:

    • Clear field specifications (required vs optional)
    • Supported action types enumerated
    • Path safety rules explicitly documented
    • Third-party integration guidance
  4. Safety Guidance: Added explicit guidance about using --body-file for large text payloads to avoid shell escaping issues, including stdin support (--body-file -)

  5. Comprehensive Testing: Added excellent test coverage:

    • Stdin support via --body-file -
    • Batch/direct mode conflict detection
    • Path traversal rejection
    • Validation of batch file parsing
  6. CLI Consistency: Removed --batch from pr subcommand passthrough (commit 596bb35), preventing accidental batch mode invocation


Security Analysis

✅ Path Traversal - FIXED

The existing review threads correctly identified a path traversal vulnerability. This PR addresses it properly:

New protection in lib/publish.sh:310-356:

resolve_output_path_for_read() {
    # Rejects absolute paths
    if [[ "$rel_path" =~ ^/ ]]; then
        return 1
    fi

    # Rejects .. traversal
    if [[ "$rel_path" =~ (^|/)\.\.(/|$) ]]; then
        return 1
    fi

    # Verifies canonical path stays under GITHUB_OUTPUT_DIR
    # For existing paths, resolves with pwd -P and validates
}

This function is now used in:

  • parse_body_param() (line 426)
  • action_post_review() (line 452)

Test coverage: test_batch_path_traversal_rejected() validates the protection works.


Code Quality Observations

Positive Changes

  1. Marker handling (lib/publish.sh:434-438):

    • Changed from jq default to explicit empty check
    • More readable and explicit
  2. Stderr redirection (lib/publish.sh:300-385):

    • Added >&2 to log_info/log_warn calls
    • Ensures logs go to stderr (correct)
  3. Batch/direct mode validation (publish.sh:518-522):

    • Added check to prevent combining --batch with direct commands
    • Good defensive programming

Minor Style Notes

These are not issues, just observations:

  1. Log redirection inconsistency: Some log calls have >&2, others don't. This is minor and doesn't affect functionality.

  2. Marker default logic: Could have kept the jq one-liner, but the explicit check is more readable.


Validation

CI Status: All critical checks passing

  • CodeQL: success (no new alerts)
  • Test Unit: success
  • Test Run Safety: success
  • skill-mode tests: passing per PR description

Documentation: Consistently updated across all referenced files

Breaking Change: Properly handled without compatibility alias (as noted in PR description)


Recommendations

Before Merge - None

The path traversal vulnerability has been properly addressed, tests are comprehensive, and documentation is excellent.

Future Enhancements (Optional)

  1. Consider adding integration tests with GitHub API mocks for batch mode
  2. Consider documenting the security model for path handling explicitly in SKILL.md

Conclusion

This is a well-executed refactoring that:

  • Improves clarity with better terminology
  • Fixes the security vulnerability identified in previous reviews
  • Adds comprehensive test coverage
  • Documents the public contract properly
  • Provides excellent safety guidance for text payloads

The code quality is high, tests are thorough, and the public contract is now properly documented. The primary security concern has been addressed through proper path validation.

Recommendation: ✅ APPROVE

This PR is ready to merge.

@jolestar jolestar merged commit 1aa8631 into main Feb 27, 2026
16 checks passed
@jolestar jolestar deleted the refactor/ghx-batch-public-contract branch February 27, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants