refactor ghx batch publish contract and docs#744
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Deploying holon with
|
| Latest commit: |
b6e417d
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://7de897dd.holon-1dl.pages.dev |
| Branch Preview URL: | https://refactor-ghx-batch-public-co.holon-1dl.pages.dev |
There was a problem hiding this comment.
Pull request overview
Refactors the GHX multi-action publish flow by renaming the “intent” mode/schema to “batch”, updating the CLI surface and documentation to treat publish-batch.json as the public contract, and clarifying safer patterns for large text payloads.
Changes:
- Rename
ghx.sh intent run --intent=...→ghx.sh batch run --batch=...(and update tests/examples/docs accordingly). - Update
publish.shinternals fromINTENT_FILE/intent terminology toBATCH_FILE/batch terminology. - Document
publish-batch.jsonas a public schema and add explicit “text payload safety” guidance (use--body-file/--comments-file).
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/skill-mode/test_publisher.sh | Updates skill-mode publisher tests to use batch run and publish-batch.json. |
| tests/skill-mode/README.md | Updates test documentation references from intent mode to batch mode. |
| skills/github-review/references/SCRIPTS.md | Updates examples to reference batch mode for multi-action publishing. |
| skills/ghx/scripts/publish.sh | Renames intent parsing/execution to batch parsing/execution and updates CLI flag to --batch. |
| skills/ghx/scripts/ghx.sh | Updates top-level CLI routing from intent to batch and forwards --batch flags. |
| skills/ghx/references/github-publishing.md | Updates public publishing guide: batch schema, mode selection, and payload safety rules. |
| skills/ghx/SKILL.md | Updates GHX skill contract docs to describe batch mode and publish-batch schema as public. |
| docs/skills.md | Updates recommended artifact names from publish-intent.json to publish-batch.json. |
| docs/manifest-format.md | Updates manifest documentation examples to reference publish-batch.json. |
Comments suppressed due to low confidence (2)
skills/ghx/scripts/publish.sh:145
parse_pr_ref_from_batch()reads.pr_reffrom the global$BATCH_FILEinstead of the batch file passed intoexecute_batch(). This makesexecute_batch <path>fragile (it can parse the wrong file or fail if$BATCH_FILEis unset) and is inconsistent withvalidate_batch, which correctly uses its argument. Pass the batch file path intoparse_pr_ref_from_batch(batch_file)(or setBATCH_FILE="$batch_file"insideexecute_batch) and have the function use that value for thejqread.
parse_pr_ref_from_batch() {
if [[ -z "$PR_REF" ]]; then
PR_REF=$(jq -r '.pr_ref' "$BATCH_FILE")
fi
if [[ "$PR_REF" == "null" || -z "$PR_REF" ]]; then
log_error "No PR reference specified and not found in batch file"
return 1
skills/ghx/scripts/ghx.sh:62
ghx.sh pr ...currently treats--batch=*as a "global" option and forwards it topublish.shalong with the direct subcommand. Becausepublish.shprioritizes batch mode when--batchis set, a user who accidentally provides--batchtopr create|update|commentwill silently run batch mode and ignore the direct command. Consider removing--batch=*from theproption passthrough, or makepublish.sherror out when both a direct command and--batchare provided.
case "$arg" in
--pr=*|--repo=*|--dry-run|--from=*|--batch=*)
global_opts+=("$arg")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| execute_batch() { | ||
| local batch_file="$1" | ||
| validate_batch "$batch_file" || return 1 | ||
| parse_pr_ref_from_batch || return 1 | ||
|
|
||
| local action_count | ||
| action_count=$(jq '.actions | length' "$intent_file") | ||
| action_count=$(jq '.actions | length' "$batch_file") | ||
| local results_json='[]' | ||
| local total=0 completed=0 failed=0 | ||
|
|
||
| source "${SCRIPT_DIR}/lib/publish.sh" | ||
|
|
||
| for ((i=FROM_INDEX; i<action_count; i++)); do | ||
| local action_type action_params | ||
| action_type=$(jq -r ".actions[$i].type" "$intent_file") | ||
| action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$intent_file") | ||
| action_type=$(jq -r ".actions[$i].type" "$batch_file") | ||
| action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$batch_file") | ||
| total=$((total + 1)) |
There was a problem hiding this comment.
execute_batch passes untrusted publish-batch.json data as action_params into action handlers (e.g. action_post_review) without constraining any path-like fields. In lib/publish.sh these params are used to build body and comments_file paths relative to GITHUB_OUTPUT_DIR but only reject absolute paths, so values containing ../ can escape the output directory and cause arbitrary Markdown/JSON files to be read and their contents posted back to GitHub. To avoid path traversal and potential secret exfiltration, normalize and strictly validate these path parameters (e.g., reject .. segments and enforce that resolved paths stay under GITHUB_OUTPUT_DIR or come from a small allowed filename set) before using them.
Review: PR #744 - Refactor ghx batch publish contract and docsSummaryThis PR refactors the ghx publish system by renaming "intent mode" to "batch mode" and promoting the schema from internal-only to a public contract. It also adds stdin support via Overall Assessment: APPROVED with minor suggestions The refactoring is well-executed with comprehensive test coverage. The documentation improvements are substantial and valuable. The renaming makes the terminology clearer for external consumers. Key Findings1. Breaking Change Without Migration Path (informational)Severity: The rename from Recommendation: Consider adding a brief migration note in the commit message or a CHANGES entry if external consumers exist. 2. Stdin State Management in Batch Mode (low priority)Severity: In Recommendation: Add a comment clarifying this behavior in the batch schema documentation, or consider whether per-action stdin is needed (which would require a different design). 3. Existing Security Issue (already raised by Copilot)Severity: The Copilot review correctly identified a path traversal vulnerability in Recommendation: File a follow-up issue to add path normalization and validation to Positive Aspects
Testing Status
Recommendations for Merge
Files Changed
Reviewed commit: |
There was a problem hiding this comment.
Review: #744
Summary
This PR successfully refactors the ghx publish multi-action mode from "intent" to "batch", with clear naming, documentation, and test updates. The changes are well-structured, properly tested, and address the stated goals effectively.
Overall Assessment: APPROVE with minor suggestions
Key Strengths
- Clear Naming: The "batch" terminology is more intuitive and better describes the multi-action behavior than "intent"
- Comprehensive Testing: Added new test coverage including:
- stdin support via
--body-file - - batch/direct mode conflict detection
- Proper validation of batch file parsing
- stdin support via
- Public Contract Documentation:
publish-batch.jsonis now properly documented as a public schema with clear field specifications - Safety Guidance: Added explicit guidance about using
--body-filefor large text payloads to avoid shell escaping issues - Good Validation: Added check to prevent combining
--batchwith direct commands (lines 519-523 in publish.sh)
Security Consideration
⚠️ Path Traversal Risk in Batch Execution (Already Reported)
The existing GitHub Copilot reviewer correctly identified a path traversal vulnerability at skills/ghx/scripts/publish.sh:219-236:
Issue: execute_batch passes untrusted publish-batch.json data as action_params into action handlers. These params are used to build body and comments_file paths relative to GITHUB_OUTPUT_DIR, but the validation only rejects absolute paths. Values containing ../ can escape the output directory and cause arbitrary Markdown/JSON files to be read and their contents posted to GitHub.
Recommendation: Normalize and strictly validate path parameters before use:
- Reject
..segments in path validation - Enforce that resolved paths stay under
GITHUB_OUTPUT_DIR - Consider allowing only a small set of known safe filenames
This should be addressed before merge or documented as a known limitation if the threat model is acceptable.
Findings
Low Priority Improvements
-
CLI Option Passthrough Consistency (ghx.sh:61)
- The
--batch=*option was correctly removed fromprsubcommand passthrough in commit 596bb35 - This prevents accidental batch mode invocation when users meant direct commands
- Good fix that matches the new validation in publish.sh
- The
-
Marker Default Value Change (lib/publish.sh:349-353)
- Changed from using
jq -r '.marker // "holon-publish-marker"'to explicit empty check - This is more explicit and correct, but could have been kept as the original one-liner
- Not a functional issue, just a style difference
- Changed from using
-
Log Output Redirection (lib/publish.sh:300, 306, 314, 318)
- Added
>&2to several log_info/log_warn calls - This is correct for ensuring logs go to stderr
- Inconsistent application (some calls still missing redirection)
- Added
-
Documentation Completeness
- SKILL.md and github-publishing.md now have excellent documentation of the batch schema
- Good examples of stdin usage and mode selection guidance
- Clear explanation of when to use batch vs direct mode
Validation
✅ All CI checks passing (CodeQL, Test Unit, Test Run Safety, skill-mode tests)
✅ Test coverage expanded appropriately for new functionality
✅ Documentation updated consistently across all referenced files
✅ Breaking change handled cleanly (no compatibility alias needed as noted in PR description)
Recommendations
Before Merge
- Address the path traversal vulnerability identified in the Copilot review or document the threat model decision
After Merge (Optional)
- Consider adding integration tests for the batch mode with actual GitHub API mocks
- Consider documenting the security model for
publish-batch.jsonpath handling explicitly
Conclusion
This is a well-executed refactoring that improves clarity and documentation of the ghx batch publish mechanism. The code quality is good, tests are comprehensive, and the public contract is now properly documented.
The primary concern is the path traversal vulnerability in batch execution, which should be addressed or explicitly accepted as a design decision within the documented threat model.
Recommendation: APPROVE after addressing the security concern
| action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$intent_file") | ||
| action_type=$(jq -r ".actions[$i].type" "$batch_file") | ||
| action_params=$(jq ".actions[$i].params // (.actions[$i] | del(.type, .description))" "$batch_file") | ||
| total=$((total + 1)) |
There was a problem hiding this comment.
Path traversal vulnerability: execute_batch passes untrusted batch params to action handlers without sufficient validation. The action_params are used to construct body/comments_file paths, but validation only rejects absolute paths. Values containing ../ can escape GITHUB_OUTPUT_DIR and read arbitrary files. Normalize paths, reject .. segments, and enforce resolved paths stay under GITHUB_OUTPUT_DIR before using in action handlers.\n\nSuggestion: In parse_body_param() and similar path parsing functions, add strict validation: 1) Resolve paths to absolute using realpath, 2) Check resolved path starts with GITHUB_OUTPUT_DIR, 3) Reject paths containing .. components. Or restrict to a whitelist of safe filenames.
|
Thanks, this is valid and I fixed it in What changed:
Validation:
|
There was a problem hiding this comment.
Review: #744
Summary
This PR successfully refactors the ghx publish multi-action mode from "intent" to "batch" terminology, with improved documentation, comprehensive testing, and properly addresses the path traversal security concern. The changes are well-structured, properly tested, and represent a quality improvement to the codebase.
Overall Assessment: ✅ APPROVE
Key Strengths
-
Clear Naming: The "batch" terminology is more intuitive and better describes the multi-action behavior than "intent"
-
Security Fix: The path traversal vulnerability identified in previous reviews has been properly addressed through the new
resolve_output_path_for_read()function inlib/publish.sh:- Rejects absolute paths
- Validates against
..traversal using regex:(^|/)\.\.(/|$) - Verifies canonical paths stay under
GITHUB_OUTPUT_DIR - Proper test coverage:
test_batch_path_traversal_rejected()
-
Public Contract Documentation:
publish-batch.jsonis now properly documented as a public schema with:- Clear field specifications (required vs optional)
- Supported action types enumerated
- Path safety rules explicitly documented
- Third-party integration guidance
-
Safety Guidance: Added explicit guidance about using
--body-filefor large text payloads to avoid shell escaping issues, including stdin support (--body-file -) -
Comprehensive Testing: Added excellent test coverage:
- Stdin support via
--body-file - - Batch/direct mode conflict detection
- Path traversal rejection
- Validation of batch file parsing
- Stdin support via
-
CLI Consistency: Removed
--batchfromprsubcommand passthrough (commit 596bb35), preventing accidental batch mode invocation
Security Analysis
✅ Path Traversal - FIXED
The existing review threads correctly identified a path traversal vulnerability. This PR addresses it properly:
New protection in lib/publish.sh:310-356:
resolve_output_path_for_read() {
# Rejects absolute paths
if [[ "$rel_path" =~ ^/ ]]; then
return 1
fi
# Rejects .. traversal
if [[ "$rel_path" =~ (^|/)\.\.(/|$) ]]; then
return 1
fi
# Verifies canonical path stays under GITHUB_OUTPUT_DIR
# For existing paths, resolves with pwd -P and validates
}This function is now used in:
parse_body_param()(line 426)action_post_review()(line 452)
Test coverage: test_batch_path_traversal_rejected() validates the protection works.
Code Quality Observations
Positive Changes
-
Marker handling (lib/publish.sh:434-438):
- Changed from
jqdefault to explicit empty check - More readable and explicit
- Changed from
-
Stderr redirection (lib/publish.sh:300-385):
- Added
>&2to log_info/log_warn calls - Ensures logs go to stderr (correct)
- Added
-
Batch/direct mode validation (publish.sh:518-522):
- Added check to prevent combining
--batchwith direct commands - Good defensive programming
- Added check to prevent combining
Minor Style Notes
These are not issues, just observations:
-
Log redirection inconsistency: Some log calls have
>&2, others don't. This is minor and doesn't affect functionality. -
Marker default logic: Could have kept the
jqone-liner, but the explicit check is more readable.
Validation
✅ CI Status: All critical checks passing
- CodeQL: success (no new alerts)
- Test Unit: success
- Test Run Safety: success
- skill-mode tests: passing per PR description
✅ Documentation: Consistently updated across all referenced files
✅ Breaking Change: Properly handled without compatibility alias (as noted in PR description)
Recommendations
Before Merge - None
The path traversal vulnerability has been properly addressed, tests are comprehensive, and documentation is excellent.
Future Enhancements (Optional)
- Consider adding integration tests with GitHub API mocks for batch mode
- Consider documenting the security model for path handling explicitly in SKILL.md
Conclusion
This is a well-executed refactoring that:
- Improves clarity with better terminology
- Fixes the security vulnerability identified in previous reviews
- Adds comprehensive test coverage
- Documents the public contract properly
- Provides excellent safety guidance for text payloads
The code quality is high, tests are thorough, and the public contract is now properly documented. The primary security concern has been addressed through proper path validation.
Recommendation: ✅ APPROVE
This PR is ready to merge.
Summary
intenttobatch(no compatibility alias)--intent/publish-intent.jsonto--batch/publish-batch.jsonpublish-batch.jsonas a public schema (not internal-only), including required/optional fields and action types--body-file/--comments-fileValidation
./tests/skill-mode/test_publisher.sh(pass)Notes