Skip to content

fix(codex): fail closed on wrapped hook errors#113

Merged
majiayu000 merged 4 commits intomainfrom
feat/issue-109-codex-hook-fail-closed
Apr 21, 2026
Merged

fix(codex): fail closed on wrapped hook errors#113
majiayu000 merged 4 commits intomainfrom
feat/issue-109-codex-hook-fail-closed

Conversation

@majiayu000
Copy link
Copy Markdown
Owner

Closes #109

Summary

  • fail closed in hooks/run-hook-codex.sh when a wrapped PreToolUse hook exits nonzero or returns invalid JSON
  • keep valid empty pass responses silent and add Codex runtime regressions for both the silent-pass and failure paths
  • make the vg-helper session metrics test helper clear its temp directory so the required Cargo test run is deterministic

Test plan

  • bash tests/test_codex_runtime.sh
  • cargo check --manifest-path \"vg-helper/Cargo.toml\"
  • cargo test --manifest-path \"vg-helper/Cargo.toml\"

What/Why

Codex hook wrapper failures were being converted into a successful no-op path, which disabled VibeGuard enforcement instead of surfacing a deny result. This makes the wrapper fail closed for those error cases while preserving the existing silent pass behavior for legitimate empty hook output.

Proof

  • bash tests/test_codex_runtime.sh
  • cargo check --manifest-path \"vg-helper/Cargo.toml\"
  • cargo test --manifest-path \"vg-helper/Cargo.toml\"

AI Role

  • AI-generated and verified the wrapper fix, regression tests, and the minimal vg-helper test-helper cleanup needed to get the required Rust verification green.
  • Risk level: medium

Review Focus

  • Confirm Codex expects the emitted PreToolUse deny payload plus nonzero exit on wrapped-hook failure.
  • Confirm the vg-helper temp-dir cleanup is the right boundary for keeping JSONL append semantics while stabilizing tests.

Codex PreToolUse wrapper execution was failing open when the wrapped hook exited
nonzero or returned malformed JSON, which disabled guard enforcement instead of
surfacing a deny decision. Fail closed for those wrapper errors, keep valid
empty pass output silent, and cover the failure path with runtime regression
checks.

A small vg-helper test-helper cleanup is included so the required cargo test run
is deterministic: session metrics tests append JSONL, so reusing a fixed temp
folder could leave trailing records from earlier runs.

Signed-off-by: majiayu000 <1835304752@qq.com>
Copy link
Copy Markdown

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hook Latency (P95)

Details
Benchmark suite Current: de71a84 Previous: 0bb421d Ratio
pre-edit-guard (P95) 186 ms 195 ms 0.95
pre-write-guard (P95) 223 ms 219 ms 1.02
pre-bash-guard (P95) 267 ms 274 ms 0.97
post-edit-guard (100) (P95) 306 ms 299 ms 1.02
post-write-guard (100) (P95) 217 ms 208 ms 1.04
post-edit-guard (5000) (P95) 323 ms 316 ms 1.02
post-write-guard (5000) (P95) 219 ms 217 ms 1.01
stop-guard (5000) (P95) 144 ms 130 ms 1.11
learn-evaluator (5000) (P95) 142 ms 136 ms 1.04

This comment was automatically generated by workflow using github-action-benchmark.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 9515708946

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread hooks/run-hook-codex.sh Outdated
}, ensure_ascii=False))
PY
fi
exit "$HOOK_EXIT"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return success when emitting PreToolUse deny JSON

After printing a hookSpecificOutput.permissionDecision: "deny" payload, this branch exits with the wrapped hook’s nonzero code. In Codex hook handling, nonzero exits are treated as hook failures (Some(exit_code) => Failed) rather than parsing stdout as a deny decision, so the tool call is not actually blocked in this error path. This means wrapped-hook crashes still fail open for PreToolUse despite emitting a deny payload.

Useful? React with 👍 / 👎.

Limit the wrapper hardening to PreToolUse so Stop and other best-effort
Codex hooks still fail open when they are only advisory, while keeping the
approval boundary fail closed when hook adaptation breaks.

Constraint: PreToolUse must deny on hook/adaptation failure without changing documented non-blocking Stop/PostToolUse behavior
Rejected: Propagate every nonzero hook exit | it breaks best-effort hook contracts outside the approval boundary
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When changing Codex hook adaptation, keep explicit tests for both fail-closed PreToolUse and fail-open best-effort hook events
Tested: bash tests/test_codex_runtime.sh; bash tests/test_hook_health.sh; bash tests/test_hooks.sh; cargo check --manifest-path vg-helper/Cargo.toml; cargo test --manifest-path vg-helper/Cargo.toml
Not-tested: Full CI matrix on GitHub
Signed-off-by: majiayu000 <1835304752@qq.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3984d26379

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread hooks/run-hook-codex.sh
'permissionDecisionReason': 'VIBEGUARD hook failed: wrapped hook produced invalid JSON.'
}
}, ensure_ascii=False))
sys.exit(3)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Return success when emitting PreToolUse deny for bad JSON

In the invalid-JSON adapter path, the script prints a valid hookSpecificOutput.permissionDecision:"deny" payload but then exits with code 3; Codex’s PreToolUse handler treats non-2 nonzero exits as Failed (not blocked), so this path can still fail open and allow the tool call despite the deny JSON. Fresh evidence in this commit is the new sys.exit(3) branch for malformed wrapped-hook output, which reproduces the same fail-open behavior under a newly added error path.

Useful? React with 👍 / 👎.

Decline app-server command approvals when pre-bash hooks fail to launch or exit nonzero so broken guards cannot silently bypass the approval boundary.

Signed-off-by: majiayu000 <1835304752@qq.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7ee300e52

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread tests/test_codex_runtime.sh Outdated
assert_contains "${failing_out}" 'hook failed' "run-hook-codex explains wrapped hook failures"
assert_not_contains "${failing_out}" '"permissionDecision":"allow"' "run-hook-codex does not convert wrapped hook failure into allow"
TOTAL=$((TOTAL + 1))
if [[ ${failing_rc} -ne 0 ]]; then
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Expect zero exit when deny payload is emitted

This assertion encodes the wrong Codex hook contract for PreToolUse: upstream parse_completed logic in codex-rs/hooks/src/events/pre_tool_use.rs only blocks on either exit 0 with a deny JSON payload on stdout or exit 2 with a blocking reason on stderr, while other nonzero exits are marked Failed (fail-open). By requiring failing_rc != 0 here (and similarly in the invalid_json_rc check later), the regression test bakes in the fail-open behavior and will fail once the wrapper is corrected to return a successful exit when emitting deny JSON.

Useful? React with 👍 / 👎.

Codex only blocks PreToolUse requests when deny JSON is emitted with a successful exit, so keep wrapped-hook and adapter failures fail-closed instead of surfacing them as generic hook failures.

Signed-off-by: majiayu000 <1835304752@qq.com>
@majiayu000
Copy link
Copy Markdown
Owner Author

/gemini review

@majiayu000 majiayu000 merged commit 075c3ed into main Apr 21, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[P1] U-23: Codex CLI hook wrapper suppresses hook failures and allows execution

1 participant