fix(codex): fail closed on wrapped hook errors#113
Conversation
Codex PreToolUse wrapper execution was failing open when the wrapped hook exited nonzero or returned malformed JSON, which disabled guard enforcement instead of surfacing a deny decision. Fail closed for those wrapper errors, keep valid empty pass output silent, and cover the failure path with runtime regression checks. A small vg-helper test-helper cleanup is included so the required cargo test run is deterministic: session metrics tests append JSONL, so reusing a fixed temp folder could leave trailing records from earlier runs. Signed-off-by: majiayu000 <1835304752@qq.com>
There was a problem hiding this comment.
Hook Latency (P95)
Details
| Benchmark suite | Current: de71a84 | Previous: 0bb421d | Ratio |
|---|---|---|---|
pre-edit-guard (P95) |
186 ms |
195 ms |
0.95 |
pre-write-guard (P95) |
223 ms |
219 ms |
1.02 |
pre-bash-guard (P95) |
267 ms |
274 ms |
0.97 |
post-edit-guard (100) (P95) |
306 ms |
299 ms |
1.02 |
post-write-guard (100) (P95) |
217 ms |
208 ms |
1.04 |
post-edit-guard (5000) (P95) |
323 ms |
316 ms |
1.02 |
post-write-guard (5000) (P95) |
219 ms |
217 ms |
1.01 |
stop-guard (5000) (P95) |
144 ms |
130 ms |
1.11 |
learn-evaluator (5000) (P95) |
142 ms |
136 ms |
1.04 |
This comment was automatically generated by workflow using github-action-benchmark.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 9515708946
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| }, ensure_ascii=False)) | ||
| PY | ||
| fi | ||
| exit "$HOOK_EXIT" |
There was a problem hiding this comment.
Return success when emitting PreToolUse deny JSON
After printing a hookSpecificOutput.permissionDecision: "deny" payload, this branch exits with the wrapped hook’s nonzero code. In Codex hook handling, nonzero exits are treated as hook failures (Some(exit_code) => Failed) rather than parsing stdout as a deny decision, so the tool call is not actually blocked in this error path. This means wrapped-hook crashes still fail open for PreToolUse despite emitting a deny payload.
Useful? React with 👍 / 👎.
Limit the wrapper hardening to PreToolUse so Stop and other best-effort Codex hooks still fail open when they are only advisory, while keeping the approval boundary fail closed when hook adaptation breaks. Constraint: PreToolUse must deny on hook/adaptation failure without changing documented non-blocking Stop/PostToolUse behavior Rejected: Propagate every nonzero hook exit | it breaks best-effort hook contracts outside the approval boundary Confidence: high Scope-risk: narrow Reversibility: clean Directive: When changing Codex hook adaptation, keep explicit tests for both fail-closed PreToolUse and fail-open best-effort hook events Tested: bash tests/test_codex_runtime.sh; bash tests/test_hook_health.sh; bash tests/test_hooks.sh; cargo check --manifest-path vg-helper/Cargo.toml; cargo test --manifest-path vg-helper/Cargo.toml Not-tested: Full CI matrix on GitHub Signed-off-by: majiayu000 <1835304752@qq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3984d26379
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| 'permissionDecisionReason': 'VIBEGUARD hook failed: wrapped hook produced invalid JSON.' | ||
| } | ||
| }, ensure_ascii=False)) | ||
| sys.exit(3) |
There was a problem hiding this comment.
Return success when emitting PreToolUse deny for bad JSON
In the invalid-JSON adapter path, the script prints a valid hookSpecificOutput.permissionDecision:"deny" payload but then exits with code 3; Codex’s PreToolUse handler treats non-2 nonzero exits as Failed (not blocked), so this path can still fail open and allow the tool call despite the deny JSON. Fresh evidence in this commit is the new sys.exit(3) branch for malformed wrapped-hook output, which reproduces the same fail-open behavior under a newly added error path.
Useful? React with 👍 / 👎.
Decline app-server command approvals when pre-bash hooks fail to launch or exit nonzero so broken guards cannot silently bypass the approval boundary. Signed-off-by: majiayu000 <1835304752@qq.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: f7ee300e52
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| assert_contains "${failing_out}" 'hook failed' "run-hook-codex explains wrapped hook failures" | ||
| assert_not_contains "${failing_out}" '"permissionDecision":"allow"' "run-hook-codex does not convert wrapped hook failure into allow" | ||
| TOTAL=$((TOTAL + 1)) | ||
| if [[ ${failing_rc} -ne 0 ]]; then |
There was a problem hiding this comment.
Expect zero exit when deny payload is emitted
This assertion encodes the wrong Codex hook contract for PreToolUse: upstream parse_completed logic in codex-rs/hooks/src/events/pre_tool_use.rs only blocks on either exit 0 with a deny JSON payload on stdout or exit 2 with a blocking reason on stderr, while other nonzero exits are marked Failed (fail-open). By requiring failing_rc != 0 here (and similarly in the invalid_json_rc check later), the regression test bakes in the fail-open behavior and will fail once the wrapper is corrected to return a successful exit when emitting deny JSON.
Useful? React with 👍 / 👎.
Codex only blocks PreToolUse requests when deny JSON is emitted with a successful exit, so keep wrapped-hook and adapter failures fail-closed instead of surfacing them as generic hook failures. Signed-off-by: majiayu000 <1835304752@qq.com>
|
/gemini review |
Closes #109
Summary
hooks/run-hook-codex.shwhen a wrapped PreToolUse hook exits nonzero or returns invalid JSONvg-helpersession metrics test helper clear its temp directory so the required Cargo test run is deterministicTest plan
bash tests/test_codex_runtime.shcargo check --manifest-path \"vg-helper/Cargo.toml\"cargo test --manifest-path \"vg-helper/Cargo.toml\"What/Why
Codex hook wrapper failures were being converted into a successful no-op path, which disabled VibeGuard enforcement instead of surfacing a deny result. This makes the wrapper fail closed for those error cases while preserving the existing silent pass behavior for legitimate empty hook output.
Proof
bash tests/test_codex_runtime.shcargo check --manifest-path \"vg-helper/Cargo.toml\"cargo test --manifest-path \"vg-helper/Cargo.toml\"AI Role
vg-helpertest-helper cleanup needed to get the required Rust verification green.Review Focus
vg-helpertemp-dir cleanup is the right boundary for keeping JSONL append semantics while stabilizing tests.