M7: surface unclassifiedFallback in audit + validate/doctor lint by LanNguyenSi · Pull Request #314 · LanNguyenSi/harness

LanNguyenSi · 2026-06-27T14:59:49Z

What

M7 (discovery 2026-06-10): surface the Risk Gate's unclassifiedFallback signal end to end, and lint the risk-without-env-scope footgun.

The when: evaluator already computed unclassifiedFallback (true when a risk clause matched ONLY because the action was unclassified, the "unknown is not safe" fail-close) but it surfaced only in explain-policy. An operator reviewing a deny could not tell a genuine critical-severity match from a fail-closed unclassified command.

Changes

Audit visibility. PolicyDecision.whenUnclassifiedFallback is set when a match was fail-closed, serialised into the policy_decision ledger row, and rendered on three surfaces: harness audit (table annotation [unclassified-fallback]), harness audit --json (field), harness explain <policy> --trace [--json] (trace projection). The non-ux deny message appends a cause note before the "To satisfy" hint. The ux surface is left operator-curated; the flag still rides the audit record.
New validate lint checkPolicyRiskWithoutEnvScope: warns when a when: gates on risk.severity_at_least / risk.category_in / action.reversible without an environment.name scope (those clauses fail-closed to match every unclassified command in every environment).
doctor parity. doctor delegates to the same shared check; this corrects a stale doctor walker plus comment that wrongly excluded action.reversible (when-eval treats it like the other arms).
docs/risk-gate.md: deny-unclassified-in-production example.

Verification

build plus full suite green (2550 passed, 1 skipped).
New tests: intercept (flag set/absent, ux carve-out), ledger-record (encode/decode round-trip), validate (3 positive, 3 negative), doctor (action.reversible parity), audit (json plus table render), explain (--trace --json). All mutation-validated.
Two independent reviewer passes; all findings (MEDIUM render-gap, MEDIUM audit test-gap, LOWs) addressed.

Refs: harness-discovery-2026-06-10/M7 (b519df5c)

…tor lint M7 (discovery 2026-06-10). The Risk Gate now distinguishes a genuine classification hit from a fail-closed "unknown is not safe" match: - PolicyDecision and the serialised audit row carry an optional whenUnclassifiedFallback flag; the neutral deny message appends a note when a block was caused by an unclassified command rather than a real risk classification. The ux surface is unchanged (operator-curated). - New validate lint checkPolicyRiskWithoutEnvScope warns when a policy gates on risk.severity_at_least / risk.category_in / action.reversible without an environment.name scope (those clauses fail-closed to match every unclassified command in every environment). - doctor delegates to the same check so doctor and validate stay in parity; this also corrects a stale doctor comment + walker that wrongly excluded action.reversible (when-eval treats it like the other arms). - docs/risk-gate.md gains a deny-unclassified-in-production example. Refs: harness-discovery-2026-06-10/M7 (b519df5c) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Address review findings on the M7 change: - explain --trace [--json] and harness audit now surface the whenUnclassifiedFallback flag (TraceProjection field; AuditDecisionRow field + [unclassified-fallback] annotation on the reason column). The flag was persisted but no read surface rendered it, so a ux-block deny had no operator-visible fail-closed signal; CHANGELOG/docs now name the three render surfaces accurately instead of overclaiming. - intercept: build whenFallbackMap in an explicit loop instead of as a side effect inside the .filter() predicate (a future filter refactor could otherwise silently drop the audit flag). - intercept: place the fail-closed note before the "To satisfy" hint so the cause precedes the remedy in the deny message. - tests: explain --trace --json renders/omits the flag; ux-path carve-out (flag on the decision, clause NOT in the agent-facing reason); a direct payloadFromDecision->encode->decode round-trip. Refs: harness-discovery-2026-06-10/M7 (b519df5c) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…tor parity Final review notes: - Add two audit tests (harness audit --json field, table reason-column [unclassified-fallback] annotation), each with a classified negative control. The audit render surface was advertised in CHANGELOG/docs but had no mutation-test guard; reverting either audit.ts render line now fails a test. - CHANGELOG: note that harness doctor delegates to the same shared check and now also warns on action.reversible-unscoped policies (doctor previously excluded it on an incorrect fail-close assumption). The remaining LOW (whenFallbackMap keyed by policy.name) is consciously accepted: policy-name uniqueness is enforced by schema/policies.ts. Refs: harness-discovery-2026-06-10/M7 (b519df5c) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

nguyen-si-pp and others added 3 commits June 27, 2026 16:28

LanNguyenSi merged commit 77050a5 into master Jun 27, 2026
1 check passed

LanNguyenSi mentioned this pull request Jun 27, 2026

chore(release): 0.38.0 #315

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

M7: surface unclassifiedFallback in audit + validate/doctor lint#314

M7: surface unclassifiedFallback in audit + validate/doctor lint#314
LanNguyenSi merged 3 commits into
masterfrom
fix/m7-risk-gate-unclassified-audit-validate-lint

LanNguyenSi commented Jun 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

LanNguyenSi commented Jun 27, 2026

What

Changes

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants