ACL: strict classification mode for ambiguous tools

The tool classifier introduced in #54 produces three possible outcomes: `Read`, `Write`, and `Ambiguous`. The default fail-safe treats `Ambiguous` as `Write` — tools land in the write bucket unless the classifier is confident they are read-only or the user added an override.

For most users this is fine: a `dev` role with `access: "write"` on a given server can still reach ambiguous tools because "write bucket" includes them. But for security-sensitive deployments, that default is too permissive: an ambiguous tool is, by definition, a tool the proxy does not understand, and silently granting it to anyone with `write` access is a blind spot.

`strictClassification: true` (already accepted by the new ACL schema parser in #55, but with no runtime effect yet) should change the default: in strict mode, an ambiguous tool is **blocked entirely** until it has an explicit override in the server config.

This issue is intentionally last in the sequence because its value only becomes clear once operators have run the classifier against their real environment, seen which tools come out ambiguous, and can make an informed decision about whether strict mode is worth enabling.

Depends on: #54 (classifier), #55 (schema flag), #56 (so denials caused by strict mode show up in audit with a clear reason).

## Goal

Wire the `strictClassification` flag into the ACL evaluator so that when enabled, ambiguous tools are unreachable until the operator acknowledges them via an override in `servers.json`.

## Expected behavior

- `acl.strictClassification: false` (default) → current behavior from #55. Ambiguous tools are evaluated as `Write` against grants. A role with `access: "write"` can reach them.
- `acl.strictClassification: true` → any tool whose classification is `Ambiguous` is **denied** regardless of grants, unless its classification came from a manual override (`tools: { read: [...] }` or `tools: { write: [...] }` in the server config from #54). In other words: strict mode requires the operator to have explicitly stated what the tool is.
- Denials caused by strict mode produce an audit entry (#56) with a distinct, recognizable reason — something like `matched_rule: "strict_classification"` or an equivalent unambiguous marker — so operators can easily find them in logs and decide which tools to override.
- A clear `WARN` log at proxy startup summarizes how many ambiguous tools are currently being blocked due to strict mode, grouped by upstream server. This is the operator's cue to review `mcp acl classify` and decide which ambiguous tools to pin.
- `mcp acl check` (from #56) must report strict-mode denials with their specific reason, not a generic "default deny". The operator needs to see "blocked by strictClassification" as distinct from "no matching grant".
- `mcp acl check --subject alice --server x --all-tools` should clearly flag the strict-mode-blocked tools in its output when strict mode is on.

## Out of scope

- **Changing the default to `true`.** Strict mode stays opt-in. Making it the default would be a breaking change and is a separate policy decision.
- **A third classification state** (e.g., "suspicious"). The model stays `Read` / `Write` / `Ambiguous`.
- **Per-server `strictClassification` overrides.** One global flag is enough for v1. If a real use case shows up later for per-server strict mode, it can be added without breaking the flag semantics.

## Technical pointers

- The flag already exists in the schema parser from #55 — this issue only needs to consume it in the evaluator.
- The evaluator needs access to the classification source (`override` / `annotation` / `classifier` / `fallback`) to distinguish "ambiguous but user-overridden" from "ambiguous by classifier". Both come from #54's `ToolClassification`.
- The distinct audit marker should be defined in a single place and reused by both the evaluator and the CLI output, so they never drift.
- Unit tests: a policy that would otherwise allow an ambiguous tool via `access: "write"`, evaluated with and without strict mode, must flip from allow to deny. A policy with an explicit override on the same tool must allow in both modes.

## Success criteria

- Operators enabling `strictClassification: true` see clear startup warnings listing every ambiguous tool currently being blocked.
- `mcp acl check` distinguishes strict-mode denials from generic default denials in its output.
- Audit log queries can filter specifically for strict-mode denials to produce a prioritized list of tools needing overrides.
- A tool that was ambiguous but receives an override in `servers.json` immediately becomes reachable again on restart, without any ACL change needed.

See the full redesign plan at `docs/acl-redesign-plan.md`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ACL: strict classification mode for ambiguous tools #59

Goal

Expected behavior

Out of scope

Technical pointers

Success criteria

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

ACL: strict classification mode for ambiguous tools #59

Description

Goal

Expected behavior

Out of scope

Technical pointers

Success criteria

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions