P2 3-way-audit follow-ups: audit truncation + numbers/Origin/OAuth + count reconcile#306
Draft
heznpc wants to merge 6 commits into
Draft
P2 3-way-audit follow-ups: audit truncation + numbers/Origin/OAuth + count reconcile#306heznpc wants to merge 6 commits into
heznpc wants to merge 6 commits into
Conversation
…ce comment audit_summary already computed verified / verifiedFirstBreak / auditDisabled but omitted them from outputSchema, so the MCP SDK strips the tamper-evidence signal out of structuredContent. Declared all three; added a strict-contract test so a future drift is caught at the runtime contract, not just registration shape. oauth-verifier's header comment claimed the RFC 8707 resource claim is accepted as an audience fallback, but the impl only checks aud via jose. Made the comment match reality (resource rides untouched in raw, never gates audience) and dropped the now-unused claimsFromPayload audience parameter.
The HMAC chain catches edits, inserts, reorders, and genesis-reroot, but a tail-truncated chain is still a valid shorter chain — removing the most recent lines went undetected. Each sealed line now carries a monotonic seq, and every flush overwrites a single signed checkpoint anchoring the highest seq + head. verifyAuditChain reports truncated when the chain falls short of the checkpoint and checkpoint_forged when the checkpoint MAC fails. A corrupt or absent checkpoint degrades to chain-only verification (no false alarm); an attacker who removes both the tail and the checkpoint is a documented limit, not a silent gap. Also emit a one-time warning on first flush when the chain is keyed off the host-derived fallback (tamper-evident only, not strong auth) so an operator knows to set AIRMCP_AUDIT_HMAC_KEY for cross-machine integrity. Regenerated tool-manifest.json + MCPIntents.swift for the extended audit_summary outputSchema (verified / verifiedFirstBreak / auditDisabled).
numbers_set_cell forced every value through a single-quoted JXA string literal, so 42 landed as the text "42" (left-aligned, unsortable, invisible to formulas) and only Numbers' own coercion rescued some cases. The value schema is now a string | number | boolean union and setCellScript emits a native JS literal for numbers/booleans (quoted + escaped only for strings, where a leading '=' is still interpreted by Numbers as a formula). A scalar-union param previously returned null from swiftTypeFor and was silently dropped from the generated AppIntent, which would have removed value from the Siri/Shortcuts "Set Numbers Cell" intent. swiftTypeFor now projects a scalar union to its string member (String) so the intent keeps a usable text parameter. Regenerated tool-manifest.json + MCPIntents.swift.
…opt-in isOriginAllowed has always allowed requests with no Origin header, but RFC 0002 documented the opposite (default-deny + an AIRMCP_TRUST_NO_ORIGIN bypass that was never implemented). Allow-by-default is the correct call: a browser always attaches Origin to a cross-origin request, so a missing Origin is a non-browser client (curl, native MCP) already gated by the token/OAuth policy — denying it would break those clients for no security gain. Made the rationale explicit in code, reconciled RFC 0002 section 5.2 to match, and added an opt-in strict mode (AIRMCP_DENY_NO_ORIGIN=1) for browser-only deployments that genuinely want to reject Origin-less requests.
The public "272 tools" headline counted only server.registerTool() call sites and undercounted the real runtime surface — the generated manifest exposes 286 (it also includes the dynamically-registered, skill_*, and MCP-app tools). count-stats now reads manifest.toolCount as the single source of truth and propagates 286 across README, the docs site, registry manifests, the landing page, and locales. Left as-is on purpose: the "~111 tools" starter-preset figure, the v2.7 "262 tools across 27 modules" pitch quoted in REGISTRY_SUBMISSIONS, and the point-in-time counts inside RFC 0013/0014 (historical context, not current claims). tool-count-drift now asserts README advertises exactly manifest.toolCount, instead of carrying a stale "superset" comment — guarding the headline against future drift.
…policy Two issues an adversarial review of this branch surfaced before merge: - llms.txt (the root AI-discovery manifest) still advertised "272 tools" while README / registry manifests moved to 286 — gen-llms-txt derived the headline from a registerTool( regex (272) instead of manifest.toolCount (286), and the llms:check gate compared against its own stale number so the contradiction shipped green. Headline now reads manifest.toolCount; the per-module list stays a documented presentation projection. - The no-Origin allow-by-default comment claimed the request is "already gated by the token/OAuth policy" — true for with-token*/with-oauth*, but the DEFAULT loopback-only policy has no such gate (it relies on the 127.0.0.1 binding). The behavior is safe either way; corrected the comment (code + RFC 0002 section 5.2) to attribute the gate to the active policy so it can't mislead a maintainer.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The verifiable half of the P2 backlog from the 3-way completeness audit. Five self-contained fixes, each green on its own (typecheck, full suite of 2030 tests, gen/manifest/intents,
mcp:validate,stats:check,npm audit).audit_summary exposes
verified— the HMAC-chain tamper verdict was computed but omitted fromoutputSchema, so the SDK stripped it fromstructuredContent. Declared it (+verifiedFirstBreak/auditDisabled) with a strict-contract test. Also corrected the OAuthresourcecomment (impl isaud-only).audit-log tail-truncation detection — the HMAC chain caught edits / inserts / reorders but not removal of the most recent lines (a truncated chain is a valid shorter chain). Each line now carries a monotonic
seq+ a signed per-flush checkpoint; verify reportstruncated/checkpoint_forged. A corrupt or absent checkpoint degrades to chain-only (no false alarm). Plus a one-time warning when the chain is keyed off the host-derived fallback.Numbers cells write native types —
set_cellforced every value through a quoted JXA string, so42landed as the text"42". The value schema is nowstring | number | booleanand emits native literals;swiftTypeForgained scalar-union support so the AppIntent parameter isn't silently dropped.no-Origin policy —
isOriginAllowedhas always allowed Origin-less requests (correct: a browser always sends Origin, so a missing one is a token-gated non-browser client), but RFC 0002 documented the opposite plus an unimplemented bypass. Reconciled the RFC, made the rationale explicit, and added an opt-inAIRMCP_DENY_NO_ORIGINstrict mode.tool-count reconciliation — the public "272 tools" headline undercounted the real surface;
count-statsnow readsmanifest.toolCount(286) as the single source of truth and propagates it across README / docs site / registry manifests / landing page / locales. Historical figures preserved on purpose (the~111starter, the v2.7 "262 tools" pitch quote, RFC 0013/0014 point-in-time counts).Not included — the dual-provider AppShortcutsProvider guard, the FoundationModels fail-closed runtime test + tool-loop pre-check, and app/FM-flag CI. These live behind
#if canImport(FoundationModels)and cannot even be compile-checked without the macOS 26 SDK + a real device, so they stay in the separate macOS-verification bucket rather than being written blind and claimed done.