Skip to content

security(evidence): add #335 spending-policy execution evidence verifier#338

Merged
omarespejel merged 10 commits intomainfrom
codex/issue-335-evidence
Mar 6, 2026
Merged

security(evidence): add #335 spending-policy execution evidence verifier#338
omarespejel merged 10 commits intomainfrom
codex/issue-335-evidence

Conversation

@omarespejel
Copy link
Collaborator

@omarespejel omarespejel commented Mar 6, 2026

Summary

  • add scripts/security/spending-policy-evidence.mjs to initialize and verify reproducible #335 execution reports
  • add unit tests for schema validation, safety checks, and closure gating (--require-closed)
  • add canonical evidence template + runbook under docs/security/evidence/spending-policy/
  • map unresolved SPENDING_POLICY_AUDIT.md checklist items to explicit SP-xx checks and owner roles
  • wire new test suite into CI and strict proof workflow

Why

#335 requires reproducible evidence links, explicit owners, and enforceable closure criteria for no-backend launch readiness.

Validation

  • node --test scripts/security/check-session-signature-parity.test.mjs
  • node --test scripts/security/verify-secure-defi-claims.test.mjs
  • node --test scripts/security/evidence-manifest.test.mjs
  • node --test scripts/security/spending-policy-evidence.test.mjs
  • node scripts/security/spending-policy-evidence.mjs --report docs/security/evidence/spending-policy/execution-report.template.json --bundle-dir docs/security/evidence/spending-policy

Closes #335

Summary by CodeRabbit

  • New Features

    • CLI tooling to create, validate, and manage spending-policy execution reports for security closure.
  • Documentation

    • Added spending-policy audit guidance, evidence README, execution-report template, and launch-readiness tracker entries for spending-policy closure.
  • Tests

    • Comprehensive test suite for spending-policy evidence; CI now runs the new evidence validation tests.
  • Chores

    • CI workflows and package scripts updated to invoke the spending-policy evidence tooling.

@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 0632e59c-b239-4f4c-969c-6552e2c35f30

📥 Commits

Reviewing files that changed from the base of the PR and between 031d611 and 87fcbf3.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml
📒 Files selected for processing (1)
  • package.json
📜 Recent review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2026-02-17T04:43:12.245Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: website/CLAUDE.md:0-0
Timestamp: 2026-02-17T04:43:12.245Z
Learning: Install dependencies with `pnpm install`, start development server with `pnpm dev`, build with `pnpm build`, and run ESLint with `pnpm lint`

Applied to files:

  • package.json
📚 Learning: 2026-02-17T04:43:12.245Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: website/CLAUDE.md:0-0
Timestamp: 2026-02-17T04:43:12.245Z
Learning: Install dependencies with pnpm install, start dev server with pnpm dev, build for production with pnpm build, start production with pnpm start, and run linting with pnpm lint

Applied to files:

  • package.json
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to packages/create-starknet-agent/**/*.{ts,tsx,js} : CLI scaffolding tool (`create-starknet-agent`) must provide E2E onboarding examples for agent setup on Starknet

Applied to files:

  • package.json
🔇 Additional comments (2)
package.json (2)

22-23: LGTM!

The new scripts are well-structured and align with the PR objectives. The spending:evidence:init script appropriately uses fixed values (--run-id sp-template, hardcoded timestamp) to generate a reproducible canonical template, while spending:evidence:verify validates the report against the bundle directory. The --force flag on init enables template regeneration during development.


44-44: Version express-rate-limit@8.3.0 is valid and current.

The version exists on npm and is the latest available release. It is not deprecated. The override is using an up-to-date version.


Walkthrough

Adds a spending-policy evidence feature: new verifier CLI with exports, tests, JSON report template and docs, npm scripts, and CI workflow steps to run the verifier; also updates launch-readiness tracker and spending-policy audit docs for closure and sign-off. (50 words)

Changes

Cohort / File(s) Summary
CI Workflow Integration
/.github/workflows/ci.yml, /.github/workflows/strict-security-proof.yml
Invoke scripts/security/spending-policy-evidence.test.mjs as an additional security/evidence test step in the Test / Run evidence/proof validator job.
Launch Readiness & Audit Docs
docs/security/LAUNCH_READINESS_TRACKER.md, docs/security/SPENDING_POLICY_AUDIT.md
Add spending-policy closure checklist, evidence mapping, closure procedure, renumber sections, update sign-off criteria and limitations.
Evidence Docs & Template
docs/security/evidence/spending-policy/README.md, docs/security/evidence/spending-policy/execution-report.template.json
New README describing the execution evidence workflow and a JSON execution-report template with checks SP-01..SP-10, signoffs, and residualRisks.
Package Scripts
package.json
Add npm scripts spending:evidence:init and spending:evidence:verify; add pnpm.overrides entry for express-rate-limit.
Verifier Implementation
scripts/security/spending-policy-evidence.mjs
New CLI verifier implementing schema/version, CHECK_IDS, required-check definitions, argument parsing, template creation, strict validation (checks, evidence path safety, signoffs, residual risks), I/O helpers, exports and main entry.
Test Suite
scripts/security/spending-policy-evidence.test.mjs
New comprehensive tests covering arg parsing, template creation, many verification failure cases, closed-report success path, path-safety checks, duplicate detection, and CLI init/require-closed behavior.

Sequence Diagram(s)

sequenceDiagram
    actor Developer
    participant CI as CI Workflow
    participant TR as Test Runner
    participant Verifier as spending-policy-evidence.test.mjs
    participant CLI as spending-policy-evidence.mjs
    participant FS as Filesystem

    Developer->>CI: push changes (scripts, tests, docs, package.json)
    CI->>TR: run security test job
    TR->>Verifier: execute test suite
    Verifier->>CLI: import/run verifier module functions
    CLI->>FS: read/write report templates and bundle files
    CLI->>CLI: validate checks, evidence paths, signoffs, residual risks
    CLI-->>Verifier: return pass/fail to tests
    Verifier-->>TR: report test outcomes
    TR-->>CI: publish job result
    CI-->>Developer: status (PASS/FAIL)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs


Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

  • Ignore

❌ Failed checks (1 error, 1 warning)

Check name Status Explanation Resolution
Spec Impact Declaration ❌ Error PR modifies .github/workflows/ and docs/ paths but lacks required 'Spec impact' section in description. Add 'Spec impact' section to PR description stating 'none' or detailing any specification impacts.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (8 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title uses the required scope prefix 'security(evidence)' and clearly summarizes the behavioral impact: adding a spending-policy execution evidence verifier for issue #335.
Linked Issues check ✅ Passed The PR fully implements objectives from #335: enumeration of checklist items as SP-xx checks with owners, reproducible evidence validation infrastructure, comprehensive test coverage, audit documentation updates with closure procedures, and sign-off tracking.
Out of Scope Changes check ✅ Passed All changes are directly aligned with #335 scope: spending-policy evidence infrastructure, validation scripts, documentation, tests, and CI workflow integration. The express-rate-limit override is a minor dependency management task with no functional impact.
Cross-Repo Boundary Awareness ✅ Passed PR does not modify files in restricted directories (contracts/session-account/** or packages/**).
Security Rationale For Account Semantics ✅ Passed This PR does not include changes to contracts/session-account/** directory, so the security rationale requirement does not apply.
Cairo Contract Safety Gate ✅ Passed No Cairo contract files were modified in this PR; all changes are limited to workflow configurations, documentation, and Node.js scripts.
Ci/Cd Security Gate ✅ Passed Workflow changes add test script invocations without introducing new permissions, unpinned action versions, or secrets handling issues.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch codex/issue-335-evidence

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link

greptile-apps bot commented Mar 6, 2026

Greptile Summary

This PR introduces a structured CLI evidence verifier (spending-policy-evidence.mjs) for the #335 spending-policy launch gate, together with a canonical report template, a runbook, and supporting CI wiring. The tool covers report initialization, multi-layer path-traversal prevention, required-check completeness validation, and a hard --require-closed gate that demands all SP-01..SP-10 checks to "pass" and all three sign-offs to be "approved" before closure.

Key observations:

  • Three String() coercion bugs remain in validateEvidenceEntry and validateSignoff (lines ~578, ~582, ~728–739). Because String(["tx"]) === "tx", String(["0xdeadbeef"]) === "0xdeadbeef", and String(["approved"]) === "approved", single-element array values pass the set-membership guards. The most impactful case is signoff.*.status: ["approved"]: it passes the allowed-values check but causes the subsequent === "approved" block to be skipped, so name and signedAt are never enforced during non---require-closed validation. The --require-closed closure gate itself is safe (it fails closed for non-string statuses), but the relaxed validation path is misleading.
  • The url coercion issue identified in the previous review thread has been correctly resolved.
  • The validateResidualRisks concern from the previous thread has also been addressed — description and owner are now required.
  • CI workflows correctly pin all actions to full commit SHAs; permissions are minimal.
  • Test coverage is thorough but does not include tests for the array-value coercion bypass.

Confidence Score: 3/5

  • Safe to merge with caveats: the --require-closed closure gate is secure, but three String() coercion inconsistencies in the schema validator allow structurally malformed array values to pass non-closure validation silently.
  • The hard --require-closed path (the actual launch-blocking check) correctly rejects non-"approved" and non-"pass" values, so the critical closure gate is not bypassable. However, three String() coercion bugs in validateEvidenceEntry and validateSignoff allow single-element arrays to pass the set-membership checks while failing subsequent strict-equality logic — most notably allowing status: ["approved"] to skip name/signedAt enforcement during ordinary schema validation, undermining the evidence integrity guarantee outside the --require-closed path. The tests also lack coverage for these cases.
  • Focus attention on scripts/security/spending-policy-evidence.mjs lines ~576–584 (evidence entry type/txHash coercion) and lines ~728–741 (signoff status coercion). All other files are clean.

Important Files Changed

Filename Overview
scripts/security/spending-policy-evidence.mjs Core CLI evidence verifier. Three String() coercion bugs allow single-element arrays to pass set-membership guards for type, txHash, and status fields; the status coercion is most impactful because it lets a signoff entry with status: ["approved"] skip name/signedAt enforcement when --require-closed is not set.
scripts/security/spending-policy-evidence.test.mjs Comprehensive unit test coverage for schema validation, path traversal, closure gating, and edge cases. Missing tests for array-value coercion bypass (status: ["approved"], type: ["tx"], txHash: ["0x1"]).
.github/workflows/ci.yml Single-line addition of the new test invocation; all actions are correctly pinned to full commit SHAs. No issues.
.github/workflows/strict-security-proof.yml Single-line addition of the new test; action is pinned to full SHA, permissions are minimal (contents: read). No issues.
docs/security/evidence/spending-policy/execution-report.template.json Well-structured template covering all 10 SP-xx checks with correct owner roles and pending statuses. No issues.
docs/security/SPENDING_POLICY_AUDIT.md Updated with SP-01..SP-10 ownership map, closure procedure (Section 10), and corrected checklist items. Documentation is clear and accurate.
package.json Two new convenience scripts and an express-rate-limit override bump to 8.3.0. No issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    CLI["CLI: spending-policy-evidence.mjs"] --> PARSE["parseArgs(argv)"]
    PARSE --> INIT_MODE{--init flag?}

    INIT_MODE -->|yes| CREATE["createReportTemplate()"]
    CREATE --> WRITE["writeJson(reportPath, report)"]
    WRITE --> PASS0["✅ PASS (template initialized)"]

    INIT_MODE -->|no| READ["readJson(reportPath)"]
    READ --> VERIFY["verifySpendingPolicyReport(report, options)"]

    VERIFY --> SHAPE["validateBaseShape(report)\nschemaVersion, issue, profile, network,\nrunId, generatedAt, checks[], signoff, residualRisks[]"]
    SHAPE --> CHECKS["For each check entry:\n• checkId non-empty, no duplicates\n• status in allowed set (String() coercion ⚠️)\n• owner non-empty, evidence is array\n• if status==='pass': evidence.length > 0\n• validateEvidenceEntry(entry)"]
    CHECKS --> EVIDENCE["validateEvidenceEntry:\n• at least one of txHash/path/url\n• type in allowed set (String() coercion ⚠️)\n• txHash: hex regex (String() coercion ⚠️)\n• path: toSafeRelativePath + fs.existsSync\n• url: isNonEmptyString"]
    CHECKS --> MISSING["missingCheckIds = CHECK_IDS\n  .filter(id => !checkMap.has(id))\nfail if any missing"]
    MISSING --> CLOSED_CHECK{requireClosed?}
    CLOSED_CHECK -->|yes| UNRESOLVED["unresolved = checks\n  where status !== 'pass'\nfail if any unresolved"]
    CLOSED_CHECK -->|no| SIGNOFF
    UNRESOLVED --> SIGNOFF

    SIGNOFF["validateSignoff(signoff, options)\nFor each of leadDeveloper, securityReviewer, qaEngineer:\n• status in allowed set (String() coercion ⚠️)\n• if status==='approved': name + signedAt required\n• if requireClosed: status must be 'approved'"]
    SIGNOFF --> RISKS["validateResidualRisks(residualRisks)\neach entry must have description + owner\ndueDate optional ISO-8601"]
    RISKS --> RESULT["Return summary:\nrequiredChecks / passedChecks /\nunresolvedChecks / runId"]
    RESULT --> PASS["✅ PASS"]
    SHAPE -->|fail| BLOCK["❌ BLOCK (exit 1)"]
    CHECKS -->|fail| BLOCK
    EVIDENCE -->|fail| BLOCK
    MISSING -->|fail| BLOCK
    UNRESOLVED -->|fail| BLOCK
    SIGNOFF -->|fail| BLOCK
    RISKS -->|fail| BLOCK
Loading

Comments Outside Diff (3)

  1. scripts/security/spending-policy-evidence.mjs, line 728-741 (link)

    String() coercion lets array status values bypass approved enforcement

    In validateSignoff, the set-membership guard uses String(entry.status):

    if (!ALLOWED_SIGNOFF_STATUSES.has(String(entry.status))) {  }

    String(["approved"]) evaluates to "approved", which is in the set, so an array value passes. However the next check is a strict equality:

    if (entry.status === "approved") {   // false for ["approved"]
      if (!isNonEmptyString(entry.name)) {  }   // skipped
      validateIsoDate(entry.signedAt, );          // skipped
    }

    Because ["approved"] !== "approved", the name/signedAt enforcement block is silently skipped. A report with "status": ["approved"] (a one-element array) would therefore pass schema validation without any name or timestamp — contradicting the sign-off intent — whenever the caller omits --require-closed.

    The same coercion pattern affects check-status validation on lines ~785 and ~739. Consider replacing String(entry.status) with isNonEmptyString(entry.status) && ALLOWED_SIGNOFF_STATUSES.has(entry.status) (and the analogous guard for check statuses) to ensure only proper string values are accepted before any downstream strict-equality logic runs.

  2. scripts/security/spending-policy-evidence.mjs, line 576-580 (link)

    String() coercion on entry.type accepts single-element arrays

    String(entry.type) converts an array ["tx"] to the string "tx" via Array.prototype.toString. Because "tx" is in ALLOWED_EVIDENCE_TYPES, the evidence entry passes type validation even though entry.type is not a string.

    While type is currently only used as a label and does not gate further logic in this function, accepting a structurally malformed value quietly undermines the integrity guarantee of the schema validator.

    Replace with a strict string-first check:

  3. scripts/security/spending-policy-evidence.mjs, line 582-584 (link)

    String() coercion allows array txHash to pass hex validation

    String(["0x1234"]) returns "0x1234" because Array.prototype.toString calls join(","). A single-element array like ["0xdeadbeef"] therefore satisfies the regex, yet the stored value is an array, not a hex string. For a security-audit evidence record, a txHash that is not a plain string is structurally invalid.

    Add an explicit string check before the regex test:

Last reviewed commit: 87fcbf3

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 6, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/security/spending-policy-evidence.mjs`:
- Around line 108-140: The current toSafeRelativePath check only rejects
Unix-style absolute paths (leading "/") but misses Windows drive-letter
absolutes (e.g., "C:\\" or "C:/"); update toSafeRelativePath to also detect and
fail on drive-letter prefixes after normalization (e.g., strings matching a
pattern like "^[A-Za-z]:/") so Windows absolute paths are rejected; keep the
existing normalize (replace backslashes with "/") and the same fail message, and
ensure resolveEvidencePath still performs its semantic check as-is.

In `@scripts/security/spending-policy-evidence.test.mjs`:
- Around line 78-192: Add a unit test that ensures verifySpendingPolicyReport
rejects reports with duplicate checkId values: create a report via
createReportTemplate, duplicate one of its checks (e.g., push
{...report.checks[0]}), then assert that calling
verifySpendingPolicyReport(report, { bundleDir: process.cwd() }) throws an error
matching /duplicate checkId/i; this covers the duplicate-checkId validation in
verifySpendingPolicyReport and uses existing helpers CHECK_IDS and
createReportTemplate to construct the scenario.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 01734320-dcd2-42fb-a4d0-111dacd09a5c

📥 Commits

Reviewing files that changed from the base of the PR and between 8ba02ec and 928af9a.

📒 Files selected for processing (10)
  • .github/workflows/ci.yml
  • .github/workflows/strict-security-proof.yml
  • docs/security/LAUNCH_READINESS_TRACKER.md
  • docs/security/SPENDING_POLICY_AUDIT.md
  • docs/security/evidence/spending-policy/README.md
  • docs/security/evidence/spending-policy/execution-report.template.json
  • docs/security/evidence/spending-policy/runs/.gitkeep
  • package.json
  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (3)
docs/**

⚙️ CodeRabbit configuration file

docs/**: Security documentation must accurately reflect the current implementation. Flag any discrepancy between documented and actual behavior. API docs must include authentication and authorization requirements.

Files:

  • docs/security/LAUNCH_READINESS_TRACKER.md
  • docs/security/evidence/spending-policy/README.md
  • docs/security/evidence/spending-policy/execution-report.template.json
  • docs/security/SPENDING_POLICY_AUDIT.md
scripts/**

⚙️ CodeRabbit configuration file

scripts/**: Flag any script that runs with elevated privileges without justification. Verify input sanitization in deployment scripts. Flag hardcoded credentials, URLs, or environment-specific assumptions.

Files:

  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
.github/workflows/**

📄 CodeRabbit inference engine (Custom checks)

.github/workflows/**: For changes in .github/workflows/, flag any new or expanded permissions in CI/CD configurations
For changes in .github/workflows/
, flag unpinned action versions or secrets referenced without need-to-know scope

Files:

  • .github/workflows/ci.yml
  • .github/workflows/strict-security-proof.yml

⚙️ CodeRabbit configuration file

.github/workflows/**: Ensure least-privilege permissions and safe CI execution patterns. Flag any workflow with write permissions that doesn't strictly need them. Verify secrets are not exposed in logs or artifacts. Dependency actions must use pinned SHA versions, not tags.

Files:

  • .github/workflows/ci.yml
  • .github/workflows/strict-security-proof.yml
🧠 Learnings (18)
📓 Common learnings
Learnt from: omarespejel
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: :0-0
Timestamp: 2026-03-06T10:42:07.959Z
Learning: In the keep-starknet-strange/starknet-agentic repository, the canonical SISNA parity source for session-signature-v2 conformance workflows is `keep-starknet-strange/SISNA` (not `omarespejel/SISNA`), as established by PR `#329` closing issue `#256`. The spec files live at `spec/session-signature-v2.schema.json` and `spec/session-signature-v2.json` on the `main` branch.
📚 Learning: 2026-03-06T10:42:07.959Z
Learnt from: omarespejel
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: :0-0
Timestamp: 2026-03-06T10:42:07.959Z
Learning: In the keep-starknet-strange/starknet-agentic repository, the canonical SISNA parity source for session-signature-v2 conformance workflows is `keep-starknet-strange/SISNA` (not `omarespejel/SISNA`), as established by PR `#329` closing issue `#256`. The spec files live at `spec/session-signature-v2.schema.json` and `spec/session-signature-v2.json` on the `main` branch.

Applied to files:

  • docs/security/LAUNCH_READINESS_TRACKER.md
  • docs/security/evidence/spending-policy/execution-report.template.json
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to contracts/session-account/** : For changes in contracts/session-account/**, require a concise security rationale and explicit mention of invariants preserved or changed

Applied to files:

  • docs/security/LAUNCH_READINESS_TRACKER.md
  • docs/security/SPENDING_POLICY_AUDIT.md
  • .github/workflows/strict-security-proof.yml
📚 Learning: 2026-02-24T20:25:18.742Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-02-24T20:25:18.742Z
Learning: Applies to contracts/session-account/** : For changes in contracts/session-account/**, require a concise security rationale and explicit mention of invariants preserved or changed in documentation

Applied to files:

  • docs/security/LAUNCH_READINESS_TRACKER.md
  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to +(contracts,packages)/**/{**/*account*,**/*session*,**/*signature*,**/*key*}.{cairo,ts,tsx} : Require human review for security-sensitive code (key handling, signature verification, spending limits)

Applied to files:

  • docs/security/LAUNCH_READINESS_TRACKER.md
  • scripts/security/spending-policy-evidence.mjs
  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-24T20:25:18.742Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-02-24T20:25:18.742Z
Learning: Applies to .github/workflows/** : For changes in .github/workflows/**, flag any new or expanded permissions in CI/CD configurations

Applied to files:

  • .github/workflows/ci.yml
📚 Learning: 2026-02-17T04:43:12.245Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: website/CLAUDE.md:0-0
Timestamp: 2026-02-17T04:43:12.245Z
Learning: Install dependencies with `pnpm install`, start development server with `pnpm dev`, build with `pnpm build`, and run ESLint with `pnpm lint`

Applied to files:

  • .github/workflows/ci.yml
  • package.json
📚 Learning: 2026-02-17T04:43:12.245Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: website/CLAUDE.md:0-0
Timestamp: 2026-02-17T04:43:12.245Z
Learning: Install dependencies with pnpm install, start dev server with pnpm dev, build for production with pnpm build, start production with pnpm start, and run linting with pnpm lint

Applied to files:

  • .github/workflows/ci.yml
  • package.json
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to packages/create-starknet-agent/**/*.{ts,tsx,js} : CLI scaffolding tool (`create-starknet-agent`) must provide E2E onboarding examples for agent setup on Starknet

Applied to files:

  • package.json
📚 Learning: 2026-02-16T23:56:29.389Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T23:56:29.389Z
Learning: Perform Sepolia testing before any mainnet deployment

Applied to files:

  • docs/security/evidence/spending-policy/execution-report.template.json
📚 Learning: 2026-02-24T20:25:10.945Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-24T20:25:10.945Z
Learning: Contract deployments require human review before pushing to Sepolia or mainnet

Applied to files:

  • docs/security/evidence/spending-policy/execution-report.template.json
📚 Learning: 2026-02-24T20:27:50.560Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-02-24T20:27:50.560Z
Learning: PR description must include 'Spec impact' with either 'none' or concrete compatibility/migration notes for changes in contracts/session-account/**, packages/**, docs/**, or .github/workflows/**

Applied to files:

  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to contracts/**/*{identity,reputation,validation}*.{cairo,test.*} : Implement ERC-8004 identity, reputation, and validation registries with production-ready unit and E2E testing coverage

Applied to files:

  • docs/security/SPENDING_POLICY_AUDIT.md
  • .github/workflows/strict-security-proof.yml
📚 Learning: 2026-02-15T18:44:13.849Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-02-15T18:44:13.849Z
Learning: If this PR changes contracts/session-account/**, packages/**, docs/**, or .github/workflows/**, the PR description must include 'Spec impact' with either 'none' or concrete compatibility/migration notes

Applied to files:

  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to **/contracts/agent-account/**/*.cairo : Agent Account contracts must support session keys, spending limits, time bounds, kill switches, multi-call batching, and nonce abstraction according to the Agent Wallet Architecture specification

Applied to files:

  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to contracts/agent-account/**/*.cairo : Agent Account contracts must implement owner and session key policies with allowed contracts, spending limits, time bounds, and kill switch functionality

Applied to files:

  • docs/security/SPENDING_POLICY_AUDIT.md
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to **/contracts/agent-account/**/*.{test,spec}.{ts,tsx,cairo} : Agent Account contract must be tested with at least 110 unit tests and E2E tests before production deployment

Applied to files:

  • .github/workflows/strict-security-proof.yml
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to contracts/**/*.test.cairo : All smart contracts must include comprehensive unit tests (110+ tests for Agent Account) and E2E tests (47+ tests) before deployment

Applied to files:

  • .github/workflows/strict-security-proof.yml
🪛 markdownlint-cli2 (0.21.0)
docs/security/evidence/spending-policy/README.md

[warning] 26-26: Ordered list item prefix
Expected: 1; Actual: 2; Style: 1/1/1

(MD029, ol-prefix)


[warning] 32-32: Ordered list item prefix
Expected: 1; Actual: 3; Style: 1/1/1

(MD029, ol-prefix)


[warning] 40-40: Ordered list item prefix
Expected: 1; Actual: 4; Style: 1/1/1

(MD029, ol-prefix)

🔇 Additional comments (17)
package.json (1)

22-23: LGTM: New npm scripts for spending policy evidence workflow.

The scripts correctly wire the new spending-policy-evidence.mjs module for template initialization and verification. The hardcoded values (--run-id sp-template, --generated-at, --network starknet-sepolia) are appropriate defaults for the static template file, and --force is necessary since the template already exists in the repository.

scripts/security/spending-policy-evidence.mjs (4)

1-7: LGTM: Clean imports using Node.js built-in modules.

The script correctly imports only from node:* namespaced modules, avoiding external dependencies for this security-critical validation tool.


8-76: LGTM: Well-defined constants align with audit documentation.

The check definitions (SP-01 through SP-10) correctly match the launch-blocking checks documented in SPENDING_POLICY_AUDIT.md section 7.5, and the status/signoff enumerations provide appropriate validation constraints.


329-398: LGTM: Comprehensive report verification with proper closure gating.

The verification logic correctly:

  • Validates schema structure and required fields
  • Detects duplicate check IDs
  • Enforces evidence requirement for passed checks
  • Validates evidence paths against traversal attacks
  • Gates closure on all checks passing and all signoffs approved

415-473: LGTM: Clean CLI implementation with appropriate safeguards.

The main function properly handles both initialization and verification modes, with sensible defaults (bundleDir defaults to report's directory) and safety checks (--force required to overwrite). Exit codes follow convention (0=success, 1=error, 2=usage).

docs/security/evidence/spending-policy/execution-report.template.json (1)

1-108: LGTM: Template correctly matches script schema and audit documentation.

The template structure aligns with:

  • REPORT_SCHEMA_VERSION and validation requirements in spending-policy-evidence.mjs
  • Check definitions (SP-01 through SP-10) with correct owner role assignments
  • Required signoff keys (leadDeveloper, securityReviewer, qaEngineer)
  • Launch-blocking checks documented in SPENDING_POLICY_AUDIT.md section 7.5
docs/security/evidence/spending-policy/README.md (1)

1-73: LGTM: Comprehensive workflow documentation with accurate command examples.

The README correctly documents:

  • The canonical 4-step evidence workflow (init → execute → validate → verify closure)
  • CLI commands matching the script interface
  • Required check IDs (SP-01..SP-10) and signoff keys
  • Path safety and evidence requirements

The markdownlint warnings about ordered list prefixes (using 2., 3., 4. vs 1. throughout) are a stylistic preference and don't affect documentation correctness.

scripts/security/spending-policy-evidence.test.mjs (2)

1-14: LGTM: Clean test setup with proper imports.

The test file correctly imports from Node.js built-in test infrastructure and the module under test.


15-76: LGTM: Well-designed test helpers with proper cleanup patterns.

The helper functions provide good test isolation:

  • makeTempDir() creates unique temp directories
  • makeClosedReport() generates a realistic closed report with actual evidence files
  • writeTempReport() handles file setup for verification tests
.github/workflows/ci.yml (1)

185-185: LGTM: Test integration follows established CI patterns.

The new test is added alongside existing security tests in the same step, maintaining consistency. The workflow retains least-privilege permissions (contents: read) and all actions use pinned SHA versions.

.github/workflows/strict-security-proof.yml (1)

34-34: LGTM: Consistent test integration with CI workflow.

The addition mirrors the ci.yml change, ensuring the spending-policy-evidence tests run in both regular CI and the strict security proof gate. Workflow maintains contents: read permissions.

docs/security/SPENDING_POLICY_AUDIT.md (4)

633-655: LGTM: Ownership map integrates evidence workflow into audit documentation.

Section 7.5 correctly documents:

  • Reference to the canonical evidence template
  • Validation command matching the script CLI
  • Launch-blocking checks (SP-01..SP-10) with owner assignments that align with the script's REQUIRED_CHECK_DEFINITIONS

696-709: LGTM: Sign-off criteria updated with evidence check references.

The checklist items now reference specific check IDs (SP-01..SP-10) and signoff keys, providing clear traceability between the audit documentation and the evidence validation schema.


713-723: LGTM: Closure procedure provides clear, actionable workflow.

The 5-step procedure correctly documents:

  1. Run bundle initialization with proper CLI flags
  2. Evidence attachment guidance
  3. Structure validation command
  4. Closure readiness check with --require-closed
  5. Cross-reference requirements to issues #335 and #273

Commands match the script's CLI interface. As per coding guidelines: "Security documentation must accurately reflect the current implementation."


726-764: LGTM: Conclusion section provides comprehensive status summary.

The updated conclusion effectively summarizes:

  • Completed security work (R1, R2, R3 items resolved)
  • Remaining E2E/load testing requirements
  • Known limitations with rationale
  • Clear next steps for deployment readiness
docs/security/LAUNCH_READINESS_TRACKER.md (2)

14-14: Scope update correctly includes #335 closure evidence.

This addition cleanly aligns the tracker scope with the new spending-policy evidence workflow introduced in this PR.


53-61: #335 evidence map is well-wired and implementation-aligned.

The checklist/verifier/template references are precise and traceable, which improves closure reproducibility for launch gating.

As per coding guidelines, this reflects the rule: “Security documentation must accurately reflect the current implementation.”

@omarespejel
Copy link
Collaborator Author

Addressed all open feedback threads in this PR:

  • hardened toSafeRelativePath to reject Windows drive-letter absolute paths (e.g. C:/...)
  • fixed URL type-confusion by removing String(entry.url) coercion and requiring non-empty string directly
  • tightened residual risk validation: each entry now requires non-empty description and owner
  • added explicit duplicate-checkId unit test
  • added regression tests for Windows absolute path rejection, non-string URL rejection, and empty residual-risk object rejection

Validation run:

  • node --test scripts/security/spending-policy-evidence.test.mjs (pass: 12/12)

All previously open review threads for this PR were resolved.

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 6, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/security/spending-policy-evidence.mjs`:
- Around line 93-101: The validateIsoDate function currently accepts any string
parseable by new Date; replace this with a strict ISO-8601 check: first ensure
value matches a rigorous ISO-8601 datetime regex (e.g.
YYYY-MM-DDTHH:MM:SS(.sss)? with Z or ±HH:MM timezone) so date-only or
human-readable forms are rejected, then parse with new Date(value) and call fail
via the existing fail(...) message if either the regex test fails or the parsed
date is NaN; update error messages in validateIsoDate to reflect ISO-8601
requirement so callers know the exact contract.

In `@scripts/security/spending-policy-evidence.test.mjs`:
- Around line 118-142: The test creates a temp dir with makeTempDir() and
currently removes it at the end of the test body, which can leak if an assertion
throws; change both path-traversal tests to register cleanup with t.after(() =>
fs.rmSync(bundleDir, { recursive: true, force: true })) immediately after
creating bundleDir so cleanup always runs, and remove the manual fs.rmSync(...)
at the end of the test bodies; the affected symbols are makeTempDir, bundleDir,
verifySpendingPolicyReport and t.after.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 29d6fd3c-0ad6-4c15-b034-6e9b9b674eed

📥 Commits

Reviewing files that changed from the base of the PR and between 928af9a and b9cbc03.

📒 Files selected for processing (2)
  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
scripts/**

⚙️ CodeRabbit configuration file

scripts/**: Flag any script that runs with elevated privileges without justification. Verify input sanitization in deployment scripts. Flag hardcoded credentials, URLs, or environment-specific assumptions.

Files:

  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
🧠 Learnings (5)
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to +(contracts,packages)/**/{**/*account*,**/*session*,**/*signature*,**/*key*}.{cairo,ts,tsx} : Require human review for security-sensitive code (key handling, signature verification, spending limits)

Applied to files:

  • scripts/security/spending-policy-evidence.mjs
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to contracts/**/*.test.cairo : All smart contracts must include comprehensive unit tests (110+ tests for Agent Account) and E2E tests (47+ tests) before deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to **/contracts/agent-account/**/*.{test,spec}.{ts,tsx,cairo} : Agent Account contract must be tested with at least 110 unit tests and E2E tests before production deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to contracts/**/*{identity,reputation,validation}*.{cairo,test.*} : Implement ERC-8004 identity, reputation, and validation registries with production-ready unit and E2E testing coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T23:56:29.389Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T23:56:29.389Z
Learning: Applies to packages/starknet-mcp-server/src/tools/**/*.ts : All MCP tools must have Zod-validated input schemas and Vitest test coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
🧬 Code graph analysis (1)
scripts/security/spending-policy-evidence.test.mjs (1)
scripts/security/spending-policy-evidence.mjs (15)
  • bundleDir (331-331)
  • bundleDir (454-454)
  • CHECK_IDS (63-63)
  • CHECK_IDS (63-63)
  • report (441-445)
  • report (453-453)
  • reportPath (424-424)
  • args (178-178)
  • args (417-417)
  • parseArgs (177-192)
  • createReportTemplate (194-242)
  • entry (288-288)
  • verifySpendingPolicyReport (330-399)
  • summary (455-458)
  • main (416-469)
🔇 Additional comments (2)
scripts/security/spending-policy-evidence.test.mjs (1)

170-178: Great addition: duplicate checkId is now explicitly covered.

This directly exercises the guard in report verification and closes a meaningful regression gap.

scripts/security/spending-policy-evidence.mjs (1)

108-121: Path-safety hardening looks good.

Drive-letter absolute paths and traversal patterns are explicitly rejected before resolution, which is the right defense-in-depth for evidence path input.

@omarespejel
Copy link
Collaborator Author

Follow-up updates applied after the latest bot round:

  • validateIsoDate now enforces strict ISO-8601 UTC format (YYYY-MM-DDTHH:mm:ss.sssZ) with regex + roundtrip check.
  • Added regression test for non-ISO generatedAt rejection.
  • Updated temp-dir cleanup in path traversal tests to use t.after(...) to avoid leaks on assertion failure.

Validation rerun:

  • node --test scripts/security/spending-policy-evidence.test.mjs (pass: 13/13)

Resolved the new review threads.

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 6, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/security/spending-policy-evidence.mjs`:
- Around line 147-171: The validator currently allows evidence entries with just
a type and no locator; update validateEvidenceEntry to require a concrete
locator by adding a check that at least one of entry.path, entry.url, or
entry.txHash is present (and failing with a clear message otherwise). Keep the
existing validations for path (use resolveEvidencePath + fs.existsSync), url
(isNonEmptyString) and txHash (hex regex) but add the new presence check
(referencing validateEvidenceEntry and the entry object) before or after the
individual field validations so an entry like { type: "other" } will fail unless
it supplies path, url, or txHash.

In `@scripts/security/spending-policy-evidence.test.mjs`:
- Line 41: The fixture in spending-policy-evidence.test.mjs contains a hardcoded
network-specific URL ("https://sepolia.voyager.online/tx/0x1") assigned to the
url property; replace it with a neutral test constant (e.g., TEST_TX_URL) or a
placeholder string ("https://example.test/tx/0x1") and define that constant at
the top of the test file, then update the fixture's url reference so the test no
longer assumes a specific network endpoint.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e22d0f62-f9b5-4206-b2b6-de3520b4a1df

📥 Commits

Reviewing files that changed from the base of the PR and between b9cbc03 and 98b9f6b.

📒 Files selected for processing (2)
  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
scripts/**

⚙️ CodeRabbit configuration file

scripts/**: Flag any script that runs with elevated privileges without justification. Verify input sanitization in deployment scripts. Flag hardcoded credentials, URLs, or environment-specific assumptions.

Files:

  • scripts/security/spending-policy-evidence.test.mjs
  • scripts/security/spending-policy-evidence.mjs
🧠 Learnings (5)
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to contracts/**/*.test.cairo : All smart contracts must include comprehensive unit tests (110+ tests for Agent Account) and E2E tests (47+ tests) before deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to **/contracts/agent-account/**/*.{test,spec}.{ts,tsx,cairo} : Agent Account contract must be tested with at least 110 unit tests and E2E tests before production deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to contracts/**/*{identity,reputation,validation}*.{cairo,test.*} : Implement ERC-8004 identity, reputation, and validation registries with production-ready unit and E2E testing coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T23:56:29.389Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T23:56:29.389Z
Learning: Applies to packages/starknet-mcp-server/src/tools/**/*.ts : All MCP tools must have Zod-validated input schemas and Vitest test coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to +(contracts,packages)/**/{**/*account*,**/*session*,**/*signature*,**/*key*}.{cairo,ts,tsx} : Require human review for security-sensitive code (key handling, signature verification, spending limits)

Applied to files:

  • scripts/security/spending-policy-evidence.mjs
🧬 Code graph analysis (2)
scripts/security/spending-policy-evidence.test.mjs (1)
scripts/security/spending-policy-evidence.mjs (15)
  • bundleDir (335-335)
  • bundleDir (458-458)
  • CHECK_IDS (63-63)
  • CHECK_IDS (63-63)
  • report (445-449)
  • report (457-457)
  • reportPath (428-428)
  • args (182-182)
  • args (421-421)
  • parseArgs (181-196)
  • createReportTemplate (198-246)
  • entry (292-292)
  • verifySpendingPolicyReport (334-403)
  • summary (459-462)
  • main (420-473)
scripts/security/spending-policy-evidence.mjs (3)
scripts/security/spending-policy-evidence.test.mjs (16)
  • parsed (253-253)
  • bundleDir (130-130)
  • bundleDir (155-155)
  • bundleDir (224-224)
  • report (86-86)
  • report (106-106)
  • report (116-116)
  • report (132-132)
  • report (157-157)
  • report (180-180)
  • report (190-190)
  • report (214-214)
  • report (227-227)
  • report (258-258)
  • reportPath (73-73)
  • reportPath (241-241)
scripts/security/evidence-manifest.mjs (1)
  • relativePath (171-171)
scripts/security/check-session-signature-parity.mjs (1)
  • error (18-18)
🔇 Additional comments (2)
scripts/security/spending-policy-evidence.mjs (1)

94-105: Strong validation hardening for timestamps and evidence paths.

Strict ISO-8601 UTC enforcement and dual-layer path safety (toSafeRelativePath + resolveEvidencePath) materially improve reproducibility and traversal resistance.

Also applies to: 112-145

scripts/security/spending-policy-evidence.test.mjs (1)

94-103: Test coverage is comprehensive for security-critical validation paths.

The suite now directly covers strict ISO rejection, traversal blocking (including Windows absolute), duplicate checkId, URL type safety, residual-risk validation, and CLI exit semantics with robust temp-dir cleanup.

Also applies to: 129-177, 179-271

@omarespejel
Copy link
Collaborator Author

Applied the latest feedback in this PR:

  • validateEvidenceEntry now requires at least one concrete locator (txHash, path, or url)
  • added explicit non-empty string validation for path before path resolution
  • replaced network-specific fixture URL with neutral https://example.invalid/... constant
  • added regression test to ensure locator-less evidence entries are rejected

Validation rerun:

  • node --test scripts/security/spending-policy-evidence.test.mjs -> pass (14/14)

All currently open review threads on #338 are resolved.

coderabbitai[bot]
coderabbitai bot previously requested changes Mar 6, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scripts/security/spending-policy-evidence.mjs`:
- Around line 188-203: The parseArgs function currently treats a missing value
for any flag as the string "true", which lets value-bearing flags like --report,
--bundle-dir, and --generated-at silently accept no value; update parseArgs to
reject missing values for those flags by validating when a flag is present that
the next argv item exists and does not start with "--", and if it does, throw or
exit with a clear error mentioning the flag name (use the local variable name
and key to build the message). Specifically, inside parseArgs (function name)
replace the branch that sets args[name] = "true" for missing values with logic
that: for known value-bearing flags (--report, --bundle-dir, --generated-at)
treat a missing value as an error and return/throw a helpful message, and for
true/boolean flags still allow explicit --flag without value if desired; ensure
tests or callers handle the thrown error/exit consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5add9570-7459-445d-b344-eab720095a6b

📥 Commits

Reviewing files that changed from the base of the PR and between 98b9f6b and 0906cfb.

📒 Files selected for processing (2)
  • scripts/security/spending-policy-evidence.mjs
  • scripts/security/spending-policy-evidence.test.mjs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Type Check
  • GitHub Check: Greptile Review
🧰 Additional context used
📓 Path-based instructions (1)
scripts/**

⚙️ CodeRabbit configuration file

scripts/**: Flag any script that runs with elevated privileges without justification. Verify input sanitization in deployment scripts. Flag hardcoded credentials, URLs, or environment-specific assumptions.

Files:

  • scripts/security/spending-policy-evidence.test.mjs
  • scripts/security/spending-policy-evidence.mjs
🧠 Learnings (5)
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to contracts/**/*.test.cairo : All smart contracts must include comprehensive unit tests (110+ tests for Agent Account) and E2E tests (47+ tests) before deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T20:44:56.184Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-16T20:44:56.184Z
Learning: Applies to **/contracts/agent-account/**/*.{test,spec}.{ts,tsx,cairo} : Agent Account contract must be tested with at least 110 unit tests and E2E tests before production deployment

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-15T17:03:57.943Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: AGENT.md:0-0
Timestamp: 2026-02-15T17:03:57.943Z
Learning: Applies to contracts/**/*{identity,reputation,validation}*.{cairo,test.*} : Implement ERC-8004 identity, reputation, and validation registries with production-ready unit and E2E testing coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T23:56:29.389Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T23:56:29.389Z
Learning: Applies to packages/starknet-mcp-server/src/tools/**/*.ts : All MCP tools must have Zod-validated input schemas and Vitest test coverage

Applied to files:

  • scripts/security/spending-policy-evidence.test.mjs
📚 Learning: 2026-02-16T05:58:24.900Z
Learnt from: CR
Repo: keep-starknet-strange/starknet-agentic PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-02-16T05:58:24.900Z
Learning: Applies to +(contracts,packages)/**/{**/*account*,**/*session*,**/*signature*,**/*key*}.{cairo,ts,tsx} : Require human review for security-sensitive code (key handling, signature verification, spending limits)

Applied to files:

  • scripts/security/spending-policy-evidence.mjs
🧬 Code graph analysis (1)
scripts/security/spending-policy-evidence.mjs (4)
contracts/erc8004-cairo/scripts/verify_owners.js (1)
  • message (161-161)
scripts/security/spending-policy-evidence.test.mjs (18)
  • parsed (277-277)
  • bundleDir (132-132)
  • bundleDir (157-157)
  • bundleDir (248-248)
  • report (88-88)
  • report (108-108)
  • report (118-118)
  • report (134-134)
  • report (159-159)
  • report (182-182)
  • report (192-192)
  • report (216-216)
  • report (238-238)
  • report (251-251)
  • report (282-282)
  • reportPath (75-75)
  • reportPath (265-265)
  • summary (252-255)
scripts/security/evidence-manifest.mjs (2)
  • relativePath (171-171)
  • process (452-455)
scripts/security/check-session-signature-parity.mjs (1)
  • error (18-18)
🔇 Additional comments (4)
scripts/security/spending-policy-evidence.test.mjs (3)

131-179: Strong path-safety regression coverage.

These tests cleanly validate traversal rejection and Windows absolute-path rejection, with reliable t.after cleanup.


181-245: Validation hardening cases are well covered.

Good coverage for duplicate checkId, non-string URL, locator-required evidence entries, and residual-risk field enforcement.


247-295: Closed-gating and CLI behavior tests look solid.

The positive closed-report case and --require-closed failure path provide useful end-to-end confidence.

scripts/security/spending-policy-evidence.mjs (1)

94-178: Validation hardening is well implemented.

Strict ISO-8601 UTC validation, safe-relative-path enforcement (including Windows drive paths), and locator/URL/path evidence checks are robust.

@omarespejel
Copy link
Collaborator Author

Follow-up clean summary of the resolved thread:

  • parseArgs now rejects missing values for value-bearing flags (report, bundle-dir, generated-at, run-id, network) instead of coercing them to true.
  • Added regression tests for missing-value handling.
  • Validation run passed: node --test scripts/security/spending-policy-evidence.test.mjs (15/15 passing).

Latest commit on this PR: 031d611.

@omarespejel
Copy link
Collaborator Author

@coderabbitai review

@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@omarespejel omarespejel dismissed stale reviews from coderabbitai[bot], coderabbitai[bot], coderabbitai[bot], and coderabbitai[bot] March 6, 2026 20:01

Dismissing stale bot change request: all actionable comments have been addressed on latest head.

@omarespejel
Copy link
Collaborator Author

Applied all actionable fixes needed to make this PR merge-ready on latest base:

  • Updated branch with latest main to remove BEHIND status.
  • Fixed CI Test audit-gate blocker by pinning patched express-rate-limit via pnpm overrides and lockfile refresh.
    • package.json: add pnpm.overrides["express-rate-limit"] = "8.3.0"
    • pnpm-lock.yaml: now resolves express-rate-limit@8.3.0
  • Confirmed no unresolved review threads remain.
  • Dismissed stale CodeRabbit CHANGES_REQUESTED reviews after all comments were addressed on latest head.

Validation run locally on this branch:

  • node scripts/security/audit-gate.mjs --report audit-report.json --allowlist security/audit-allowlist.json --failLevel high => PASS
  • node --test scripts/security/spending-policy-evidence.test.mjs => 15/15 passing

Current remaining blocker is only required human approval (self-approval is not allowed for PR author).

@omarespejel omarespejel merged commit 3092b54 into main Mar 6, 2026
22 checks passed
@omarespejel omarespejel deleted the codex/issue-335-evidence branch March 6, 2026 20:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P1-validation: close SPENDING_POLICY_AUDIT E2E/load/sign-off items

1 participant