Skip to content

Safe Executor v1 + MCP utilization standard (internal rollout)#356

Open
fvn946zh9w-crypto wants to merge 1 commit intowonderwhy-er:mainfrom
fvn946zh9w-crypto:codex/mcp-utilization-skills-standard
Open

Safe Executor v1 + MCP utilization standard (internal rollout)#356
fvn946zh9w-crypto wants to merge 1 commit intowonderwhy-er:mainfrom
fvn946zh9w-crypto:codex/mcp-utilization-skills-standard

Conversation

@fvn946zh9w-crypto
Copy link

@fvn946zh9w-crypto fvn946zh9w-crypto commented Feb 23, 2026

User description

Title

Safe Executor v1 + MCP utilization standard (internal rollout)

Summary

  • Packages Safe Executor runtime, skill resources/views, and governance/rollout standards into a single integration set.
  • Keeps single control plane architecture in /Users/test1/DesktopCommanderMCP.
  • Uses strict safety defaults and reason-coded guardrails.

Included

  • Skills tools and guarded execution flow (run_skill, approve_skill_run, get_skill_run, cancel_skill_run).
  • Read-only resources (dc://skills/catalog, dc://skills/eval-gate, dc://skills/runs/{runId}).
  • Governance and rollout docs under operations/rollout/.
  • Pilot evidence and baseline snapshots under operations/rollout/2026-02-23/.

Validation

  • Full test run result: see operations/rollout/2026-02-23/npm_test.log and summary file.
  • Pilot run evidence: operations/rollout/2026-02-23/pilot_run_summary.md.
  • Eval gate snapshot: operations/rollout/2026-02-23/skills_eval_gate_snapshot.json.

Security Defaults Confirmed

  • commandValidationMode = strict
  • skillExecutionMode = confirm
  • toolCallLoggingMode = redacted
  • skillExecuteEvalGateEnabled = true (enforced post-sampling)

Known Environment-specific Notes

  • PDF creation test may hit listen EPERM in restricted sandbox; test includes environment-safe skip path.

Rollout Scope

  • Internal opt-in only for this phase.

CodeAnt-AI Description

Add Safe Executor skills runtime with guarded skill discovery and execution

What Changed

  • Exposes read-only MCP resources for skills (dc://skills/catalog, dc://skills/eval-gate, dc://skills/runs/{runId}) and new tools for listing, inspecting, running, approving and cancelling skills (list_skills, get_skill, run_skill, approve_skill_run, cancel_skill_run, get_skill_run).
  • Introduces a guarded skill execution flow: plan mode completes immediately; confirm mode queues runs for manual approval; auto-safe mode can execute. Execute attempts are blocked by configuration checks and an eval gate (sample size + pass rate) with clear reason codes when blocked.
  • Adds pre-execution guardrails that block high-risk commands, validate config values, and enforce command validation mode (strict vs legacy) so dangerous or misconfigured executions are rejected with actionable messages.
  • Adds skill discovery and parsing (SKILL.md frontmatter), a skill runner that produces deterministic plans, step outcomes, safe command/script execution within skill scope, and per-run redacted views exposed via resources.
  • Adds configuration keys and safe defaults to opt-in skills (skillsEnabled=false), including skillsDirectories, commandValidationMode=strict, toolCallLoggingMode=redacted, skillExecutionMode=confirm, eval-gate defaults, and backfill logic to preserve existing user settings.
  • Changes command validation to fail-closed in strict mode (legacy preserves fail-open) and records/redacts tool call metadata to avoid logging raw secrets; a telemetry warning is printed when telemetry is enabled but no endpoints configured.
  • Adds comprehensive unit and integration tests and resource/tool fallbacks so clients that don't surface MCP resources can still read skill views.

Impact

✅ Fewer accidental destructive commands
✅ Clearer eval-gate blocks for skill execution
✅ Fewer raw secrets in tool call logs

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@codeant-ai codeant-ai bot added the size:XXL This PR changes 1000+ lines, ignoring generated files label Feb 23, 2026
@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

Nitpicks 🔍

🔒 No security issues identified
⚡ Recommended areas for review

  • Guardrail coverage
    The new preExecutionGuardrail checks for destructive commands using a simple regex and ad-hoc arg inspection. That approach can miss command variations (e.g. 'rm -rf /*', 'rm -rf --no-preserve-root /', chained commands with && or ;, obfuscated whitespace, shell expansions) or different argument shapes. Also the guard assumes args contains command as a top-level string which may not match all tool argument shapes. Verify that detection is robust (tokenize/normalize input, canonicalize, or use whitelist+explicit parsing) and that it's applied consistently before any execution handlers.

  • Skill resource exposure
    New resource/tool handlers expose skill artifacts (dc://skills/*) and return raw view.contents?.[0]?.text. These resources may contain large payloads or sensitive/unredacted data. Validate, sanitize, and size-limit outputs and ensure redaction of secrets before returning to clients or exposing via tool fallbacks.

  • Telemetry leakage risk
    Telemetry captures extra metadata (e.g. set_config_value key name and guardrail capture events). Ensure telemetry does not include secrets or sensitive configuration values. Confirm sanitization rules are sufficient for new properties and consider stricter filtering/explicit allow-listing of telemetry fields before calling capture/capture_call_tool.

  • Environment exposure
    Child processes created in runChildProcess inherit the full process environment via env: process.env. Untrusted skill scripts may read environment variables (secrets, credentials). Consider limiting or whitelisting environment variables for spawned processes or making this configurable per deployment.

  • Command validation false positives
    The disallowed-character pattern used to reject shell-like operators includes characters such as '$' which may appear in otherwise harmless commands or in version/flag output. This could produce false positives (blocked commands) when the intention is to run simple binaries with benign arguments. Re-evaluate the regex and ensure legitimate safe commands aren't inadvertently blocked.

  • Global mutation in test
    The test temporarily replaces commandManager.extractCommands with a throwing stub to simulate parser failure. Mutating global module state can leak between tests or threads. Even though it restores in finally, concurrent test execution or exceptions before assignment could still cause side effects. Consider using a dedicated stub/mocking helper or dependency injection.

  • Flaky timing / race
    The test relies on precise timing (setTimeout 400ms vs a 5s script) to cancel a running process and then awaits an approval promise. This is fragile across CI, different machines, and scheduling jitter; it can produce intermittent failures. Prefer deterministically waiting for the run to reach a known state (e.g. "running" or "started") before cancelling or use an explicit hook/event from the runner to signal process start.

  • Accessing private internals
    The test reads server._requestHandlers (a private field). Tests that rely on private/internal APIs are brittle against refactors. Prefer public test hooks or exported handler registry accessors.

  • Environment-dependent path
    The test sets skillsDirectories to a hard-coded absolute path (/Users/test1/.codex/skills). This makes tests non-portable and likely to fail on CI or developer machines. Use a temp directory, fixture, or mock the skill registry instead.

  • Hardcoded Path / Flaky Environment
    The test sets skillsDirectories to the absolute path /Users/test1/.codex/skills. This hard-coded user-specific path creates environment-dependent tests and will fail on other machines. Use a portable approach (temp dir, env var, or os.homedir) or make the path configurable in CI.

});
}

const run = await skillRunner.approveRun(parsed.data.runId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The concurrency limit configured via skillMaxConcurrentRuns is only enforced for run_skill in execute mode, but not when a run transitions to execution via approve_skill_run, so a caller can bypass the limit by creating many planned runs and then approving them, leading to more concurrent executions than configured; you should apply the same getPendingOrActiveCount vs maxConcurrentRuns check in the approve handler before calling into the runner. [logic error]

Severity Level: Major ⚠️
- ❌ Concurrency limit ignored for approve_skill_run execution path.
- ⚠️ Safe Executor may spawn more runs than configured.
- ⚠️ Resource usage spikes despite skillMaxConcurrentRuns protection.
- ⚠️ Governance expectations on execution caps silently violated.
Suggested change
const run = await skillRunner.approveRun(parsed.data.runId);
if (skillRunner.getPendingOrActiveCount() >= settings.maxConcurrentRuns) {
return errorResponse(`Max concurrent skill runs reached (${settings.maxConcurrentRuns}).`, 'concurrency_limit_reached');
}
Steps of Reproduction ✅
1. Start the MCP server that wires the `approve_skill_run` tool to `handleApproveSkillRun`
in `src/handlers/skills-handlers.ts:157-193` (this module is imported by the tools layer
via `../tools/schemas.js`).

2. Configure the server so that `configManager.getConfig()` (used by `getSkillConfig` at
`src/handlers/skills-handlers.ts:34-37`) returns a `ServerConfig` with
`skillsEnabled=true`, `commandValidationMode='strict'`, `skillMaxConcurrentRuns=1`, and
`skillExecuteEvalGateEnabled=true`, so `normalizeSkillRuntimeConfig` at
`src/skills/runtime-config.ts:79-147` yields `maxConcurrentRuns = 1` and `evalGateEnabled
= true`.

3. From a client (DesktopCommander / MCP client), call the `run_skill` tool so it routes
to `handleRunSkill` at `src/handlers/skills-handlers.ts:107-155` with arguments that
create multiple runs in a non-`execute` mode (for example a planning/confirm flow defined
by `RunSkillArgsSchema`), so the concurrency check `if (parsed.data.mode === 'execute' &&
skillRunner.getPendingOrActiveCount() >= settings.maxConcurrentRuns)` at lines 134-137 is
not executed, producing several `waiting_approval` runs.

4. While at least one other run is already executing (started earlier via `run_skill` with
`mode='execute'` and allowed by the check at `skills-handlers.ts:134-137`, bringing real
concurrent executions to the configured limit of 1), approve each queued run by repeatedly
invoking the `approve_skill_run` tool, which reaches `handleApproveSkillRun` at
`src/handlers/skills-handlers.ts:157-193`; observe that no concurrency check is performed
before `skillRunner.approveRun(...)` is called, so additional runs start executing even
though `skillRunner.getPendingOrActiveCount()` is already ≥ `settings.maxConcurrentRuns`,
allowing more concurrent executions than the configured limit.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/handlers/skills-handlers.ts
**Line:** 183:183
**Comment:**
	*Logic Error: The concurrency limit configured via `skillMaxConcurrentRuns` is only enforced for `run_skill` in `execute` mode, but not when a run transitions to execution via `approve_skill_run`, so a caller can bypass the limit by creating many planned runs and then approving them, leading to more concurrent executions than configured; you should apply the same `getPendingOrActiveCount` vs `maxConcurrentRuns` check in the approve handler before calling into the runner.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

Comment on lines +529 to +533
text.includes('Started') || text.includes('No'),
['search_session_started_or_no_results'],
[text.substring(0, 200)],
'Search response did not include expected markers'
);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: In the search step, the verification object always includes a failureReason string even when the step passes, which leads to confusing and inconsistent state where successful steps appear to have a failure reason in downstream views; adjust the logic so failureReason is only populated when the verification actually fails. [logic error]

Severity Level: Major ⚠️
- ⚠️ Successful search steps carry misleading failureReason metadata.
- ⚠️ Consumers of run summaries may misinterpret search step status.
Suggested change
text.includes('Started') || text.includes('No'),
['search_session_started_or_no_results'],
[text.substring(0, 200)],
'Search response did not include expected markers'
);
const searchSessionOk = text.includes('Started') || text.includes('No');
const verification = createVerification(
searchSessionOk,
['search_session_started_or_no_results'],
[text.substring(0, 200)],
searchSessionOk ? undefined : 'Search response did not include expected markers'
Steps of Reproduction ✅
1. In `src/skills/runner.ts:333-383`, invoke `skillRunner.runSkill()` with `mode:
'execute'` and `executionMode: 'auto_safe'` on a valid `SkillDescriptor` so that a plan is
built including the `search` step (`buildDeterministicPlan()` at
`src/skills/runner.ts:114-155` always adds a `type: 'search'` step as `step-2`).

2. Ensure the run is executed (i.e., not `plan` or `confirm` mode) so `runSkill()` calls
`this.executePlanSteps(runId)` at `src/skills/runner.ts:382`, which in turn calls
`this.executeSingleStep()` for each step at `src/skills/runner.ts:418`.

3. When the `search` step is processed in `executeSingleStep()`
(`src/skills/runner.ts:510-536`), the call to `handleStartSearch()` returns a non-error
result where `result.content[0].text` includes either `"Started"` or `"No"` (the expected
success markers implied by the verification checks), making `verification.passed === true`
in the block at `src/skills/runner.ts:528-533`.

4. Observe the returned `SkillRun` from `runSkill()` (or via `getRun()` at
`src/skills/runner.ts:659-662`): in `run.executionSummary.stepOutcomes`, the `search` step
has `verification.passed === true` but `verification.failureReason === 'Search response
did not include expected markers'`, demonstrating a successful step carrying a non-empty
failureReason.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** src/skills/runner.ts
**Line:** 529:533
**Comment:**
	*Logic Error: In the search step, the verification object always includes a failureReason string even when the step passes, which leads to confusing and inconsistent state where successful steps appear to have a failure reason in downstream views; adjust the logic so failureReason is only populated when the verification actually fails.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

Comment on lines +65 to +70
if (prevSkillsEnabled !== undefined) {
try {
await configManager.setValue('skillsEnabled', prevSkillsEnabled);
} catch {
// ignore
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The test only restores skillsEnabled if it was previously defined, so if the config key did not exist before the test, the test will persistently add skillsEnabled: false to the user's config and violate the "should not leave config mutated" expectation. [logic error]

Severity Level: Major ⚠️
- ❌ User skills tools remain hidden across MCP sessions.
- ⚠️ Test mutates shared config despite 'no mutation' comment.
Suggested change
if (prevSkillsEnabled !== undefined) {
try {
await configManager.setValue('skillsEnabled', prevSkillsEnabled);
} catch {
// ignore
}
try {
await configManager.setValue('skillsEnabled', prevSkillsEnabled);
} catch {
// ignore
Steps of Reproduction ✅
1. Ensure the configuration managed by `src/config-manager.ts` has no `skillsEnabled` key
so that `configManager.getValue('skillsEnabled')` (implemented at lines 215-218) will
return `undefined`.

2. Run `test/test-combined-tool-filtering.js`, which imports `{ configManager }` from
`../dist/config-manager.js` and calls `prevSkillsEnabled = await
configManager.getValue('skillsEnabled');` followed by `await
configManager.setValue('skillsEnabled', false);` inside `run()` (lines 32-34).

3. In the `finally` block at `test/test-combined-tool-filtering.js:63-79`, the code checks
`if (prevSkillsEnabled !== undefined)` before restoring; since `prevSkillsEnabled` is
`undefined` in this scenario, the call to `configManager.setValue('skillsEnabled',
prevSkillsEnabled)` is skipped entirely.

4. As a result, `src/config-manager.ts:setValue` (lines 223-248) has already persisted
`skillsEnabled: false` to the backing config file, and this value is not reverted after
the test, leaving `skillsEnabled` permanently set to `false` for subsequent server runs
initiated via `dist/index.js`, which drives skill tool filtering validated by this test.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** test/test-combined-tool-filtering.js
**Line:** 65:70
**Comment:**
	*Logic Error: The test only restores `skillsEnabled` if it was previously defined, so if the config key did not exist before the test, the test will persistently add `skillsEnabled: false` to the user's config and violate the "should not leave config mutated" expectation.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

try {
skillRunner.resetExecuteEvalStats();
await configManager.setValue('skillsEnabled', true);
await configManager.setValue('skillsDirectories', ['/Users/test1/.codex/skills']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The test hardcodes skillsDirectories to /Users/test1/.codex/skills, which will not exist on most machines or CI environments; this can cause handleListSkills to return an empty skills array, leading to a runtime error when accessing listPayload.skills[0].id and making the test environment-dependent and brittle. Reusing the previously configured skillsDirectories avoids the hard-coded, user-specific path while still allowing the test to operate on whatever skill directories are actually configured. [logic error]

Severity Level: Major ⚠️
- ⚠️ Eval gate test fails on non-`test1` developer machines.
- ⚠️ CI on shared runners may intermittently fail tests.
- ⚠️ Hard-coded user path reduces portability of test suite.
Suggested change
await configManager.setValue('skillsDirectories', ['/Users/test1/.codex/skills']);
if (prevDirs !== undefined) {
await configManager.setValue('skillsDirectories', prevDirs);
}
Steps of Reproduction ✅
1. Clone the repository to a machine where the directory `/Users/test1/.codex/skills` does
not exist or does not contain valid skills for DesktopCommanderMCP (this is a
user-specific absolute path).

2. Run the eval gate test directly with `node test/test-skill-eval-gate.js`, which invokes
the `run()` function defined in `test/test-skill-eval-gate.js:11-68`.

3. Inside `run()`, the test overwrites the skills directory config with `await
configManager.setValue('skillsDirectories', ['/Users/test1/.codex/skills']);` at
`test/test-skill-eval-gate.js:22`, and then calls `handleListSkills({ limit: 1 })` at
`test/test-skill-eval-gate.js:28`.

4. `handleListSkills` (in `src/handlers/skills-handlers.ts:47-79`) uses `getSkillConfig()`
and then `skillRegistry.scanSkills(settings.skillDirs)` with the now hard-coded
`/Users/test1/.codex/skills`. Because that path has no usable skills on this machine,
`handleListSkills` returns a result with no skills (or a config error). Back in the test,
`parseTextPayload(listed)` at `test/test-skill-eval-gate.js:30` yields
`listPayload.skills` as an empty array, so `listPayload.skills[0].id` at
`test/test-skill-eval-gate.js:31` is `undefined.id`, causing a TypeError or failing the
earlier assertion `assert.ok(!listed.isError, ...)`, and the test fails in otherwise valid
environments.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** test/test-skill-eval-gate.js
**Line:** 22:22
**Comment:**
	*Logic Error: The test hardcodes `skillsDirectories` to `/Users/test1/.codex/skills`, which will not exist on most machines or CI environments; this can cause `handleListSkills` to return an empty skills array, leading to a runtime error when accessing `listPayload.skills[0].id` and making the test environment-dependent and brittle. Reusing the previously configured `skillsDirectories` avoids the hard-coded, user-specific path while still allowing the test to operate on whatever skill directories are actually configured.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

try {
skillRunner.resetExecuteEvalStats();
await configManager.setValue('skillsEnabled', true);
await configManager.setValue('skillsDirectories', ['/Users/test1/.codex/skills']);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The test hardcodes skillsDirectories to /Users/test1/.codex/skills, which is a user-specific absolute path; on any machine where that path does not exist or contain skills, handleListSkills will return no skills and the assertions expecting discovered skills will fail, so the test should instead reuse the existing configuration or a configurable directory. [logic error]

Severity Level: Major ⚠️
- ❌ Skills workflow test fails on non-author environments.
- ⚠️ CI/npm test pipeline may be red by default.
- ⚠️ Contributors cannot reliably run full test-suite locally.
Suggested change
await configManager.setValue('skillsDirectories', ['/Users/test1/.codex/skills']);
const skillsDirs = Array.isArray(prevDirs) && prevDirs.length > 0
? prevDirs
: (process.env.DESKTOP_COMMANDER_SKILLS_DIR ? [process.env.DESKTOP_COMMANDER_SKILLS_DIR] : []);
await configManager.setValue('skillsDirectories', skillsDirs);
Steps of Reproduction ✅
1. On any machine where `/Users/test1/.codex/skills` does not exist or contains no skill
definitions, build the project so that `dist/handlers/skills-handlers.js` is present.

2. Run the test entrypoint `node test/test-skills-workflow.js`, which calls the `run()`
function defined at `test/test-skills-workflow.js:18`.

3. Inside `run()`, at `test/test-skills-workflow.js:26-33`, the test sets configuration
via `configManager.setValue`, including `skillsDirectories` to
`['/Users/test1/.codex/skills']` at line 29, then calls `handleListSkills({ limit: 5 })`
at line 35.

4. `handleListSkills` (implemented in `src/handlers/skills-handlers.ts:47-79`) calls
`skillRegistry.scanSkills(settings.skillDirs)` using that directory; with no skills found,
it returns an empty `skills` array, causing the assertion
`assert.ok(listPayload.skills.length > 0, 'at least one skill should be discovered');` at
`test/test-skills-workflow.js:39` to fail and the test process to exit with a non-zero
status.
Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** test/test-skills-workflow.js
**Line:** 29:29
**Comment:**
	*Logic Error: The test hardcodes `skillsDirectories` to `/Users/test1/.codex/skills`, which is a user-specific absolute path; on any machine where that path does not exist or contain skills, `handleListSkills` will return no skills and the assertions expecting discovered skills will fail, so the test should instead reuse the existing configuration or a configurable directory.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
👍 | 👎

@codeant-ai
Copy link
Contributor

codeant-ai bot commented Feb 23, 2026

CodeAnt AI finished reviewing your PR.

@wonderwhy-er
Copy link
Owner

Ouh wow that is a big one
Do you have a linkedin or something so we can discuss

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XXL This PR changes 1000+ lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants