How MCTS works internally: discovery → analyzers → scoring → report.
| Read this if you… | Jump to |
|---|---|
| Just want to scan | Getting started — skip this doc |
| Need to understand a finding | Security checks |
| Are contributing code | Quick start for contributors + Extension points |
| Are debugging a scan | Scan lifecycle + Debugging |
Terms: Glossary
- At a glance
- End-to-end pipeline
- Entry points
- Scan lifecycle
- Core data models
- Layers
- Analyzers
- Scoring and reporting
- Supporting commands
- Package layout
- Extension points
- Debugging scans
Roadmap and planned work: Product roadmap (not covered here).
When you run mcts scan ./server.py:
- Discover — Build an
MCPServerInfosnapshot (tools, prompts, resources, handler source, repo markdown instructions, optional live schemas) - Analyze — Run security analyzers; each returns
Findingobjects - Post-process — Dedupe, enrich with MCTS-T IDs, append OWASP compliance meta-findings
- Score — Legacy 0–100
score.overall(always) plus v2score_v2whenscoring_modeisv2orboth(default); compliance excluded from both sums;attack_chainsmeta-rows excluded from v2 only - Report — Terminal UI, JSON, SARIF (incl.
mcts/scoreV2), or HTML viamcts report
Orchestrator: Scanner in src/mcts/core/scanner.py
Config: ScanConfig in src/mcts/core/config.py
CLI: mcts scan in src/mcts/cli/main.py
flowchart LR
subgraph input [Input]
CLI["CLI / API"]
CFG["ScanConfig"]
end
subgraph discover [Discovery]
STATIC["Static Py/TS"]
LIVE["Live stdio / HTTP"]
SNAP["JSON snapshot"]
STATIC --> MERGE["MCPServerInfo"]
LIVE --> MERGE
SNAP --> MERGE
end
subgraph analyze [Analysis]
ANA["Analyzers"]
DEDUPE["Dedupe + enrich"]
COMP["Compliance OWASP"]
GRAPH["Attack graph + scan scope"]
ANA --> DEDUPE --> COMP --> GRAPH
end
subgraph output [Output]
V1["RiskScoringEngine (legacy)"]
V2["RiskScoringEngineV2 (optional)"]
REP["ScanReport"]
OUT["Terminal · JSON · SARIF · HTML"]
GRAPH --> V1
V1 --> V2
V2 --> REP --> OUT
end
CLI --> CFG --> STATIC
CFG --> LIVE
CFG --> SNAP
MERGE --> ANA
ASCII equivalent:
ScanConfig ──► Discovery (static / live / snapshot) ──► MCPServerInfo
│
▼
Analyzers (parallel, sequential loop)
│
▼
filters → dedupe → enrich (MCTS-T) → compliance
│
▼
attack_graph + scan_scope (paths when v2/both)
│
▼
RiskScoringEngine (always) → RiskScoringEngineV2 (v2/both)
│
▼
ScanReport → terminal · JSON · SARIF · HTML
| Surface | Module | Typical use |
|---|---|---|
| CLI | cli/main.py |
mcts scan, inventory, fuzz, vet, pentest |
| REST API | api/app.py |
mcts serve — same Scanner, JSON in/out |
| Python API | core/scanner.py |
Scanner(ScanConfig).run() or .analyze_server(info) |
| MCP server mode | mcp_server/ |
mcts-mcp tools for IDE agents |
All scan paths converge on Scanner.analyze_server(MCPServerInfo).
Scanner.run() discovers first; Scanner.analyze_server() runs the analysis pipeline (also used by API and tests with pre-built snapshots).
| Source | When | Module |
|---|---|---|
| Python static AST | Default repo/file scan | discovery/static.py |
| TypeScript/JS patterns | --languages typescript |
discovery/static_js.py |
| Repository markdown | Default repo scan (--discover-instructions, on) |
discovery/instruction_files.py |
| Live stdio MCP | --live |
discovery/live.py, probe/session.py |
| Remote HTTP/SSE | --url |
probe/http_session.py |
| Exported JSON | --snapshot |
discovery/static_json.py |
| Client config launch | --config + --server |
discovery/live_config.py |
Static + live can merge when merge_static_live is true (default). Repository markdown discovery merges with Python/JS static results via discovery/static_merge.py. See Scanning overview.
Repository instruction discovery (discovery/instruction_files.py) walks the scan target for agent prompt content outside MCP prompts/list:
| Pattern | Loaded as |
|---|---|
**/SKILL.md |
Prompt surface + agent_skills entry (for skill_md) |
**/*prompt*.md, **/system_prompt.md |
Prompt and/or instruction surfaces |
skills/, agent/skills/ |
Project-local skill roots (no symlink required) |
Explicit paths: --instruction-file, --instruction-glob, --skills-dir. Disabled with --no-discover-instructions. Skipped when --live, --url, or --snapshot is used (MCP protocol takes precedence).
- Merge
--runtime-eventsfile rows with live / behavioral probe events - Attach
surface_scanoptions (--surfaces, MIME allowlist) - Emit
discovery_metafindings if live discovery was incomplete
Loop registered analyzers; skip disabled or filtered ones (--analyzers, config toggles). When --surface-scoped-analyzers is on (default) and --surfaces is a strict subset, only analyzers relevant to those surfaces run — e.g. mcts scan-prompts skips supply_chain on pyproject.toml. See Analyzers and core/surface_analyzers.py.
Optional: probe_protocol_security() when --protocol-probe + --url.
| Step | Function | Purpose |
|---|---|---|
| Filter | _apply_filters() |
--tool-filter, --analyzer-filter, --severity-filter, --technique |
| Metadata dedupe | dedupe_metadata_findings() |
Collapse duplicate metadata hits |
| Sigma dedupe | dedupe_sigma_findings() |
Collapse duplicate Sigma matches |
| Enrich | enrich_findings() |
Attach technique_id, mitigation_ids, crosswalk evidence |
| Compliance | ComplianceChecker.check() |
OWASP LLM + MCP meta-findings (non-scoring) |
Before scoring: attack_graph (with paths when chains ran) and scan_scope are set. Under v2/both, AttackChainAnalyzer always runs (whitelist/surface bypass).
RiskScoringEngine.score()→ legacyScoreBasis;verify()regression guard (always).- When
scoring_modeisv2orboth:build_scoring_context()→RiskScoringEngineV2.score()→ optionalscore_v2;verify()on deterministic core.
Includes canonical attack_graph, optional score_v2, partitioned legacy score_breakdown, scan scope notes, and analyzers_executed audit list.
Optional: --save-baseline writes tool metadata snapshot for rug-pull detection on future scans.
Defined in mcp/models.py and reporting/models.py.
The single input every analyzer receives.
| Field | Purpose |
|---|---|
tools, prompts, resources |
MCP surfaces with schemas and text |
instructions |
Server system instructions when exposed |
instruction_sources |
Source paths for repo-discovered system instruction markdown |
agent_skills |
Discovered SKILL.md files (name, path, content) for skill_md |
source_files |
Path → content cache for SAST |
runtime_events |
Telemetry rows for runtime analyzers |
discovery_mode |
static, live, merged, static+instruction-files, instruction-files, etc. |
surface_scan |
Per-scan surface and MIME options |
| Field | Purpose |
|---|---|
name, description |
Prompt metadata or full markdown body when repo-discovered |
source_file, source_line |
Path to SKILL.md, *prompt*.md, etc. |
discovered_via |
mcp, skill-md, or instruction-file |
| Field | Purpose |
|---|---|
name, description, input_schema |
Tool metadata |
handler_snippet, source_file, source_line |
Static SAST context |
capability |
CapabilityProfile for attack chains |
| Field | Purpose |
|---|---|
analyzer |
Source check (e.g. permission_analyzer) |
severity |
critical / high / medium / low |
technique_id |
MCTS-T-* (after enrichment) |
evidence, location |
Proof and source line |
tool |
Related tool name when applicable |
| Field | Purpose |
|---|---|
findings, summary |
All issues and severity counts |
score |
Overall score + auditable basis |
server |
Full discovery snapshot |
attack_graph |
Capability-graph paths for UI |
analyzers_executed |
Which checks ran |
Full field lists: source models in reporting/models.py.
Turns a filesystem path, live process, or JSON file into MCPServerInfo.
| Module | Role |
|---|---|
static.py |
Python @tool AST, schemas, handler snippets |
static_js.py |
TS/JS registerTool / handler patterns |
static_merge.py, static_runner.py |
Multi-language repo walks |
live.py, live_config.py |
Stdio / config-based live probe |
static_json.py |
Air-gapped snapshot load |
merge.py |
Static + live merge |
env_expand.py, json5_util.py |
IDE config parsing |
Guides: Live · Remote · Snapshot · TypeScript
Transport, consent, and event extraction for live modes.
| Module | Role |
|---|---|
session.py |
Async stdio MCP session |
http_session.py |
Streamable HTTP / SSE |
auth.py |
Bearer, headers, OAuth client credentials |
consent.py |
--i-understand-live-risk / MCTS_LIVE_OK |
behavioral.py, events.py |
Runtime event rows for analyzers |
protocol_checks.py |
Active MCPS HTTP checks |
Builds analyzer list in _build_analyzers(), runs lifecycle above. Filters and technique mode applied after analyzers complete.
See Analyzers below.
| Asset | Role |
|---|---|
techniques.json |
MCTS-T catalog |
mapper.py |
enrich_findings() |
crosswalk.json |
External framework IDs in evidence |
sigma/metadata_rules.json |
Bundled metadata Sigma rules |
Legacy exponential decay (engine.py); v2 multi-factor engine (engine_v2.py, graph.py, chains.py, packaged corpus stats). Compliance excluded from both; attack_chains meta-rows excluded from v2 sum. Details: Scoring spec · Scoring v2.
| Output | Path |
|---|---|
| JSON / SARIF | reporting/models.py, reporting/sarif.py |
| Terminal | ui/dashboard.py, themes in ui/theme.py |
| HTML | report/generators/html_report.py |
All analyzers implement BaseAnalyzer.analyze(server: MCPServerInfo) -> list[Finding] (analyzers/base.py).
Registered in Scanner._build_analyzers(). User-facing descriptions: Security checks.
Run unless disabled by _is_enabled() or --analyzers subset.
| Key | Focus |
|---|---|
permission_analyzer |
Destructive / high-risk tool names and descriptions |
metadata_integrity |
Description poisoning patterns |
prompt_injection |
Injection heuristics in metadata |
tool_shadowing |
Duplicate tool names within server |
line_jumping |
Context precedence attacks |
tool_abuse |
Path traversal in metadata |
schema_surface |
Full schema poisoning (FSP) |
data_leakage |
Secrets in source + metadata |
command_execution |
Shell/exec in handlers |
path_validation |
Missing path checks |
runtime_events |
Telemetry cluster (delegates to sub-detectors) |
sigma_metadata |
Bundled + custom Sigma YAML |
oauth_config |
OAuth misconfiguration |
supply_chain |
Dependency posture heuristics |
skill_md |
Agent SKILL.md patterns (W007–W014) when repo skills discovered |
| Key | Toggle | Focus |
|---|---|---|
surface_metadata |
enable_surface_metadata (default on) |
Multi-surface poisoning |
prompt_defense |
enable_prompt_defense (default on) |
Missing defensive language |
skill_md |
always registered | SKILL.md W007–W014 when agent_skills populated |
behavioral_static |
enable_behavioral_static (default on) |
Description vs handler + taint |
jailbreak |
enable_jailbreak (default on) |
Manipulation resistance |
attack_chains |
enable_attack_chains (default on) |
Capability-graph BFS |
| Key | Enabled when |
|---|---|
metadata_diff |
--baseline provided |
embedding_secrets |
--semantic-secrets |
cross_server |
Inventory with ≥2 servers (e.g. inventory --scan) |
toxic_flows |
Inventory with ≥2 servers + --full-toxic-flows |
| Key | Flag | Notes |
|---|---|---|
vulnerable_package |
--pip-audit |
Requires supplychain extra |
npm_audit |
--npm-audit |
npm audit subprocess |
yara_metadata |
--yara |
Requires yara extra |
llm_judge |
--llm-judge |
Requires MCTS_LLM_API_KEY |
llm_metadata_triage |
--llm-triage |
malicious / safe / suspect |
semgrep_sast |
--semgrep |
Requires semgrep CLI on PATH |
cloud_inspect |
--cloud-inspect |
Requires cloud API key |
virustotal |
--virustotal |
Hash lookup |
analyzers/surfaces.py defines ScanSurface for tools, prompts, resources, and instructions. Surface-aware analyzers iterate via scan_surfaces(server). Repo-discovered markdown is loaded into prompts / instructions before surface iteration, so prompt_injection, jailbreak, and prompt_defense analyze real file content — not only MCP prompts/list from live probes.
Used by behavioral_static. Python AST taint + optional tree-sitter for TS/Go/Rust. Semgrep rules in sast/semgrep/rules/. Install deep parsers: uv sync --extra sast.
RuntimeEventsAnalyzer routes telemetry rows to focused modules (e.g. rug_pull.py, command_injection.py, tool_redefinition.py). Technique mapping: Threat taxonomy and tests/fixtures/regression/MCTS-T-*/.
capability/inferrer.py assigns per-tool flags (reads_untrusted_input, egresses_network, executes_commands, …). BFS finds paths like read → exfiltrate. Graph stored on ScanReport.attack_graph.
When scoring_mode is v2 or both, paths are built at scan time via scoring/graph.build_paths() and stored on the canonical graph:
{
"nodes": [{"id": "read_file", "label": "read_file", "type": "tool"}],
"edges": [{"from": "read_file", "to": "send_webhook", "label": "read→exfil"}],
"paths": [{
"id": "path-chain-credential-theft-2",
"nodes": ["read_file", "get_env", "send_webhook"],
"tools_on_path": ["read_file", "get_env", "send_webhook"],
"hop_count": 2,
"finding_ids": ["chain-credential-theft"]
}]
}hop_count is validated edge hops only (len(nodes) - 1). Scanner, v2 engine, and HTML dashboard all use canonical_attack_graph(report) (invariant I3/I11).
Always runs. Populates ScanReport.score (invariant I1).
| Metric | Formula | Notes |
|---|---|---|
| Raw risk | C×25 + H×10 + M×3 + L×1 | Linear weighted sum |
| Overall score | round(100 × e^(-raw/50)) |
Higher is better |
| Risk index | min(100, raw_risk) |
Higher is worse |
compliance analyzer findings are informational only — they do not affect legacy or v2 sums.
Runs when scoring_mode is v2 or both (default). Populates ScanReport.score_v2.
Pipeline order (PR-1e): analyzers → compliance → attack graph + scan scope → legacy score → build_scoring_context() → v2 score. Canonical graph stored on report (I11).
| Output | Notes |
|---|---|
absolute_risk |
Multi-factor bracket sum × chain_factor on tool-attributed findings |
security_score |
Corpus percentile (packaged scoring_v2_corpus_stats.json) |
dimension_scores |
Eight RFC factor axes for radar chart |
top_contributors |
Finding + attack-chain explainability rows |
category_scores_v2 |
OWASP tiles (100=good), separate from legacy categories |
attack_chains meta-findings appear in the report but are excluded from v2 sum (NON_SCORING_V2). Chain signal is chain_factor on tool rows via scoring/chains.py and scoring/graph.py.
Gates: governance/scan_gates.py (CLI exit codes + API gate_violations). Docs: Scoring developer guide · v2 spec.
- Terminal — Rich dashboard (
ui/) — legacy + v2 lines whenboth - JSON — full
ScanReportwith optionalscore_v2 - SARIF —
--format sarif; run-levelmcts/scoreV2when v2 present - HTML —
mcts reportexecutive dashboard with v2 primary header
These share discovery/models but use separate entry paths:
| Command | Module | Role |
|---|---|---|
mcts fuzz |
fuzz/ |
Protocol probes → runtime_events JSON |
mcts inventory |
inventory/ |
Client config discovery; feeds cross-server / toxic-flow analyzers |
mcts vet |
vet/ |
Pre-install PyPI/npm/OCI checks |
mcts pentest |
pentest/ |
Structured recon + attack chains; absolute_risk + v2 risk_level verdict when v2/both |
mcts readiness |
readiness/ |
HEUR-001–020 (separate from security score) |
mcts serve |
api/ |
REST wrapper around Scanner |
src/mcts/
├── cli/ # Typer commands and flag wiring
├── core/ # Scanner, ScanConfig, target resolution
├── mcp/ # MCPServerInfo models, MCPClient.discover()
├── discovery/ # Static, live, snapshot, merge
├── probe/ # Live transport, consent, behavioral events
├── analyzers/ # Security checks (subclass BaseAnalyzer)
├── sast/ # Taint analysis + Semgrep rule pack
├── capability/ # Tool capability profiles
├── scoring/ # engine.py (v1), engine_v2.py, graph.py, chains.py, corpus stats
├── compliance/ # OWASP mapping (non-scoring)
├── taxonomy/ # MCTS-T/M, Sigma, crosswalk, enrichment
├── reporting/ # Pydantic models, SARIF
├── report/ # HTML dashboard templates
├── ui/ # Terminal Rich UI
├── inventory/ # Client config + skills discovery
├── fuzz/ # Fuzz runner
├── vet/ # Package vetting
├── pentest/ # Pentest phases
├── governance/ # policy.py, scan_gates.py (legacy + v2 YAML/CLI gates)
├── readiness/ # Production heuristics + OPA
├── api/ # FastAPI (mcts serve)
├── mcp_server/ # mcts-mcp stdio tools
├── output/ # Analysis dir, history, artifacts
└── testing/ # Regression harness
- Create
src/mcts/analyzers/your_check.py:
from mcts.analyzers.base import BaseAnalyzer
from mcts.mcp.models import MCPServerInfo
from mcts.reporting.models import Finding, Severity
class YourAnalyzer(BaseAnalyzer):
name = "your_check"
def analyze(self, server: MCPServerInfo) -> list[Finding]:
...- Register in
Scanner._build_analyzers()(core/scanner.py). - Add CLI flag in
cli/main.py+ field onScanConfigif opt-in. - Add tests in
tests/; regression fixture undertests/fixtures/regression/MCTS-T-*/when adding a technique. - Document in Security checks.
If the check needs inventory context, pass inventory= like OAuthConfigAnalyzer.
Add a module under analyzers/ and wire it from RuntimeEventsAnalyzer (runtime_events.py) for the relevant event types.
Extend discovery/static_runner.py and add a parser module (see static_js.py pattern).
Add YAML under sast/semgrep/rules/; map metadata.technique_id to MCTS-T.
Contributor quick start: CONTRIBUTING.md
| Symptom | Where to look |
|---|---|
| No tools discovered | Discovery logs; try --auto; check --languages, exclude dirs |
| Score seems wrong | Legacy: score.basis in JSON. v2: score_v2.basis + top_contributors. Compliance non-scoring; attack_chains meta-rows excluded from v2 only. Dual scores diverging is expected — see Scoring developer guide. |
| Analyzer missing from report | analyzers_executed on ScanReport; check --analyzers subset and opt-in flags |
| Live scan incomplete | discovery_warnings → live_discovery findings; --strict-live |
| False positive | Analyzer module + fixture in tests/fixtures/regression/ |
| Technique ID missing | taxonomy/mapper.py catalog + enrich_findings() |
Useful commands:
uv run mcts scan ./server.py -o /tmp/report.json
uv run python -c "import json; r=json.load(open('/tmp/report.json')); print(r['analyzers_executed'])"
uv run pytest tests/test_scanner.py -q
uv run pytest tests/fixtures/regression/ -q # if applicable- Security checks reference — what each analyzer looks for
- Scoring specification (legacy)
- Scoring v2 · Migration
- Threat taxonomy
- CLI reference
- CONTRIBUTING.md