Skip to content

[FEATURE] Findings trust layer — validator, display severity, and honest attack-chain evidence (Phase 0) #258

@hello-args

Description

@hello-args

Summary

Add a findings trust layer so MCTS exposes honest, independently verifiable findings: parallel display_severity fields, a central validate_findings() post-scan step, dual summary (template + display), and graph/path honesty for attack chains — without breaking legacy CI (--fail-on-critical) in Phase A.

Problem

Engineers cannot trust or act on findings without reverse-engineering heuristics. The sa-mcp-server pattern illustrates the gap:

  • Single-tool capability overlap is labeled CRITICAL — Credential theft chain
  • evidence_tags.py fabricates hop_count: 1 / path: read_tools when no graph path exists (enrich_graph_dependent_evidence, lines 201–203)
  • Finding.severity is load-bearing in 23+ consumers (CI gates, SARIF, dashboard, history, scoring, compliance) — title-only fixes do not help
  • No provenance upstream (capability/inferrer.py returns booleans only; attack chains emit tool name lists without location)

Trust = accuracy + explainability + consistency. Today MCTS exposes conclusions without verifiable facts.

Proposed solution

Implement Phase 0 (attack chain honesty) in a 10-PR train. Core pattern:

Parallel display fields + central validate_findings() + consumer migration in strict order

Phase A (display trust) — ship first:

PR Delivers
1 Optional Finding fields (display_severity, evidence_type, impact, evidence_strength, …) + reporting/display.py helpers — behavior-neutral
2 reporting/finding_validator.py + scanner hook after enrich_scoring_evidence; reorder pipeline to enrich → validate → compliance → score; findings_trust_mode config
3 Graph honesty — remove fake hop/path fallback; path_status: unproven; skip self-loop edges in _build_graph_from_chain_findings; suppress overlap-only v2 attack_chain top contributor
4 Dual summary: keep report.summary (template) for legacy CI; add report.display_summary from effective_severity()
5 Dashboard badges (HTML dashboard.js + terminal ui/dashboard.py) on display_severity
6 examples/single-tool-agent-server/ regression fixture (sa-mcp-server overlap pattern)
7 SARIF level from display_severity when trust on

Phase A invariants (non-negotiable):

  • Do not mutate finding.severity until Phase B — preserves RiskScoringEngine.verify() and --fail-on-critical default
  • Validator is the single mutation point for display fields
  • Stable finding.id (e.g. chain-credential-theft); rewrite display titles only

Target scanner order:

dedupe → enrich_findings → enrich_scoring_evidence → validate_findings()
  → _apply_filters (post-validator) → compliance (effective_severity) → score → dual summary

Later phases (separate issues/PRs): Phase 1 provenance (CapabilitySignal + signals[]), Phase 2 --ci-trust gates, Phase B scoring migration.

Alternatives considered

Alternative Why rejected
Rename attack chain titles only CI, SARIF, summary.critical unchanged — fixes marketing, not trust
Mutate finding.severity on day 1 Breaks corpus Spearman, verify(), GitHub Action defaults without opt-in
Gate in dashboard.js only evaluate_scan_gate_violations is Python-only
Facts on attack chains without inferrer changes Chains copy tool signals — no location/rule provenance otherwise
One big PR Too risky across 23 consumers; 10-PR train with gates

Component (suggested)

component:reporting (primary), component:ui, component:ci

Priority (suggested)

priority:P1 — high value; should land soon

Adoption bottleneck is trust/explainability, not detection recall. Phase 0 alone targets ~50% of related complaints.

Acceptance criteria

Phase 0 done when:

  • validate_findings() caps overlap chains: evidence_type: capability_overlap, display_severity ≤ medium, honest display title (no "theft" / "attack chain" without "potential overlap")
  • finding.severity unchanged in Phase A; RiskScoringEngine.verify() still passes
  • report.summary remains template counts; display_summary populated for dashboard
  • No fake hop_count: 1 / path: read_tools fallback in enrich_graph_dependent_evidence
  • examples/single-tool-agent-server/ added; scan yields zero display CRITICAL overlap chains
  • tests/reporting/test_finding_validator.py covers overlap cap, warn/enforce modes
  • Executive summary (build_executive_summary) does not alarm on overlap-only chains
  • --fail-on-critical / Action default behavior unchanged unless --ci-trust (Phase 2)

Code touchpoints (verified 2026-06-13):

  • src/mcts/core/scanner.py (~line 220, after enrich_scoring_evidence)
  • src/mcts/scoring/evidence_tags.py (fake path fallback)
  • src/mcts/scoring/graph.py (canonical_attack_graph_from_scan, _build_graph_from_chain_findings)
  • src/mcts/analyzers/attack_chains.py
  • src/mcts/compliance/checks.py:106 (multiple-critical uses template severity today)
  • src/mcts/report/data.py, report/assets/dashboard.js, ui/dashboard.py

Contribution

  • I am willing to open a PR for this (comment on the issue to claim it)

References

  • User-facing guide (draft): docs/reporting/interpreting-findings.md
  • Scoring gaps: docs/scoring_improvements/scoring-v2-relevant-gaps.md (if published) / local/scoring_improvements/scoring-v2-relevant-gaps.md
  • Related: attack chains forced when scoring_mode in {v2, both} (scanner.py:302–315); v2 excludes chain meta via NON_SCORING_V2

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions