fix: resolve self-scan CI-02 findings; baseline accepted detector complexity#24
Merged
Conversation
The native detector lane scanned bundled third-party code (pip/_vendor, vendor/, third_party/, site-packages). On pip, 280 of 362 native findings (77%) came from _vendor/ — code the project ships but does not own and cannot fix. Add _vendor, vendor, vendored, third_party, third-party, site-packages, bower_components to DEFAULT_GENERATED_PATH_SEGMENTS. After the fix pip reports 74 findings, 0 from _vendor, while its own code is still scanned. This is the native twin of the eslint built-JS fix: both lanes were scanning code that is not the project's own source. Locked by test_vendored_third_party_code_is_not_scanned; sample-report mirrors and ACI_EVIDENCE.md (§2i) updated. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…policy §2h (eslint) and §2i (native) were the same defect — scanning code that is not the project's own source — fixed twice because each lane kept its own skip set and they had drifted apart. The borrowed-analyzer skip sets were each missing ten segments the native policy already had, including every vendored one (_vendor, vendor, vendored, third_party, site-packages, bower_components). The report already dropped these (external findings are bounded to the native-discovered file set), but the borrowed tools still processed the code: on a pip-scale _vendor/ tree, handing hundreds of bundled files to mypy can exhaust the per-analyzer timeout and starve the project's own code of findings. Derive _PYTHON_ANALYZER_SKIP_SEGMENTS / _SEMGREP_SKIP_SEGMENTS / _ESLINT_SKIP_SEGMENTS from DEFAULT_GENERATED_PATH_SEGMENTS plus lane- specific extras, so they cannot drift again. Add a drift-guard test and a vendored file-discovery test; ACI_EVIDENCE.md §2j records the root cause. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Reviewing a branch that deletes files reports the deleted path in git's changed set while it is gone from the working tree. Verified by a real git rename+delete probe that the scan handles this gracefully (the deleted file is simply absent from the rglob'd target set); this test locks that guarantee so a future refactor cannot reintroduce a crash or a phantom finding for a vanished file. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The operations baseline is the backbone of the "only new issues" review, and the source of three earlier defects (all from encoding a finding's identity by line number). Verified by a real scan->baseline->edit->rescan probe that the workflow is sound: - existing findings keep their identity after an unrelated line shift (fingerprint match short-circuits the stale line in the baseline entry), - a genuinely new finding is flagged new, - a fixed finding is reported resolved. This end-to-end test ties fingerprint stability to operations matching in one guarantee, where the prior tests covered each half separately. The waiver path (active-waiver -> accepted/accepted-residual) was probed too and is already covered by existing tests. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…scan crash ACI ingests JSON/SARIF from 13 external tools. The parse sites caught only json.JSONDecodeError, so output that was well-formed but shape-shifted (a renamed key, a null where a table was, a list that became an object -- the ordinary result of a tool version bump) raised KeyError/TypeError/ AttributeError instead. That escaped the parse-failure path and aborted the whole scan, taking the native findings and every other analyzer down with one borrowed tool's drift. Verified: raw ruff/eslint/trivy/osv output in a drifted shape, plus codeql SARIF with a wrong-typed `runs` and a gitleaks report with a wrong-typed field, all crashed before the fix. Broaden the three parse sites (stdout JSON, gitleaks report, codeql SARIF) to a shared _ANALYZER_PARSE_ERRORS set so any structural mismatch becomes a recorded parse-failure for that one analyzer while the scan continues. One definition, used at all three sites, so the lanes cannot drift apart. Regression tests cover the drift cases and confirm valid output still parses. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
ACI's purpose includes making users aware that even its declared checks are not 100% -- a clean result does not prove the absence of issues. That disclosure already rides in the JSON report, but the GitHub PR summary, the surface where a reviewer actually concludes "ACI passed, the code is clean," omitted it. A zero-finding pass is exactly where that false confidence is highest, yet the summary said only "Gate: pass / Findings: 0." Add the disclosure as a footer to build_github_summary_markdown, sourced from the report's own detection_disclosure key (single source of truth), so the honest bound travels to the point of use. Reports without the key still render. This serves the "make the bound explicit, at the point of use" half of ACI's purpose, which until now lived only in docs and the machine report. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… report Adopting ACI on an existing codebase means accepting today's findings as pre-existing so future scans surface only NEW ones. That baseline was hand-authored TOML -- run a scan, read the JSON, copy each fingerprint by hand. For the tool's central adoption workflow, the first step was manual TOML editing, which is the opposite of "easy to use." Add `aci emit-baseline --report report.json --output ops.toml`, following the existing emit-sarif/emit-annotations/emit-github-summary pattern (report in, artifact out, same --report-scope-class/--report-owner-lane filters). The whole adoption loop is now two commands: aci scan --target . --output report.json aci emit-baseline --report report.json --output ops.toml aci scan --target . --operations-file ops.toml --fail-on-new-findings Each entry is keyed by the finding's fingerprint (stable across line shifts), never the line number -- encoding identity by line was the root of three earlier defects. Output is sorted for a clean diff on regeneration, and the TOML is escaped so paths with metacharacters still parse. Verified end to end: generate -> load_operations_state -> re-scan marks findings existing-baseline. QUICKSTART and CONFIGURATION updated to lead with the generated baseline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ooding) Found by running ACI on its own source: enabling external analyzers DROPPED native findings. `--profile full` (native only) reported 8 findings incl. 2 CI02_LONG_FUNCTION; the same scan with external analyzers reported 6, the long-function findings gone. Root cause: ACI pre-emptively suppressed five native signals (CI02_LONG_FUNCTION, CI03_TODO_HACK, CI18_PARAMETER_CLUSTER, CI21_BROAD_EXCEPTION_SWALLOW, CI26_RACE_HAZARD) whenever ruff was merely *ready*, on the assumption ruff covered them. But ACI invokes ruff with its default rule set, which emits none of C901/PLR0915/PLR0913/BLE001/TD — so the delegation suppressed native coverage with nothing replacing it. Turning on more analysis silently produced fewer findings. The post-hoc cross-lane dedup (_deduplicate_findings) is the correct and already-tested mechanism: it removes a native finding only where the external lane actually reported the same (ci_id, file, line). Remove the pre-emptive _delegated_native_signals / _RUFF_DELEGATED_SIGNALS machinery and rely on the dedup. Native now always runs; ruff dedupes only what it genuinely emits. Verified by dogfooding: ACI's self-scan with external analyzers now matches native-only (8 findings incl. both CI02_LONG_FUNCTION). Rewrote the test that encoded the buggy behavior; CI_REFERENCE wording tightened. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… own code Dogfooding follow-up: a self-scan's CI-05 (copy-paste) findings pointed at three small report-reading helpers -- _report_map, _scope_class, and _gate_scope_classes -- copy-pasted byte-identical across the SARIF, annotations, GitHub-summary, and report-view emitters "without a shared abstraction." Add that shared abstraction, aci_report_helpers, and import it (aliased to the former private names) from the four emitters instead of redefining it. This is ACI acting on its own true-positive finding: the self-scan's CI-05 count drops from 4 to 2, with no behavior change (full suite, ruff, mypy all green). The two remaining CI-05 findings are a different, rename-variant pair in the taint detectors (deferred -- it needs semantic, not just structural, equivalence checking) and a residual structural match on the new helper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…plexity Dogfooding: ACI found 4 CI-02 findings in its own code. Two are fixed, two are baselined with explicit justification. Fixed: - build_github_summary_markdown (CI02_LONG_FUNCTION, complexity 26→resolved): extract _append_list_section helper to remove the 5-times repeated "check list, render section header/items/blank-line" pattern. - _build_gate (CI02_LONG_FUNCTION, complexity 35→resolved): first pass: combine reasons + reason_details into single if-blocks; second pass: extract _gate_fail_reasons() so _build_gate drops below both line and complexity thresholds. Baselined (aci.self-scan-operations.toml): - _iter_pep621_dependency_specs: depth-5 nesting mirrors the literal depth of PEP 621 TOML schema (project → optional-dependencies → group → item). - _collect_tainted_names: fixpoint dataflow × AST walk × type dispatch is the algorithm — nesting is structural, not incidental tangling. Self-scan with operations file: gate pass, 2 findings (all existing-baseline). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Dogfooding の続き — ACI が自分のコードに対して出した CI-02 findings (4件) を全件決定状態にします。
Fix (2件)
build_github_summary_markdown_append_list_sectionhelper を導入し、6回繰り返されていた「リスト確認 → セクション描画」パターンを除去。CI02_LONG_FUNCTION 解消。_build_gate_gate_fail_reasons()を分離して_build_gate自体の行数・複雑度を閾値以下に削減。CI02_LONG_FUNCTION 解消。Baseline (2件) —
aci.self-scan-operations.toml_iter_pep621_dependency_specsproject → optional-dependencies → group → item)。分割するとトラバーサルが読みにくくなる。_collect_tainted_names検証
pytest: 831 passed, mypy: no issues
Test plan
aci scan --target shared/python --operations-file aci.self-scan-operations.toml→ gate pass, blockers 0pytest shared/tests/→ 831 passedmypy shared/python/aci_github_summary.py shared/python/aci_report_view.py→ no issues🤖 Generated with Claude Code