code-review-graph is a persistent, incrementally updated, local-first knowledge graph for token-efficient code review through MCP and the CLI. It parses codebases using Tree-sitter and targeted fallbacks, builds a structural graph in SQLite, and exposes compact context to AI coding tools including Claude Code, Codex, Cursor, Windsurf, Zed, Continue, OpenCode, Gemini CLI, Qwen, Kiro, Qoder, and GitHub Copilot.
When using code-review-graph MCP tools, follow these rules:
- First call:
get_minimal_context(task="<description>")— costs ~100 tokens, gives you the full picture. - All subsequent calls: use
detail_level="minimal"unless you need more. - Prefer
query_graphwith a specific target over broadlist_*calls. - The
next_tool_suggestionsfield in every response tells you the optimal next step. - Target: ≤5 tool calls per task, ≤800 total tokens of graph context.
-
Core Package:
code_review_graph/(Python 3.10+)parser.py— Tree-sitter multi-language AST parser plus targeted fallbacks for broad source-language and notebook supportgraph.py— SQLite-backed graph store (nodes, edges, BFS impact analysis)tools/— 30 MCP tool implementations split by domainmain.py— FastMCP server entry point, registers 30 tools + 5 promptsincremental.py— Git-based change detection, file watchingembeddings.py— Optional vector embeddings (local sentence-transformers, OpenAI-compatible endpoints, Google Gemini, MiniMax)visualization.py— D3.js interactive HTML graph generatorcli.py— CLI entry point (install, build, update, postprocess, watch, status, visualize, serve/mcp, wiki, detect-changes, register, unregister, repos, eval, daemon)flows.py— Execution flow detection and criticality scoringcommunities.py— Community detection (Leiden algorithm or file-based grouping) and architecture overviewsearch.py— FTS5 hybrid search (keyword + vector)changes.py— Risk-scored change impact analysis (detect-changes)refactor.py— Rename preview, dead code detection, refactoring suggestionshints.py— Review hint generationprompts.py— 5 MCP prompt templates (review_changes, architecture_map, debug_issue, onboard_developer, pre_merge_check)wiki.py— Markdown wiki generation from community structureskills.py— Multi-platform install/config generation and shipped skill metadataregistry.py— Multi-repo registry helpersmigrations.py— Database schema migrations (v1-v9)tsconfig_resolver.py— TypeScript path alias resolution
-
VS Code Extension:
code-review-graph-vscode/(TypeScript)- Separate subproject with its own
package.json,tsconfig.json - Reads from
.code-review-graph/graph.dbvia SQLite
- Separate subproject with its own
-
Database:
.code-review-graph/graph.db(SQLite, WAL mode)
# Development
uv run pytest tests/ --tb=short -q # Run tests
uv run ruff check code_review_graph/ # Lint
uv run mypy code_review_graph/ --ignore-missing-imports --no-strict-optional
# Build & test
uv run code-review-graph build # Full graph build
uv run code-review-graph update # Incremental update
uv run code-review-graph status # Show stats
uv run code-review-graph serve # Start MCP server
uv run code-review-graph wiki # Generate markdown wiki
uv run code-review-graph detect-changes # Risk-scored change analysis
uv run code-review-graph register <path> # Register repo in multi-repo registry
uv run code-review-graph repos # List registered repos
uv run code-review-graph eval # Run evaluation benchmarks- Line length: 100 chars (ruff)
- Python target: 3.10+
- SQL: Always use parameterized queries (
?placeholders), never f-string values - Error handling: Catch specific exceptions, log with
logger.warning/error - Thread safety:
threading.Lockfor shared caches,check_same_thread=Falsefor SQLite - Node names: Always sanitize via
_sanitize_name()before returning to MCP clients - File reads: Read bytes once, hash, then parse (TOCTOU-safe pattern)
- No
eval(),exec(),pickle, oryaml.unsafe_load() - No
shell=Truein subprocess calls _validate_repo_root()prevents path traversal via repo_root parameter_sanitize_name()strips control characters, caps at 256 chars (prompt injection defense)escH()in visualization escapes HTML entities including quotes and backticks- SRI hash on D3.js CDN script tag
- API keys only from environment variables, never hardcoded
tests/test_parser.py— Parser correctness, cross-file resolutiontests/test_graph.py— Graph CRUD, stats, impact radiustests/test_tools.py— MCP tool integration teststests/test_visualization.py— Export, HTML generation, C++ resolutiontests/test_incremental.py— Build, update, migration, git opstests/test_multilang.py— Broad language parsing tests, including SFCs, notebooks, SQL, Perl XS, and modern systems/web languagestests/test_embeddings.py— Vector encode/decode, similarity, storetests/test_flows.py— Execution flow detection and criticalitytests/test_communities.py— Community detection, architecture overviewtests/test_changes.py— Risk-scored change analysistests/test_refactor.py— Rename preview, dead code, suggestionstests/test_search.py— FTS5 hybrid searchtests/test_hints.py— Review hint generationtests/test_prompts.py— MCP prompt template teststests/test_wiki.py— Wiki generationtests/test_context_savings.py— Estimated context-savings metadatatests/test_skills.py— Install/config generation and shipped skill metadatatests/test_registry.py— Multi-repo registrytests/test_migrations.py— Database migrationstests/test_eval.py— Evaluation frameworktests/test_tsconfig_resolver.py— TypeScript path resolutiontests/test_integration_v2.py— v2 pipeline integration testtests/fixtures/— Sample files for each supported language
- lint: ruff on Python 3.10
- type-check: mypy
- security: bandit scan
- test: pytest matrix (3.10, 3.11, 3.12, 3.13) with 65% coverage minimum
This project uses bd (beads) for issue tracking. Run bd prime to see full workflow context and commands.
bd ready # Find available work
bd show <id> # View issue details
bd update <id> --claim # Claim work
bd close <id> # Complete work- Use
bdfor ALL task tracking — do NOT use TodoWrite, TaskCreate, or markdown TODO lists - Run
bd primefor detailed command reference and session close protocol - Use
bd rememberfor persistent knowledge — do NOT use MEMORY.md files
When ending a work session, you MUST complete ALL steps below. Work is NOT complete until git push succeeds.
MANDATORY WORKFLOW:
- File issues for remaining work - Create issues for anything that needs follow-up
- Run quality gates (if code changed) - Tests, linters, builds
- Update issue status - Close finished work, update in-progress items
- PUSH TO REMOTE - This is MANDATORY:
git pull --rebase bd dolt push git push git status # MUST show "up to date with origin" - Clean up - Clear stashes, prune remote branches
- Verify - All changes committed AND pushed
- Hand off - Provide context for next session
CRITICAL RULES:
- Work is NOT complete until
git pushsucceeds - NEVER stop before pushing - that leaves work stranded locally
- NEVER say "ready to push when you are" - YOU must push
- If push fails, resolve and retry until it succeeds
IMPORTANT: This project has a knowledge graph. ALWAYS use the code-review-graph MCP tools BEFORE using Grep/Glob/Read to explore the codebase. The graph is faster, cheaper (fewer tokens), and gives you structural context (callers, dependents, test coverage) that file scanning cannot.
- Exploring code:
semantic_search_nodes_toolorquery_graph_toolinstead of Grep - Understanding impact:
get_impact_radius_toolinstead of manually tracing imports - Code review:
detect_changes_tool+get_review_context_toolinstead of reading entire files - Finding relationships:
query_graph_toolwith callers_of/callees_of/imports_of/tests_for - Architecture questions:
get_architecture_overview_tool+list_communities_tool
Fall back to Grep/Glob/Read only when the graph doesn't cover what you need.
| Tool | Use when |
|---|---|
detect_changes_tool |
Reviewing code changes — gives risk-scored analysis |
get_review_context_tool |
Need source snippets for review — token-efficient |
get_impact_radius_tool |
Understanding blast radius of a change |
get_affected_flows_tool |
Finding which execution paths are impacted |
query_graph_tool |
Tracing callers, callees, imports, tests, dependencies |
semantic_search_nodes_tool |
Finding functions/classes by name or keyword |
get_architecture_overview_tool |
Understanding high-level codebase structure |
refactor_tool |
Planning renames, finding dead code |
- The graph auto-updates on file changes (via hooks).
- Use
detect_changes_toolfor code review. - Use
get_affected_flows_toolto understand impact. - Use
query_graph_toolpattern="tests_for" to check coverage.