// the find

tirth8205/code-review-graph

★ 19,136 · Python · MIT · updated Jun 2026

Local-first code intelligence graph for MCP and CLI. Builds a persistent map of your codebase so AI coding tools read only what matters, with benchmarked context reductions on reviews and large-repo workflows.

code-review-graph builds a persistent AST-based dependency graph of your codebase using Tree-sitter, stores it in local SQLite, and exposes it to AI coding tools via MCP so they query the graph instead of reading every file. The core pitch is token reduction: for large repos, blast-radius queries return a few thousand tokens where naive full-corpus reads would cost hundreds of thousands. Aimed at developers who spend real money on AI coding tools and are hitting context or cost limits.

The benchmark methodology is unusually honest — they explicitly call out that recall=1.0 is circular (ground truth derived from the same graph), publish the co-change mode alongside it, and separate the 528x headline (fastapi best case) from the 82x median. Incremental updates are genuinely fast: re-parsing only changed files via SHA-256 diff means a 2,900-file project re-indexes in under 2 seconds. The custom language extension via `.code-review-graph/languages.toml` is a clean escape hatch that doesn't require forking or patching the library. The GitHub Action integration is self-contained — it builds and queries the graph on your CI runner with no source code sent externally, which is a real selling point for anyone with IP concerns.

Flow detection has 33% recall and only works reliably on Python — if you're in a Go or JavaScript monorepo, you're getting the graph edges but not the execution-flow analysis that makes blast-radius most useful. Search quality (MRR 0.35) means roughly 1-in-3 queries return the right result at rank 1; the README quietly notes Express queries return 0 hits due to module-pattern naming, which is a significant gap for the JS ecosystem this tool is supposed to help. The token savings numbers compare against a naive full-corpus baseline that no real agent would actually use, and the more honest agent_baseline comparison (grep top-3 files) is buried in the eval CSVs rather than the headline diagram — the actual savings in practice are probably much smaller. Embeddings are opt-in and only index function signatures, not bodies or docstrings, so semantic search is weaker than the feature list suggests.

View on GitHub → Homepage ↗