// the find
StarTrail-org/LEANN
[MLsys2026]: RAG on Everything with LEANN. Enjoy 97% storage savings while running a fast, accurate, and 100% private RAG application on your personal device.
LEANN is a graph-based vector index that trades embedding storage for on-demand recomputation, claiming 97% storage reduction with no accuracy loss. It's a research prototype (MLSys 2026 paper) that ships as a Python library with pre-built apps for personal RAG — emails, browser history, chat logs, codebases. The target user is someone who wants local-first semantic search over their personal data without spinning up a dedicated vector DB.
The core idea is technically sound: store the graph structure, recompute embeddings at query time from compressed representations rather than storing all vectors. This is genuinely novel and the paper backs it up. The API is clean — `LeannBuilder`, `LeannSearcher`, `LeannChat` cover 90% of use cases without fighting abstractions. MCP integration means it drops into Claude Code as a semantic search layer with minimal setup. The benchmark suite is thorough and honest — they compare against FAISS and DiskANN, not strawmen.
The storage savings come with a CPU cost at query time that the README barely addresses — recomputing embeddings on every search is fine for personal-scale data but the latency story for larger corpora is buried in the paper, not surfaced where users will hit it. Build-from-source is a nightmare: DiskANN pulls in MKL, Abseil, protobuf, zeromq with platform-specific gotchas — the issues page has multiple threads just on Ubuntu 20.04. Windows support is listed but requires vcpkg and a wall of environment variables. The `no-recompute` mode (store embeddings normally) undercuts the core value prop and the interaction between `--compact` and `--recompute` flags is confusing enough that they had to add a warning in the docs.