// the find

HKUDS/LightRAG

★ 36,457 · Python · MIT · updated Jun 2026

[EMNLP2025] "LightRAG: Simple and Fast Retrieval-Augmented Generation"

LightRAG is a graph-augmented RAG framework from HKUST that builds a knowledge graph during indexing and uses it alongside vector search at query time. It positions itself as a lighter alternative to Microsoft's GraphRAG, with the key claim being that it avoids expensive community-report generation and multi-hop reasoning chains. Target audience is anyone building RAG over domain-specific document corpora — legal, medical, financial — where cross-document reasoning matters.

The dual-level retrieval (local entity lookup + global relationship traversal + naive chunk fallback, all mergeable) is a real architectural win over naive chunked RAG for complex queries. Role-specific LLM configuration — separate models for extraction, querying, keyword generation, and VLM — lets you optimize cost and latency without a one-size-fits-all model. Storage backend flexibility is genuinely good: PostgreSQL, MongoDB, Neo4j, Milvus, OpenSearch, Qdrant all work, and you can mix-and-match vector/graph/KV stores independently. The incremental update story (merge new subgraphs into existing graph via set operations, delete triggers LLM-cache-assisted relationship rebuild) avoids the full-reindex penalty that kills GraphRAG for dynamic data.

The default file-persisted in-memory storage is a trap for anyone who doesn't read the config docs carefully — you'll think it works fine until you restart and lose everything or hit memory walls on a real corpus. Embedding model lock-in is sharp: pick wrong at the start and re-embedding requires manual table drops with no built-in migration tooling. The evaluation benchmarks in the README are LLM-judged pairwise comparisons (not held-out ground truth), which makes the big win percentages over NaiveRAG and GraphRAG less meaningful than they look. The `mix` query mode that gives best results also has non-trivial latency cost with reranking enabled — for latency-sensitive applications you're trading quality for speed in ways the docs undersell.

View on GitHub → Homepage ↗