finds.dev← search

// the find

Raudaschl/rag-fusion

★ 941 · Python · MIT · updated Apr 2026

RAG-Fusion: multi-query generation + Reciprocal Rank Fusion for better retrieval-augmented generation. Includes evaluation harness with NFCorpus/BEIR.

RAG-Fusion extends standard RAG by generating multiple query variations via LLM and combining retrieval results using Reciprocal Rank Fusion. The core idea addresses vocabulary mismatch — your query phrasing rarely matches how the corpus is indexed. This is for developers building retrieval systems who want to understand when and whether fusion actually helps, not just a conceptual overview.

The evaluation harness is the real value here — paired-bootstrap CIs at n=200 on NFCorpus with six method variants and three rerankers, not vibes. The README is unusually honest: it tells you the vector-only fusion variant is 'roughly a wash' and net-negative on rich queries after reranking, which most repos would bury. The adaptive routing recommendation (run baseline+rerank always, fire fusion only when a weakness signal trips) is practical advice that most papers skip. The 'when to use / when not to use' section is concrete and gives you real examples on both sides.

The core pipeline in main.py is hardwired to OpenAI + ChromaDB, so swapping either requires touching implementation code rather than config — not a library, just a script. The query cache is a JSON file on disk, which breaks under any parallelism and won't survive a multi-node setup. Despite all the evaluation rigor, there's no production integration guide — the experiments live in a research directory and the bridge to 'here's how you wire this into your actual stack' is thin. The NFCorpus test domain is biomedical, so all the quantitative claims are on specialist vocabulary; generalization to general-purpose corpora is plausible but unverified in the repo.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →