// the find
NirDiamant/RAG_Techniques
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. Each technique has a detailed notebook tutorial.
A collection of 42+ Jupyter notebooks covering RAG techniques from basic vector search to RAPTOR, GraphRAG, Self-RAG, and CRAG. Aimed at ML engineers and developers who want working code examples of specific retrieval patterns rather than having to dig through papers. Each notebook is self-contained with explanations and runnable on Colab.
- Breadth of coverage is genuinely useful - techniques are organized by category (chunking, query enhancement, retrieval, evaluation) so you can jump to what's relevant rather than reading everything linearly
- Runnable Python scripts exist alongside the notebooks for most techniques, which matters when you want to integrate something into actual code rather than just read cells
- Evaluation section includes multiple frameworks (DeepEval, GroUSE, end-to-end evaluation) which is often the part people skip and later regret
- Both LangChain and LlamaIndex implementations for several core techniques, so you're not locked into one abstraction
- Heavy OpenAI dependency throughout - most notebooks assume you have an API key and will cost money to run; there's minimal coverage of local/open-source model alternatives despite the llms topic tag
- No benchmarks or comparative results between techniques on the same dataset - you get implementations but no honest answer to 'does proposition chunking actually beat semantic chunking on my use case'
- The repo is essentially a tutorial collection with newsletter/book upsell woven in; the README is long on marketing links and short on guidance about when to pick which technique
- Notebook dependency management is inconsistent - some use !pip installs mid-cell, versions aren't pinned, and notebooks from different time periods likely have conflicting requirements that will break silently