// the find

GiovanniPasq/agentic-rag-for-dummies

★ 3,475 · Jupyter Notebook · MIT · updated Jun 2026

A modular Agentic RAG built with LangGraph — learn Retrieval-Augmented Generation Agents in minutes.

A teaching repo that shows how to build an agentic RAG system with LangGraph — hierarchical indexing, multi-agent map-reduce, self-correction, and context compression. Aimed at developers learning how RAG pipelines actually work beyond the toy tutorial level. The Gradio app makes it runnable without writing a line of code.

Parent/child chunk splitting is implemented correctly — small chunks for retrieval precision, large parents for answer context, with sensible merge/split logic to handle markdown header variability. The context compression step is a real solution to a real problem: without it, long retrieval loops blow past the context window. Token budgeting with tiktoken before compressing is the right call. The retrieval key deduplication (tracking already-fetched parent IDs and search queries across iterations) prevents the agent from spinning in circles re-fetching the same data.

Parent chunks are stored as flat JSON files in a directory — fine for a demo, breaks immediately at any meaningful document scale. No persistence layer means re-indexing from scratch on every restart. The tiktoken trick for token estimation uses the GPT-4 tokenizer as a proxy for whatever LLM you actually configured, which gives wrong numbers for Ollama models or Gemini. The `score_threshold` in `similarity_search` is hardcoded at 0.4 but is not calibrated per embedding model — the Qwen3-Embedding-0.6B model will have a different score distribution than a model trained on MSMARCO, so you'll get either too many or zero results depending on your docs.

View on GitHub →