// the find

snexus/llm-search

★ 658 · Jupyter Notebook · MIT · updated Jan 2026

Querying local documents, powered by LLM

pyLLMSearch is a local RAG system for querying your own documents via YAML config. It goes well beyond basic retrieval — hybrid search (dense + sparse via SPLADE), HyDE, reranking, and MCP server exposure are all built in. Aimed at developers or knowledge workers who want a serious local document search without building it from scratch.

SPLADE sparse embeddings combined with dense vectors is a real differentiator — most hobby RAG projects skip hybrid search entirely. HyDE support is genuinely useful for low-familiarity domains where you can't phrase queries in the right terminology. The YAML-driven config keeps everything reproducible without touching code. MCP server exposure means it plugs directly into Cursor/Windsurf as a context source, which is a practical use case most similar tools don't cover.

The primary language is Jupyter Notebook per GitHub, which means the actual Python package is secondary to demos — not a great sign for production reliability. ChromaDB as the only vector store option is limiting; there's no Qdrant, Weaviate, or even SQLite-vec path if you outgrow it or want a lighter footprint. The dependency surface is enormous (LangChain, HuggingFace, SPLADE, gmft, Unstructured, optional Gemini) — expect painful environment setup, and the install script being Linux-only confirms this isn't a polished cross-platform experience. Last meaningful activity was January 2026 and the repo is at 658 stars, so community momentum is low.

View on GitHub →