// the find
LearningCircuit/local-deep-research
~95% on SimpleQA (e.g. Qwen3.6-27B on a 3090). Supports all local and cloud LLMs (llama.cpp, Ollama, Google, ...). 10+ search engines - arXiv, PubMed, your private documents. Everything Local & Encrypted.
Local Deep Research is a self-hosted AI research assistant that orchestrates multiple LLMs and search engines (arXiv, PubMed, SearXNG, Brave, etc.) to produce cited research reports. It targets privacy-conscious researchers, academics, and home-server enthusiasts who want a Perplexity/Deep Research alternative with no data leaving their machine. Deployable via Docker, pip, or MCP for Claude integration.
- Genuine breadth of search source integrations—arXiv, PubMed, Semantic Scholar, Wikipedia, Wayback Machine, Elasticsearch, local documents, and premium APIs all under one interface, with a LangChain retriever hook so you can plug in any vector store.
- Security posture is unusually thorough for an open-source project this age: SQLCipher AES-256 per-user databases, Cosign-signed Docker images with SLSA provenance, SBOM attachments, and a wall of automated scanners (CodeQL, Semgrep, OWASP ZAP, Trivy, OSV) actually wired into CI rather than just badgeware.
- The pre-commit hook collection is surprisingly opinionated and useful—custom checks for sensitive logging, silent exception swallowing, settings manager thread safety, and layer import violations catch real architectural drift before it lands.
- MCP server integration lets Claude Desktop/Code trigger full research pipelines directly, and the raw `search` tool (no LLM, no cost) is a practical addition for automated monitoring workflows.
- The ~95% SimpleQA claim is prominently marketed but the fine print admits 'limited sample size' and results from gpt-4.1-mini (a cloud model), not the local Qwen3-27B called out in the description. The headline is misleading for anyone planning to run this fully locally.
- SQLCipher on Windows requires pre-built wheels and can silently fall back to unencrypted SQLite if you set `LDR_BOOTSTRAP_ALLOW_UNENCRYPTED=true`—a footgun for anyone who enables that flag for convenience and forgets the security implications.
- The repo has 60+ GitHub Actions workflow files and dozens of custom pre-commit hooks, but the actual test suite appears thin based on the directory structure—heavy on tooling ceremony, lighter on functional coverage of the research pipelines themselves.
- LangGraph agent strategy is labeled 'early results are promising' in the README, meaning the flagship agentic mode is not production-ready, and there's no clear signal about when it will be.