finds.dev← search

// the find

MODSetter/SurfSense

★ 14,433 · Python · Apache-2.0 · updated Jun 2026

An open source, privacy focused alternative to NotebookLM for teams with no data limits. Join our Discord: https://discord.gg/ejRNvftDp9

SurfSense is a self-hostable knowledge base and RAG platform targeting teams who've outgrown NotebookLM's limits. It wraps FastAPI, LangGraph, and Next.js into a system that ingests from 27+ connectors, supports local LLMs via Ollama/vLLM, and adds multiplayer chat, automations, and a desktop app. Aimed at privacy-conscious teams or individuals who want NotebookLM-style functionality without Google's quotas and lock-in.

- Genuine LLM flexibility: supports LiteLLM's full provider list plus local Ollama/vLLM endpoints, so you can actually run this air-gapped if needed

- Hybrid search implementation (semantic + BM25 with Reciprocal Rank Fusion) is a meaningful improvement over typical single-vector RAG setups

- Docker one-liner install with Watchtower auto-updates lowers the ops burden for self-hosters considerably

- Connector breadth (Slack, Linear, Jira, GitHub, Google Drive, Notion, Discord, etc.) is real and not just aspirational — the connector code is present in the repo

- The repo README itself says 'not yet production-ready', and with a project of this scope (RAG pipeline, real-time collab, desktop app, 27+ connectors, automations) that caveat carries real weight — things will break in unexpected combinations

- The .cursor/skills directory is enormous SEO and marketing content checked directly into the codebase, which is a red flag about how AI-assisted the development process is and raises questions about code review quality

- No visible test coverage for the connector integrations or agent workflows — the backend-tests.yml and e2e-tests.yml exist but there's no evidence of meaningful fixture coverage for the complex async LangGraph agent paths where bugs will actually surface

- Multi-tenant data isolation and security model for the self-hosted case isn't documented clearly — teams sharing a deployment need to understand exactly how search space isolation is enforced before putting sensitive data in

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →