// the find

h2oai/h2ogpt

★ 11,985 · Python · Apache-2.0 · updated Oct 2025

Private chat with local GPT with document, images, video, etc. 100% private, Apache 2.0. Supports oLLaMa, Mixtral, llama.cpp, and more. Demo: https://gpt.h2o.ai/ https://gpt-docs.h2o.ai/

h2oGPT is a self-hosted chat interface that lets you run local LLMs against your own documents. It sits in the same space as PrivateGPT and Open WebUI but goes significantly wider — RAG, voice, image generation, agents, and an OpenAI-compatible API proxy, all from one Python monolith. Best suited for teams that want a feature-rich private AI stack and have the GPU budget to run it.

The OpenAI-compatible proxy layer is genuinely useful — you can point existing tools at it and get local-model behavior without code changes. The document ingestion pipeline covers an unusual breadth of formats (video frames, audio, Excel, code) and uses HYDE for retrieval quality rather than naive top-k. The test suite is serious: 1000+ tests, 24 GPU-hours, which is rare for projects in this space. Helm charts and cloud packer scripts mean it's actually deployable, not just demo-ware.

The codebase is a sprawling Python monolith — generate.py alone is the kind of file you dread touching because you don't know what breaks. Dependency management is a mess of optional requirements files that make reproducible installs painful; Docker is essentially required, not optional. Last meaningful commit was October 2025, and the LLM landscape has moved significantly since then — model support docs reference Falcon and LLaMa-2 prominently while newer architectures are an afterthought. The Gradio UI is functional but shows its age; Open WebUI has lapped it on UX.

View on GitHub → Homepage ↗