// the find
eosphoros-ai/DB-GPT
open-source agentic AI data assistant for the next generation of AI + Data products.
DB-GPT is a self-hostable platform that connects LLMs to your databases, CSVs, and document stores — letting you ask questions in natural language, get back SQL or Python, and generate charts and reports. It's aimed at teams that want a private, on-premise alternative to tools like Databricks AI Assistant or Mode, with support for local models via vLLM or llama.cpp as well as every major API provider. The scope is large: it ships a full agent framework (AWEL), RAG pipelines, a sandboxed code executor, and a web UI.
- The AWEL (Agentic Workflow Expression Language) pipeline system is genuinely well-thought-out — it models data flow as typed operators (map, reduce, join, branch, stream), which makes complex multi-step agent workflows inspectable and composable rather than a bag of callbacks.
- Local model support is first-class, not an afterthought: vLLM, llama.cpp, and Ollama are all supported with their own config profiles, so you can run the whole thing air-gapped with DeepSeek-R1 or Qwen3.
- The sandboxed code execution is a real differentiator — generated Python actually runs in an isolated environment, so you get verifiable outputs rather than just plausible-looking code snippets.
- The Text2SQL fine-tuning hub (DB-GPT-Hub) covers a wide range of base models and is maintained separately, giving you a path to actually improve SQL accuracy on your schema rather than hoping the base model generalises.
- The schema migration story is raw SQL files under `assets/schema/upgrade/`, manually versioned with names like `v0_5_1`. There's no Alembic, no rollback scripts, and the upgrade path from anything pre-0.5.1 is undocumented — this will bite anyone running it in production across releases.
- The codebase is clearly Chinese-team-first: typos in directory names (`READMR.hi.md`, `DISCKAIMER.md`), some documentation exists only in Chinese or is machine-translated, and the English issue tracker gets significantly less attention than the WeChat group.
- AWEL is powerful but the learning curve is steep and the documentation is thin — the tutorials cover basic operators but there's almost nothing on debugging a failed multi-agent flow or understanding why a particular plan step got stuck.
- The one-line install script pipes curl directly to bash and pulls from GitHub at runtime with no pinned versions or checksum verification. For a tool explicitly marketed on privacy and security, this is an uncomfortable way to get started.