// the find

vanna-ai/vanna

★ 23,701 · Python · MIT · updated Feb 2026

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using Agentic Retrieval 🔄.

Vanna is a text-to-SQL library that uses RAG to convert natural language questions into SQL queries, run them, and return tables and charts. Version 2.0 is effectively a complete rewrite targeting enterprise deployments with per-user permissions and a streaming web component. It supports essentially every LLM and database combination you might need.

- The RAG approach is the right one: storing question-SQL pairs as retrieval examples improves accuracy far more than just shoving schema into context, and they published a paper with benchmarks backing this up rather than just claiming it works.

- The integration matrix is genuinely wide — Postgres, BigQuery, Snowflake, DuckDB, ClickHouse, and a dozen others, paired with OpenAI, Anthropic, Ollama, Gemini, Bedrock. You can swap LLM or database without rewriting your tool setup.

- User-aware permissions flowing through the entire agent stack (UserResolver → ToolContext → SQL runner) is architecturally correct for multi-tenant use, rather than bolting on a WHERE clause filter as an afterthought.

- The drop-in `<vanna-chat>` web component that works with plain HTML, React, or Vue and reuses existing cookies/JWTs is a real time saver for teams that want a chat interface without building one.

- v2.0 is a complete rewrite and effectively a different product from the 0.x that earned 23k stars. The `src/vanna/legacy/` subtree and the migration guide both signal non-trivial breakage — if you built on 0.x, assume a port, not an upgrade.

- Row-level security implemented through LLM-generated SQL and prompt engineering is not a real security boundary. Prompt injection via natural language input can potentially sidestep WHERE-clause filters; you should not rely on this for actual data isolation in regulated environments.

- Cold-start accuracy is weak. The RAG retrieval is only as good as the question-SQL examples you provide. A fresh install with just schema context generates mediocre SQL for anything beyond simple selects — the README glosses over the training data requirement entirely.

- Chart generation implies executing LLM-produced code (Plotly), which is a code execution surface. There is no evident sandboxing described in the README or visible in the directory tree, which matters a lot if the database contains user-controlled data that could end up in a column name or value the LLM uses to write Python.

View on GitHub → Homepage ↗