// the find

georgia-tech-db/evadb

★ 2,676 · Python · Apache-2.0 · updated May 2024

Database system for AI-powered apps

EvaDB is a Python query engine that extends SQL with AI function calls — you write `SELECT ChatGPT(prompt, text) FROM documents` and it handles model dispatch, batching, and result caching. It comes out of the Georgia Tech database group and targets developers who want AI inference in their data pipelines without leaving SQL. The project is effectively dead: last commit was May 2024 and the GitHub org has gone quiet.

It has a real query optimizer, not a thin wrapper — predicate pushdown, cost-based function ordering, and parallel execution are actually implemented in the executor layer. Function result caching is built into the catalog, so the same model call on the same input doesn't re-run across queries. The connector list is genuinely broad: Postgres, MySQL, Snowflake, S3, local filesystem, and vector DBs like FAISS, pgvector, Chroma, Pinecone. The architecture separates binder, optimizer, and executor cleanly, which makes adding a new data source or AI backend a scoped change rather than a surgery.

The project is abandoned — last push May 2024, Python support caps at 3.11, and the roadmap board hasn't moved. You'd be adopting a dead dependency. EvaQL is a custom SQL dialect, which means your queries won't run on anything else and you're locked into whatever subset of SQL the authors bothered to implement. The dependency graph is a nightmare: to use all features you need Ludwig, XGBoost, statsforecast, various HuggingFace packages, and multiple vector DB clients installed simultaneously, with no clear guidance on which are actually compatible. It never shipped a stable 1.0, so the API surface has shifted repeatedly and migration notes are sparse.

View on GitHub → Homepage ↗