// the find
dropbox/PyHive
Python interface to Hive and Presto. 🐝
PyHive is a DB-API 2.0 and SQLAlchemy dialect for Hive, Presto, and Trino. It was the standard way to talk to HiveServer2 from Python for years. The project has been donated to Apache Kyuubi and is effectively in maintenance mode at the Dropbox repo.
The SQLAlchemy dialect support means it plugs into the existing Python data ecosystem (pandas, SQLAlchemy ORM, Alembic) without custom glue. Async query polling with log streaming is genuinely useful for long-running Hive queries where you want progress feedback. The pure-sasl variant for Python 3.11+ is a practical workaround for the broken C sasl library. Session/configuration passthrough is clean — you get a single generic hook rather than a dozen bespoke parameters.
The project is abandoned at this location — the README says go to Apache Kyuubi, the CI badge points to Travis CI which hasn't been active for years. The sasl dependency outright breaks on Python 3.11+ unless you specifically use the hive_pure_sasl extra, which is an easy way to waste an afternoon. Presto support is questionable given that Presto and Trino forked years ago and the dialects may have drifted; you'd be betting on a frozen snapshot. The autogenerated TCLIService Thrift bindings are pinned to Hive 2.3, so newer HiveServer2 protocol features are just missing.