// the find
ibis-project/ibis
the portable Python dataframe library
Ibis is a Python dataframe library that compiles a single pandas-like API into SQL or native dataframe operations across 20+ backends — DuckDB, BigQuery, Snowflake, Spark, Polars, and more. The core pitch: write your data transformation once locally against DuckDB, then point it at Snowflake in production by changing one line. It's aimed at data engineers and analysts who are tired of rewriting the same logic in pandas, SQL, and PySpark depending on where the data lives.
- The lazy expression model is genuinely well-designed — expressions don't execute until you ask, which lets Ibis push the full query to the backend rather than pulling data to Python. This is the right call and it shows.
- Python-SQL interop is useful in practice: you can drop into raw SQL mid-chain and keep composing Python operations on top, which handles the 20% of cases where the dataframe API falls short without forcing you to abandon it entirely.
- Backend coverage is serious — not just a thin shim over DuckDB. The CI matrix runs against real instances of BigQuery, Snowflake, ClickHouse, Trino, etc., which means backend-specific quirks actually get caught.
- The ibis.to_sql() escape hatch is good engineering — you can inspect exactly what SQL gets generated, which makes debugging backend-specific failures tractable instead of magical.
- The abstraction leaks constantly in practice. Operations that exist in one backend don't exist in others, and the 'write once, run anywhere' promise quietly breaks on anything beyond basic group-by/filter/join. The backend compatibility matrix is a maze of footnotes.
- Error messages when an operation isn't supported on your backend are often cryptic — you get a Python exception deep in the compilation layer, not a clear 'this backend doesn't support window functions with RANGE BETWEEN'.
- Twenty-plus backends means the maintenance surface is enormous and backend implementations vary in quality. The DuckDB backend is first-class; the Oracle and Druid backends feel like they were contributed once and lightly maintained since.
- No streaming story for backends that support it well (Flink backend exists but is experimental and the API surface is limited). If you're doing streaming, you'll hit the ceiling fast.