// the find
apache/polaris
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
Apache Polaris is an open-source Iceberg REST catalog — the metadata layer that lets Spark, Flink, Trino, and other engines read and write the same Iceberg tables without vendor lock-in. It was donated to Apache by Snowflake and is the reference implementation of the Iceberg REST catalog spec. Target audience is platform/data engineering teams building or migrating to a lakehouse architecture who don't want to depend on a managed catalog from a cloud vendor.
Implements the Iceberg REST catalog spec faithfully, so any compliant engine connects without custom plugins. The extension points for authorization (OPA and Ranger are both included) are real interfaces, not bolted-on afterthoughts — the OPA integration ships with a JSON schema for the policy input, which saves hours of guessing. Quarkus as the runtime means fast startup and a sane configuration model via application.properties, and the Docker/Helm packaging is actually production-grade rather than a demo afterthought. The catalog federation extensions (Hive Metastore, Hadoop, BigQuery) mean you can bridge legacy catalogs without a full migration.
Persistence story is limited: the only production-ready backend is JDBC, meaning you're running a relational DB to store table metadata — fine, but there's no native pluggable key-value or object-store backend for teams that want to avoid another stateful service. The Python client is fully code-generated from OpenAPI, which means the ergonomics are mechanical and the CLI feels like a thin wrapper around HTTP calls rather than something you'd actually want to use day-to-day. Integration tests require Docker and are slow enough that the CI explicitly offers a flag to skip them, which is a sign that the local dev loop is painful. The project is young (donated to Apache in 2024) and the API surface, especially around policy management, has been shifting — check the CHANGELOG before committing to a version.