// the find
airbytehq/airbyte
Open-source data movement for ELT pipelines and AI agents — from APIs, databases & files to warehouses, lakes, and AI applications. Both self-hosted and Cloud.
Airbyte is a mature open-source ELT platform with 600+ connectors for moving data between APIs, databases, warehouses, and lakes. It's aimed at data engineers who need to build and manage data pipelines without writing everything from scratch. Recently they've been pushing into the AI agent space with an Agent SDK for giving LLMs access to connector data.
- Connector catalog breadth is genuinely impressive — 600+ connectors covering the long tail of SaaS APIs, databases, and cloud storage that competitors either don't have or charge enterprise prices for.
- The low-code CDK and no-code Connector Builder make it practical to add new sources without deep platform knowledge; the Python CDK and Kotlin bulk CDK give escape hatches for complex cases.
- CDC support for major databases (Postgres, MySQL, MSSQL) is a real feature, not an afterthought — it handles the hard parts like debezium integration and state management.
- CI/CD infrastructure is unusually thorough for an OSS project: per-connector test workflows, progressive rollout gates, automated CDK version bumps, and connector registry generation are all in GH Actions.
- Self-hosting is genuinely heavy — the platform runs on Kubernetes/Helm with Temporal, multiple microservices, and a PostgreSQL backend. Spinning it up for a small team is not a weekend project, and abctl abstracts some of this but adds its own failure modes.
- Dual licensing (MIT for connectors, ELv2 for the platform) means you cannot use Airbyte as part of a commercial product or service without buying Enterprise. This catches people off guard after they've already built on it.
- Connector quality varies wildly across the catalog. Certified connectors are well-maintained, but a large portion are community-contributed and may be broken, unmaintained, or missing incremental sync.
- The monorepo is enormous and the build tooling is complex (Gradle + Python + Docker + Dagger), making local development of even a single connector time-consuming to set up, especially for contributors unfamiliar with the full stack.