// the find

raystack/optimus

★ 761 · Go · Apache-2.0 · updated Jun 2024

Optimus is an easy-to-use, reliable, and performant workflow orchestrator for data transformation, data modeling, pipelines, and data quality management.

Optimus is a data pipeline orchestrator aimed at data/analytics engineers who want to write SQL transforms without hand-coding Airflow DAGs. It parses SQL to infer table dependencies automatically and compiles jobs to an Airflow-compatible scheduler. Best fit for teams already on BigQuery who find dbt+Airflow too much to maintain separately.

Automatic dependency resolution from SQL is the real differentiator — you write a SELECT, Optimus builds the DAG, no manual upstream/downstream declarations. The plugin system is first-class: custom executors are gRPC services, not monkey-patched Python. Multi-tenant support with cross-tenant dependency tracking is something most open-source orchestrators skip entirely. YAML-driven warehouse resource management (tables, views) keeps schema definitions colocated with the jobs that use them.

Last commit was June 2024 and the repo is still v0.x with explicit warning of breaking API changes — adopting this in production means accepting instability without a migration path guarantee. BigQuery is clearly the primary target; Snowflake, Redshift, and Databricks feel like afterthoughts with plugin coverage that lags behind. The documentation site links are all relative to a hosted docs page that may or may not be current, and the getting-started guide requires standing up a Kubernetes cluster just to run a sample job. No native observability — metrics exist but you're wiring Prometheus yourself with no pre-built dashboards.

View on GitHub → Homepage ↗