// the find
dagucloud/dagu
Local-first workflow engine with a Web UI for small teams. Define DAGs in a declarative YAML format. Self-contained and no DBMS required. Use any AI agent to manage your DAGs.
Dagu is a single-binary DAG workflow engine with a Web UI, designed as a self-contained alternative to Airflow for small teams who want scheduling, retries, and observability around existing scripts without standing up a database or message broker. It speaks YAML for workflow definitions, runs shell commands, Docker containers, Kubernetes Jobs, and SSH targets as steps, and stores all state in local files. The target user is the DevOps engineer tired of opaque cron jobs, not the data platform team running petabyte pipelines.
The single-binary, file-backed storage model is genuinely practical — you can deploy it on a VM or even a Raspberry Pi without provisioning Postgres or Redis first. The built-in action library is unusually wide: postgres.query, redis.*, s3.*, sftp, HTTP, SSH, Docker, and Kubernetes all work as first-class step types, not shell wrappers. The coordinator/worker architecture for distributed execution is a real design, not an afterthought — mTLS between nodes, label-based routing, gRPC health checks, and heartbeat monitoring are all there. The human-in-the-loop approval step is clean and genuinely useful for ops workflows that need a gate before destructive actions.
File-backed state is the design's main liability: no ACID transactions means a mid-write crash can corrupt run history, and distributed workers sharing a network filesystem (required for ReadWriteMany in the Helm chart) turns a 'no DBMS' selling point into 'use NFS instead', which is arguably worse. The embedded Go API is marked experimental with no stability promise, so you can't build reliable tooling on top of it without tracking upstream breakage. The MCP server and AI harness features are very new and the documentation is thin on failure modes — what happens when the agent CLI crashes mid-step, whether stdout is buffered correctly, and how secrets are actually masked in streamed logs are all left to the reader. RBAC and SSO are paywalled behind the self-host license, which means any multi-user deployment immediately hits a commercial gate for what most orchestrators treat as a basic feature.