// the find
iterative/cml
♾️ CML - Continuous Machine Learning | CI/CD for ML
CML bolts CI/CD onto ML workflows by letting you post model metrics and plots as PR comments, and spin up cloud GPU instances (EC2, GCP, Azure) directly from a GitHub Actions or GitLab CI job. It's for teams that want experiment tracking without a separate MLflow server or Weights & Biases account — just Git and your existing CI. Best fit for small-to-medium teams already living in GitHub/GitLab who don't want to run another service.
The `cml runner launch` cloud provisioning is genuinely useful — you get an ephemeral GPU instance that auto-terminates, which beats maintaining a persistent self-hosted runner for GPU jobs that run once a week. Multi-platform support (GitHub, GitLab, Bitbucket) is real, not just checkbox marketing. The DVC integration for diffs across commits (`dvc metrics diff main --show-md`) gives you actual before/after comparisons in PRs rather than just 'training finished'. Standalone binary releases mean you can use it without Node in your environment.
The Docker base images are frozen at Ubuntu 20.04 + CUDA 11.2, and the README has a maintenance warning from 2023 about Nvidia dropping those images — the project feels like it's in slow-motion decline. Cloud runner provisioning uses Terraform under the hood, which means cloud credential management is more footgun-prone than it looks from the YAML examples; secret sprawl across AWS/GCP/Azure keys gets messy fast. The report format is just Markdown appended to a file, so any real charting requires Vega-Lite installed locally in the runner, which is a painful dependency chain. No native artifact storage — you're still responsible for where your model files actually live, which pushes you toward DVC anyway, making CML feel like a thin wrapper.