// the find
Altinity/clickhouse-backup
Tool for easy backup and restore for ClickHouse® using object storage for backup files.
clickhouse-backup is a backup/restore tool for ClickHouse that uses ALTER TABLE FREEZE to snapshot immutable data parts and streams them to S3, GCS, Azure, or a handful of other destinations. It's the de-facto standard for self-managed ClickHouse installations — Altinity maintains it and it covers most of what you'd need for production backups.
Incremental backup support is well-implemented: it tracks which parts already exist on remote storage and only uploads new ones, which keeps costs down for large datasets. The resumable state feature means an interrupted upload doesn't restart from zero, a real practical concern with multi-hundred-GB tables. Configuration is fully environment-variable driven, which makes it straightforward to deploy in Docker or Kubernetes without baking config files into images. The custom remote storage escape hatch (rclone, restic, rsync) means if your storage backend isn't natively supported, you're not stuck.
The hard requirement to run on the same host as clickhouse-server is a meaningful operational constraint — you can't back up remotely, which complicates any setup where the backup process needs to live in a separate security boundary or pod sidecar with read-only access. The MergeTree-only limitation still bites: if you have Dictionary tables, MaterializedView sources on non-MergeTree engines, or Kafka tables, you're doing schema-only backups for those and hoping your data can be reconstructed. The embedded backup/restore path (USE_EMBEDDED_BACKUP_RESTORE=true) requires ClickHouse 22.7+ and has its own set of incompatibilities with the standard path, so you end up maintaining two mental models. Documentation for multi-shard cluster scenarios is sparse given how much complexity the sharded_operation_mode options imply.