// the find
linkedin/cruise-control
Cruise-control is the first of its kind to fully automate the dynamic workload rebalance and self-healing of a Kafka cluster. It provides great value to Kafka users by simplifying the operation of Kafka clusters.
Cruise Control is LinkedIn's Kafka cluster autopilot: it continuously monitors broker and partition-level resource utilization, builds a workload model, and automatically rebalances replicas or kicks off self-healing when brokers die or goals are violated. It's for teams running Kafka at scale (tens to hundreds of brokers) who want automated rebalancing instead of hand-tuning partition assignments. Running it at a handful of brokers is probably overkill.
The goal system is genuinely well-designed — a prioritized, pluggable list of constraints (rack awareness, capacity caps, utilization distribution) that the optimizer works through in order, so you can define exactly what tradeoffs you accept. Self-healing for broker failures is production-proven at LinkedIn's 10K+ broker scale, which is meaningful signal. The metrics reporter ships as a separate JAR you drop into Kafka's lib directory, keeping the instrumentation side cleanly decoupled. Integration tests cover actual broker failure and disk failure scenarios rather than just unit-testing the optimizer math.
Setup friction is real: you need to manually copy the metrics reporter JAR into every broker, configure capacity JSON files that must reflect actual hardware, and wait for multiple sampling windows before Cruise Control has enough data to do anything useful — the README glosses over how long that warmup actually takes. The Python CLI client is a maintenance orphan sitting in the same repo as the Java service, with no indication it tracks the REST API version. KRaft support is present but the wiki and docs still reference ZooKeeper prominently, so you'll spend time figuring out what's actually current. There's no built-in auth beyond what you wire up yourself — in a multi-tenant or externally-accessible environment, the REST API is wide open by default.