// the find
apache/kafka
Apache Kafka - A distributed event streaming platform
Apache Kafka is the de facto standard for distributed event streaming — a persistent, fault-tolerant log that decouples producers from consumers at massive scale. It's for teams running data pipelines, event-driven architectures, or anything that needs durable, ordered, high-throughput message delivery. If you're considering it, you already know what it is.
1. KRaft mode (ZooKeeper removal) is now the default and stable — this eliminates one of the biggest operational headaches Kafka had for years; you no longer need to run and version-match a separate ZK ensemble. 2. The protocol is versioned and backward-compatible via the auto-generated message definitions under clients/src/main/resources/common/message — wire compatibility is taken seriously, not bolted on. 3. Kafka Streams ships as a library inside the same repo, so the streaming layer tracks exactly with the broker; no version drift between your broker and your stream processing layer. 4. The test infrastructure is serious: JMH benchmarks, Trogdor for distributed workload testing, flaky test tracking via CI — this is a project that actually cares about correctness under load.
1. Still Java/Scala — the Scala dependency (locked to 2.13) adds build complexity and JAR weight that most teams never actually need; it's there for legacy reasons and the direction is Java-only, but that migration isn't done. 2. Consumer group rebalances are still a production incident waiting to happen at scale — cooperative incremental rebalancing helped, but a large consumer group rebalancing under load will still cause meaningful pause times and requires careful tuning of session timeouts, heartbeat intervals, and max.poll.interval.ms. 3. Operational surface area is significant: tiered storage, rack awareness, quotas, ACLs, log compaction tuning — getting a cluster right in production requires real expertise; the 'just run the Docker image' path gets you started but won't survive anything serious. 4. No built-in schema registry — you need Confluent's or a third-party solution to enforce message contracts, which is a gap that bites every team eventually when producers start sending breaking changes.