// the find
danielqsj/kafka_exporter
Kafka exporter for Prometheus
A Prometheus exporter for Kafka that exposes broker, topic partition, and consumer group lag metrics via a standard /metrics endpoint. It uses Sarama under the hood and supports a wide range of auth mechanisms (SASL plain/SCRAM/Kerberos/AWS IAM/OAuthBearer, mTLS). Aimed at ops teams running Kafka who want Prometheus/Grafana monitoring without standing up a full JMX pipeline.
The auth surface is genuinely broad — AWS IAM, Kerberos, OAuthBearer, and SCRAM SHA-256/512 are all first-class flags, which covers most enterprise Kafka deployments. Consumer group lag per-partition is the metric that actually matters for diagnosing stuck consumers, and it's exposed properly with both per-partition and summed variants. There's a bundled Helm chart with a ServiceMonitor resource, so getting this into a kube-native Prometheus Operator setup is one values file away. The `topic.filter`/`topic.exclude` and `group.filter`/`group.exclude` regex flags are essential for clusters with hundreds of topics and prevent cardinality explosions.
The entire codebase lives in a single file (`kafka_exporter.go`) — at that complexity level it's hard to read and harder to test; `simple_test.go` is almost certainly not testing much of the real scrape logic. The `concurrent.enable` flag warning ('WARN: This should be disabled on large clusters') hints at a design where the concurrency model is an afterthought rather than something you can trust. It doesn't expose producer metrics at all — throughput, batch sizes, send errors — so you'll still need JMX for anything beyond consumer lag and partition health. The Grafana dashboard ID is hardcoded in the README but not versioned with the repo, so the dashboard JSON in the repo and the published one on grafana.com can quietly diverge.