finds.dev← search

// the find

kaiwaehner/kafka-streams-machine-learning-examples

★ 912 · Java · Apache-2.0 · updated Dec 2023

This project contains examples which demonstrate how to deploy analytic models to mission-critical, scalable production environments leveraging Apache Kafka and its Streams API. Models are built with Python, H2O, TensorFlow, Keras, DeepLearning4 and other technologies.

A collection of Java examples showing how to run ML model inference inside Kafka Streams processors — load a trained model (H2O, TensorFlow, DL4J, Keras via DL4J import) once at startup, then score each message in the stream topology. Aimed at Java backend engineers who already know Kafka and want to add inference without a separate model-serving tier.

Each example ships with pre-trained model binaries checked in, so you can actually run `mvn test` and see it work without training anything. The embedded Kafka cluster in tests means no local broker needed for the unit tests. The H2O POJO export approach (model compiled to a plain Java class) is genuinely clever — zero network hop, no HTTP overhead, model runs in-process with the stream processor. TensorFlow example uses the SavedModel format with the official Java bindings, which is the right call rather than some hacky subprocess.

Last touched December 2023 but dependencies are much older — Kafka 2.5 and Java 8, both well past their useful life; TF Java bindings API has changed significantly since this was written and will not compile against current versions. The DL4J/ND4J dependency is enormous (multiple GB including native libs) and DL4J itself is largely abandoned in favor of ONNX or TorchScript approaches. No model reloading at runtime — swapping a model means restarting the JVM, which is a real operational gap for production use. The repo conflates 'here is a pattern' with 'here is a production-ready library' when it's firmly the former: no backpressure handling, no latency metrics, no dead-letter topic for scoring failures.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →