// the find
hazelcast/hazelcast-jet
Distributed Stream and Batch Processing
Hazelcast Jet was a distributed stream and batch processing engine built on the Hazelcast IMDG platform. It has been absorbed into Hazelcast 5.0 — this repo is a dead end. If you're evaluating it for a new project, you're looking at the wrong repo; go to github.com/hazelcast/hazelcast instead.
The cooperative multithreading execution model was genuinely clever — running thousands of concurrent jobs on a fixed thread pool with sub-10ms latency at 10M events/second on a single node is a real engineering achievement. Fault tolerance via a Chandy-Lamport checkpoint implementation without requiring ZooKeeper or a distributed filesystem was a significant operational simplification compared to Flink. The connector library was broad: Kafka, Debezium CDC, JDBC, JMS, S3/GCS/ADLS, Elasticsearch, MongoDB — most data pipeline needs covered out of the box. The Pipeline API is clean; the word count and sensor aggregation examples in the README are actually readable Java, not the configuration XML nightmare you get with older Hadoop-era tools.
This repo is archived/superseded — the README itself says development moved to the main Hazelcast repo with 5.0. Any bug you file here goes nowhere. The last push was December 2024 but the codebase targets JDK 8 (the CI explicitly says 'Oracle JDK 8'), which means no records, no virtual threads, no modern Java. The dual-license model (Apache 2.0 for some files, Hazelcast Community License for others) is a trap — the Community License restricts competing commercial use, so you need to audit every file before shipping a product. Documentation lives on jet-start.sh, which may not survive as an independent domain now that the project is folded into Hazelcast proper.