// the find
numaproj/numaflow
Kubernetes-native platform to run massively parallel data/streaming jobs
Numaflow is a Kubernetes-native stream processing platform from the Intuit/Argo team, built to run DAG-style pipelines where each vertex auto-scales independently. It's aimed at platform teams that want Kafka Streams or Flink-style event processing but want to express it as Kubernetes CRDs rather than running a separate cluster. The core is written in Go (the GitHub badge says so, despite the repo now tagging Rust as its primary language — misleading).
Exactly-once semantics with at-least-once as the floor is a real guarantee, not hand-waving — it's implemented via a WAL and tracked in the inter-step buffer service. The MonoVertex mode (single-container pipelines without an ISB) is a smart addition for simpler workloads that don't need the full pipeline overhead. Language-agnostic vertex UDFs via gRPC means you can drop in Python or Rust code without rewriting the infrastructure. The autoscaling with back-pressure propagation is built-in at the platform level, not bolted on — each vertex scales based on buffer depth, which is the right signal.
The repo language is listed as Rust but the codebase is primarily Go — this is a basic metadata error that signals less-than-careful housekeeping. The Inter-Step Buffer Service (JetStream or Redis) is a mandatory dependency even for simple pipelines; there's no embedded option, so your 'serverless' pipeline still requires operating a stateful NATS cluster. The serving/HTTP source story is thin — real-time request/response over a streaming pipeline is possible but the docs make it look experimental. With 2.5k stars and Intuit backing it's not obscure, but the ecosystem of pre-built connectors is sparse compared to Kafka Streams or Flink, so you'll write more custom sources and sinks than you might expect.