// the find
arkflow-rs/arkflow
High performance Rust stream processing engine seamlessly integrates AI capabilities, providing powerful real-time data processing and intelligent analysis.
ArkFlow is a YAML-configured stream processing engine built on Tokio and Apache Arrow/DataFusion, positioning itself as a Rust alternative to Benthos/Redpanda Connect. You define pipelines declaratively: pick an input source (Kafka, MQTT, HTTP, databases, Modbus), run data through processors (SQL via DataFusion, Python scripts, VRL, Protobuf), and route to outputs. The AI angle is thin — it's mainly the Python processor calling out to inference code, not built-in model execution despite what the description implies.
Using Apache Arrow as the internal message format is the right call — zero-copy column batches through the pipeline is genuinely fast and SQL queries via DataFusion work directly on that memory layout without serialization. The windowing primitives (tumbling, sliding, session) are first-class buffer types rather than bolted-on afterthoughts, which matters for stateful aggregations. VRL support is a good pick for transformation logic — it's a proven DSL from the Vector ecosystem with good error handling semantics. The plugin architecture is clean: inputs, processors, outputs, and buffers all implement discrete traits, so extending it doesn't require forking core.
The 'AI capabilities' claim in the description is marketing — there's no built-in model runtime; you call Python via subprocess/embedding, which means GIL contention and process overhead that undercuts the Rust performance story. Error routing exists (error_output field) but there's no dead-letter queue or replay mechanism documented, so a bad message that keeps failing will either block or get silently dropped depending on config. Only one production user listed (a Korean company), and the project is at v0.5 — the API surface will move, and the versioned docs already show breaking changes between 0.2 and 0.5. Backpressure handling upstream to Kafka consumers isn't explained anywhere in the README or docs, which is the failure mode that bites people first in production.