// the find
redpanda-data/connect
Fancy stream processing made operationally mundane
Redpanda Connect (formerly Benthos) is a declarative stream processor: you write a YAML file describing sources, transforms, and sinks, and a single static binary runs it. It covers the full ETL/CDC spectrum — Kafka, NATS, RabbitMQ, S3, Postgres CDC, Iceberg — with a custom mapping language (Bloblang) handling the transformation layer. It's for platform/data engineers who want Kafka Streams-level capability without the JVM or a distributed cluster.
Bloblang is genuinely well-designed — a functional mapping language with proper error handling that avoids the template-string hell of most YAML-based ETL tools. The in-process transaction model for at-least-once delivery is a real architectural advantage: no external state store, no ZooKeeper, just the pipeline. CDC connector breadth is impressive — Postgres, MySQL, MongoDB, Oracle, MSSQL, Spanner, Salesforce, DynamoDB — most competitors pick two or three. The plugin API is clean Go interfaces, so adding a custom connector doesn't require forking the whole thing.
Bloblang is expressive but a proprietary language you have to learn and debug — when a mapping goes wrong in production you're reading unfamiliar stack traces, and IDE support is limited to a beta Claude plugin. The enterprise/free split is murky: certain connectors (CDC for some sources, Iceberg, cloud-specific inputs) live behind an enterprise license that isn't clearly documented in the repo itself. Stateful operations beyond simple deduplication and windowing get awkward fast — if you need joins across streams or complex aggregations you'll hit the ceiling and need to reach for Flink. The single-binary model is operationally simple until you need to scale one stage of a pipeline independently, at which point you're back to running multiple configs and managing them yourself.