finds.dev← search

// the find

apache/hop

★ 1,399 · Java · Apache-2.0 · updated Jun 2026

Hop Orchestration Platform

Apache Hop is a visual ETL and workflow orchestration tool that forked from Kettle/Pentaho Data Integration. It targets data engineers who need a GUI-driven pipeline builder with broad connector support and want to avoid paying for Informatica or Talend. Think PDI with a modern plugin architecture and actual Apache governance.

The plugin system is genuinely well-structured — transforms, workflows, and run configurations are all pluggable via annotation-based discovery, so adding a new connector doesn't require touching core. The separation between pipelines (streaming row-by-row transforms) and workflows (sequential action orchestration) is a correct design that most ETL tools blur. Ships with a substantial sample library including loops, parallel workflows, and parameterized pipelines so you can get oriented without reading docs. Active daily CI on Jenkins and SonarCloud integration suggests the codebase isn't rotting.

The PDI heritage shows — the codebase has Kettle-era patterns like `Const.java` doing everything, `BlockingRowSet` concurrency that dates to pre-Java-5 mental models, and XML everywhere under the hood. Build is a full Maven multi-module monorepo that will take 10+ minutes to compile from scratch; no incremental dev story worth mentioning. The GUI is SWT-based, which means platform-native look-and-feel is a lie on anything that isn't Windows and requires a desktop environment — no headless dev loop. Documentation lags the code; the community mailing list and scattered AsciiDoc pages are the real source of truth, and both assume you already know PDI.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →