finds.dev← search

// the find

apache/incubator-xtable

★ 1,193 · Java · Apache-2.0 · updated Jun 2026

Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.

Apache XTable is a format translation layer for lakehouse table formats — you write data in Hudi, Iceberg, or Delta Lake, and XTable generates the metadata so the other two formats can read the same physical files. It's an incubating Apache project that solves a real pain point: organizations locked into one lakehouse format when their query engines or partners expect another. Aimed at data engineers running Spark/Trino/Presto stacks who can't afford to ETL data between formats.

The core idea is solid — translate metadata rather than copy data, which means zero storage duplication and near-instant cross-format availability. The SPI is well-factored: adding a new source format means implementing ConversionSource, a new target means ConversionTarget, and the ConversionController wires them up independently. Incremental sync support (not just snapshot) is a genuine differentiator; most naive solutions do full rewrites. Multi-cloud Hadoop config is baked in — AWS, Azure, and GCP S3/ABFS/GCS credentials all work through the bundled jar without custom builds.

Still incubating and at 0.2/0.3 releases — the features-and-limitations doc exists for a reason, and partition evolution and schema evolution edge cases bite people in production. Only three source formats supported, and adding a new one is not as simple as the README implies — you need deep knowledge of the source format's internal commit protocol. Spark is an implicit dependency for Delta conversion even if you don't write Spark jobs, which drags in a heavy JVM footprint that makes it awkward to run as a lightweight sidecar. The YAML-driven CLI is fine for batch jobs, but there's no streaming/CDC mode — you need to schedule it externally, and drift between source commits and target metadata sync is entirely your problem to manage.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →