finds.dev← search

// the find

WeBankFinTech/DataSphereStudio

★ 3,261 · Java · Apache-2.0 · updated Nov 2025

DataSphereStudio is a one stop data application development& management portal, covering scenarios including data exchange, desensitization/cleansing, analysis/mining, quality measurement, visualization, and task scheduling.

DataSphereStudio is a big-data workflow IDE from WeBank that glues together a constellation of Hadoop-ecosystem tools (Spark, Hive, Flink, Azkaban, DolphinScheduler, etc.) behind a single drag-and-drop UI. It is aimed at enterprise data teams that already run a Linkis-backed platform and want one portal across ETL, scheduling, data quality, and visualization rather than logging into six separate tools. Pointedly not useful if you don't already have a Hadoop/Linkis stack.

The AppConn plugin model is genuinely well thought out — the three-tier SSO/organization/workflow protocol means adding a new tool doesn't require forking the core. Cross-system shared context (ContextService) lets a variable defined in a Hive node get picked up downstream in a Spark node without copy-paste wiring, which is the right abstraction for multi-tool pipelines. Financial-grade origins show in the multi-tenant resource isolation and the Azkaban-derived scheduler having HA baked in from day one. The compatibility matrix is explicit and versioned, which is rare for a project with this many moving parts.

Deployment is genuinely painful: you need Linkis running first, then DSS on top, then each AppConn plugin separately — the standalone install script works only on a single node and the HA path is underdocumented. The project is a WeBank internal tool open-sourced, and that shows in the English docs, which are machine-translated, incomplete, and lag behind the Chinese README by a full version. Last commit was November 2025 and the issue tracker has open bugs from 2023 with no responses, suggesting maintenance has slowed. The DolphinScheduler AppConn is listed as 'In Code Merging' in the README despite the code existing, which is a trust signal about doc hygiene generally.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →