finds.dev← search

// the find

tencentmusic/cube-studio

★ 5,061 · Python · NOASSERTION · updated Jun 2026

cube studio开源云原生一站式机器学习/深度学习/大模型AI平台,mlops算法链路全流程,算力租赁平台,notebook在线开发,拖拉拽任务流pipeline编排,多机多卡分布式训练,超参搜索,推理服务VGPU虚拟化,边缘计算,标注平台自动化标注,deepseek等大模型sft微调/奖励模型/强化学习训练,vllm/ollama/mindie大模型多机推理,私有知识库,AI模型市场,支持国产cpu/gpu/npu 昇腾生态,支持RDMA,支持pytorch/tf/mxnet/deepspeed/paddle/colossalai/horovod/ray/volcano等分布式

Cube Studio is a Kubernetes-native MLOps platform from Tencent Music that covers the full ML lifecycle: notebook development, drag-and-drop pipeline orchestration, distributed training across frameworks (PyTorch, TF, DeepSpeed, ColossalAI), LLM fine-tuning, and inference serving with vGPU virtualization. It is aimed at teams running their own GPU clusters who want a self-hosted alternative to managed ML platforms. The repo was archived on 2026-05-18 and has migrated to data-infra/cube-studio.

The vGPU virtualization and RDMA scheduling support is genuinely useful for teams sharing expensive GPU hardware — you get GPU time-slicing without needing to buy into NVIDIA MIG. The breadth of distributed training framework support (PyTorch, MXNet, Horovod, DeepSpeed, ColossalAI, PaddlePaddle, Mindspore, Volcano) in a single scheduler is rare in open-source tools. Support for domestic Chinese hardware (Huawei NPU, Hygon DCU, Cambricon MLU) makes it the only open-source option in that space. The separation of pipeline orchestration from execution (using Argo Workflows under the hood) means pipeline DAGs are portable and inspectable without platform lock-in.

The repo is archived — development has moved to data-infra/cube-studio, so you're reading the old codebase and will need to migrate anyway before getting bug fixes or security patches. The frontend is a compiled React bundle checked into git (static/appbuilder/frontend/*.chunk.js) with no source, which makes customization or debugging the UI essentially impossible. Installation is Kubernetes-only with a significant footprint (Argo, Istio, Harbor, Prometheus, MinIO all expected), so the barrier to getting a working dev environment is high — no Docker Compose path for local testing. Documentation is almost entirely in Chinese, which will block non-Mandarin teams from operating or troubleshooting it without translation overhead.

View on GitHub →

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →