// the find

logicalclocks/hopsworks

★ 1,303 · Java · AGPL-3.0 · updated Feb 2025

Hopsworks - Data-Intensive AI platform with a Feature Store

Hopsworks is a full-stack MLOps platform combining a feature store, model registry, and serving infrastructure. It targets ML teams that want one system to manage the entire pipeline from raw data to deployed models, particularly those already on Kubernetes in AWS/Azure/GCP.

The feature store design is genuinely thought through — it separates online (low-latency) and offline (batch training) storage paths, which is the right call and something most teams get wrong when building this themselves. The multi-tenancy model with project-scoped sandboxes lets a shared cluster serve multiple teams without data bleeding between them. Integration tests live in Ruby against a real running cluster, not mocked unit tests, so the test suite actually catches what matters. AGPL license keeps the hosted version honest — you get the real source, not a crippled open-core stub.

32GB RAM minimum for a single node is a significant barrier; this is enterprise infrastructure, not something you spin up to evaluate. The last push was February 2025 on a 1.3k star repo, which for a platform of this scope suggests the open-source activity is mostly cosmetic and the real development is behind the managed offering paywall. The Java backend with a Python client SDK means debugging cross-language issues is painful and the stack trace usually terminates at the serialization boundary. On-premise installation explicitly requires collaborating with the Hopsworks engineering team — that's a sales call dressed up as a deployment guide.

View on GitHub → Homepage ↗