// the find

vdaas/vald

★ 1,707 · Go · Apache-2.0 · updated Jun 2026

Vald. A Highly Scalable Distributed Vector Search Engine

Vald is a distributed vector search engine built for Kubernetes, using Yahoo Japan's NGT algorithm under the hood. It targets teams that need billion-scale ANN search with automatic sharding, index backup, and horizontal scaling — and want a purpose-built system rather than bolting vector search onto an existing database. It's CNCF Landscape-listed and production-used at LY Corporation (Yahoo Japan).

NGT is genuinely fast — it consistently outperforms HNSW on high-dimensional data in ann-benchmarks, which is a real differentiator over pgvector or Weaviate. The architecture is properly decomposed: agents own the index shards, a discoverer handles service discovery in Kubernetes, and a load-balancing gateway fans out queries — each component is independently scalable. The filter gateway (ingress/egress) lets you embed transformation logic without touching the core search path, which is a clean design for pre/post-processing pipelines. The Helm operator and the nightly/stable/versioned image tagging policy show operational maturity — this isn't a research project that assumed you'd compile from source.

Kubernetes is a hard requirement, not an optional deployment target — there's no standalone mode, so you can't run this locally without k3d or minikube, which raises the floor considerably for evaluation. NGT requires AVX2 instructions, meaning you can't run it on older hardware or many cloud spot instances without checking CPU flags first. The contributor count (27 people, primarily from one company) and star count (1,707) are low for infrastructure software at this ambition level — compare to Weaviate or Qdrant, which have much larger communities and ecosystems of client libraries. Documentation exists but the getting-started experience routes you through a Helm install against a live cluster; there's no 'run this Docker Compose and try a search' path for people who just want to see if it fits before committing to the operational model.

View on GitHub → Homepage ↗