// the find
vearch/vearch
Distributed vector search for AI-native applications
Vearch is a distributed vector database built on top of Faiss, originally developed at JD.com for large-scale image search. It sits in the same space as Milvus and Weaviate — vector similarity search with scalar filtering — but with a Go/C++ split architecture where the heavy lifting happens in a C++ engine called Gamma.
1. The Gamma engine wraps Faiss directly and adds real-time indexing on top of it — you get IVF, HNSW, IVFPQfs, and even a ScaNN integration without paying the latency of a pure Python stack. 2. Raft-based replication at the partition server level is solid design; data durability isn't bolted on as an afterthought. 3. Multi-vector-field support in a single document is genuinely useful for multimodal search (image + text embeddings on the same record). 4. Deployment story is reasonable — Helm chart, Docker Compose for both standalone and cluster modes, and SDKs in Python, Go, Java, and Rust.
1. The C++ engine (Gamma) is bundled as a compiled dependency with a CMake build — this makes the build system fragile and means you're chasing prebuilt binaries or fighting toolchain issues before you can run a single test. 2. 2,314 stars in 2024 for a project from a major e-commerce company suggests limited community traction; compare to Milvus at 30k+ or even Qdrant at 22k+ — you're betting on a project that may not have the community momentum to survive a pivot at JD.com. 3. The hybrid scalar filtering is implemented via bitmap and inverted index inside Gamma, which means filter performance characteristics are opaque and you can't use your existing database tooling to inspect or optimize them. 4. Documentation is thin and leans on ReadTheDocs pages that are clearly maintained as an afterthought — the architecture diagram is an Excalidraw PNG and several docs pages are stub-length.