finds.dev← search

// the find

neuml/txtai

★ 12,650 · Python · Apache-2.0 · updated Jun 2026

💡 All-in-one AI framework for semantic search, LLM orchestration and language model workflows

txtai is a Python library that unifies vector search, sparse indexing, graph networks, RAG pipelines, and AI agents under a single embeddings database abstraction. It's built on Hugging Face Transformers and Sentence Transformers, with a FastAPI layer that exposes an OpenAI-compatible API. The target is developers who want to build semantic search or LLM-backed applications without wiring together five separate libraries.

The embeddings database combining dense vectors, sparse BM25, and a relational DB in one object is the most honest part of the design — you get hybrid search without running separate services. The pgvector backend means you can use your existing Postgres as the vector store instead of adopting yet another purpose-built DB. The 85+ Colab-ready example notebooks are genuinely dense with working code, not contrived toy demos. The built-in OpenAI-compatible API endpoint lets you swap txtai in front of existing OpenAI client code with minimal changes.

All-in-one cuts both ways: the dependency graph pulls in torch, transformers, faiss, whisper, and friends — a full install is heavy, and a breaking change in the transcription or image-captioning layer can hit teams who only wanted vector search. The agent layer is built on smolagents, which means you inherit Hugging Face's design decisions about tool interfaces and memory models; if smolagents makes a breaking change, your agents break. The distributed/cluster mode exists but production guidance is thin — the cluster notebook dates from early examples and there's no documentation on index consistency or shard rebalancing. YAML-driven pipeline configuration works for simple cases but gets opaque fast; when the schema is wrong, error messages are not your friend.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →