// the find
second-state/echokit_server
Open Source Voice Agent Platform
EchoKit Server is a WebSocket-based voice pipeline server — ASR → LLM → TTS — designed to pair with a specific ESP32 hardware device or web client. It's for embedded/IoT developers who want to run a self-hosted voice assistant with swappable model backends. The hardware dependency is real: without the EchoKit device or a DIY ESP32 running their firmware, the server is half a project.
Written in Rust with async WebSocket handling, which is the right call for a low-latency real-time audio pipeline. Every stage (VAD, ASR, LLM, TTS) is independently configurable via a single TOML file, and the example library covers Gemini, Groq, OpenAI, ElevenLabs, Alibaba Bailian — actually useful breadth. MCP client support is a meaningful differentiator: you can give your voice assistant tool use without writing custom glue. The firmware is also open source (separate repo), so you're not locked into buying their hardware.
Last push was February 2026 and the project only has one active contributor visible in the structure — low bus factor for something that manages a hardware device's network connection. The web test client is hosted on echokit.dev, not in the repo, which means it can disappear or change under you. There's no authentication on the WebSocket server — anyone who can reach your port can talk to your LLM. Docker support exists but is tucked in a subdirectory with minimal documentation; production deployment story is thin.