finds.dev← search

// the find

akdeb/ElatoAI

★ 1,781 · TypeScript · NOASSERTION · updated May 2026

Realtime Voice AI with 100+ Models on Arduino ESP32 with Secure Websockets and Edge Functions for AI Toys, Companions, and Devices

ElatoAI turns an ESP32 into a voice AI client by offloading all the heavy computation to edge functions (Deno or Cloudflare Workers) that proxy to OpenAI Realtime, Gemini Live, ElevenLabs, Hume, and others. The device captures audio, streams it over WebSocket, and plays back the response — no ML runs on the microcontroller itself. It's aimed at people building AI toys, custom voice companions, or educational devices on cheap hardware.

- The architecture is the right call: ESP32 does only audio I/O and WebSocket framing, not inference. Opus at 12kbps/24kHz is a good codec choice for constrained hardware — much better than trying to send raw PCM over WiFi.

- Multi-model support is real, not a checkbox — there are separate server implementations per provider (OpenAI, Gemini, Grok, ElevenLabs, Hume), and the Cloudflare path uses Durable Objects for managing multiple concurrent device sessions, which scales correctly.

- The firmware test suite is unusually thorough for an Arduino project — individual test files for audio streaming, Opus encode/decode, OTA, touch sensor, WiFi captive portal, and speaker radio. That's not typical for this class of project.

- OTA firmware updates and captive-portal WiFi provisioning are built in, which removes two of the most painful parts of shipping any IoT device.

- No barge-in (speech interruption) on the ESP32 side — they list it as a known gap. This is a first-class UX problem for voice assistants; conversations feel robotic when you can't cut off the AI mid-sentence.

- The 20-minute session ceiling is a hard wall from edge runtime time limits, not a soft performance constraint. Any use case positioned as an 'always-on companion' hits this and needs a reconnect strategy that the repo doesn't really address.

- API key encryption is marked 'optional', so most people will flash their keys into firmware unencrypted. The README buries the security risk — someone who clones a device or reads its flash gets full API access.

- The project has an obvious commercial layer (Kickstarter hardware, products page) woven into what's framed as an open source framework. The boundary between what you get for free and what requires their hardware or paid tier is never stated clearly.

View on GitHub → Homepage ↗

// want more like this?

We dig through GitHub every week and send a few repos picked for what you actually care about — each with an honest take like this one.

Get finds in your inbox → Search again →