// the find
coaidev/coai
🚀 Next Gen Multi-tenant AI One-Stop Solution. Builtin Admin & Billing System. Enterprise-Grade Unified LLM Gateway Support for 200+ Models And 35+ Providers, Load Balacing w/ Priority-base Routing, Cost Management, Chat Share, Cloud Sync, Credit/Subscription Billing, All File Parsing, Web Search, Built-in Model Cache.
CoAI is a self-hosted LLM gateway plus chat UI — think One API and ChatGPT-Next-Web bolted together into one Go/React monolith. It handles multi-provider routing, per-user billing (subscriptions and token-based), conversation sync, and file parsing. The target is someone who wants to run a commercial AI chat service for paying users without stitching together three separate projects.
The channel management system is the standout: priority-based routing with weighted load balancing across same-priority channels, automatic failover, and model aliasing are all there and not just bolted on. The billing model is genuinely flexible — you can mix subscription and elastic billing per user group, with minimum-points detection to stop zero-balance calls from slipping through. The model cache is a smart cost-reduction feature: hash request params, return cached response, don't bill the hit. Deployment story is solid — Docker Compose, Zeabur one-click, and a native binary build all work, which is more than most projects this size manage.
MySQL as the primary store means no pgvector, no native JSON path queries, and a schema that's going to get painful at any real multi-tenant scale — the README doesn't mention any sharding or read replica story. The 'Pro version' upsell is prominent and walls off the things you actually want in production (model health monitoring, rate limiting, security auditing), so the open-source version is effectively a demo for the paid tier. Last commit was March 2026 on a project that was trending hard in 2024 — activity has dropped off sharply, which is a risk for a dependency that touches your billing and auth. The Go backend has no mention of structured logging or tracing, so debugging why a specific channel is failing under load will mean grepping application logs.