// the find

QuantumNous/new-api

★ 38,605 · Go · AGPL-3.0 · updated Jun 2026

A unified AI model hub for aggregation & distribution. It supports cross-converting various LLMs into OpenAI-compatible, Claude-compatible, or Gemini-compatible formats. A centralized gateway for personal and enterprise model management. 🍥

Self-hosted LLM API gateway that translates between OpenAI, Claude, and Gemini wire formats, so any OpenAI-compatible client can swap backend providers without touching application code. A fork of One API with significant additions: multi-currency billing, weighted channel routing with retry, several OAuth/SSO integrations, and a custom billing expression engine. Aimed at teams consolidating API keys, managing costs across providers, or reselling API access.

Per-request cost accounting is genuinely well done — it tracks cache hits for OpenAI, Claude, and DeepSeek, handles token counting, and has a custom expression engine in `pkg/billingexpr` for flexible billing rules. Most roll-your-own proxies get this wrong or skip it entirely. Weighted channel routing with automatic failure retry is the right default for multi-provider setups — failover without changing client code. SSRF protection is present (`common/ssrf_protection.go`), which matters for a gateway that fetches upstream URLs configured by potentially untrusted admins. Pyroscope profiling integration built in is unusual and suggests the team actually runs this at scale and cares about latency, not just getting requests through.

Format conversion is incomplete in ways that will surprise you at the worst time: Gemini→OpenAI explicitly doesn't support function calling, and OpenAI Responses↔OpenAI is still marked 'in development.' If your app uses tool calls through the proxy, you may get wrong behavior instead of an error. AGPL v3 with Section 7 extras means commercial users who modify the software must open-source changes AND preserve specific attribution strings in the UI — there's a commercial licensing contact email in the README, which tells you exactly where this is heading. The reasoning effort naming convention (appending `-high`/`-medium`/`-low` to model names like `gpt-5-high`) is a string-munging hack that will silently break if any provider ships a model whose real name ends with those suffixes. All configuration lives in the web admin console — no config-as-code, no exportable channel definitions, no IaC path, so reproducing a deployment means clicking through the UI again.

View on GitHub → Homepage ↗