Unified Gateway
One OpenAI-compatible API in front of every provider you use.
Unified Gateway is a backend-only, provider-agnostic AI gateway. Your clients keep talking the OpenAI (and Anthropic) wire format they already know; the gateway routes each request to the right upstream — OpenAI, Anthropic, Google, Azure, OpenAI-compatible providers, or your own custom models — behind a single public model catalog with one auth, rate-limit, logging, and cost layer.
The pitch in one line: point your existing OpenAI SDK at Unified Gateway and change nothing else.
from openai import OpenAI
client = OpenAI(base_url="http://localhost:4000/v1", api_key="<virtual-key>")
client.chat.completions.create(model="general", messages=[{"role": "user", "content": "Hi"}])model is a public name you define (general, image-default, …), not a provider model id. Behind
it sits a pool of one or more deployments that the gateway load-balances, retries, and falls back across.
Start here
New to the project? Read these three pages in order:
- Quickstart — from
git cloneto your first successful response in a few minutes. - Concepts — the mental model: public models, deployments, adapters, transports, and how a request flows through the gateway.
- Creating deployments — wire up real providers and custom models.
Set up & deploy
- Setup — requirements, environment, secrets, and choosing a database and Redis.
- Deployment — the Docker Compose stack and per-platform guides (Coolify, Portainer, Dokploy, Linux).
Guides
Operate
- Operations — production runbook: health probes, migrations, secret rotation, partitions, and observability.
- Creating deployments — examples for text, image, embeddings, transcription, Azure, Google AI Studio, and custom models.
- Virtual keys — issuing client keys with model scopes, budgets, and RPM/TPM limits.
- Fallbacks — fallback chains per public model: reasons, retries, and lifecycle.
- Caching — the opt-in response cache and its headers.
Extend
- Runtime extensions — mounted ESM extensions, manifests, hooks, and failure handling.
- Model catalog — the shape of
catalog.json, per-operation profiles, pricing, and custom models.
Reference
- API reference — OpenAPI, authentication, import notes, and a typical test flow.
- Errors — error shape, status codes, and a troubleshooting table.
- Known errors — exact symptoms and fixes (e.g. self-signed TLS on Bun).
- Testing — test layers, common helpers, and the no-real-providers guarantee.
- Glossary — the project's canonical names. The source of truth for code, SQL, OpenAPI, logs, and docs.
Non-Markdown artifacts
- OpenAPI:
apps/gateway/openapi.yaml - JSON Schemas:
schemas