LunarGate Gateway¶
Self-hosted AI gateway
Run one OpenAI-compatible endpoint in your infrastructure and route requests across multiple LLM providers with fallback, retries, caching, hot-reloadable config, and optional observability export.
Best entry path:
- start with Homepage quickstart if you want the same install -> config -> run -> client flow as the marketing site
- jump to Examples overview if you learn better from runnable projects
- keep Configuration overview open when you begin editing real YAML
Why teams use it¶
- One endpoint for every app
Keep your app code on the OpenAI API shape and swap providers behind the gateway.
- Resilience built in
Route by headers, retry transient failures, and cascade to fallback targets automatically.
- Operated from config
Change routing rules, rate limits, and provider weights without rebuilding the binary.
- Observability without lock-in
Export metrics only by default, or opt into prompt and response sharing for request inspection.
The request path¶
Request -> Auth edge -> Rate Limit -> Cache -> Route Match -> Load Balance
-> Retry -> Circuit Breaker -> Provider Translation -> LLM Call
-> Response Translation -> Metrics -> Optional Data Sharing -> Response
Where to actually start:
- use Getting started for the shortest install -> config -> run -> client path
- use Docker Compose if you want a container-first setup
- use Examples overview if you prefer runnable projects over one-page setup instructions
What is in scope today¶
- OpenAI-compatible
POST /v1/chat/completions - Model listing via
GET /v1/models - Health and metrics endpoints
- Multi-provider routing and fallback
- In-memory rate limiting and caching
- Hot-reloadable YAML config
- Optional SaaS observability export
Important security note¶
Warning
The gateway currently does not implement inbound client authentication. Run it inside a trusted network or behind an auth-enforcing edge such as an API gateway, reverse proxy, or service mesh.
Documentation map¶
- Start with Homepage quickstart for the same install -> config -> run -> client flow used on the main site.
- Go to Examples overview for runnable Python, Node, Streamlit, and Docker Compose apps based on
gateway-examples/. - Read
lunargate/autoand autorouting if you want the gateway to choose model tiers from one stable client model. - Use Routing and fallback for route ordering, fallback chains, and load-balancing strategy.
- Keep Configuration overview and the detailed config pages open while editing YAML.
- Read Observability and data sharing before enabling prompt or response export.