LunarGate Gateway¶

Self-hosted AI gateway

Run one OpenAI-compatible endpoint in your infrastructure and route requests across multiple LLM providers with fallback, retries, caching, hot-reloadable config, and optional observability export.

Get started Browse runnable examples Read the configuration reference

Best entry path:

start with Quickstart if you want the shortest install -> config -> run -> client flow
jump to Examples overview if you learn better from runnable projects
keep Configuration overview open when you begin editing real YAML

Get running fast
Install LunarGate, save one config, run it, and call it with a normal OpenAI client.

See real examples
Browse Python, Docker Compose, Node/Express, Streamlit, and lunargate/auto examples.

Learn autorouting
Understand how lunargate/auto and model-selection headers drive tier-based routing.

Why teams use it¶

One endpoint for every app

Keep your app code on the OpenAI API shape and swap providers behind the gateway.

Resilience built in

Route by headers, retry transient failures, and cascade to fallback targets automatically.

Operated from config

Change providers, routing, retry/cache behavior, rate limits, model selection, and Dashboard export settings without rebuilding the binary.

Observability without lock-in

Export metrics only by default, or opt into prompt and response sharing for request inspection.

The request path¶

Request -> Optional Inbound Auth (API key) -> Rate Limit -> Cache -> Route Match -> Load Balance
        -> Retry -> Circuit Breaker -> Provider Translation -> LLM Call
        -> Response Translation -> Metrics -> Optional Data Sharing -> Response

Where to actually start:

use Quickstart for the shortest install -> config -> run -> client path
use Docker Compose local stack if you want a container-first setup
use Examples overview if you prefer runnable projects over one-page setup instructions

What is in scope today¶

OpenAI-compatible POST /v1/chat/completions
Model listing via GET /v1/models
Health and metrics endpoints
Multi-provider routing and fallback
In-memory rate limiting and caching
Hot-reloadable YAML config
Optional Dashboard observability export

Important security note¶

Warning

The gateway now supports basic inbound API-key authentication, but it is still safest to run it inside a trusted network or behind an auth-enforcing edge such as an API gateway, reverse proxy, or service mesh.

Documentation map¶

Start with Quickstart for the fastest install -> config -> run -> client flow.
Go to Examples overview for runnable Python, Node, Streamlit, and Docker Compose apps based on gateway-examples/.
Read lunargate/auto and autorouting if you want the gateway to choose model tiers from one stable client model.
Use Routing and fallback for route ordering, fallback chains, and load-balancing strategy.
Keep Configuration overview and the detailed config pages open while editing YAML.
Read Observability and data sharing before enabling prompt or response export.