LunarGate Gateway¶

Self-hosted AI gateway

Run one OpenAI-compatible endpoint in your infrastructure and route requests across multiple LLM providers with fallback, retries, caching, hot-reloadable config, and optional observability export.

Get started Browse runnable examples Read the configuration reference

Best entry path:

start with Homepage quickstart if you want the same install -> config -> run -> client flow as the marketing site
jump to Examples overview if you learn better from runnable projects
keep Configuration overview open when you begin editing real YAML

Get running fast
Install LunarGate, save one config, run it, and call it with a normal OpenAI client.

See real examples
Browse Python, Docker Compose, Node/Express, Streamlit, and lunargate/auto examples.

Learn autorouting
Understand how lunargate/auto and model-selection headers drive tier-based routing.

Why teams use it¶

One endpoint for every app

Keep your app code on the OpenAI API shape and swap providers behind the gateway.

Resilience built in

Route by headers, retry transient failures, and cascade to fallback targets automatically.

Operated from config

Change routing rules, rate limits, and provider weights without rebuilding the binary.

Observability without lock-in

Export metrics only by default, or opt into prompt and response sharing for request inspection.

The request path¶

Request -> Auth edge -> Rate Limit -> Cache -> Route Match -> Load Balance
        -> Retry -> Circuit Breaker -> Provider Translation -> LLM Call
        -> Response Translation -> Metrics -> Optional Data Sharing -> Response

Where to actually start:

use Getting started for the shortest install -> config -> run -> client path
use Docker Compose if you want a container-first setup
use Examples overview if you prefer runnable projects over one-page setup instructions

What is in scope today¶

OpenAI-compatible POST /v1/chat/completions
Model listing via GET /v1/models
Health and metrics endpoints
Multi-provider routing and fallback
In-memory rate limiting and caching
Hot-reloadable YAML config
Optional SaaS observability export

Important security note¶

Warning

The gateway currently does not implement inbound client authentication. Run it inside a trusted network or behind an auth-enforcing edge such as an API gateway, reverse proxy, or service mesh.

Documentation map¶

Start with Homepage quickstart for the same install -> config -> run -> client flow used on the main site.
Go to Examples overview for runnable Python, Node, Streamlit, and Docker Compose apps based on gateway-examples/.
Read lunargate/auto and autorouting if you want the gateway to choose model tiers from one stable client model.
Use Routing and fallback for route ordering, fallback chains, and load-balancing strategy.
Keep Configuration overview and the detailed config pages open while editing YAML.
Read Observability and data sharing before enabling prompt or response export.