Skip to content

LunarGate Gateway

Self-hosted AI gateway

Run one OpenAI-compatible endpoint in your infrastructure and route requests across multiple LLM providers with fallback, retries, caching, hot-reloadable config, and optional observability export.

Best entry path:

Why teams use it

  • One endpoint for every app

Keep your app code on the OpenAI API shape and swap providers behind the gateway.

  • Resilience built in

Route by headers, retry transient failures, and cascade to fallback targets automatically.

  • Operated from config

Change routing rules, rate limits, and provider weights without rebuilding the binary.

  • Observability without lock-in

Export metrics only by default, or opt into prompt and response sharing for request inspection.

The request path

Request -> Auth edge -> Rate Limit -> Cache -> Route Match -> Load Balance
        -> Retry -> Circuit Breaker -> Provider Translation -> LLM Call
        -> Response Translation -> Metrics -> Optional Data Sharing -> Response

Where to actually start:

What is in scope today

  • OpenAI-compatible POST /v1/chat/completions
  • Model listing via GET /v1/models
  • Health and metrics endpoints
  • Multi-provider routing and fallback
  • In-memory rate limiting and caching
  • Hot-reloadable YAML config
  • Optional SaaS observability export

Important security note

Warning

The gateway currently does not implement inbound client authentication. Run it inside a trusted network or behind an auth-enforcing edge such as an API gateway, reverse proxy, or service mesh.

Documentation map

  1. Start with Homepage quickstart for the same install -> config -> run -> client flow used on the main site.
  2. Go to Examples overview for runnable Python, Node, Streamlit, and Docker Compose apps based on gateway-examples/.
  3. Read lunargate/auto and autorouting if you want the gateway to choose model tiers from one stable client model.
  4. Use Routing and fallback for route ordering, fallback chains, and load-balancing strategy.
  5. Keep Configuration overview and the detailed config pages open while editing YAML.
  6. Read Observability and data sharing before enabling prompt or response export.