Skip to content

LunarGate Gateway

Run one OpenAI-compatible endpoint in your infrastructure and route requests across multiple LLM providers with fallback, retries, caching, hot-reloadable config, and optional observability export.

Why teams use it

  • One endpoint for every app

Keep your app code on the OpenAI API shape and swap providers behind the gateway.

  • Resilience built in

Route by headers, retry transient failures, and cascade to fallback targets automatically.

  • Operated from config

Change routing rules, rate limits, and provider weights without rebuilding the binary.

  • Observability without lock-in

Export metrics only by default, or opt into prompt and response sharing for request inspection.

The request path

Request -> Auth edge -> Rate Limit -> Cache -> Route Match -> Load Balance
        -> Retry -> Circuit Breaker -> Provider Translation -> LLM Call
        -> Response Translation -> Metrics -> Optional Data Sharing -> Response

Start in the mode you need

brew tap lunargate-ai/tap
brew install lunargate-ai/tap/gateway

export OPENAI_API_KEY="sk-..."
lunargate --config ./config.yaml
make build

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."

./bin/lunargate --config ./configs/config.example.yaml
docker-compose up gateway
curl http://localhost:8081/health

What is in scope today

  • OpenAI-compatible POST /v1/chat/completions
  • Model listing via GET /v1/models
  • Health and metrics endpoints
  • Multi-provider routing and fallback
  • In-memory rate limiting and caching
  • Hot-reloadable YAML config
  • Optional SaaS observability export

Important security note

Warning

The gateway currently does not implement inbound client authentication. Run it inside a trusted network or behind an auth-enforcing edge such as an API gateway, reverse proxy, or service mesh.

Documentation map