Skip to content

lunargate/auto and autorouting

lunargate/auto is a stable client-side model name that tells the gateway:

you decide which concrete provider/model should handle this request.

That lets application code stay fixed while the gateway moves traffic between tiers, providers, or capabilities.

The core idea

From the client side, you keep sending one model ID:

resp = client.chat.completions.create(
    model="lunargate/auto",
    messages=[{"role": "user", "content": "Plan a migration rollout."}],
)

Inside the gateway:

  1. model_selection scores the request
  2. the gateway emits routing headers like x-lunargate-complexity
  3. routing matches those headers and selects a target model
  4. the response comes back with resolved headers such as X-LunarGate-Provider and X-LunarGate-Model

What you need in config

1. Enable model_selection

model_selection:
  enabled: true
  override_user_model: false
  output_headers:
    complexity: "x-lunargate-complexity"
    score: "x-lunargate-complexity-score"
    skill: "x-lunargate-skill"

2. Route on the emitted headers

A simplified pattern looks like this:

routing:
  default_strategy: "weighted"
  routes:
    - name: "complexity-tier-0-1"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "0-1"
      targets:
        - provider: openai
          model: gpt-5.4-nano
          weight: 100

    - name: "complexity-tier-4-5"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "4-5"
      targets:
        - provider: openai
          model: gpt-5.4-mini
          weight: 100

    - name: "complexity-tier-6plus"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "6+"
      targets:
        - provider: openai
          model: gpt-5.4
          weight: 100

Tool-aware routing

When a request contains tools or tool_choice, the selector also injects:

x-lunargate-requires-tools: true

That lets you add tool-capability routes before your normal complexity routes.

Example pattern from the runnable demo:

- name: "tools-tier-6plus"
  match:
    path: "/v1/chat/completions"
    headers:
      x-lunargate-requires-tools: "true"
      x-lunargate-complexity: "6+"
  targets:
    - provider: openai
      model: "${LUNARGATE_HEAVY_MODEL}"
      weight: 100

Route ordering matters

Put the most specific routes first:

  1. forced-route or forced-provider patterns
  2. tool-aware routes
  3. complexity-tier routes
  4. generic default route

If you reverse that order, the default route can match before the more specific autorouting rules.

What override_user_model changes

  • override_user_model: false
  • explicit client model/provider choices still matter
  • the selector mainly enriches headers for routing
  • override_user_model: true
  • the gateway can replace the user model with the resolved route target more aggressively

For most lunargate/auto setups, false is a good default because the special pseudo-model already signals that the gateway should decide.

What the client can observe

Useful response headers include:

  • X-LunarGate-Provider
  • X-LunarGate-Model
  • X-LunarGate-Route

And the request can be influenced by selector-emitted routing headers such as:

  • x-lunargate-complexity
  • x-lunargate-complexity-score
  • x-lunargate-skill
  • x-lunargate-requires-tools

Two practical patterns

Stable app, changing tiers

Keep application code fixed on lunargate/auto and update only gateway config when you want to:

  • move traffic to cheaper models
  • promote a better model into the heavy tier
  • split traffic across accounts/providers

Capability gating

Use x-lunargate-requires-tools: true routes to keep tool-bearing traffic away from models that do not support tool calling reliably.

Best companion material