`lunargate/auto` and autorouting¶

lunargate/auto is a stable client-side model name that tells the gateway:

you decide which concrete provider/model should handle this request.

That lets application code stay fixed while the gateway moves traffic between tiers, providers, or capabilities.

The core idea¶

From the client side, you keep sending one model ID:

resp = client.chat.completions.create(
    model="lunargate/auto",
    messages=[{"role": "user", "content": "Plan a migration rollout."}],
)

Inside the gateway:

model_selection scores the request
the gateway emits routing headers like x-lunargate-complexity
routing matches those headers and selects a target model
the response comes back with resolved headers such as X-LunarGate-Provider and X-LunarGate-Model

What you need in config¶

1. Enable `model_selection`¶

model_selection:
  enabled: true
  override_user_model: false
  output_headers:
    complexity: "x-lunargate-complexity"
    score: "x-lunargate-complexity-score"
    skill: "x-lunargate-skill"

2. Route on the emitted headers¶

A simplified pattern looks like this:

routing:
  default_strategy: "weighted"
  routes:
    - name: "complexity-tier-0-1"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "0-1"
      targets:
        - provider: openai
          model: gpt-5.4-nano
          weight: 100

    - name: "complexity-tier-4-5"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "4-5"
      targets:
        - provider: openai
          model: gpt-5.4-mini
          weight: 100

    - name: "complexity-tier-6plus"
      match:
        path: "/v1/chat/completions"
        headers:
          x-lunargate-complexity: "6+"
      targets:
        - provider: openai
          model: gpt-5.4
          weight: 100

Tool-aware routing¶

When a request contains tools or tool_choice, the selector also injects:

x-lunargate-requires-tools: true

That lets you add tool-capability routes before your normal complexity routes.

Example pattern from the runnable demo:

- name: "tools-tier-6plus"
  match:
    path: "/v1/chat/completions"
    headers:
      x-lunargate-requires-tools: "true"
      x-lunargate-complexity: "6+"
  targets:
    - provider: openai
      model: "${LUNARGATE_HEAVY_MODEL}"
      weight: 100

Route ordering matters¶

Put the most specific routes first:

forced-route or forced-provider patterns
tool-aware routes
complexity-tier routes
generic default route

If you reverse that order, the default route can match before the more specific autorouting rules.

What `override_user_model` changes¶

override_user_model: false
explicit client model/provider choices still matter
the selector mainly enriches headers for routing
override_user_model: true
the gateway can replace the user model with the resolved route target more aggressively

For most lunargate/auto setups, false is a good default because the special pseudo-model already signals that the gateway should decide.

What the client can observe¶

Useful response headers include:

X-LunarGate-Provider
X-LunarGate-Model
X-LunarGate-Route

And the request can be influenced by selector-emitted routing headers such as:

x-lunargate-complexity
x-lunargate-complexity-score
x-lunargate-skill
x-lunargate-requires-tools

Two practical patterns¶

Stable app, changing tiers¶

Keep application code fixed on lunargate/auto and update only gateway config when you want to:

move traffic to cheaper models
promote a better model into the heavy tier
split traffic across accounts/providers

Capability gating¶

Use x-lunargate-requires-tools: true routes to keep tool-bearing traffic away from models that do not support tool calling reliably.

Best companion material¶

Runnable example: Python lunargate/auto demo
Config reference: model_selection
Config reference: routing

lunargate/auto and autorouting¶