lunargate/auto and autorouting¶
lunargate/auto is a stable client-side model name that tells the gateway:
you decide which concrete provider/model should handle this request.
That lets application code stay fixed while the gateway moves traffic between tiers, providers, or capabilities.
The core idea¶
From the client side, you keep sending one model ID:
resp = client.chat.completions.create(
model="lunargate/auto",
messages=[{"role": "user", "content": "Plan a migration rollout."}],
)
Inside the gateway:
model_selectionscores the request- the gateway emits routing headers like
x-lunargate-complexity - routing matches those headers and selects a target model
- the response comes back with resolved headers such as
X-LunarGate-ProviderandX-LunarGate-Model
What you need in config¶
1. Enable model_selection¶
model_selection:
enabled: true
override_user_model: false
output_headers:
complexity: "x-lunargate-complexity"
score: "x-lunargate-complexity-score"
skill: "x-lunargate-skill"
2. Route on the emitted headers¶
A simplified pattern looks like this:
routing:
default_strategy: "weighted"
routes:
- name: "complexity-tier-0-1"
match:
path: "/v1/chat/completions"
headers:
x-lunargate-complexity: "0-1"
targets:
- provider: openai
model: gpt-5.4-nano
weight: 100
- name: "complexity-tier-4-5"
match:
path: "/v1/chat/completions"
headers:
x-lunargate-complexity: "4-5"
targets:
- provider: openai
model: gpt-5.4-mini
weight: 100
- name: "complexity-tier-6plus"
match:
path: "/v1/chat/completions"
headers:
x-lunargate-complexity: "6+"
targets:
- provider: openai
model: gpt-5.4
weight: 100
Tool-aware routing¶
When a request contains tools or tool_choice, the selector also injects:
That lets you add tool-capability routes before your normal complexity routes.
Example pattern from the runnable demo:
- name: "tools-tier-6plus"
match:
path: "/v1/chat/completions"
headers:
x-lunargate-requires-tools: "true"
x-lunargate-complexity: "6+"
targets:
- provider: openai
model: "${LUNARGATE_HEAVY_MODEL}"
weight: 100
Route ordering matters¶
Put the most specific routes first:
- forced-route or forced-provider patterns
- tool-aware routes
- complexity-tier routes
- generic default route
If you reverse that order, the default route can match before the more specific autorouting rules.
What override_user_model changes¶
override_user_model: false- explicit client model/provider choices still matter
- the selector mainly enriches headers for routing
override_user_model: true- the gateway can replace the user model with the resolved route target more aggressively
For most lunargate/auto setups, false is a good default because the special pseudo-model already signals that the gateway should decide.
What the client can observe¶
Useful response headers include:
X-LunarGate-ProviderX-LunarGate-ModelX-LunarGate-Route
And the request can be influenced by selector-emitted routing headers such as:
x-lunargate-complexityx-lunargate-complexity-scorex-lunargate-skillx-lunargate-requires-tools
Two practical patterns¶
Stable app, changing tiers¶
Keep application code fixed on lunargate/auto and update only gateway config when you want to:
- move traffic to cheaper models
- promote a better model into the heavy tier
- split traffic across accounts/providers
Capability gating¶
Use x-lunargate-requires-tools: true routes to keep tool-bearing traffic away from models that do not support tool calling reliably.
Best companion material¶
- Runnable example: Python
lunargate/autodemo - Config reference:
model_selection - Config reference:
routing