routing¶
The routing section decides which provider/model handles a request.
Top-level fields¶
| Field | Type | Default | Notes |
|---|---|---|---|
default_strategy |
string | round-robin |
supported values: weighted, round-robin, random |
routes |
list | required in practice | ordered route list evaluated from top to bottom |
Route shape¶
routing:
default_strategy: "weighted"
routes:
- name: "default"
match:
path: "/v1/chat/completions"
headers:
x-lunargate-provider: "openai"
targets:
- provider: openai
model: gpt-5.2
weight: 100
fallback:
- provider: anthropic
model: claude-sonnet-4-5
weight: 100
If you want separate policies for chat and embeddings, use separate match.path rules:
routing:
routes:
- name: "chat-default"
match:
path: "/v1/chat/completions"
targets:
- provider: openai
model: gpt-5.2
- name: "embeddings-default"
match:
path: "/v1/embeddings"
targets:
- provider: ollama
model: nomic-embed-text-v2-moe
Route fields¶
name¶
Human-readable identifier. It is returned in X-LunarGate-Route and can be useful in logs and observability.
match.path¶
"*"matches everything- any other value is treated as a prefix match
For chat traffic, the usual value is:
For embeddings traffic, the usual value is:
match.headers¶
Exact-match header map.
Common examples:
x-lunargate-providerx-lunargate-modelx-lunargate-complexityx-lunargate-requires-tools
targets¶
Each target needs:
| Field | Notes |
|---|---|
provider |
provider ID from the providers section |
model |
upstream model name for that provider |
weight |
used by weighted; values <= 0 behave like 1 |
fallback¶
Ordered list of backup targets used by the fallback executor if the primary path fails.
Supported balancing strategies¶
weighted¶
Chooses between eligible targets based on weight.
round-robin¶
Cycles through eligible targets evenly.
random¶
Selects a random eligible target.
If you set an unknown strategy, the engine falls back to weighted selection.
Route order matters¶
Routes are checked from top to bottom.
Recommended order:
- forced-provider or forced-route rules
- tool-aware rules
- complexity-tier rules
- generic default route
Request-side overrides that interact with routing¶
The runtime also understands request headers such as:
X-LunarGate-ProviderX-LunarGate-ModelX-LunarGate-Route
Those headers do not replace routing config, but they can narrow the set of eligible targets or force a specific named route.
Practical guidance¶
- Keep at least one generic fallback route at the bottom.
- Make sure every
target.providerexists inproviders. - Use header-based routing for team, environment, capability, or complexity decisions.
- Use separate path matches when chat and embeddings should go to different upstream models or providers.
- Use the
lunargate/autotechnique when you want the gateway to decide tiers from one stable client model.