Python lunargate/auto demo¶
This example demonstrates one of the most LunarGate-specific patterns: the client always sends model="lunargate/auto", and the gateway picks a tier based on the request.
Best for¶
Use this example when you want to understand autorouting before rolling it into a real application.
What it demonstrates¶
- a stable client-side model ID:
lunargate/auto model_selectionscoring and tier headers- routing different requests to light, balanced, and heavy models
- tool-aware routes that can override normal complexity-tier routing
- OpenAI-compatible
chat.completionsas the client-facing API - reading
X-LunarGate-*headers to see what the gateway decided
Default tier mapping in the example¶
- light:
gpt-5.4-nano - balanced:
gpt-5.4-mini - heavy:
gpt-5.4
You can change those model names in .env without touching application code.
Run it¶
Start the gateway separately:
Then run the demo:
This example intentionally stays on POST /v1/chat/completions. It does not require the Responses API path.
What you will see¶
The script sends multiple requests with different complexity and tool usage.
After each request it prints:
- the selected provider
- the selected model
- the calculated complexity tier
- either the response content or generated tool calls
Important design detail¶
This example sets weight_tools = 0 in the scoring config.
That is a demo choice, not a universal recommendation. It keeps tool-bearing requests from automatically looking more complex just because they include tools, which makes it easier to show multiple tiers in one short demo run.
What to inspect¶
config-simple.yaml.examplefor tier routes andmodel_selection.env.examplefor light/balanced/heavy model variablesmain.pyfor the stablelunargate/autoclient modelconfig-observability.yaml.exampleif you want the same demo withdata_sharingand remote control enabled
Related docs¶
- Read
lunargate/autoand autorouting for the conceptual explanation. - Keep
model_selectionandroutingopen while studying this example.