Configuration overview¶
YAML reference
LunarGate configuration is YAML-based, hot-reloadable, and environment-variable friendly. The gateway loads one config file, applies defaults, expands `${ENV_VAR}` placeholders across all string fields, and then uses the result to build providers, routing, retries, observability, and the rest of the runtime behavior.
The mental model:
providerstells the gateway what upstreams existroutingdecides which upstream handles a requestmodel_selectionenriches requests so routing can act on complexity or skill- the rest of the sections shape operational behavior like retries, caching, logging, and Dashboard export
Start with providers
Define real upstreams, credentials, base URLs, and model discovery behavior.
Need autorouting?
Jump to the lunargate/auto technique and then come back to model_selection.
How config loading works¶
- the gateway reads the config file you pass with
--config - it also tries to load a
.envfile from the config directory and then from the current working directory ${ENV_VAR}expansion works across all string fields in the parsed config, not just provider API keysdata_sharing.backend_urlis normalized to a versioned backend base URL and falls back to the gateway default when omitted- hot reload reconciles provider translators, routing, retries, cache, rate limits, model selection, collector behavior, and remote control in-place
- server listener settings such as bind address, port, and HTTP timeouts still require a process restart
- most operational sections have sane defaults, so a minimal config can stay very short
Top-level sections¶
| Section | What it controls | Detailed page |
|---|---|---|
server |
bind address, port, and HTTP timeouts | server |
providers |
upstream providers, credentials, base URLs, model discovery, and default models | providers |
routing |
route matching, targets, balancing strategy, and fallback chains | routing |
model_selection |
complexity scoring, output headers, and autorouting inputs | model_selection |
rate_limiting |
in-memory throttling for inbound requests | rate_limiting |
caching |
in-memory exact-match response cache | caching |
retry |
retry policy for upstream failures | retry |
logging |
log level and output format | logging |
security |
config shape for inbound API keys; still not the primary hardening layer | security |
data_sharing |
Dashboard observability export, gateway identity, geo tags, and remote control | data_sharing |
Minimal config¶
This is enough to start a single-provider gateway:
providers:
openai:
api_key: "${OPENAI_API_KEY}"
routing:
routes:
- name: "default"
targets:
- provider: openai
model: gpt-5.2
Defaults worth knowing¶
These defaults are applied by the config manager if you omit them:
server.host:0.0.0.0server.port:8080server.read_timeout:30sserver.write_timeout:0sserver.idle_timeout:60srouting.default_strategy:round-robinrate_limiting.enabled:falsecaching.enabled:falseretry.enabled:truelogging.level:infologging.format:consolemodel_selection.enabled:falsedata_sharing.enabled:falsedata_sharing.backend_url: gateway default LunarGate backend base URL
Recommended reading order¶
If you are just getting started:
- Read
providersto define the upstreams the gateway can actually call. - Then read
routingto decide how requests are matched and where they land. - Add
data_sharingonly if you want observability in the LunarGate Dashboard onapp.lunargate.ai.
If you want lunargate/auto:
- Read
model_selectionto understand the scoring and emitted headers. - Then read the
lunargate/autotechnique page for the conceptual routing pattern. - Study the runnable
python-auto-tiers-poetryexample if you want to see the full pattern end to end.
Authoring tip¶
Tip
Keep specific routes above generic ones, keep provider IDs consistent between providers and routing.targets, and remember that hot reload makes config changes easy to apply but does not protect you from bad routing logic.