fix: add runtime model contextTokens caps

This commit is contained in:
Peter Steinberger
2026-04-04 09:35:59 +09:00
parent 45675c1698
commit 58d2b9dd46
25 changed files with 350 additions and 52 deletions

View File

@@ -18,6 +18,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
- Fallback runtime rules, cooldown probes, and session-override persistence are
documented in [/concepts/model-failover](/concepts/model-failover).
- `models.providers.*.models[].contextWindow` is native model metadata;
`models.providers.*.models[].contextTokens` is the effective runtime cap.
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
OpenClaw merges that output into `models.providers` before writing
`models.json`.
@@ -187,6 +189,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
- `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
- Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
- `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent
- `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
- Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
```json5
@@ -195,6 +198,18 @@ OpenClaw ships with the piai catalog. These providers require **no**
}
```
```json5
{
models: {
providers: {
"openai-codex": {
models: [{ id: "gpt-5.4", contextTokens: 160000 }],
},
},
},
}
```
### Other subscription-style hosted options
- [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints

View File

@@ -2186,6 +2186,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 128000,
contextTokens: 96000,
maxTokens: 32000,
},
],
@@ -2204,6 +2205,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
- SecretRef-managed provider header values are refreshed from source markers (`secretref-env:ENV_VAR_NAME` for env refs, `secretref-managed` for file/exec refs).
- Empty or missing agent `apiKey`/`baseUrl` fall back to `models.providers` in config.
- Matching model `contextWindow`/`maxTokens` use the higher value between explicit config and implicit catalog values.
- Matching model `contextTokens` preserves an explicit runtime cap when present; use it to limit effective context without changing native model metadata.
- Use `models.mode: "replace"` when you want config to fully rewrite `models.json`.
- Marker persistence is source-authoritative: markers are written from the active source config snapshot (pre-resolution), not from resolved runtime secret values.
@@ -2219,6 +2221,8 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
- `models.providers.*.baseUrl`: upstream API base URL.
- `models.providers.*.headers`: extra static headers for proxy/tenant routing.
- `models.providers.*.models`: explicit provider model catalog entries.
- `models.providers.*.models.*.contextWindow`: native model context window metadata.
- `models.providers.*.models.*.contextTokens`: optional runtime context cap. Use this when you want a smaller effective context budget than the model's native `contextWindow`.
- `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
- `models.bedrockDiscovery`: Bedrock auto-discovery settings root.
- `models.bedrockDiscovery.enabled`: turn discovery polling on/off.

View File

@@ -143,6 +143,41 @@ discovers it. Treat it as entitlement-dependent and experimental: Codex Spark is
separate from GPT-5.4 `/fast`, and availability depends on the signed-in Codex /
ChatGPT account.
### Codex context window cap
OpenClaw treats the Codex model metadata and the runtime context cap as separate
values.
For `openai-codex/gpt-5.4`:
- native `contextWindow`: `1050000`
- default runtime `contextTokens` cap: `272000`
That keeps model metadata truthful while preserving the smaller default runtime
window that has better latency and quality characteristics in practice.
If you want a different effective cap, set `models.providers.<provider>.models[].contextTokens`:
```json5
{
models: {
providers: {
"openai-codex": {
models: [
{
id: "gpt-5.4",
contextTokens: 160000,
},
],
},
},
},
}
```
Use `contextWindow` only when you are declaring or overriding native model
metadata. Use `contextTokens` when you want to limit the runtime context budget.
### Transport default
OpenClaw uses `pi-ai` for model streaming. For both `openai/*` and