mirror of
https://github.com/moltbot/moltbot.git
synced 2026-04-25 23:47:20 +00:00
fix: add runtime model contextTokens caps
This commit is contained in:
@@ -18,6 +18,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
|
||||
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
|
||||
- Fallback runtime rules, cooldown probes, and session-override persistence are
|
||||
documented in [/concepts/model-failover](/concepts/model-failover).
|
||||
- `models.providers.*.models[].contextWindow` is native model metadata;
|
||||
`models.providers.*.models[].contextTokens` is the effective runtime cap.
|
||||
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
|
||||
OpenClaw merges that output into `models.providers` before writing
|
||||
`models.json`.
|
||||
@@ -187,6 +189,7 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
|
||||
- `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
|
||||
- Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
|
||||
- `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent
|
||||
- `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
|
||||
- Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
|
||||
|
||||
```json5
|
||||
@@ -195,6 +198,18 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
|
||||
}
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [{ id: "gpt-5.4", contextTokens: 160000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
### Other subscription-style hosted options
|
||||
|
||||
- [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints
|
||||
|
||||
@@ -2186,6 +2186,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 128000,
|
||||
contextTokens: 96000,
|
||||
maxTokens: 32000,
|
||||
},
|
||||
],
|
||||
@@ -2204,6 +2205,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
- SecretRef-managed provider header values are refreshed from source markers (`secretref-env:ENV_VAR_NAME` for env refs, `secretref-managed` for file/exec refs).
|
||||
- Empty or missing agent `apiKey`/`baseUrl` fall back to `models.providers` in config.
|
||||
- Matching model `contextWindow`/`maxTokens` use the higher value between explicit config and implicit catalog values.
|
||||
- Matching model `contextTokens` preserves an explicit runtime cap when present; use it to limit effective context without changing native model metadata.
|
||||
- Use `models.mode: "replace"` when you want config to fully rewrite `models.json`.
|
||||
- Marker persistence is source-authoritative: markers are written from the active source config snapshot (pre-resolution), not from resolved runtime secret values.
|
||||
|
||||
@@ -2219,6 +2221,8 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
- `models.providers.*.baseUrl`: upstream API base URL.
|
||||
- `models.providers.*.headers`: extra static headers for proxy/tenant routing.
|
||||
- `models.providers.*.models`: explicit provider model catalog entries.
|
||||
- `models.providers.*.models.*.contextWindow`: native model context window metadata.
|
||||
- `models.providers.*.models.*.contextTokens`: optional runtime context cap. Use this when you want a smaller effective context budget than the model's native `contextWindow`.
|
||||
- `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
|
||||
- `models.bedrockDiscovery`: Bedrock auto-discovery settings root.
|
||||
- `models.bedrockDiscovery.enabled`: turn discovery polling on/off.
|
||||
|
||||
@@ -143,6 +143,41 @@ discovers it. Treat it as entitlement-dependent and experimental: Codex Spark is
|
||||
separate from GPT-5.4 `/fast`, and availability depends on the signed-in Codex /
|
||||
ChatGPT account.
|
||||
|
||||
### Codex context window cap
|
||||
|
||||
OpenClaw treats the Codex model metadata and the runtime context cap as separate
|
||||
values.
|
||||
|
||||
For `openai-codex/gpt-5.4`:
|
||||
|
||||
- native `contextWindow`: `1050000`
|
||||
- default runtime `contextTokens` cap: `272000`
|
||||
|
||||
That keeps model metadata truthful while preserving the smaller default runtime
|
||||
window that has better latency and quality characteristics in practice.
|
||||
|
||||
If you want a different effective cap, set `models.providers.<provider>.models[].contextTokens`:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [
|
||||
{
|
||||
id: "gpt-5.4",
|
||||
contextTokens: 160000,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Use `contextWindow` only when you are declaring or overriding native model
|
||||
metadata. Use `contextTokens` when you want to limit the runtime context budget.
|
||||
|
||||
### Transport default
|
||||
|
||||
OpenClaw uses `pi-ai` for model streaming. For both `openai/*` and
|
||||
|
||||
Reference in New Issue
Block a user