fix: add runtime model contextTokens caps

2026-04-25 23:47:20 +00:00 · 2026-04-04 09:35:59 +09:00
parent 45675c1698
commit 58d2b9dd46
25 changed files with 350 additions and 52 deletions
--- a/docs/concepts/model-providers.md
+++ b/docs/concepts/model-providers.md
@@ -18,6 +18,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
 - CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
 - Fallback runtime rules, cooldown probes, and session-override persistence are
  documented in [/concepts/model-failover](/concepts/model-failover).
+- `models.providers.*.models[].contextWindow` is native model metadata;
+  `models.providers.*.models[].contextTokens` is the effective runtime cap.
 - Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
  OpenClaw merges that output into `models.providers` before writing
  `models.json`.
@@ -187,6 +189,7 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 - `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
 - Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
 - `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent
+- `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
 - Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.

 ```json5
@@ -195,6 +198,18 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
 }
 ```

+```json5
+{
+  models: {
+    providers: {
+      "openai-codex": {
+        models: [{ id: "gpt-5.4", contextTokens: 160000 }],
+      },
+    },
+  },
+}
+```
+
 ### Other subscription-style hosted options

 - [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints
--- a/docs/gateway/configuration-reference.md
+++ b/docs/gateway/configuration-reference.md
@@ -2186,6 +2186,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
            input: ["text"],
            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
            contextWindow: 128000,
+            contextTokens: 96000,
            maxTokens: 32000,
          },
        ],
@@ -2204,6 +2205,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
  - SecretRef-managed provider header values are refreshed from source markers (`secretref-env:ENV_VAR_NAME` for env refs, `secretref-managed` for file/exec refs).
  - Empty or missing agent `apiKey`/`baseUrl` fall back to `models.providers` in config.
  - Matching model `contextWindow`/`maxTokens` use the higher value between explicit config and implicit catalog values.
+  - Matching model `contextTokens` preserves an explicit runtime cap when present; use it to limit effective context without changing native model metadata.
  - Use `models.mode: "replace"` when you want config to fully rewrite `models.json`.
  - Marker persistence is source-authoritative: markers are written from the active source config snapshot (pre-resolution), not from resolved runtime secret values.

@@ -2219,6 +2221,8 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
 - `models.providers.*.baseUrl`: upstream API base URL.
 - `models.providers.*.headers`: extra static headers for proxy/tenant routing.
 - `models.providers.*.models`: explicit provider model catalog entries.
+- `models.providers.*.models.*.contextWindow`: native model context window metadata.
+- `models.providers.*.models.*.contextTokens`: optional runtime context cap. Use this when you want a smaller effective context budget than the model's native `contextWindow`.
 - `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
 - `models.bedrockDiscovery`: Bedrock auto-discovery settings root.
 - `models.bedrockDiscovery.enabled`: turn discovery polling on/off.
--- a/docs/providers/openai.md
+++ b/docs/providers/openai.md
@@ -143,6 +143,41 @@ discovers it. Treat it as entitlement-dependent and experimental: Codex Spark is
 separate from GPT-5.4 `/fast`, and availability depends on the signed-in Codex /
 ChatGPT account.

+### Codex context window cap
+
+OpenClaw treats the Codex model metadata and the runtime context cap as separate
+values.
+
+For `openai-codex/gpt-5.4`:
+
+- native `contextWindow`: `1050000`
+- default runtime `contextTokens` cap: `272000`
+
+That keeps model metadata truthful while preserving the smaller default runtime
+window that has better latency and quality characteristics in practice.
+
+If you want a different effective cap, set `models.providers.<provider>.models[].contextTokens`:
+
+```json5
+{
+  models: {
+    providers: {
+      "openai-codex": {
+        models: [
+          {
+            id: "gpt-5.4",
+            contextTokens: 160000,
+          },
+        ],
+      },
+    },
+  },
+}
+```
+
+Use `contextWindow` only when you are declaring or overriding native model
+metadata. Use `contextTokens` when you want to limit the runtime context budget.
+
 ### Transport default

 OpenClaw uses `pi-ai` for model streaming. For both `openai/*` and