fix: add runtime model contextTokens caps

This commit is contained in:
Peter Steinberger
2026-04-04 09:35:59 +09:00
parent 45675c1698
commit 58d2b9dd46
25 changed files with 350 additions and 52 deletions

View File

@@ -22,6 +22,7 @@ Docs: https://docs.openclaw.ai
### Fixes
- Providers/OpenAI: preserve native `reasoning.effort: "none"` and strict tool schemas on direct OpenAI-family endpoints, keep OpenAI-compatible proxies on the older compat shim path, and enable OpenAI WebSocket warm-up by default for native Responses routes.
- Providers/OpenAI Codex: split native `contextWindow` from runtime `contextTokens` for `openai-codex/gpt-5.4`, keep the default effective cap at `272000`, and expose a per-model config override via `models.providers.*.models[].contextTokens`.
- Skills/uv install: block workspace `.env` from overriding `UV_PYTHON` and strip related interpreter override keys from uv skill-install subprocesses so repository-controlled env files cannot steer the selected Python runtime. (#59178) Thanks @pgondhi987.
- Telegram/reactions: preserve `reactionNotifications: "own"` across gateway restarts by persisting sent-message ownership state instead of treating cold cache as a permissive fallback. (#59207) Thanks @samzong.
- Gateway/startup: detect PID recycling in gateway lock files on Windows and macOS, and add startup progress so stale lock conflicts no longer block healthy restarts. (#59843) Thanks @TonyDerek-dot.

View File

@@ -18,6 +18,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
- Fallback runtime rules, cooldown probes, and session-override persistence are
documented in [/concepts/model-failover](/concepts/model-failover).
- `models.providers.*.models[].contextWindow` is native model metadata;
`models.providers.*.models[].contextTokens` is the effective runtime cap.
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
OpenClaw merges that output into `models.providers` before writing
`models.json`.
@@ -187,6 +189,7 @@ OpenClaw ships with the piai catalog. These providers require **no**
- `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
- Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
- `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent
- `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
- Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
```json5
@@ -195,6 +198,18 @@ OpenClaw ships with the piai catalog. These providers require **no**
}
```
```json5
{
models: {
providers: {
"openai-codex": {
models: [{ id: "gpt-5.4", contextTokens: 160000 }],
},
},
},
}
```
### Other subscription-style hosted options
- [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints

View File

@@ -2186,6 +2186,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 128000,
contextTokens: 96000,
maxTokens: 32000,
},
],
@@ -2204,6 +2205,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
- SecretRef-managed provider header values are refreshed from source markers (`secretref-env:ENV_VAR_NAME` for env refs, `secretref-managed` for file/exec refs).
- Empty or missing agent `apiKey`/`baseUrl` fall back to `models.providers` in config.
- Matching model `contextWindow`/`maxTokens` use the higher value between explicit config and implicit catalog values.
- Matching model `contextTokens` preserves an explicit runtime cap when present; use it to limit effective context without changing native model metadata.
- Use `models.mode: "replace"` when you want config to fully rewrite `models.json`.
- Marker persistence is source-authoritative: markers are written from the active source config snapshot (pre-resolution), not from resolved runtime secret values.
@@ -2219,6 +2221,8 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
- `models.providers.*.baseUrl`: upstream API base URL.
- `models.providers.*.headers`: extra static headers for proxy/tenant routing.
- `models.providers.*.models`: explicit provider model catalog entries.
- `models.providers.*.models.*.contextWindow`: native model context window metadata.
- `models.providers.*.models.*.contextTokens`: optional runtime context cap. Use this when you want a smaller effective context budget than the model's native `contextWindow`.
- `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
- `models.bedrockDiscovery`: Bedrock auto-discovery settings root.
- `models.bedrockDiscovery.enabled`: turn discovery polling on/off.

View File

@@ -143,6 +143,41 @@ discovers it. Treat it as entitlement-dependent and experimental: Codex Spark is
separate from GPT-5.4 `/fast`, and availability depends on the signed-in Codex /
ChatGPT account.
### Codex context window cap
OpenClaw treats the Codex model metadata and the runtime context cap as separate
values.
For `openai-codex/gpt-5.4`:
- native `contextWindow`: `1050000`
- default runtime `contextTokens` cap: `272000`
That keeps model metadata truthful while preserving the smaller default runtime
window that has better latency and quality characteristics in practice.
If you want a different effective cap, set `models.providers.<provider>.models[].contextTokens`:
```json5
{
models: {
providers: {
"openai-codex": {
models: [
{
id: "gpt-5.4",
contextTokens: 160000,
},
],
},
},
},
}
```
Use `contextWindow` only when you are declaring or overriding native model
metadata. Use `contextTokens` when you want to limit the runtime context budget.
### Transport default
OpenClaw uses `pi-ai` for model streaming. For both `openai/*` and

View File

@@ -86,4 +86,69 @@ describe("openai codex provider", () => {
"Deprecated profile. Run `openclaw models auth login --provider openai-codex` or `openclaw configure`.",
);
});
it("resolves gpt-5.4 with native contextWindow plus default contextTokens cap", () => {
const provider = buildOpenAICodexProviderPlugin();
const model = provider.resolveDynamicModel?.({
provider: "openai-codex",
modelId: "gpt-5.4",
modelRegistry: {
find: vi.fn((providerId: string, modelId: string) => {
if (providerId === "openai-codex" && modelId === "gpt-5.3-codex") {
return {
id: "gpt-5.3-codex",
name: "gpt-5.3-codex",
provider: "openai-codex",
api: "openai-codex-responses",
baseUrl: "https://chatgpt.com/backend-api",
reasoning: true,
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 272_000,
maxTokens: 128_000,
};
}
return null;
}),
},
});
expect(model).toMatchObject({
id: "gpt-5.4",
contextWindow: 1_050_000,
contextTokens: 272_000,
maxTokens: 128_000,
});
});
it("augments catalog with gpt-5.4 native contextWindow and runtime cap", () => {
const provider = buildOpenAICodexProviderPlugin();
const entries = provider.augmentModelCatalog?.({
provider: "openai-codex",
entries: [
{
id: "gpt-5.3-codex",
name: "gpt-5.3-codex",
provider: "openai-codex",
api: "openai-codex-responses",
baseUrl: "https://chatgpt.com/backend-api",
reasoning: true,
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 272_000,
maxTokens: 128_000,
},
],
});
expect(entries).toContainEqual(
expect.objectContaining({
id: "gpt-5.4",
contextWindow: 1_050_000,
contextTokens: 272_000,
}),
);
});
});

View File

@@ -33,7 +33,8 @@ import { wrapOpenAICodexProviderStream } from "./stream-hooks.js";
const PROVIDER_ID = "openai-codex";
const OPENAI_CODEX_BASE_URL = "https://chatgpt.com/backend-api";
const OPENAI_CODEX_GPT_54_MODEL_ID = "gpt-5.4";
const OPENAI_CODEX_GPT_54_CONTEXT_TOKENS = 400_000;
const OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS = 1_050_000;
const OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS = 272_000;
const OPENAI_CODEX_GPT_54_MAX_TOKENS = 128_000;
const OPENAI_CODEX_GPT_54_COST = {
input: 2.5,
@@ -100,7 +101,8 @@ function resolveCodexForwardCompatModel(
if (lower === OPENAI_CODEX_GPT_54_MODEL_ID) {
templateIds = OPENAI_CODEX_GPT_54_TEMPLATE_MODEL_IDS;
patch = {
contextWindow: OPENAI_CODEX_GPT_54_CONTEXT_TOKENS,
contextWindow: OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS,
contextTokens: OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS,
maxTokens: OPENAI_CODEX_GPT_54_MAX_TOKENS,
cost: OPENAI_CODEX_GPT_54_COST,
};
@@ -140,6 +142,7 @@ function resolveCodexForwardCompatModel(
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: patch?.contextWindow ?? DEFAULT_CONTEXT_TOKENS,
contextTokens: patch?.contextTokens,
maxTokens: patch?.maxTokens ?? DEFAULT_CONTEXT_TOKENS,
} as ProviderRuntimeModel)
);
@@ -217,6 +220,7 @@ function buildSyntheticCatalogEntry(
reasoning: boolean;
input: readonly ("text" | "image")[];
contextWindow: number;
contextTokens?: number;
},
) {
if (!template) {
@@ -229,6 +233,7 @@ function buildSyntheticCatalogEntry(
reasoning: entry.reasoning,
input: [...entry.input],
contextWindow: entry.contextWindow,
...(entry.contextTokens === undefined ? {} : { contextTokens: entry.contextTokens }),
};
}
@@ -312,7 +317,8 @@ export function buildOpenAICodexProviderPlugin(): ProviderPlugin {
id: OPENAI_CODEX_GPT_54_MODEL_ID,
reasoning: true,
input: ["text", "image"],
contextWindow: OPENAI_CODEX_GPT_54_CONTEXT_TOKENS,
contextWindow: OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS,
contextTokens: OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS,
}),
buildSyntheticCatalogEntry(sparkTemplate, {
id: OPENAI_CODEX_GPT_53_SPARK_MODEL_ID,

View File

@@ -524,5 +524,7 @@ export function pruneHistoryForContextShare(params: {
}
export function resolveContextWindowTokens(model?: ExtensionContext["model"]): number {
return Math.max(1, Math.floor(model?.contextWindow ?? DEFAULT_CONTEXT_TOKENS));
const effective =
(model as { contextTokens?: number } | undefined)?.contextTokens ?? model?.contextWindow;
return Math.max(1, Math.floor(effective ?? DEFAULT_CONTEXT_TOKENS));
}

View File

@@ -85,6 +85,45 @@ describe("context-window-guard", () => {
expect(guard.shouldBlock).toBe(true);
});
it("prefers models.providers.*.models[].contextTokens over contextWindow", () => {
const cfg = {
models: {
providers: {
openrouter: {
baseUrl: "http://localhost",
apiKey: "x",
models: [
{
id: "tiny",
name: "tiny",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 1_050_000,
contextTokens: 12_000,
maxTokens: 256,
},
],
},
},
},
} satisfies OpenClawConfig;
const info = resolveContextWindowInfo({
cfg,
provider: "openrouter",
modelId: "tiny",
modelContextWindow: 64_000,
modelContextTokens: 48_000,
defaultTokens: 200_000,
});
expect(info).toEqual({
source: "modelsConfig",
tokens: 12_000,
});
});
it("normalizes provider aliases when reading models config context windows", () => {
const cfg = {
models: {

View File

@@ -23,19 +23,25 @@ export function resolveContextWindowInfo(params: {
cfg: OpenClawConfig | undefined;
provider: string;
modelId: string;
modelContextTokens?: number;
modelContextWindow?: number;
defaultTokens: number;
}): ContextWindowInfo {
const fromModelsConfig = (() => {
const providers = params.cfg?.models?.providers as
| Record<string, { models?: Array<{ id?: string; contextWindow?: number }> }>
| Record<
string,
{ models?: Array<{ id?: string; contextTokens?: number; contextWindow?: number }> }
>
| undefined;
const providerEntry = findNormalizedProviderValue(providers, params.provider);
const models = Array.isArray(providerEntry?.models) ? providerEntry.models : [];
const match = models.find((m) => m?.id === params.modelId);
return normalizePositiveInt(match?.contextWindow);
return normalizePositiveInt(match?.contextTokens) ?? normalizePositiveInt(match?.contextWindow);
})();
const fromModel = normalizePositiveInt(params.modelContextWindow);
const fromModel =
normalizePositiveInt(params.modelContextTokens) ??
normalizePositiveInt(params.modelContextWindow);
const baseInfo = fromModelsConfig
? { tokens: fromModelsConfig, source: "modelsConfig" as const }
: fromModel

View File

@@ -1,6 +1,6 @@
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
type DiscoveredModel = { id: string; contextWindow: number };
type DiscoveredModel = { id: string; contextWindow?: number; contextTokens?: number };
type ContextModule = typeof import("./context.js");
function mockContextDeps(params: {
@@ -120,6 +120,21 @@ describe("lookupContextTokens", () => {
);
});
it("prefers config contextTokens over contextWindow on first lookup", async () => {
mockContextModuleDeps(() => ({
models: {
providers: {
"openai-codex": {
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
},
},
},
}));
const { lookupContextTokens } = await importContextModule();
expect(lookupContextTokens("gpt-5.4", { allowAsyncLoad: false })).toBe(272_000);
});
it("rehydrates config-backed cache entries after module reload when runtime config survives", async () => {
const firstLoadConfigMock = vi.fn(() => ({
models: {

View File

@@ -38,6 +38,16 @@ describe("applyDiscoveredContextWindows", () => {
expect(cache.get("github-copilot/gemini-3.1-pro-preview")).toBe(128_000);
expect(cache.get("google-gemini-cli/gemini-3.1-pro-preview")).toBe(1_048_576);
});
it("prefers discovered contextTokens over contextWindow", () => {
const cache = new Map<string, number>();
applyDiscoveredContextWindows({
cache,
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
});
expect(cache.get("gpt-5.4")).toBe(272_000);
});
});
describe("applyConfiguredContextWindows", () => {
@@ -107,6 +117,22 @@ describe("applyConfiguredContextWindows", () => {
expect(cache.get("custom/model")).toBe(150_000);
expect(cache.has("bad/model")).toBe(false);
});
it("prefers configured contextTokens over contextWindow", () => {
const cache = new Map<string, number>();
applyConfiguredContextWindows({
cache,
modelsConfig: {
providers: {
openrouter: {
models: [{ id: "custom/model", contextWindow: 1_050_000, contextTokens: 200_000 }],
},
},
},
});
expect(cache.get("custom/model")).toBe(200_000);
});
});
describe("createSessionManagerRuntimeRegistry", () => {
@@ -192,4 +218,23 @@ describe("resolveContextTokensForModel", () => {
expect(result).toBe(200_000);
});
it("prefers per-model contextTokens config over contextWindow", () => {
const result = resolveContextTokensForModel({
cfg: {
models: {
providers: {
"openai-codex": {
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 160_000 }],
},
},
},
},
provider: "openai-codex",
model: "gpt-5.4",
fallbackContextTokens: 272_000,
});
expect(result).toBe(160_000);
});
});

View File

@@ -13,12 +13,12 @@ import { normalizeProviderId } from "./model-selection.js";
export { resetContextWindowCacheForTest } from "./context-runtime-state.js";
type ModelEntry = { id: string; contextWindow?: number };
type ModelEntry = { id: string; contextWindow?: number; contextTokens?: number };
type ModelRegistryLike = {
getAvailable?: () => ModelEntry[];
getAll: () => ModelEntry[];
};
type ConfigModelEntry = { id?: string; contextWindow?: number };
type ConfigModelEntry = { id?: string; contextWindow?: number; contextTokens?: number };
type ProviderConfigEntry = { models?: ConfigModelEntry[] };
type ModelsConfig = { providers?: Record<string, ProviderConfigEntry | undefined> };
type AgentModelEntry = { params?: Record<string, unknown> };
@@ -40,20 +40,20 @@ export function applyDiscoveredContextWindows(params: {
if (!model?.id) {
continue;
}
const contextWindow =
typeof model.contextWindow === "number" ? Math.trunc(model.contextWindow) : undefined;
if (!contextWindow || contextWindow <= 0) {
const contextTokens =
typeof model.contextTokens === "number"
? Math.trunc(model.contextTokens)
: typeof model.contextWindow === "number"
? Math.trunc(model.contextWindow)
: undefined;
if (!contextTokens || contextTokens <= 0) {
continue;
}
const existing = params.cache.get(model.id);
// When the same bare model id appears under multiple providers with different
// limits, keep the smaller window. This cache feeds both display paths and
// runtime paths (flush thresholds, session context-token persistence), so
// overestimating the limit could delay compaction and cause context overflow.
// Callers that know the active provider should use resolveContextTokensForModel,
// which tries the provider-qualified key first and falls back here.
if (existing === undefined || contextWindow < existing) {
params.cache.set(model.id, contextWindow);
// Cache the most conservative effective limit. Provider/runtime callers that
// know the active provider should still prefer qualified lookups first.
if (existing === undefined || contextTokens < existing) {
params.cache.set(model.id, contextTokens);
}
}
}
@@ -72,12 +72,16 @@ export function applyConfiguredContextWindows(params: {
}
for (const model of provider.models) {
const modelId = typeof model?.id === "string" ? model.id : undefined;
const contextWindow =
typeof model?.contextWindow === "number" ? model.contextWindow : undefined;
if (!modelId || !contextWindow || contextWindow <= 0) {
const contextTokens =
typeof model?.contextTokens === "number"
? model.contextTokens
: typeof model?.contextWindow === "number"
? model.contextWindow
: undefined;
if (!modelId || !contextTokens || contextTokens <= 0) {
continue;
}
params.cache.set(modelId, contextWindow);
params.cache.set(modelId, contextTokens);
}
}
}
@@ -307,12 +311,12 @@ function resolveProviderModelRef(params: {
return { provider, model };
}
// Look up an explicit contextWindow override for a specific provider+model
// Look up an explicit runtime context cap for a specific provider+model
// directly from config, without going through the shared discovery cache.
// This avoids the cache keyspace collision where "provider/model" synthetic
// keys overlap with raw slash-containing model IDs (e.g. OpenRouter's
// "google/gemini-2.5-pro" stored as a raw catalog entry).
function resolveConfiguredProviderContextWindow(
function resolveConfiguredProviderContextTokens(
cfg: OpenClawConfig | undefined,
provider: string,
model: string,
@@ -324,8 +328,8 @@ function resolveConfiguredProviderContextWindow(
// Mirror the lookup order in pi-embedded-runner/model.ts: exact key first,
// then normalized fallback. This prevents alias collisions from picking the
// wrong contextWindow based on Object.entries iteration order.
function findContextWindow(matchProviderId: (id: string) => boolean): number | undefined {
// wrong configured cap based on Object.entries iteration order.
function findContextTokens(matchProviderId: (id: string) => boolean): number | undefined {
for (const [providerId, providerConfig] of Object.entries(providers!)) {
if (!matchProviderId(providerId)) {
continue;
@@ -334,13 +338,19 @@ function resolveConfiguredProviderContextWindow(
continue;
}
for (const m of providerConfig.models) {
const contextTokens =
typeof m?.contextTokens === "number"
? m.contextTokens
: typeof m?.contextWindow === "number"
? m.contextWindow
: undefined;
if (
typeof m?.id === "string" &&
m.id === model &&
typeof m?.contextWindow === "number" &&
m.contextWindow > 0
typeof contextTokens === "number" &&
contextTokens > 0
) {
return m.contextWindow;
return contextTokens;
}
}
}
@@ -348,14 +358,14 @@ function resolveConfiguredProviderContextWindow(
}
// 1. Exact match (case-insensitive, no alias expansion).
const exactResult = findContextWindow((id) => id.trim().toLowerCase() === provider.toLowerCase());
const exactResult = findContextTokens((id) => id.trim().toLowerCase() === provider.toLowerCase());
if (exactResult !== undefined) {
return exactResult;
}
// 2. Normalized fallback: covers alias keys such as "z.ai" → "zai".
const normalizedProvider = normalizeProviderId(provider);
return findContextWindow((id) => normalizeProviderId(id) === normalizedProvider);
return findContextTokens((id) => normalizeProviderId(id) === normalizedProvider);
}
function isAnthropic1MModel(provider: string, model: string): boolean {
@@ -399,7 +409,7 @@ export function resolveContextTokensForModel(params: {
// window and misreport context limits for the OpenRouter session.
// See status.ts log-usage fallback which calls with only { model } set.
if (explicitProvider) {
const configuredWindow = resolveConfiguredProviderContextWindow(
const configuredWindow = resolveConfiguredProviderContextTokens(
params.cfg,
explicitProvider,
ref.model,

View File

@@ -85,6 +85,11 @@ export function mergeProviderModels(
explicitValue: explicitModel.contextWindow,
implicitValue: implicitModel.contextWindow,
});
const contextTokens = resolvePreferredTokenLimit({
explicitPresent: "contextTokens" in explicitModel,
explicitValue: explicitModel.contextTokens,
implicitValue: implicitModel.contextTokens,
});
const maxTokens = resolvePreferredTokenLimit({
explicitPresent: "maxTokens" in explicitModel,
explicitValue: explicitModel.maxTokens,
@@ -96,6 +101,7 @@ export function mergeProviderModels(
input: implicitModel.input,
reasoning: "reasoning" in explicitModel ? explicitModel.reasoning : implicitModel.reasoning,
...(contextWindow === undefined ? {} : { contextWindow }),
...(contextTokens === undefined ? {} : { contextTokens }),
...(maxTokens === undefined ? {} : { maxTokens }),
};
});

View File

@@ -441,6 +441,7 @@ export async function compactEmbeddedPiSessionDirect(
cfg: params.config,
provider,
modelId,
modelContextTokens: runtimeModel.contextTokens,
modelContextWindow: runtimeModel.contextWindow,
defaultTokens: DEFAULT_CONTEXT_TOKENS,
});
@@ -1031,6 +1032,7 @@ export async function compactEmbeddedPiSession(
cfg: params.config,
provider: ceProvider,
modelId: ceModelId,
modelContextTokens: ceModel?.contextTokens,
modelContextWindow: ceModel?.contextWindow,
defaultTokens: DEFAULT_CONTEXT_TOKENS,
});

View File

@@ -1,6 +1,6 @@
import type { Api, Model } from "@mariozechner/pi-ai";
import type { ExtensionFactory, SessionManager } from "@mariozechner/pi-coding-agent";
import type { OpenClawConfig } from "../../config/config.js";
import type { ProviderRuntimeModel } from "../../plugins/types.js";
import { resolveContextWindowInfo } from "../context-window-guard.js";
import { DEFAULT_CONTEXT_TOKENS } from "../defaults.js";
import { setCompactionSafeguardRuntime } from "../pi-hooks/compaction-safeguard-runtime.js";
@@ -17,12 +17,13 @@ function resolveContextWindowTokens(params: {
cfg: OpenClawConfig | undefined;
provider: string;
modelId: string;
model: Model<Api> | undefined;
model: ProviderRuntimeModel | undefined;
}): number {
return resolveContextWindowInfo({
cfg: params.cfg,
provider: params.provider,
modelId: params.modelId,
modelContextTokens: params.model?.contextTokens,
modelContextWindow: params.model?.contextWindow,
defaultTokens: DEFAULT_CONTEXT_TOKENS,
}).tokens;
@@ -33,7 +34,7 @@ function buildContextPruningFactory(params: {
sessionManager: SessionManager;
provider: string;
modelId: string;
model: Model<Api> | undefined;
model: ProviderRuntimeModel | undefined;
}): ExtensionFactory | undefined {
const raw = params.cfg?.agents?.defaults?.contextPruning;
if (raw?.mode !== "cache-ttl") {
@@ -73,7 +74,7 @@ export function buildEmbeddedExtensionFactories(params: {
sessionManager: SessionManager;
provider: string;
modelId: string;
model: Model<Api> | undefined;
model: ProviderRuntimeModel | undefined;
}): ExtensionFactory[] {
const factories: ExtensionFactory[] = [];
if (resolveCompactionMode(params.cfg) === "safeguard") {
@@ -83,6 +84,7 @@ export function buildEmbeddedExtensionFactories(params: {
cfg: params.cfg,
provider: params.provider,
modelId: params.modelId,
modelContextTokens: params.model?.contextTokens,
modelContextWindow: params.model?.contextWindow,
defaultTokens: DEFAULT_CONTEXT_TOKENS,
});

View File

@@ -205,7 +205,8 @@ function buildDynamicModel(
api: "openai-codex-responses",
baseUrl: OPENAI_CODEX_BASE_URL,
cost: { input: 2.5, output: 15, cacheRead: 0.25, cacheWrite: 0 },
contextWindow: 272_000,
contextWindow: 1_050_000,
contextTokens: 272_000,
maxTokens: 128_000,
},
fallback,

View File

@@ -70,7 +70,8 @@ export function buildOpenAICodexForwardCompatExpectation(
: isGpt54
? { input: 2.5, output: 15, cacheRead: 0.25, cacheWrite: 0 }
: OPENAI_CODEX_TEMPLATE_MODEL.cost,
contextWindow: isGpt54 ? 272_000 : isSpark ? 128_000 : 272000,
contextWindow: isGpt54 ? 1_050_000 : isSpark ? 128_000 : 272000,
...(isGpt54 ? { contextTokens: 272_000 } : {}),
maxTokens: 128000,
};
}

View File

@@ -12,6 +12,7 @@ import {
runProviderDynamicModel,
normalizeProviderResolvedModelWithPlugin,
} from "../../plugins/provider-runtime.js";
import type { ProviderRuntimeModel } from "../../plugins/types.js";
import { resolveOpenClawAgentDir } from "../agent-paths.js";
import { DEFAULT_CONTEXT_TOKENS } from "../defaults.js";
import { buildModelAliasLines } from "../model-alias-lines.js";
@@ -294,12 +295,12 @@ function resolveConfiguredProviderConfig(
function applyConfiguredProviderOverrides(params: {
provider: string;
discoveredModel: Model<Api>;
discoveredModel: ProviderRuntimeModel;
providerConfig?: InlineProviderConfig;
modelId: string;
cfg?: OpenClawConfig;
runtimeHooks?: ProviderRuntimeHooks;
}): Model<Api> {
}): ProviderRuntimeModel {
const { discoveredModel, providerConfig, modelId } = params;
if (!providerConfig) {
return {
@@ -368,6 +369,7 @@ function applyConfiguredProviderOverrides(params: {
input: normalizedInput,
cost: configuredModel?.cost ?? discoveredModel.cost,
contextWindow: configuredModel?.contextWindow ?? discoveredModel.contextWindow,
contextTokens: configuredModel?.contextTokens ?? discoveredModel.contextTokens,
maxTokens: configuredModel?.maxTokens ?? discoveredModel.maxTokens,
headers: requestConfig.headers,
compat: configuredModel?.compat ?? discoveredModel.compat,
@@ -595,6 +597,7 @@ function resolveConfiguredFallbackModel(params: {
configuredModel?.contextWindow ??
providerConfig?.models?.[0]?.contextWindow ??
DEFAULT_CONTEXT_TOKENS,
contextTokens: configuredModel?.contextTokens ?? providerConfig?.models?.[0]?.contextTokens,
maxTokens:
configuredModel?.maxTokens ??
providerConfig?.models?.[0]?.maxTokens ??

View File

@@ -105,6 +105,7 @@ export function resolveEffectiveRuntimeModel(params: {
cfg: params.cfg,
provider: params.provider,
modelId: params.modelId,
modelContextTokens: params.runtimeModel.contextTokens,
modelContextWindow: params.runtimeModel.contextWindow,
defaultTokens: DEFAULT_CONTEXT_TOKENS,
});

View File

@@ -20,6 +20,25 @@ describe("statusSummaryRuntime.resolveContextTokensForModel", () => {
expect(contextTokens).toBe(123_456);
});
it("prefers per-model contextTokens over contextWindow", () => {
const contextTokens = statusSummaryRuntime.resolveContextTokensForModel({
cfg: {
models: {
providers: {
"openai-codex": {
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
},
},
},
} as never,
provider: "openai-codex",
model: "gpt-5.4",
fallbackContextTokens: 999,
});
expect(contextTokens).toBe(272_000);
});
});
describe("statusSummaryRuntime.resolveSessionModelRef", () => {

View File

@@ -99,7 +99,7 @@ function resolveConfiguredStatusModelRef(params: {
return { provider: params.defaultProvider, model: params.defaultModel };
}
function resolveConfiguredProviderContextWindow(
function resolveConfiguredProviderContextTokens(
cfg: OpenClawConfig | undefined,
provider: string,
model: string,
@@ -114,13 +114,19 @@ function resolveConfiguredProviderContextWindow(
continue;
}
for (const entry of providerConfig.models) {
const contextTokens =
typeof entry?.contextTokens === "number"
? entry.contextTokens
: typeof entry?.contextWindow === "number"
? entry.contextWindow
: undefined;
if (
typeof entry?.id === "string" &&
entry.id === model &&
typeof entry.contextWindow === "number" &&
entry.contextWindow > 0
typeof contextTokens === "number" &&
contextTokens > 0
) {
return entry.contextWindow;
return contextTokens;
}
}
}
@@ -180,13 +186,13 @@ function resolveContextTokensForModel(params: {
return params.contextTokensOverride;
}
if (params.provider && params.model) {
const configuredWindow = resolveConfiguredProviderContextWindow(
const configuredContextTokens = resolveConfiguredProviderContextTokens(
params.cfg,
params.provider,
params.model,
);
if (configuredWindow !== undefined) {
return configuredWindow;
if (configuredContextTokens !== undefined) {
return configuredContextTokens;
}
}
return params.fallbackContextTokens ?? DEFAULT_CONTEXT_TOKENS;

View File

@@ -2318,6 +2318,11 @@ export const GENERATED_BASE_CONFIG_SCHEMA = {
type: "number",
exclusiveMinimum: 0,
},
contextTokens: {
type: "integer",
exclusiveMinimum: 0,
maximum: 9007199254740991,
},
maxTokens: {
type: "number",
exclusiveMinimum: 0,
@@ -22333,7 +22338,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA = {
},
"models.mode": {
label: "Model Catalog Mode",
help: 'Controls provider catalog behavior: "merge" keeps built-ins and overlays your custom providers, while "replace" uses only your configured providers. In "merge", matching provider IDs preserve non-empty agent models.json baseUrl values, while apiKey values are preserved only when the provider is not SecretRef-managed in current config/auth-profile context; SecretRef-managed providers refresh apiKey from current source markers, and matching model contextWindow/maxTokens use the higher value between explicit and implicit entries.',
help: 'Controls provider catalog behavior: "merge" keeps built-ins and overlays your custom providers, while "replace" uses only your configured providers. In "merge", matching provider IDs preserve non-empty agent models.json baseUrl values, while apiKey values are preserved only when the provider is not SecretRef-managed in current config/auth-profile context; SecretRef-managed providers refresh apiKey from current source markers, matching model contextWindow/maxTokens use the higher value between explicit and implicit entries, and explicit contextTokens runtime caps are preserved.',
tags: ["models"],
},
"models.providers": {

View File

@@ -60,6 +60,12 @@ export type ModelDefinitionConfig = {
cacheWrite: number;
};
contextWindow: number;
/**
* Optional effective runtime cap used for compaction/session budgeting.
* Keeps provider/native contextWindow metadata intact while letting configs
* prefer a smaller practical window.
*/
contextTokens?: number;
maxTokens: number;
headers?: Record<string, string>;
compat?: ModelCompatConfig;

View File

@@ -307,6 +307,7 @@ export const ModelDefinitionSchema = z
.strict()
.optional(),
contextWindow: z.number().positive().optional(),
contextTokens: z.number().int().positive().optional(),
maxTokens: z.number().positive().optional(),
headers: z.record(z.string(), z.string()).optional(),
compat: ModelCompatSchema,

View File

@@ -313,7 +313,9 @@ export type ProviderPluginCatalog = {
* Runtime hooks below operate on the final `pi-ai` model object after
* discovery/override merging, just before inference runs.
*/
export type ProviderRuntimeModel = Model<Api>;
export type ProviderRuntimeModel = Model<Api> & {
contextTokens?: number;
};
export type ProviderRuntimeProviderConfig = {
baseUrl?: string;