mirror of
https://github.com/moltbot/moltbot.git
synced 2026-04-20 21:23:23 +00:00
fix: add runtime model contextTokens caps
This commit is contained in:
@@ -22,6 +22,7 @@ Docs: https://docs.openclaw.ai
|
||||
### Fixes
|
||||
|
||||
- Providers/OpenAI: preserve native `reasoning.effort: "none"` and strict tool schemas on direct OpenAI-family endpoints, keep OpenAI-compatible proxies on the older compat shim path, and enable OpenAI WebSocket warm-up by default for native Responses routes.
|
||||
- Providers/OpenAI Codex: split native `contextWindow` from runtime `contextTokens` for `openai-codex/gpt-5.4`, keep the default effective cap at `272000`, and expose a per-model config override via `models.providers.*.models[].contextTokens`.
|
||||
- Skills/uv install: block workspace `.env` from overriding `UV_PYTHON` and strip related interpreter override keys from uv skill-install subprocesses so repository-controlled env files cannot steer the selected Python runtime. (#59178) Thanks @pgondhi987.
|
||||
- Telegram/reactions: preserve `reactionNotifications: "own"` across gateway restarts by persisting sent-message ownership state instead of treating cold cache as a permissive fallback. (#59207) Thanks @samzong.
|
||||
- Gateway/startup: detect PID recycling in gateway lock files on Windows and macOS, and add startup progress so stale lock conflicts no longer block healthy restarts. (#59843) Thanks @TonyDerek-dot.
|
||||
|
||||
@@ -18,6 +18,8 @@ For model selection rules, see [/concepts/models](/concepts/models).
|
||||
- CLI helpers: `openclaw onboard`, `openclaw models list`, `openclaw models set <provider/model>`.
|
||||
- Fallback runtime rules, cooldown probes, and session-override persistence are
|
||||
documented in [/concepts/model-failover](/concepts/model-failover).
|
||||
- `models.providers.*.models[].contextWindow` is native model metadata;
|
||||
`models.providers.*.models[].contextTokens` is the effective runtime cap.
|
||||
- Provider plugins can inject model catalogs via `registerProvider({ catalog })`;
|
||||
OpenClaw merges that output into `models.providers` before writing
|
||||
`models.json`.
|
||||
@@ -187,6 +189,7 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
|
||||
- `params.serviceTier` is also forwarded on native Codex Responses requests (`chatgpt.com/backend-api`)
|
||||
- Shares the same `/fast` toggle and `params.fastMode` config as direct `openai/*`; OpenClaw maps that to `service_tier=priority`
|
||||
- `openai-codex/gpt-5.3-codex-spark` remains available when the Codex OAuth catalog exposes it; entitlement-dependent
|
||||
- `openai-codex/gpt-5.4` keeps native `contextWindow = 1050000` and a default runtime `contextTokens = 272000`; override the runtime cap with `models.providers.openai-codex.models[].contextTokens`
|
||||
- Policy note: OpenAI Codex OAuth is explicitly supported for external tools/workflows like OpenClaw.
|
||||
|
||||
```json5
|
||||
@@ -195,6 +198,18 @@ OpenClaw ships with the pi‑ai catalog. These providers require **no**
|
||||
}
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [{ id: "gpt-5.4", contextTokens: 160000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
### Other subscription-style hosted options
|
||||
|
||||
- [Qwen / Model Studio](/providers/qwen_modelstudio): Alibaba Cloud Standard pay-as-you-go and Coding Plan subscription endpoints
|
||||
|
||||
@@ -2186,6 +2186,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 128000,
|
||||
contextTokens: 96000,
|
||||
maxTokens: 32000,
|
||||
},
|
||||
],
|
||||
@@ -2204,6 +2205,7 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
- SecretRef-managed provider header values are refreshed from source markers (`secretref-env:ENV_VAR_NAME` for env refs, `secretref-managed` for file/exec refs).
|
||||
- Empty or missing agent `apiKey`/`baseUrl` fall back to `models.providers` in config.
|
||||
- Matching model `contextWindow`/`maxTokens` use the higher value between explicit config and implicit catalog values.
|
||||
- Matching model `contextTokens` preserves an explicit runtime cap when present; use it to limit effective context without changing native model metadata.
|
||||
- Use `models.mode: "replace"` when you want config to fully rewrite `models.json`.
|
||||
- Marker persistence is source-authoritative: markers are written from the active source config snapshot (pre-resolution), not from resolved runtime secret values.
|
||||
|
||||
@@ -2219,6 +2221,8 @@ OpenClaw uses the built-in model catalog. Add custom providers via `models.provi
|
||||
- `models.providers.*.baseUrl`: upstream API base URL.
|
||||
- `models.providers.*.headers`: extra static headers for proxy/tenant routing.
|
||||
- `models.providers.*.models`: explicit provider model catalog entries.
|
||||
- `models.providers.*.models.*.contextWindow`: native model context window metadata.
|
||||
- `models.providers.*.models.*.contextTokens`: optional runtime context cap. Use this when you want a smaller effective context budget than the model's native `contextWindow`.
|
||||
- `models.providers.*.models.*.compat.supportsDeveloperRole`: optional compatibility hint. For `api: "openai-completions"` with a non-empty non-native `baseUrl` (host not `api.openai.com`), OpenClaw forces this to `false` at runtime. Empty/omitted `baseUrl` keeps default OpenAI behavior.
|
||||
- `models.bedrockDiscovery`: Bedrock auto-discovery settings root.
|
||||
- `models.bedrockDiscovery.enabled`: turn discovery polling on/off.
|
||||
|
||||
@@ -143,6 +143,41 @@ discovers it. Treat it as entitlement-dependent and experimental: Codex Spark is
|
||||
separate from GPT-5.4 `/fast`, and availability depends on the signed-in Codex /
|
||||
ChatGPT account.
|
||||
|
||||
### Codex context window cap
|
||||
|
||||
OpenClaw treats the Codex model metadata and the runtime context cap as separate
|
||||
values.
|
||||
|
||||
For `openai-codex/gpt-5.4`:
|
||||
|
||||
- native `contextWindow`: `1050000`
|
||||
- default runtime `contextTokens` cap: `272000`
|
||||
|
||||
That keeps model metadata truthful while preserving the smaller default runtime
|
||||
window that has better latency and quality characteristics in practice.
|
||||
|
||||
If you want a different effective cap, set `models.providers.<provider>.models[].contextTokens`:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [
|
||||
{
|
||||
id: "gpt-5.4",
|
||||
contextTokens: 160000,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Use `contextWindow` only when you are declaring or overriding native model
|
||||
metadata. Use `contextTokens` when you want to limit the runtime context budget.
|
||||
|
||||
### Transport default
|
||||
|
||||
OpenClaw uses `pi-ai` for model streaming. For both `openai/*` and
|
||||
|
||||
@@ -86,4 +86,69 @@ describe("openai codex provider", () => {
|
||||
"Deprecated profile. Run `openclaw models auth login --provider openai-codex` or `openclaw configure`.",
|
||||
);
|
||||
});
|
||||
|
||||
it("resolves gpt-5.4 with native contextWindow plus default contextTokens cap", () => {
|
||||
const provider = buildOpenAICodexProviderPlugin();
|
||||
|
||||
const model = provider.resolveDynamicModel?.({
|
||||
provider: "openai-codex",
|
||||
modelId: "gpt-5.4",
|
||||
modelRegistry: {
|
||||
find: vi.fn((providerId: string, modelId: string) => {
|
||||
if (providerId === "openai-codex" && modelId === "gpt-5.3-codex") {
|
||||
return {
|
||||
id: "gpt-5.3-codex",
|
||||
name: "gpt-5.3-codex",
|
||||
provider: "openai-codex",
|
||||
api: "openai-codex-responses",
|
||||
baseUrl: "https://chatgpt.com/backend-api",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 272_000,
|
||||
maxTokens: 128_000,
|
||||
};
|
||||
}
|
||||
return null;
|
||||
}),
|
||||
},
|
||||
});
|
||||
|
||||
expect(model).toMatchObject({
|
||||
id: "gpt-5.4",
|
||||
contextWindow: 1_050_000,
|
||||
contextTokens: 272_000,
|
||||
maxTokens: 128_000,
|
||||
});
|
||||
});
|
||||
|
||||
it("augments catalog with gpt-5.4 native contextWindow and runtime cap", () => {
|
||||
const provider = buildOpenAICodexProviderPlugin();
|
||||
|
||||
const entries = provider.augmentModelCatalog?.({
|
||||
provider: "openai-codex",
|
||||
entries: [
|
||||
{
|
||||
id: "gpt-5.3-codex",
|
||||
name: "gpt-5.3-codex",
|
||||
provider: "openai-codex",
|
||||
api: "openai-codex-responses",
|
||||
baseUrl: "https://chatgpt.com/backend-api",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 272_000,
|
||||
maxTokens: 128_000,
|
||||
},
|
||||
],
|
||||
});
|
||||
|
||||
expect(entries).toContainEqual(
|
||||
expect.objectContaining({
|
||||
id: "gpt-5.4",
|
||||
contextWindow: 1_050_000,
|
||||
contextTokens: 272_000,
|
||||
}),
|
||||
);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -33,7 +33,8 @@ import { wrapOpenAICodexProviderStream } from "./stream-hooks.js";
|
||||
const PROVIDER_ID = "openai-codex";
|
||||
const OPENAI_CODEX_BASE_URL = "https://chatgpt.com/backend-api";
|
||||
const OPENAI_CODEX_GPT_54_MODEL_ID = "gpt-5.4";
|
||||
const OPENAI_CODEX_GPT_54_CONTEXT_TOKENS = 400_000;
|
||||
const OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS = 1_050_000;
|
||||
const OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS = 272_000;
|
||||
const OPENAI_CODEX_GPT_54_MAX_TOKENS = 128_000;
|
||||
const OPENAI_CODEX_GPT_54_COST = {
|
||||
input: 2.5,
|
||||
@@ -100,7 +101,8 @@ function resolveCodexForwardCompatModel(
|
||||
if (lower === OPENAI_CODEX_GPT_54_MODEL_ID) {
|
||||
templateIds = OPENAI_CODEX_GPT_54_TEMPLATE_MODEL_IDS;
|
||||
patch = {
|
||||
contextWindow: OPENAI_CODEX_GPT_54_CONTEXT_TOKENS,
|
||||
contextWindow: OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS,
|
||||
contextTokens: OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS,
|
||||
maxTokens: OPENAI_CODEX_GPT_54_MAX_TOKENS,
|
||||
cost: OPENAI_CODEX_GPT_54_COST,
|
||||
};
|
||||
@@ -140,6 +142,7 @@ function resolveCodexForwardCompatModel(
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: patch?.contextWindow ?? DEFAULT_CONTEXT_TOKENS,
|
||||
contextTokens: patch?.contextTokens,
|
||||
maxTokens: patch?.maxTokens ?? DEFAULT_CONTEXT_TOKENS,
|
||||
} as ProviderRuntimeModel)
|
||||
);
|
||||
@@ -217,6 +220,7 @@ function buildSyntheticCatalogEntry(
|
||||
reasoning: boolean;
|
||||
input: readonly ("text" | "image")[];
|
||||
contextWindow: number;
|
||||
contextTokens?: number;
|
||||
},
|
||||
) {
|
||||
if (!template) {
|
||||
@@ -229,6 +233,7 @@ function buildSyntheticCatalogEntry(
|
||||
reasoning: entry.reasoning,
|
||||
input: [...entry.input],
|
||||
contextWindow: entry.contextWindow,
|
||||
...(entry.contextTokens === undefined ? {} : { contextTokens: entry.contextTokens }),
|
||||
};
|
||||
}
|
||||
|
||||
@@ -312,7 +317,8 @@ export function buildOpenAICodexProviderPlugin(): ProviderPlugin {
|
||||
id: OPENAI_CODEX_GPT_54_MODEL_ID,
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
contextWindow: OPENAI_CODEX_GPT_54_CONTEXT_TOKENS,
|
||||
contextWindow: OPENAI_CODEX_GPT_54_NATIVE_CONTEXT_TOKENS,
|
||||
contextTokens: OPENAI_CODEX_GPT_54_DEFAULT_CONTEXT_TOKENS,
|
||||
}),
|
||||
buildSyntheticCatalogEntry(sparkTemplate, {
|
||||
id: OPENAI_CODEX_GPT_53_SPARK_MODEL_ID,
|
||||
|
||||
@@ -524,5 +524,7 @@ export function pruneHistoryForContextShare(params: {
|
||||
}
|
||||
|
||||
export function resolveContextWindowTokens(model?: ExtensionContext["model"]): number {
|
||||
return Math.max(1, Math.floor(model?.contextWindow ?? DEFAULT_CONTEXT_TOKENS));
|
||||
const effective =
|
||||
(model as { contextTokens?: number } | undefined)?.contextTokens ?? model?.contextWindow;
|
||||
return Math.max(1, Math.floor(effective ?? DEFAULT_CONTEXT_TOKENS));
|
||||
}
|
||||
|
||||
@@ -85,6 +85,45 @@ describe("context-window-guard", () => {
|
||||
expect(guard.shouldBlock).toBe(true);
|
||||
});
|
||||
|
||||
it("prefers models.providers.*.models[].contextTokens over contextWindow", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
providers: {
|
||||
openrouter: {
|
||||
baseUrl: "http://localhost",
|
||||
apiKey: "x",
|
||||
models: [
|
||||
{
|
||||
id: "tiny",
|
||||
name: "tiny",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 1_050_000,
|
||||
contextTokens: 12_000,
|
||||
maxTokens: 256,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
} satisfies OpenClawConfig;
|
||||
|
||||
const info = resolveContextWindowInfo({
|
||||
cfg,
|
||||
provider: "openrouter",
|
||||
modelId: "tiny",
|
||||
modelContextWindow: 64_000,
|
||||
modelContextTokens: 48_000,
|
||||
defaultTokens: 200_000,
|
||||
});
|
||||
|
||||
expect(info).toEqual({
|
||||
source: "modelsConfig",
|
||||
tokens: 12_000,
|
||||
});
|
||||
});
|
||||
|
||||
it("normalizes provider aliases when reading models config context windows", () => {
|
||||
const cfg = {
|
||||
models: {
|
||||
|
||||
@@ -23,19 +23,25 @@ export function resolveContextWindowInfo(params: {
|
||||
cfg: OpenClawConfig | undefined;
|
||||
provider: string;
|
||||
modelId: string;
|
||||
modelContextTokens?: number;
|
||||
modelContextWindow?: number;
|
||||
defaultTokens: number;
|
||||
}): ContextWindowInfo {
|
||||
const fromModelsConfig = (() => {
|
||||
const providers = params.cfg?.models?.providers as
|
||||
| Record<string, { models?: Array<{ id?: string; contextWindow?: number }> }>
|
||||
| Record<
|
||||
string,
|
||||
{ models?: Array<{ id?: string; contextTokens?: number; contextWindow?: number }> }
|
||||
>
|
||||
| undefined;
|
||||
const providerEntry = findNormalizedProviderValue(providers, params.provider);
|
||||
const models = Array.isArray(providerEntry?.models) ? providerEntry.models : [];
|
||||
const match = models.find((m) => m?.id === params.modelId);
|
||||
return normalizePositiveInt(match?.contextWindow);
|
||||
return normalizePositiveInt(match?.contextTokens) ?? normalizePositiveInt(match?.contextWindow);
|
||||
})();
|
||||
const fromModel = normalizePositiveInt(params.modelContextWindow);
|
||||
const fromModel =
|
||||
normalizePositiveInt(params.modelContextTokens) ??
|
||||
normalizePositiveInt(params.modelContextWindow);
|
||||
const baseInfo = fromModelsConfig
|
||||
? { tokens: fromModelsConfig, source: "modelsConfig" as const }
|
||||
: fromModel
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import { afterEach, beforeEach, describe, expect, it, vi } from "vitest";
|
||||
|
||||
type DiscoveredModel = { id: string; contextWindow: number };
|
||||
type DiscoveredModel = { id: string; contextWindow?: number; contextTokens?: number };
|
||||
type ContextModule = typeof import("./context.js");
|
||||
|
||||
function mockContextDeps(params: {
|
||||
@@ -120,6 +120,21 @@ describe("lookupContextTokens", () => {
|
||||
);
|
||||
});
|
||||
|
||||
it("prefers config contextTokens over contextWindow on first lookup", async () => {
|
||||
mockContextModuleDeps(() => ({
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
}));
|
||||
|
||||
const { lookupContextTokens } = await importContextModule();
|
||||
expect(lookupContextTokens("gpt-5.4", { allowAsyncLoad: false })).toBe(272_000);
|
||||
});
|
||||
|
||||
it("rehydrates config-backed cache entries after module reload when runtime config survives", async () => {
|
||||
const firstLoadConfigMock = vi.fn(() => ({
|
||||
models: {
|
||||
|
||||
@@ -38,6 +38,16 @@ describe("applyDiscoveredContextWindows", () => {
|
||||
expect(cache.get("github-copilot/gemini-3.1-pro-preview")).toBe(128_000);
|
||||
expect(cache.get("google-gemini-cli/gemini-3.1-pro-preview")).toBe(1_048_576);
|
||||
});
|
||||
|
||||
it("prefers discovered contextTokens over contextWindow", () => {
|
||||
const cache = new Map<string, number>();
|
||||
applyDiscoveredContextWindows({
|
||||
cache,
|
||||
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
|
||||
});
|
||||
|
||||
expect(cache.get("gpt-5.4")).toBe(272_000);
|
||||
});
|
||||
});
|
||||
|
||||
describe("applyConfiguredContextWindows", () => {
|
||||
@@ -107,6 +117,22 @@ describe("applyConfiguredContextWindows", () => {
|
||||
expect(cache.get("custom/model")).toBe(150_000);
|
||||
expect(cache.has("bad/model")).toBe(false);
|
||||
});
|
||||
|
||||
it("prefers configured contextTokens over contextWindow", () => {
|
||||
const cache = new Map<string, number>();
|
||||
applyConfiguredContextWindows({
|
||||
cache,
|
||||
modelsConfig: {
|
||||
providers: {
|
||||
openrouter: {
|
||||
models: [{ id: "custom/model", contextWindow: 1_050_000, contextTokens: 200_000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
});
|
||||
|
||||
expect(cache.get("custom/model")).toBe(200_000);
|
||||
});
|
||||
});
|
||||
|
||||
describe("createSessionManagerRuntimeRegistry", () => {
|
||||
@@ -192,4 +218,23 @@ describe("resolveContextTokensForModel", () => {
|
||||
|
||||
expect(result).toBe(200_000);
|
||||
});
|
||||
|
||||
it("prefers per-model contextTokens config over contextWindow", () => {
|
||||
const result = resolveContextTokensForModel({
|
||||
cfg: {
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 160_000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
provider: "openai-codex",
|
||||
model: "gpt-5.4",
|
||||
fallbackContextTokens: 272_000,
|
||||
});
|
||||
|
||||
expect(result).toBe(160_000);
|
||||
});
|
||||
});
|
||||
|
||||
@@ -13,12 +13,12 @@ import { normalizeProviderId } from "./model-selection.js";
|
||||
|
||||
export { resetContextWindowCacheForTest } from "./context-runtime-state.js";
|
||||
|
||||
type ModelEntry = { id: string; contextWindow?: number };
|
||||
type ModelEntry = { id: string; contextWindow?: number; contextTokens?: number };
|
||||
type ModelRegistryLike = {
|
||||
getAvailable?: () => ModelEntry[];
|
||||
getAll: () => ModelEntry[];
|
||||
};
|
||||
type ConfigModelEntry = { id?: string; contextWindow?: number };
|
||||
type ConfigModelEntry = { id?: string; contextWindow?: number; contextTokens?: number };
|
||||
type ProviderConfigEntry = { models?: ConfigModelEntry[] };
|
||||
type ModelsConfig = { providers?: Record<string, ProviderConfigEntry | undefined> };
|
||||
type AgentModelEntry = { params?: Record<string, unknown> };
|
||||
@@ -40,20 +40,20 @@ export function applyDiscoveredContextWindows(params: {
|
||||
if (!model?.id) {
|
||||
continue;
|
||||
}
|
||||
const contextWindow =
|
||||
typeof model.contextWindow === "number" ? Math.trunc(model.contextWindow) : undefined;
|
||||
if (!contextWindow || contextWindow <= 0) {
|
||||
const contextTokens =
|
||||
typeof model.contextTokens === "number"
|
||||
? Math.trunc(model.contextTokens)
|
||||
: typeof model.contextWindow === "number"
|
||||
? Math.trunc(model.contextWindow)
|
||||
: undefined;
|
||||
if (!contextTokens || contextTokens <= 0) {
|
||||
continue;
|
||||
}
|
||||
const existing = params.cache.get(model.id);
|
||||
// When the same bare model id appears under multiple providers with different
|
||||
// limits, keep the smaller window. This cache feeds both display paths and
|
||||
// runtime paths (flush thresholds, session context-token persistence), so
|
||||
// overestimating the limit could delay compaction and cause context overflow.
|
||||
// Callers that know the active provider should use resolveContextTokensForModel,
|
||||
// which tries the provider-qualified key first and falls back here.
|
||||
if (existing === undefined || contextWindow < existing) {
|
||||
params.cache.set(model.id, contextWindow);
|
||||
// Cache the most conservative effective limit. Provider/runtime callers that
|
||||
// know the active provider should still prefer qualified lookups first.
|
||||
if (existing === undefined || contextTokens < existing) {
|
||||
params.cache.set(model.id, contextTokens);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -72,12 +72,16 @@ export function applyConfiguredContextWindows(params: {
|
||||
}
|
||||
for (const model of provider.models) {
|
||||
const modelId = typeof model?.id === "string" ? model.id : undefined;
|
||||
const contextWindow =
|
||||
typeof model?.contextWindow === "number" ? model.contextWindow : undefined;
|
||||
if (!modelId || !contextWindow || contextWindow <= 0) {
|
||||
const contextTokens =
|
||||
typeof model?.contextTokens === "number"
|
||||
? model.contextTokens
|
||||
: typeof model?.contextWindow === "number"
|
||||
? model.contextWindow
|
||||
: undefined;
|
||||
if (!modelId || !contextTokens || contextTokens <= 0) {
|
||||
continue;
|
||||
}
|
||||
params.cache.set(modelId, contextWindow);
|
||||
params.cache.set(modelId, contextTokens);
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -307,12 +311,12 @@ function resolveProviderModelRef(params: {
|
||||
return { provider, model };
|
||||
}
|
||||
|
||||
// Look up an explicit contextWindow override for a specific provider+model
|
||||
// Look up an explicit runtime context cap for a specific provider+model
|
||||
// directly from config, without going through the shared discovery cache.
|
||||
// This avoids the cache keyspace collision where "provider/model" synthetic
|
||||
// keys overlap with raw slash-containing model IDs (e.g. OpenRouter's
|
||||
// "google/gemini-2.5-pro" stored as a raw catalog entry).
|
||||
function resolveConfiguredProviderContextWindow(
|
||||
function resolveConfiguredProviderContextTokens(
|
||||
cfg: OpenClawConfig | undefined,
|
||||
provider: string,
|
||||
model: string,
|
||||
@@ -324,8 +328,8 @@ function resolveConfiguredProviderContextWindow(
|
||||
|
||||
// Mirror the lookup order in pi-embedded-runner/model.ts: exact key first,
|
||||
// then normalized fallback. This prevents alias collisions from picking the
|
||||
// wrong contextWindow based on Object.entries iteration order.
|
||||
function findContextWindow(matchProviderId: (id: string) => boolean): number | undefined {
|
||||
// wrong configured cap based on Object.entries iteration order.
|
||||
function findContextTokens(matchProviderId: (id: string) => boolean): number | undefined {
|
||||
for (const [providerId, providerConfig] of Object.entries(providers!)) {
|
||||
if (!matchProviderId(providerId)) {
|
||||
continue;
|
||||
@@ -334,13 +338,19 @@ function resolveConfiguredProviderContextWindow(
|
||||
continue;
|
||||
}
|
||||
for (const m of providerConfig.models) {
|
||||
const contextTokens =
|
||||
typeof m?.contextTokens === "number"
|
||||
? m.contextTokens
|
||||
: typeof m?.contextWindow === "number"
|
||||
? m.contextWindow
|
||||
: undefined;
|
||||
if (
|
||||
typeof m?.id === "string" &&
|
||||
m.id === model &&
|
||||
typeof m?.contextWindow === "number" &&
|
||||
m.contextWindow > 0
|
||||
typeof contextTokens === "number" &&
|
||||
contextTokens > 0
|
||||
) {
|
||||
return m.contextWindow;
|
||||
return contextTokens;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -348,14 +358,14 @@ function resolveConfiguredProviderContextWindow(
|
||||
}
|
||||
|
||||
// 1. Exact match (case-insensitive, no alias expansion).
|
||||
const exactResult = findContextWindow((id) => id.trim().toLowerCase() === provider.toLowerCase());
|
||||
const exactResult = findContextTokens((id) => id.trim().toLowerCase() === provider.toLowerCase());
|
||||
if (exactResult !== undefined) {
|
||||
return exactResult;
|
||||
}
|
||||
|
||||
// 2. Normalized fallback: covers alias keys such as "z.ai" → "zai".
|
||||
const normalizedProvider = normalizeProviderId(provider);
|
||||
return findContextWindow((id) => normalizeProviderId(id) === normalizedProvider);
|
||||
return findContextTokens((id) => normalizeProviderId(id) === normalizedProvider);
|
||||
}
|
||||
|
||||
function isAnthropic1MModel(provider: string, model: string): boolean {
|
||||
@@ -399,7 +409,7 @@ export function resolveContextTokensForModel(params: {
|
||||
// window and misreport context limits for the OpenRouter session.
|
||||
// See status.ts log-usage fallback which calls with only { model } set.
|
||||
if (explicitProvider) {
|
||||
const configuredWindow = resolveConfiguredProviderContextWindow(
|
||||
const configuredWindow = resolveConfiguredProviderContextTokens(
|
||||
params.cfg,
|
||||
explicitProvider,
|
||||
ref.model,
|
||||
|
||||
@@ -85,6 +85,11 @@ export function mergeProviderModels(
|
||||
explicitValue: explicitModel.contextWindow,
|
||||
implicitValue: implicitModel.contextWindow,
|
||||
});
|
||||
const contextTokens = resolvePreferredTokenLimit({
|
||||
explicitPresent: "contextTokens" in explicitModel,
|
||||
explicitValue: explicitModel.contextTokens,
|
||||
implicitValue: implicitModel.contextTokens,
|
||||
});
|
||||
const maxTokens = resolvePreferredTokenLimit({
|
||||
explicitPresent: "maxTokens" in explicitModel,
|
||||
explicitValue: explicitModel.maxTokens,
|
||||
@@ -96,6 +101,7 @@ export function mergeProviderModels(
|
||||
input: implicitModel.input,
|
||||
reasoning: "reasoning" in explicitModel ? explicitModel.reasoning : implicitModel.reasoning,
|
||||
...(contextWindow === undefined ? {} : { contextWindow }),
|
||||
...(contextTokens === undefined ? {} : { contextTokens }),
|
||||
...(maxTokens === undefined ? {} : { maxTokens }),
|
||||
};
|
||||
});
|
||||
|
||||
@@ -441,6 +441,7 @@ export async function compactEmbeddedPiSessionDirect(
|
||||
cfg: params.config,
|
||||
provider,
|
||||
modelId,
|
||||
modelContextTokens: runtimeModel.contextTokens,
|
||||
modelContextWindow: runtimeModel.contextWindow,
|
||||
defaultTokens: DEFAULT_CONTEXT_TOKENS,
|
||||
});
|
||||
@@ -1031,6 +1032,7 @@ export async function compactEmbeddedPiSession(
|
||||
cfg: params.config,
|
||||
provider: ceProvider,
|
||||
modelId: ceModelId,
|
||||
modelContextTokens: ceModel?.contextTokens,
|
||||
modelContextWindow: ceModel?.contextWindow,
|
||||
defaultTokens: DEFAULT_CONTEXT_TOKENS,
|
||||
});
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
import type { Api, Model } from "@mariozechner/pi-ai";
|
||||
import type { ExtensionFactory, SessionManager } from "@mariozechner/pi-coding-agent";
|
||||
import type { OpenClawConfig } from "../../config/config.js";
|
||||
import type { ProviderRuntimeModel } from "../../plugins/types.js";
|
||||
import { resolveContextWindowInfo } from "../context-window-guard.js";
|
||||
import { DEFAULT_CONTEXT_TOKENS } from "../defaults.js";
|
||||
import { setCompactionSafeguardRuntime } from "../pi-hooks/compaction-safeguard-runtime.js";
|
||||
@@ -17,12 +17,13 @@ function resolveContextWindowTokens(params: {
|
||||
cfg: OpenClawConfig | undefined;
|
||||
provider: string;
|
||||
modelId: string;
|
||||
model: Model<Api> | undefined;
|
||||
model: ProviderRuntimeModel | undefined;
|
||||
}): number {
|
||||
return resolveContextWindowInfo({
|
||||
cfg: params.cfg,
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
modelContextTokens: params.model?.contextTokens,
|
||||
modelContextWindow: params.model?.contextWindow,
|
||||
defaultTokens: DEFAULT_CONTEXT_TOKENS,
|
||||
}).tokens;
|
||||
@@ -33,7 +34,7 @@ function buildContextPruningFactory(params: {
|
||||
sessionManager: SessionManager;
|
||||
provider: string;
|
||||
modelId: string;
|
||||
model: Model<Api> | undefined;
|
||||
model: ProviderRuntimeModel | undefined;
|
||||
}): ExtensionFactory | undefined {
|
||||
const raw = params.cfg?.agents?.defaults?.contextPruning;
|
||||
if (raw?.mode !== "cache-ttl") {
|
||||
@@ -73,7 +74,7 @@ export function buildEmbeddedExtensionFactories(params: {
|
||||
sessionManager: SessionManager;
|
||||
provider: string;
|
||||
modelId: string;
|
||||
model: Model<Api> | undefined;
|
||||
model: ProviderRuntimeModel | undefined;
|
||||
}): ExtensionFactory[] {
|
||||
const factories: ExtensionFactory[] = [];
|
||||
if (resolveCompactionMode(params.cfg) === "safeguard") {
|
||||
@@ -83,6 +84,7 @@ export function buildEmbeddedExtensionFactories(params: {
|
||||
cfg: params.cfg,
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
modelContextTokens: params.model?.contextTokens,
|
||||
modelContextWindow: params.model?.contextWindow,
|
||||
defaultTokens: DEFAULT_CONTEXT_TOKENS,
|
||||
});
|
||||
|
||||
@@ -205,7 +205,8 @@ function buildDynamicModel(
|
||||
api: "openai-codex-responses",
|
||||
baseUrl: OPENAI_CODEX_BASE_URL,
|
||||
cost: { input: 2.5, output: 15, cacheRead: 0.25, cacheWrite: 0 },
|
||||
contextWindow: 272_000,
|
||||
contextWindow: 1_050_000,
|
||||
contextTokens: 272_000,
|
||||
maxTokens: 128_000,
|
||||
},
|
||||
fallback,
|
||||
|
||||
@@ -70,7 +70,8 @@ export function buildOpenAICodexForwardCompatExpectation(
|
||||
: isGpt54
|
||||
? { input: 2.5, output: 15, cacheRead: 0.25, cacheWrite: 0 }
|
||||
: OPENAI_CODEX_TEMPLATE_MODEL.cost,
|
||||
contextWindow: isGpt54 ? 272_000 : isSpark ? 128_000 : 272000,
|
||||
contextWindow: isGpt54 ? 1_050_000 : isSpark ? 128_000 : 272000,
|
||||
...(isGpt54 ? { contextTokens: 272_000 } : {}),
|
||||
maxTokens: 128000,
|
||||
};
|
||||
}
|
||||
|
||||
@@ -12,6 +12,7 @@ import {
|
||||
runProviderDynamicModel,
|
||||
normalizeProviderResolvedModelWithPlugin,
|
||||
} from "../../plugins/provider-runtime.js";
|
||||
import type { ProviderRuntimeModel } from "../../plugins/types.js";
|
||||
import { resolveOpenClawAgentDir } from "../agent-paths.js";
|
||||
import { DEFAULT_CONTEXT_TOKENS } from "../defaults.js";
|
||||
import { buildModelAliasLines } from "../model-alias-lines.js";
|
||||
@@ -294,12 +295,12 @@ function resolveConfiguredProviderConfig(
|
||||
|
||||
function applyConfiguredProviderOverrides(params: {
|
||||
provider: string;
|
||||
discoveredModel: Model<Api>;
|
||||
discoveredModel: ProviderRuntimeModel;
|
||||
providerConfig?: InlineProviderConfig;
|
||||
modelId: string;
|
||||
cfg?: OpenClawConfig;
|
||||
runtimeHooks?: ProviderRuntimeHooks;
|
||||
}): Model<Api> {
|
||||
}): ProviderRuntimeModel {
|
||||
const { discoveredModel, providerConfig, modelId } = params;
|
||||
if (!providerConfig) {
|
||||
return {
|
||||
@@ -368,6 +369,7 @@ function applyConfiguredProviderOverrides(params: {
|
||||
input: normalizedInput,
|
||||
cost: configuredModel?.cost ?? discoveredModel.cost,
|
||||
contextWindow: configuredModel?.contextWindow ?? discoveredModel.contextWindow,
|
||||
contextTokens: configuredModel?.contextTokens ?? discoveredModel.contextTokens,
|
||||
maxTokens: configuredModel?.maxTokens ?? discoveredModel.maxTokens,
|
||||
headers: requestConfig.headers,
|
||||
compat: configuredModel?.compat ?? discoveredModel.compat,
|
||||
@@ -595,6 +597,7 @@ function resolveConfiguredFallbackModel(params: {
|
||||
configuredModel?.contextWindow ??
|
||||
providerConfig?.models?.[0]?.contextWindow ??
|
||||
DEFAULT_CONTEXT_TOKENS,
|
||||
contextTokens: configuredModel?.contextTokens ?? providerConfig?.models?.[0]?.contextTokens,
|
||||
maxTokens:
|
||||
configuredModel?.maxTokens ??
|
||||
providerConfig?.models?.[0]?.maxTokens ??
|
||||
|
||||
@@ -105,6 +105,7 @@ export function resolveEffectiveRuntimeModel(params: {
|
||||
cfg: params.cfg,
|
||||
provider: params.provider,
|
||||
modelId: params.modelId,
|
||||
modelContextTokens: params.runtimeModel.contextTokens,
|
||||
modelContextWindow: params.runtimeModel.contextWindow,
|
||||
defaultTokens: DEFAULT_CONTEXT_TOKENS,
|
||||
});
|
||||
|
||||
@@ -20,6 +20,25 @@ describe("statusSummaryRuntime.resolveContextTokensForModel", () => {
|
||||
|
||||
expect(contextTokens).toBe(123_456);
|
||||
});
|
||||
|
||||
it("prefers per-model contextTokens over contextWindow", () => {
|
||||
const contextTokens = statusSummaryRuntime.resolveContextTokensForModel({
|
||||
cfg: {
|
||||
models: {
|
||||
providers: {
|
||||
"openai-codex": {
|
||||
models: [{ id: "gpt-5.4", contextWindow: 1_050_000, contextTokens: 272_000 }],
|
||||
},
|
||||
},
|
||||
},
|
||||
} as never,
|
||||
provider: "openai-codex",
|
||||
model: "gpt-5.4",
|
||||
fallbackContextTokens: 999,
|
||||
});
|
||||
|
||||
expect(contextTokens).toBe(272_000);
|
||||
});
|
||||
});
|
||||
|
||||
describe("statusSummaryRuntime.resolveSessionModelRef", () => {
|
||||
|
||||
@@ -99,7 +99,7 @@ function resolveConfiguredStatusModelRef(params: {
|
||||
return { provider: params.defaultProvider, model: params.defaultModel };
|
||||
}
|
||||
|
||||
function resolveConfiguredProviderContextWindow(
|
||||
function resolveConfiguredProviderContextTokens(
|
||||
cfg: OpenClawConfig | undefined,
|
||||
provider: string,
|
||||
model: string,
|
||||
@@ -114,13 +114,19 @@ function resolveConfiguredProviderContextWindow(
|
||||
continue;
|
||||
}
|
||||
for (const entry of providerConfig.models) {
|
||||
const contextTokens =
|
||||
typeof entry?.contextTokens === "number"
|
||||
? entry.contextTokens
|
||||
: typeof entry?.contextWindow === "number"
|
||||
? entry.contextWindow
|
||||
: undefined;
|
||||
if (
|
||||
typeof entry?.id === "string" &&
|
||||
entry.id === model &&
|
||||
typeof entry.contextWindow === "number" &&
|
||||
entry.contextWindow > 0
|
||||
typeof contextTokens === "number" &&
|
||||
contextTokens > 0
|
||||
) {
|
||||
return entry.contextWindow;
|
||||
return contextTokens;
|
||||
}
|
||||
}
|
||||
}
|
||||
@@ -180,13 +186,13 @@ function resolveContextTokensForModel(params: {
|
||||
return params.contextTokensOverride;
|
||||
}
|
||||
if (params.provider && params.model) {
|
||||
const configuredWindow = resolveConfiguredProviderContextWindow(
|
||||
const configuredContextTokens = resolveConfiguredProviderContextTokens(
|
||||
params.cfg,
|
||||
params.provider,
|
||||
params.model,
|
||||
);
|
||||
if (configuredWindow !== undefined) {
|
||||
return configuredWindow;
|
||||
if (configuredContextTokens !== undefined) {
|
||||
return configuredContextTokens;
|
||||
}
|
||||
}
|
||||
return params.fallbackContextTokens ?? DEFAULT_CONTEXT_TOKENS;
|
||||
|
||||
@@ -2318,6 +2318,11 @@ export const GENERATED_BASE_CONFIG_SCHEMA = {
|
||||
type: "number",
|
||||
exclusiveMinimum: 0,
|
||||
},
|
||||
contextTokens: {
|
||||
type: "integer",
|
||||
exclusiveMinimum: 0,
|
||||
maximum: 9007199254740991,
|
||||
},
|
||||
maxTokens: {
|
||||
type: "number",
|
||||
exclusiveMinimum: 0,
|
||||
@@ -22333,7 +22338,7 @@ export const GENERATED_BASE_CONFIG_SCHEMA = {
|
||||
},
|
||||
"models.mode": {
|
||||
label: "Model Catalog Mode",
|
||||
help: 'Controls provider catalog behavior: "merge" keeps built-ins and overlays your custom providers, while "replace" uses only your configured providers. In "merge", matching provider IDs preserve non-empty agent models.json baseUrl values, while apiKey values are preserved only when the provider is not SecretRef-managed in current config/auth-profile context; SecretRef-managed providers refresh apiKey from current source markers, and matching model contextWindow/maxTokens use the higher value between explicit and implicit entries.',
|
||||
help: 'Controls provider catalog behavior: "merge" keeps built-ins and overlays your custom providers, while "replace" uses only your configured providers. In "merge", matching provider IDs preserve non-empty agent models.json baseUrl values, while apiKey values are preserved only when the provider is not SecretRef-managed in current config/auth-profile context; SecretRef-managed providers refresh apiKey from current source markers, matching model contextWindow/maxTokens use the higher value between explicit and implicit entries, and explicit contextTokens runtime caps are preserved.',
|
||||
tags: ["models"],
|
||||
},
|
||||
"models.providers": {
|
||||
|
||||
@@ -60,6 +60,12 @@ export type ModelDefinitionConfig = {
|
||||
cacheWrite: number;
|
||||
};
|
||||
contextWindow: number;
|
||||
/**
|
||||
* Optional effective runtime cap used for compaction/session budgeting.
|
||||
* Keeps provider/native contextWindow metadata intact while letting configs
|
||||
* prefer a smaller practical window.
|
||||
*/
|
||||
contextTokens?: number;
|
||||
maxTokens: number;
|
||||
headers?: Record<string, string>;
|
||||
compat?: ModelCompatConfig;
|
||||
|
||||
@@ -307,6 +307,7 @@ export const ModelDefinitionSchema = z
|
||||
.strict()
|
||||
.optional(),
|
||||
contextWindow: z.number().positive().optional(),
|
||||
contextTokens: z.number().int().positive().optional(),
|
||||
maxTokens: z.number().positive().optional(),
|
||||
headers: z.record(z.string(), z.string()).optional(),
|
||||
compat: ModelCompatSchema,
|
||||
|
||||
@@ -313,7 +313,9 @@ export type ProviderPluginCatalog = {
|
||||
* Runtime hooks below operate on the final `pi-ai` model object after
|
||||
* discovery/override merging, just before inference runs.
|
||||
*/
|
||||
export type ProviderRuntimeModel = Model<Api>;
|
||||
export type ProviderRuntimeModel = Model<Api> & {
|
||||
contextTokens?: number;
|
||||
};
|
||||
|
||||
export type ProviderRuntimeProviderConfig = {
|
||||
baseUrl?: string;
|
||||
|
||||
Reference in New Issue
Block a user