mirror of
https://github.com/moltbot/moltbot.git
synced 2026-04-21 05:32:53 +00:00
test: add live cache provider probes
This commit is contained in:
@@ -26,7 +26,7 @@ openclaw models scan
|
||||
|
||||
`openclaw models status` shows the resolved default/fallbacks plus an auth overview.
|
||||
When provider usage snapshots are available, the OAuth/token status section includes
|
||||
provider usage headers.
|
||||
provider usage windows and quota snapshots.
|
||||
Add `--probe` to run live auth probes against each configured provider profile.
|
||||
Probes are real requests (may consume tokens and trigger rate limits).
|
||||
Use `--agent <id>` to inspect a configured agent’s model/auth state. When omitted,
|
||||
|
||||
@@ -9,14 +9,18 @@ read_when:
|
||||
|
||||
# Prompt caching
|
||||
|
||||
Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. The first matching request writes cache tokens (`cacheWrite`), and later matching requests can read them back (`cacheRead`).
|
||||
Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. OpenClaw normalizes provider usage into `cacheRead` and `cacheWrite` where the upstream API exposes those counters directly.
|
||||
|
||||
Why this matters: lower token cost, faster responses, and more predictable performance for long-running sessions. Without caching, repeated prompts pay the full prompt cost on every turn even when most input did not change.
|
||||
|
||||
This page covers all cache-related knobs that affect prompt reuse and token cost.
|
||||
|
||||
For Anthropic pricing details, see:
|
||||
[https://docs.anthropic.com/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/docs/build-with-claude/prompt-caching)
|
||||
Provider references:
|
||||
|
||||
- Anthropic prompt caching: [https://platform.claude.com/docs/en/build-with-claude/prompt-caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
|
||||
- OpenAI prompt caching: [https://developers.openai.com/api/docs/guides/prompt-caching](https://developers.openai.com/api/docs/guides/prompt-caching)
|
||||
- OpenAI API headers and request IDs: [https://developers.openai.com/api/reference/overview](https://developers.openai.com/api/reference/overview)
|
||||
- Anthropic request IDs and errors: [https://platform.claude.com/docs/en/api/errors](https://platform.claude.com/docs/en/api/errors)
|
||||
|
||||
## Primary knobs
|
||||
|
||||
@@ -100,6 +104,16 @@ Per-agent heartbeat is supported at `agents.list[].heartbeat`.
|
||||
|
||||
- `cacheRetention` is supported.
|
||||
- With Anthropic API-key auth profiles, OpenClaw seeds `cacheRetention: "short"` for Anthropic model refs when unset.
|
||||
- Anthropic native Messages responses expose both `cache_read_input_tokens` and `cache_creation_input_tokens`, so OpenClaw can show both `cacheRead` and `cacheWrite`.
|
||||
- For native Anthropic requests, `cacheRetention: "short"` maps to the default 5-minute ephemeral cache, and `cacheRetention: "long"` upgrades to the 1-hour TTL only on direct `api.anthropic.com` hosts.
|
||||
|
||||
### OpenAI (direct API)
|
||||
|
||||
- Prompt caching is automatic on supported recent models. OpenClaw does not need to inject block-level cache markers.
|
||||
- OpenClaw uses `prompt_cache_key` to keep cache routing stable across turns and uses `prompt_cache_retention: "24h"` only when `cacheRetention: "long"` is selected on direct OpenAI hosts.
|
||||
- OpenAI responses expose cached prompt tokens via `usage.prompt_tokens_details.cached_tokens` (or `input_tokens_details.cached_tokens` on Responses API events). OpenClaw maps that to `cacheRead`.
|
||||
- OpenAI does not expose a separate cache-write token counter, so `cacheWrite` stays `0` on OpenAI paths even when the provider is warming a cache.
|
||||
- OpenAI returns useful tracing and rate-limit headers such as `x-request-id`, `openai-processing-ms`, and `x-ratelimit-*`, but cache-hit accounting should come from the usage payload, not from headers.
|
||||
|
||||
### Amazon Bedrock
|
||||
|
||||
@@ -180,10 +194,15 @@ Defaults:
|
||||
|
||||
- Cache trace events are JSONL and include staged snapshots like `session:loaded`, `prompt:before`, `stream:context`, and `session:after`.
|
||||
- Per-turn cache token impact is visible in normal usage surfaces via `cacheRead` and `cacheWrite` (for example `/usage full` and session usage summaries).
|
||||
- For Anthropic, expect both `cacheRead` and `cacheWrite` when caching is active.
|
||||
- For OpenAI, expect `cacheRead` on cache hits and `cacheWrite` to remain `0`; OpenAI does not publish a separate cache-write token field.
|
||||
- If you need request tracing, log request IDs and rate-limit headers separately from cache metrics. OpenClaw's current cache-trace output is focused on prompt/session shape and normalized token usage rather than raw provider response headers.
|
||||
|
||||
## Quick troubleshooting
|
||||
|
||||
- High `cacheWrite` on most turns: check for volatile system-prompt inputs and verify model/provider supports your cache settings.
|
||||
- High `cacheWrite` on Anthropic: often means the cache breakpoint is landing on content that changes every request.
|
||||
- Low OpenAI `cacheRead`: verify the stable prefix is at the front, the repeated prefix is at least 1024 tokens, and the same `prompt_cache_key` is reused for turns that should share a cache.
|
||||
- No effect from `cacheRetention`: confirm model key matches `agents.defaults.models["provider/model"]`.
|
||||
- Bedrock Nova/Mistral requests with cache settings: expected runtime force to `none`.
|
||||
|
||||
|
||||
194
src/agents/live-cache-test-support.ts
Normal file
194
src/agents/live-cache-test-support.ts
Normal file
@@ -0,0 +1,194 @@
|
||||
import { completeSimple, type Api, type AssistantMessage, type Model } from "@mariozechner/pi-ai";
|
||||
import { loadConfig } from "../config/config.js";
|
||||
import { isTruthyEnvValue } from "../infra/env.js";
|
||||
import { resolveOpenClawAgentDir } from "./agent-paths.js";
|
||||
import { collectProviderApiKeys } from "./live-auth-keys.js";
|
||||
import { isLiveTestEnabled } from "./live-test-helpers.js";
|
||||
import { getApiKeyForModel, requireApiKey } from "./model-auth.js";
|
||||
import { normalizeProviderId, parseModelRef } from "./model-selection.js";
|
||||
import { ensureOpenClawModelsJson } from "./models-config.js";
|
||||
import { discoverAuthStorage, discoverModels } from "./pi-model-discovery.js";
|
||||
|
||||
export const LIVE_CACHE_TEST_ENABLED =
|
||||
isLiveTestEnabled() && isTruthyEnvValue(process.env.OPENCLAW_LIVE_CACHE_TEST);
|
||||
|
||||
const DEFAULT_HEARTBEAT_MS = 20_000;
|
||||
const DEFAULT_TIMEOUT_MS = 90_000;
|
||||
|
||||
type LiveResolvedModel = {
|
||||
apiKey: string;
|
||||
model: Model<Api>;
|
||||
};
|
||||
|
||||
function toInt(value: string | undefined, fallback: number): number {
|
||||
const trimmed = value?.trim();
|
||||
if (!trimmed) {
|
||||
return fallback;
|
||||
}
|
||||
const parsed = Number.parseInt(trimmed, 10);
|
||||
return Number.isFinite(parsed) ? parsed : fallback;
|
||||
}
|
||||
|
||||
export function logLiveCache(message: string): void {
|
||||
process.stderr.write(`[live-cache] ${message}\n`);
|
||||
}
|
||||
|
||||
export async function withLiveCacheHeartbeat<T>(
|
||||
operation: Promise<T>,
|
||||
context: string,
|
||||
): Promise<T> {
|
||||
const heartbeatMs = Math.max(
|
||||
1_000,
|
||||
toInt(process.env.OPENCLAW_LIVE_HEARTBEAT_MS, DEFAULT_HEARTBEAT_MS),
|
||||
);
|
||||
const startedAt = Date.now();
|
||||
let heartbeatCount = 0;
|
||||
const timer = setInterval(() => {
|
||||
heartbeatCount += 1;
|
||||
logLiveCache(
|
||||
`${context}: still running (${Math.max(1, Math.round((Date.now() - startedAt) / 1_000))}s)`,
|
||||
);
|
||||
}, heartbeatMs);
|
||||
timer.unref?.();
|
||||
try {
|
||||
return await operation;
|
||||
} finally {
|
||||
clearInterval(timer);
|
||||
if (heartbeatCount > 0) {
|
||||
logLiveCache(
|
||||
`${context}: completed (${Math.max(1, Math.round((Date.now() - startedAt) / 1_000))}s)`,
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export async function completeSimpleWithLiveTimeout<TApi extends Api>(
|
||||
model: Model<TApi>,
|
||||
context: Parameters<typeof completeSimple<TApi>>[1],
|
||||
options: Parameters<typeof completeSimple<TApi>>[2],
|
||||
progressContext: string,
|
||||
timeoutMs = Math.max(
|
||||
1_000,
|
||||
toInt(process.env.OPENCLAW_LIVE_MODEL_TIMEOUT_MS, DEFAULT_TIMEOUT_MS),
|
||||
),
|
||||
): Promise<AssistantMessage> {
|
||||
const controller = new AbortController();
|
||||
const abortTimer = setTimeout(() => controller.abort(), timeoutMs);
|
||||
abortTimer.unref?.();
|
||||
let hardTimer: ReturnType<typeof setTimeout> | undefined;
|
||||
const timeout = new Promise<never>((_, reject) => {
|
||||
hardTimer = setTimeout(() => {
|
||||
reject(new Error(`${progressContext} timed out after ${timeoutMs}ms`));
|
||||
}, timeoutMs);
|
||||
hardTimer.unref?.();
|
||||
});
|
||||
try {
|
||||
return await withLiveCacheHeartbeat(
|
||||
Promise.race([
|
||||
completeSimple(model, context, {
|
||||
...options,
|
||||
signal: controller.signal,
|
||||
}),
|
||||
timeout,
|
||||
]),
|
||||
progressContext,
|
||||
);
|
||||
} finally {
|
||||
clearTimeout(abortTimer);
|
||||
if (hardTimer) {
|
||||
clearTimeout(hardTimer);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
export function buildStableCachePrefix(tag: string, sections = 160): string {
|
||||
const lines = [
|
||||
`Stable cache prefix for ${tag}.`,
|
||||
"Preserve this prefix byte-for-byte across retries.",
|
||||
"Return only the requested marker from the final user message.",
|
||||
];
|
||||
for (let index = 0; index < sections; index += 1) {
|
||||
lines.push(
|
||||
`Section ${index + 1}: deterministic cache prose with repeated lexical material about routing, invariants, transcript stability, prefix locality, provider usage accounting, and session affinity.`,
|
||||
);
|
||||
}
|
||||
return lines.join("\n");
|
||||
}
|
||||
|
||||
export function extractAssistantText(message: AssistantMessage): string {
|
||||
return message.content
|
||||
.filter((block) => block.type === "text")
|
||||
.map((block) => block.text.trim())
|
||||
.filter(Boolean)
|
||||
.join(" ");
|
||||
}
|
||||
|
||||
export function computeCacheHitRate(usage: {
|
||||
input?: number;
|
||||
cacheRead?: number;
|
||||
cacheWrite?: number;
|
||||
}): number {
|
||||
const input = usage.input ?? 0;
|
||||
const cacheRead = usage.cacheRead ?? 0;
|
||||
const cacheWrite = usage.cacheWrite ?? 0;
|
||||
const totalPrompt = input + cacheRead + cacheWrite;
|
||||
if (totalPrompt <= 0 || cacheRead <= 0) {
|
||||
return 0;
|
||||
}
|
||||
return cacheRead / totalPrompt;
|
||||
}
|
||||
|
||||
export async function resolveLiveDirectModel(params: {
|
||||
provider: "anthropic" | "openai";
|
||||
api: "anthropic-messages" | "openai-responses";
|
||||
envVar: string;
|
||||
preferredModelIds: readonly string[];
|
||||
}): Promise<LiveResolvedModel> {
|
||||
const cfg = loadConfig();
|
||||
await ensureOpenClawModelsJson(cfg);
|
||||
const agentDir = resolveOpenClawAgentDir();
|
||||
const authStorage = discoverAuthStorage(agentDir);
|
||||
const models = discoverModels(authStorage, agentDir).getAll();
|
||||
|
||||
const rawModel = process.env[params.envVar]?.trim();
|
||||
const parsed = rawModel ? parseModelRef(rawModel, params.provider) : null;
|
||||
const candidates = models.filter(
|
||||
(model) => normalizeProviderId(model.provider) === params.provider && model.api === params.api,
|
||||
);
|
||||
|
||||
let resolvedModel: Model<Api> | undefined;
|
||||
if (parsed) {
|
||||
resolvedModel = candidates.find(
|
||||
(model) =>
|
||||
normalizeProviderId(model.provider) === parsed.provider && model.id === parsed.model,
|
||||
);
|
||||
}
|
||||
if (!resolvedModel) {
|
||||
resolvedModel = params.preferredModelIds
|
||||
.map((id) => candidates.find((model) => model.id === id))
|
||||
.find(Boolean);
|
||||
}
|
||||
if (!resolvedModel) {
|
||||
throw new Error(
|
||||
rawModel
|
||||
? `Model not found for ${params.provider}: ${rawModel}`
|
||||
: `No ${params.provider} ${params.api} model available in registry.`,
|
||||
);
|
||||
}
|
||||
|
||||
const liveKeys = collectProviderApiKeys(params.provider);
|
||||
const apiKey =
|
||||
liveKeys[0] ??
|
||||
requireApiKey(
|
||||
await getApiKeyForModel({
|
||||
model: resolvedModel,
|
||||
cfg,
|
||||
agentDir,
|
||||
}),
|
||||
resolvedModel.provider,
|
||||
);
|
||||
return {
|
||||
model: resolvedModel,
|
||||
apiKey,
|
||||
};
|
||||
}
|
||||
234
src/agents/pi-embedded-runner.cache.live.test.ts
Normal file
234
src/agents/pi-embedded-runner.cache.live.test.ts
Normal file
@@ -0,0 +1,234 @@
|
||||
import type { AssistantMessage } from "@mariozechner/pi-ai";
|
||||
import { beforeAll, describe, expect, it } from "vitest";
|
||||
import {
|
||||
buildStableCachePrefix,
|
||||
completeSimpleWithLiveTimeout,
|
||||
computeCacheHitRate,
|
||||
extractAssistantText,
|
||||
LIVE_CACHE_TEST_ENABLED,
|
||||
logLiveCache,
|
||||
resolveLiveDirectModel,
|
||||
} from "./live-cache-test-support.js";
|
||||
|
||||
const describeCacheLive = LIVE_CACHE_TEST_ENABLED ? describe : describe.skip;
|
||||
|
||||
const OPENAI_TIMEOUT_MS = 120_000;
|
||||
const ANTHROPIC_TIMEOUT_MS = 120_000;
|
||||
const OPENAI_SESSION_ID = "live-cache-openai-stable-session";
|
||||
const ANTHROPIC_SESSION_ID = "live-cache-anthropic-stable-session";
|
||||
const OPENAI_PREFIX = buildStableCachePrefix("openai");
|
||||
const ANTHROPIC_PREFIX = buildStableCachePrefix("anthropic");
|
||||
|
||||
type CacheRun = {
|
||||
hitRate: number;
|
||||
suffix: string;
|
||||
text: string;
|
||||
usage: AssistantMessage["usage"];
|
||||
};
|
||||
|
||||
async function runOpenAiCacheProbe(params: {
|
||||
apiKey: string;
|
||||
model: Awaited<ReturnType<typeof resolveLiveDirectModel>>["model"];
|
||||
sessionId: string;
|
||||
suffix: string;
|
||||
}): Promise<CacheRun> {
|
||||
const response = await completeSimpleWithLiveTimeout(
|
||||
params.model,
|
||||
{
|
||||
systemPrompt: OPENAI_PREFIX,
|
||||
messages: [
|
||||
{
|
||||
role: "user",
|
||||
content: `Reply with exactly CACHE-OK ${params.suffix}.`,
|
||||
timestamp: Date.now(),
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
apiKey: params.apiKey,
|
||||
cacheRetention: "short",
|
||||
sessionId: params.sessionId,
|
||||
maxTokens: 32,
|
||||
temperature: 0,
|
||||
reasoning: "none",
|
||||
},
|
||||
`openai cache probe ${params.suffix}`,
|
||||
OPENAI_TIMEOUT_MS,
|
||||
);
|
||||
const text = extractAssistantText(response);
|
||||
expect(text.toLowerCase()).toContain(params.suffix.toLowerCase());
|
||||
return {
|
||||
suffix: params.suffix,
|
||||
text,
|
||||
usage: response.usage,
|
||||
hitRate: computeCacheHitRate(response.usage),
|
||||
};
|
||||
}
|
||||
|
||||
async function runAnthropicCacheProbe(params: {
|
||||
apiKey: string;
|
||||
model: Awaited<ReturnType<typeof resolveLiveDirectModel>>["model"];
|
||||
sessionId: string;
|
||||
suffix: string;
|
||||
cacheRetention: "none" | "short" | "long";
|
||||
}): Promise<CacheRun> {
|
||||
const response = await completeSimpleWithLiveTimeout(
|
||||
params.model,
|
||||
{
|
||||
systemPrompt: ANTHROPIC_PREFIX,
|
||||
messages: [
|
||||
{
|
||||
role: "user",
|
||||
content: `Reply with exactly CACHE-OK ${params.suffix}.`,
|
||||
timestamp: Date.now(),
|
||||
},
|
||||
],
|
||||
},
|
||||
{
|
||||
apiKey: params.apiKey,
|
||||
cacheRetention: params.cacheRetention,
|
||||
sessionId: params.sessionId,
|
||||
maxTokens: 32,
|
||||
temperature: 0,
|
||||
},
|
||||
`anthropic cache probe ${params.suffix} (${params.cacheRetention})`,
|
||||
ANTHROPIC_TIMEOUT_MS,
|
||||
);
|
||||
const text = extractAssistantText(response);
|
||||
expect(text.toLowerCase()).toContain(params.suffix.toLowerCase());
|
||||
return {
|
||||
suffix: params.suffix,
|
||||
text,
|
||||
usage: response.usage,
|
||||
hitRate: computeCacheHitRate(response.usage),
|
||||
};
|
||||
}
|
||||
|
||||
describeCacheLive("pi embedded runner prompt caching (live)", () => {
|
||||
describe("openai", () => {
|
||||
let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
|
||||
|
||||
beforeAll(async () => {
|
||||
fixture = await resolveLiveDirectModel({
|
||||
provider: "openai",
|
||||
api: "openai-responses",
|
||||
envVar: "OPENCLAW_LIVE_OPENAI_CACHE_MODEL",
|
||||
preferredModelIds: ["gpt-5.4-mini", "gpt-5.4", "gpt-5.2"],
|
||||
});
|
||||
logLiveCache(`openai model=${fixture.model.provider}/${fixture.model.id}`);
|
||||
}, 120_000);
|
||||
|
||||
it(
|
||||
"hits a high cache-read rate on repeated stable prefixes",
|
||||
async () => {
|
||||
const warmup = await runOpenAiCacheProbe({
|
||||
...fixture,
|
||||
sessionId: OPENAI_SESSION_ID,
|
||||
suffix: "warmup",
|
||||
});
|
||||
logLiveCache(
|
||||
`openai warmup cacheRead=${warmup.usage.cacheRead} input=${warmup.usage.input} rate=${warmup.hitRate.toFixed(3)}`,
|
||||
);
|
||||
|
||||
const hitRuns = [
|
||||
await runOpenAiCacheProbe({
|
||||
...fixture,
|
||||
sessionId: OPENAI_SESSION_ID,
|
||||
suffix: "hit-a",
|
||||
}),
|
||||
await runOpenAiCacheProbe({
|
||||
...fixture,
|
||||
sessionId: OPENAI_SESSION_ID,
|
||||
suffix: "hit-b",
|
||||
}),
|
||||
];
|
||||
|
||||
const bestHit = hitRuns.reduce((best, candidate) =>
|
||||
(candidate.usage.cacheRead ?? 0) > (best.usage.cacheRead ?? 0) ? candidate : best,
|
||||
);
|
||||
logLiveCache(
|
||||
`openai best-hit suffix=${bestHit.suffix} cacheRead=${bestHit.usage.cacheRead} input=${bestHit.usage.input} rate=${bestHit.hitRate.toFixed(3)}`,
|
||||
);
|
||||
|
||||
expect(bestHit.usage.cacheRead ?? 0).toBeGreaterThan(1_024);
|
||||
expect(bestHit.hitRate).toBeGreaterThanOrEqual(0.7);
|
||||
},
|
||||
6 * 60_000,
|
||||
);
|
||||
});
|
||||
|
||||
describe("anthropic", () => {
|
||||
let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
|
||||
|
||||
beforeAll(async () => {
|
||||
fixture = await resolveLiveDirectModel({
|
||||
provider: "anthropic",
|
||||
api: "anthropic-messages",
|
||||
envVar: "OPENCLAW_LIVE_ANTHROPIC_CACHE_MODEL",
|
||||
preferredModelIds: ["claude-sonnet-4-6", "claude-sonnet-4-5", "claude-haiku-3-5"],
|
||||
});
|
||||
logLiveCache(`anthropic model=${fixture.model.provider}/${fixture.model.id}`);
|
||||
}, 120_000);
|
||||
|
||||
it(
|
||||
"writes cache on warmup and reads it back on repeated stable prefixes",
|
||||
async () => {
|
||||
const warmup = await runAnthropicCacheProbe({
|
||||
...fixture,
|
||||
sessionId: ANTHROPIC_SESSION_ID,
|
||||
suffix: "warmup",
|
||||
cacheRetention: "short",
|
||||
});
|
||||
logLiveCache(
|
||||
`anthropic warmup cacheWrite=${warmup.usage.cacheWrite} cacheRead=${warmup.usage.cacheRead} input=${warmup.usage.input} rate=${warmup.hitRate.toFixed(3)}`,
|
||||
);
|
||||
expect(warmup.usage.cacheWrite ?? 0).toBeGreaterThan(0);
|
||||
|
||||
const hitRuns = [
|
||||
await runAnthropicCacheProbe({
|
||||
...fixture,
|
||||
sessionId: ANTHROPIC_SESSION_ID,
|
||||
suffix: "hit-a",
|
||||
cacheRetention: "short",
|
||||
}),
|
||||
await runAnthropicCacheProbe({
|
||||
...fixture,
|
||||
sessionId: ANTHROPIC_SESSION_ID,
|
||||
suffix: "hit-b",
|
||||
cacheRetention: "short",
|
||||
}),
|
||||
];
|
||||
|
||||
const bestHit = hitRuns.reduce((best, candidate) =>
|
||||
(candidate.usage.cacheRead ?? 0) > (best.usage.cacheRead ?? 0) ? candidate : best,
|
||||
);
|
||||
logLiveCache(
|
||||
`anthropic best-hit suffix=${bestHit.suffix} cacheWrite=${bestHit.usage.cacheWrite} cacheRead=${bestHit.usage.cacheRead} input=${bestHit.usage.input} rate=${bestHit.hitRate.toFixed(3)}`,
|
||||
);
|
||||
|
||||
expect(bestHit.usage.cacheRead ?? 0).toBeGreaterThan(1_024);
|
||||
expect(bestHit.hitRate).toBeGreaterThanOrEqual(0.7);
|
||||
},
|
||||
6 * 60_000,
|
||||
);
|
||||
|
||||
it(
|
||||
"does not report meaningful cache activity when retention is disabled",
|
||||
async () => {
|
||||
const disabled = await runAnthropicCacheProbe({
|
||||
...fixture,
|
||||
sessionId: `${ANTHROPIC_SESSION_ID}-disabled`,
|
||||
suffix: "no-cache",
|
||||
cacheRetention: "none",
|
||||
});
|
||||
logLiveCache(
|
||||
`anthropic none cacheWrite=${disabled.usage.cacheWrite} cacheRead=${disabled.usage.cacheRead} input=${disabled.usage.input}`,
|
||||
);
|
||||
|
||||
expect(disabled.usage.cacheRead ?? 0).toBeLessThanOrEqual(32);
|
||||
expect(disabled.usage.cacheWrite ?? 0).toBeLessThanOrEqual(32);
|
||||
},
|
||||
3 * 60_000,
|
||||
);
|
||||
});
|
||||
});
|
||||
93
src/agents/provider-headers.live.test.ts
Normal file
93
src/agents/provider-headers.live.test.ts
Normal file
@@ -0,0 +1,93 @@
|
||||
import { beforeAll, describe, expect, it } from "vitest";
|
||||
import {
|
||||
LIVE_CACHE_TEST_ENABLED,
|
||||
logLiveCache,
|
||||
resolveLiveDirectModel,
|
||||
withLiveCacheHeartbeat,
|
||||
} from "./live-cache-test-support.js";
|
||||
|
||||
const describeLive = LIVE_CACHE_TEST_ENABLED ? describe : describe.skip;
|
||||
|
||||
describeLive("provider response headers (live)", () => {
|
||||
describe("openai", () => {
|
||||
let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
|
||||
|
||||
beforeAll(async () => {
|
||||
fixture = await resolveLiveDirectModel({
|
||||
provider: "openai",
|
||||
api: "openai-responses",
|
||||
envVar: "OPENCLAW_LIVE_OPENAI_CACHE_MODEL",
|
||||
preferredModelIds: ["gpt-5.4-mini", "gpt-5.4", "gpt-5.2"],
|
||||
});
|
||||
}, 120_000);
|
||||
|
||||
it("returns request-id style headers from Responses", async () => {
|
||||
const response = await withLiveCacheHeartbeat(
|
||||
fetch("https://api.openai.com/v1/responses", {
|
||||
method: "POST",
|
||||
headers: {
|
||||
"content-type": "application/json",
|
||||
authorization: `Bearer ${fixture.apiKey}`,
|
||||
},
|
||||
body: JSON.stringify({
|
||||
model: fixture.model.id,
|
||||
input: "Reply with OK.",
|
||||
max_output_tokens: 32,
|
||||
}),
|
||||
}),
|
||||
"openai headers probe",
|
||||
);
|
||||
const bodyText = await response.text();
|
||||
expect(response.ok, bodyText).toBe(true);
|
||||
|
||||
const requestId = response.headers.get("x-request-id");
|
||||
const processingMs = response.headers.get("openai-processing-ms");
|
||||
const rateLimitHeaders = [...response.headers.entries()]
|
||||
.filter(([key]) => key.startsWith("x-ratelimit-"))
|
||||
.map(([key, value]) => `${key}=${value}`);
|
||||
|
||||
logLiveCache(
|
||||
`openai headers x-request-id=${requestId ?? "(missing)"} openai-processing-ms=${processingMs ?? "(missing)"} ${rateLimitHeaders.join(" ")}`.trim(),
|
||||
);
|
||||
expect(requestId).toBeTruthy();
|
||||
}, 120_000);
|
||||
});
|
||||
|
||||
describe("anthropic", () => {
|
||||
let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
|
||||
|
||||
beforeAll(async () => {
|
||||
fixture = await resolveLiveDirectModel({
|
||||
provider: "anthropic",
|
||||
api: "anthropic-messages",
|
||||
envVar: "OPENCLAW_LIVE_ANTHROPIC_CACHE_MODEL",
|
||||
preferredModelIds: ["claude-sonnet-4-6", "claude-sonnet-4-5", "claude-haiku-3-5"],
|
||||
});
|
||||
}, 120_000);
|
||||
|
||||
it("returns request-id from Messages", async () => {
|
||||
const response = await withLiveCacheHeartbeat(
|
||||
fetch("https://api.anthropic.com/v1/messages", {
|
||||
method: "POST",
|
||||
headers: {
|
||||
"content-type": "application/json",
|
||||
"x-api-key": fixture.apiKey,
|
||||
"anthropic-version": "2023-06-01",
|
||||
},
|
||||
body: JSON.stringify({
|
||||
model: fixture.model.id,
|
||||
max_tokens: 32,
|
||||
messages: [{ role: "user", content: "Reply with OK." }],
|
||||
}),
|
||||
}),
|
||||
"anthropic headers probe",
|
||||
);
|
||||
const bodyText = await response.text();
|
||||
expect(response.ok, bodyText).toBe(true);
|
||||
|
||||
const requestId = response.headers.get("request-id");
|
||||
logLiveCache(`anthropic headers request-id=${requestId ?? "(missing)"}`);
|
||||
expect(requestId).toBeTruthy();
|
||||
}, 120_000);
|
||||
});
|
||||
});
|
||||
Reference in New Issue
Block a user