test: add live cache provider probes

2026-04-21 05:32:53 +00:00 · 2026-04-04 12:46:00 +09:00
parent efefa5560d
commit ca99ad0af8
5 changed files with 544 additions and 4 deletions
--- a/docs/cli/models.md
+++ b/docs/cli/models.md
@@ -26,7 +26,7 @@ openclaw models scan

 `openclaw models status` shows the resolved default/fallbacks plus an auth overview.
 When provider usage snapshots are available, the OAuth/token status section includes
-provider usage headers.
+provider usage windows and quota snapshots.
 Add `--probe` to run live auth probes against each configured provider profile.
 Probes are real requests (may consume tokens and trigger rate limits).
 Use `--agent <id>` to inspect a configured agent’s model/auth state. When omitted,
--- a/docs/reference/prompt-caching.md
+++ b/docs/reference/prompt-caching.md
@@ -9,14 +9,18 @@ read_when:

 # Prompt caching

-Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. The first matching request writes cache tokens (`cacheWrite`), and later matching requests can read them back (`cacheRead`).
+Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. OpenClaw normalizes provider usage into `cacheRead` and `cacheWrite` where the upstream API exposes those counters directly.

 Why this matters: lower token cost, faster responses, and more predictable performance for long-running sessions. Without caching, repeated prompts pay the full prompt cost on every turn even when most input did not change.

 This page covers all cache-related knobs that affect prompt reuse and token cost.

-For Anthropic pricing details, see:
-[https://docs.anthropic.com/docs/build-with-claude/prompt-caching](https://docs.anthropic.com/docs/build-with-claude/prompt-caching)
+Provider references:
+
+- Anthropic prompt caching: [https://platform.claude.com/docs/en/build-with-claude/prompt-caching](https://platform.claude.com/docs/en/build-with-claude/prompt-caching)
+- OpenAI prompt caching: [https://developers.openai.com/api/docs/guides/prompt-caching](https://developers.openai.com/api/docs/guides/prompt-caching)
+- OpenAI API headers and request IDs: [https://developers.openai.com/api/reference/overview](https://developers.openai.com/api/reference/overview)
+- Anthropic request IDs and errors: [https://platform.claude.com/docs/en/api/errors](https://platform.claude.com/docs/en/api/errors)

 ## Primary knobs

@@ -100,6 +104,16 @@ Per-agent heartbeat is supported at `agents.list[].heartbeat`.

 - `cacheRetention` is supported.
 - With Anthropic API-key auth profiles, OpenClaw seeds `cacheRetention: "short"` for Anthropic model refs when unset.
+- Anthropic native Messages responses expose both `cache_read_input_tokens` and `cache_creation_input_tokens`, so OpenClaw can show both `cacheRead` and `cacheWrite`.
+- For native Anthropic requests, `cacheRetention: "short"` maps to the default 5-minute ephemeral cache, and `cacheRetention: "long"` upgrades to the 1-hour TTL only on direct `api.anthropic.com` hosts.
+
+### OpenAI (direct API)
+
+- Prompt caching is automatic on supported recent models. OpenClaw does not need to inject block-level cache markers.
+- OpenClaw uses `prompt_cache_key` to keep cache routing stable across turns and uses `prompt_cache_retention: "24h"` only when `cacheRetention: "long"` is selected on direct OpenAI hosts.
+- OpenAI responses expose cached prompt tokens via `usage.prompt_tokens_details.cached_tokens` (or `input_tokens_details.cached_tokens` on Responses API events). OpenClaw maps that to `cacheRead`.
+- OpenAI does not expose a separate cache-write token counter, so `cacheWrite` stays `0` on OpenAI paths even when the provider is warming a cache.
+- OpenAI returns useful tracing and rate-limit headers such as `x-request-id`, `openai-processing-ms`, and `x-ratelimit-*`, but cache-hit accounting should come from the usage payload, not from headers.

 ### Amazon Bedrock

@@ -180,10 +194,15 @@ Defaults:

 - Cache trace events are JSONL and include staged snapshots like `session:loaded`, `prompt:before`, `stream:context`, and `session:after`.
 - Per-turn cache token impact is visible in normal usage surfaces via `cacheRead` and `cacheWrite` (for example `/usage full` and session usage summaries).
+- For Anthropic, expect both `cacheRead` and `cacheWrite` when caching is active.
+- For OpenAI, expect `cacheRead` on cache hits and `cacheWrite` to remain `0`; OpenAI does not publish a separate cache-write token field.
+- If you need request tracing, log request IDs and rate-limit headers separately from cache metrics. OpenClaw's current cache-trace output is focused on prompt/session shape and normalized token usage rather than raw provider response headers.

 ## Quick troubleshooting

 - High `cacheWrite` on most turns: check for volatile system-prompt inputs and verify model/provider supports your cache settings.
+- High `cacheWrite` on Anthropic: often means the cache breakpoint is landing on content that changes every request.
+- Low OpenAI `cacheRead`: verify the stable prefix is at the front, the repeated prefix is at least 1024 tokens, and the same `prompt_cache_key` is reused for turns that should share a cache.
 - No effect from `cacheRetention`: confirm model key matches `agents.defaults.models["provider/model"]`.
 - Bedrock Nova/Mistral requests with cache settings: expected runtime force to `none`.

--- a/src/agents/live-cache-test-support.ts
+++ b/src/agents/live-cache-test-support.ts
@@ -0,0 +1,194 @@
+import { completeSimple, type Api, type AssistantMessage, type Model } from "@mariozechner/pi-ai";
+import { loadConfig } from "../config/config.js";
+import { isTruthyEnvValue } from "../infra/env.js";
+import { resolveOpenClawAgentDir } from "./agent-paths.js";
+import { collectProviderApiKeys } from "./live-auth-keys.js";
+import { isLiveTestEnabled } from "./live-test-helpers.js";
+import { getApiKeyForModel, requireApiKey } from "./model-auth.js";
+import { normalizeProviderId, parseModelRef } from "./model-selection.js";
+import { ensureOpenClawModelsJson } from "./models-config.js";
+import { discoverAuthStorage, discoverModels } from "./pi-model-discovery.js";
+
+export const LIVE_CACHE_TEST_ENABLED =
+  isLiveTestEnabled() && isTruthyEnvValue(process.env.OPENCLAW_LIVE_CACHE_TEST);
+
+const DEFAULT_HEARTBEAT_MS = 20_000;
+const DEFAULT_TIMEOUT_MS = 90_000;
+
+type LiveResolvedModel = {
+  apiKey: string;
+  model: Model<Api>;
+};
+
+function toInt(value: string | undefined, fallback: number): number {
+  const trimmed = value?.trim();
+  if (!trimmed) {
+    return fallback;
+  }
+  const parsed = Number.parseInt(trimmed, 10);
+  return Number.isFinite(parsed) ? parsed : fallback;
+}
+
+export function logLiveCache(message: string): void {
+  process.stderr.write(`[live-cache] ${message}\n`);
+}
+
+export async function withLiveCacheHeartbeat<T>(
+  operation: Promise<T>,
+  context: string,
+): Promise<T> {
+  const heartbeatMs = Math.max(
+    1_000,
+    toInt(process.env.OPENCLAW_LIVE_HEARTBEAT_MS, DEFAULT_HEARTBEAT_MS),
+  );
+  const startedAt = Date.now();
+  let heartbeatCount = 0;
+  const timer = setInterval(() => {
+    heartbeatCount += 1;
+    logLiveCache(
+      `${context}: still running (${Math.max(1, Math.round((Date.now() - startedAt) / 1_000))}s)`,
+    );
+  }, heartbeatMs);
+  timer.unref?.();
+  try {
+    return await operation;
+  } finally {
+    clearInterval(timer);
+    if (heartbeatCount > 0) {
+      logLiveCache(
+        `${context}: completed (${Math.max(1, Math.round((Date.now() - startedAt) / 1_000))}s)`,
+      );
+    }
+  }
+}
+
+export async function completeSimpleWithLiveTimeout<TApi extends Api>(
+  model: Model<TApi>,
+  context: Parameters<typeof completeSimple<TApi>>[1],
+  options: Parameters<typeof completeSimple<TApi>>[2],
+  progressContext: string,
+  timeoutMs = Math.max(
+    1_000,
+    toInt(process.env.OPENCLAW_LIVE_MODEL_TIMEOUT_MS, DEFAULT_TIMEOUT_MS),
+  ),
+): Promise<AssistantMessage> {
+  const controller = new AbortController();
+  const abortTimer = setTimeout(() => controller.abort(), timeoutMs);
+  abortTimer.unref?.();
+  let hardTimer: ReturnType<typeof setTimeout> | undefined;
+  const timeout = new Promise<never>((_, reject) => {
+    hardTimer = setTimeout(() => {
+      reject(new Error(`${progressContext} timed out after ${timeoutMs}ms`));
+    }, timeoutMs);
+    hardTimer.unref?.();
+  });
+  try {
+    return await withLiveCacheHeartbeat(
+      Promise.race([
+        completeSimple(model, context, {
+          ...options,
+          signal: controller.signal,
+        }),
+        timeout,
+      ]),
+      progressContext,
+    );
+  } finally {
+    clearTimeout(abortTimer);
+    if (hardTimer) {
+      clearTimeout(hardTimer);
+    }
+  }
+}
+
+export function buildStableCachePrefix(tag: string, sections = 160): string {
+  const lines = [
+    `Stable cache prefix for ${tag}.`,
+    "Preserve this prefix byte-for-byte across retries.",
+    "Return only the requested marker from the final user message.",
+  ];
+  for (let index = 0; index < sections; index += 1) {
+    lines.push(
+      `Section ${index + 1}: deterministic cache prose with repeated lexical material about routing, invariants, transcript stability, prefix locality, provider usage accounting, and session affinity.`,
+    );
+  }
+  return lines.join("\n");
+}
+
+export function extractAssistantText(message: AssistantMessage): string {
+  return message.content
+    .filter((block) => block.type === "text")
+    .map((block) => block.text.trim())
+    .filter(Boolean)
+    .join(" ");
+}
+
+export function computeCacheHitRate(usage: {
+  input?: number;
+  cacheRead?: number;
+  cacheWrite?: number;
+}): number {
+  const input = usage.input ?? 0;
+  const cacheRead = usage.cacheRead ?? 0;
+  const cacheWrite = usage.cacheWrite ?? 0;
+  const totalPrompt = input + cacheRead + cacheWrite;
+  if (totalPrompt <= 0 || cacheRead <= 0) {
+    return 0;
+  }
+  return cacheRead / totalPrompt;
+}
+
+export async function resolveLiveDirectModel(params: {
+  provider: "anthropic" | "openai";
+  api: "anthropic-messages" | "openai-responses";
+  envVar: string;
+  preferredModelIds: readonly string[];
+}): Promise<LiveResolvedModel> {
+  const cfg = loadConfig();
+  await ensureOpenClawModelsJson(cfg);
+  const agentDir = resolveOpenClawAgentDir();
+  const authStorage = discoverAuthStorage(agentDir);
+  const models = discoverModels(authStorage, agentDir).getAll();
+
+  const rawModel = process.env[params.envVar]?.trim();
+  const parsed = rawModel ? parseModelRef(rawModel, params.provider) : null;
+  const candidates = models.filter(
+    (model) => normalizeProviderId(model.provider) === params.provider && model.api === params.api,
+  );
+
+  let resolvedModel: Model<Api> | undefined;
+  if (parsed) {
+    resolvedModel = candidates.find(
+      (model) =>
+        normalizeProviderId(model.provider) === parsed.provider && model.id === parsed.model,
+    );
+  }
+  if (!resolvedModel) {
+    resolvedModel = params.preferredModelIds
+      .map((id) => candidates.find((model) => model.id === id))
+      .find(Boolean);
+  }
+  if (!resolvedModel) {
+    throw new Error(
+      rawModel
+        ? `Model not found for ${params.provider}: ${rawModel}`
+        : `No ${params.provider} ${params.api} model available in registry.`,
+    );
+  }
+
+  const liveKeys = collectProviderApiKeys(params.provider);
+  const apiKey =
+    liveKeys[0] ??
+    requireApiKey(
+      await getApiKeyForModel({
+        model: resolvedModel,
+        cfg,
+        agentDir,
+      }),
+      resolvedModel.provider,
+    );
+  return {
+    model: resolvedModel,
+    apiKey,
+  };
+}
--- a/src/agents/pi-embedded-runner.cache.live.test.ts
+++ b/src/agents/pi-embedded-runner.cache.live.test.ts
@@ -0,0 +1,234 @@
+import type { AssistantMessage } from "@mariozechner/pi-ai";
+import { beforeAll, describe, expect, it } from "vitest";
+import {
+  buildStableCachePrefix,
+  completeSimpleWithLiveTimeout,
+  computeCacheHitRate,
+  extractAssistantText,
+  LIVE_CACHE_TEST_ENABLED,
+  logLiveCache,
+  resolveLiveDirectModel,
+} from "./live-cache-test-support.js";
+
+const describeCacheLive = LIVE_CACHE_TEST_ENABLED ? describe : describe.skip;
+
+const OPENAI_TIMEOUT_MS = 120_000;
+const ANTHROPIC_TIMEOUT_MS = 120_000;
+const OPENAI_SESSION_ID = "live-cache-openai-stable-session";
+const ANTHROPIC_SESSION_ID = "live-cache-anthropic-stable-session";
+const OPENAI_PREFIX = buildStableCachePrefix("openai");
+const ANTHROPIC_PREFIX = buildStableCachePrefix("anthropic");
+
+type CacheRun = {
+  hitRate: number;
+  suffix: string;
+  text: string;
+  usage: AssistantMessage["usage"];
+};
+
+async function runOpenAiCacheProbe(params: {
+  apiKey: string;
+  model: Awaited<ReturnType<typeof resolveLiveDirectModel>>["model"];
+  sessionId: string;
+  suffix: string;
+}): Promise<CacheRun> {
+  const response = await completeSimpleWithLiveTimeout(
+    params.model,
+    {
+      systemPrompt: OPENAI_PREFIX,
+      messages: [
+        {
+          role: "user",
+          content: `Reply with exactly CACHE-OK ${params.suffix}.`,
+          timestamp: Date.now(),
+        },
+      ],
+    },
+    {
+      apiKey: params.apiKey,
+      cacheRetention: "short",
+      sessionId: params.sessionId,
+      maxTokens: 32,
+      temperature: 0,
+      reasoning: "none",
+    },
+    `openai cache probe ${params.suffix}`,
+    OPENAI_TIMEOUT_MS,
+  );
+  const text = extractAssistantText(response);
+  expect(text.toLowerCase()).toContain(params.suffix.toLowerCase());
+  return {
+    suffix: params.suffix,
+    text,
+    usage: response.usage,
+    hitRate: computeCacheHitRate(response.usage),
+  };
+}
+
+async function runAnthropicCacheProbe(params: {
+  apiKey: string;
+  model: Awaited<ReturnType<typeof resolveLiveDirectModel>>["model"];
+  sessionId: string;
+  suffix: string;
+  cacheRetention: "none" | "short" | "long";
+}): Promise<CacheRun> {
+  const response = await completeSimpleWithLiveTimeout(
+    params.model,
+    {
+      systemPrompt: ANTHROPIC_PREFIX,
+      messages: [
+        {
+          role: "user",
+          content: `Reply with exactly CACHE-OK ${params.suffix}.`,
+          timestamp: Date.now(),
+        },
+      ],
+    },
+    {
+      apiKey: params.apiKey,
+      cacheRetention: params.cacheRetention,
+      sessionId: params.sessionId,
+      maxTokens: 32,
+      temperature: 0,
+    },
+    `anthropic cache probe ${params.suffix} (${params.cacheRetention})`,
+    ANTHROPIC_TIMEOUT_MS,
+  );
+  const text = extractAssistantText(response);
+  expect(text.toLowerCase()).toContain(params.suffix.toLowerCase());
+  return {
+    suffix: params.suffix,
+    text,
+    usage: response.usage,
+    hitRate: computeCacheHitRate(response.usage),
+  };
+}
+
+describeCacheLive("pi embedded runner prompt caching (live)", () => {
+  describe("openai", () => {
+    let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
+
+    beforeAll(async () => {
+      fixture = await resolveLiveDirectModel({
+        provider: "openai",
+        api: "openai-responses",
+        envVar: "OPENCLAW_LIVE_OPENAI_CACHE_MODEL",
+        preferredModelIds: ["gpt-5.4-mini", "gpt-5.4", "gpt-5.2"],
+      });
+      logLiveCache(`openai model=${fixture.model.provider}/${fixture.model.id}`);
+    }, 120_000);
+
+    it(
+      "hits a high cache-read rate on repeated stable prefixes",
+      async () => {
+        const warmup = await runOpenAiCacheProbe({
+          ...fixture,
+          sessionId: OPENAI_SESSION_ID,
+          suffix: "warmup",
+        });
+        logLiveCache(
+          `openai warmup cacheRead=${warmup.usage.cacheRead} input=${warmup.usage.input} rate=${warmup.hitRate.toFixed(3)}`,
+        );
+
+        const hitRuns = [
+          await runOpenAiCacheProbe({
+            ...fixture,
+            sessionId: OPENAI_SESSION_ID,
+            suffix: "hit-a",
+          }),
+          await runOpenAiCacheProbe({
+            ...fixture,
+            sessionId: OPENAI_SESSION_ID,
+            suffix: "hit-b",
+          }),
+        ];
+
+        const bestHit = hitRuns.reduce((best, candidate) =>
+          (candidate.usage.cacheRead ?? 0) > (best.usage.cacheRead ?? 0) ? candidate : best,
+        );
+        logLiveCache(
+          `openai best-hit suffix=${bestHit.suffix} cacheRead=${bestHit.usage.cacheRead} input=${bestHit.usage.input} rate=${bestHit.hitRate.toFixed(3)}`,
+        );
+
+        expect(bestHit.usage.cacheRead ?? 0).toBeGreaterThan(1_024);
+        expect(bestHit.hitRate).toBeGreaterThanOrEqual(0.7);
+      },
+      6 * 60_000,
+    );
+  });
+
+  describe("anthropic", () => {
+    let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
+
+    beforeAll(async () => {
+      fixture = await resolveLiveDirectModel({
+        provider: "anthropic",
+        api: "anthropic-messages",
+        envVar: "OPENCLAW_LIVE_ANTHROPIC_CACHE_MODEL",
+        preferredModelIds: ["claude-sonnet-4-6", "claude-sonnet-4-5", "claude-haiku-3-5"],
+      });
+      logLiveCache(`anthropic model=${fixture.model.provider}/${fixture.model.id}`);
+    }, 120_000);
+
+    it(
+      "writes cache on warmup and reads it back on repeated stable prefixes",
+      async () => {
+        const warmup = await runAnthropicCacheProbe({
+          ...fixture,
+          sessionId: ANTHROPIC_SESSION_ID,
+          suffix: "warmup",
+          cacheRetention: "short",
+        });
+        logLiveCache(
+          `anthropic warmup cacheWrite=${warmup.usage.cacheWrite} cacheRead=${warmup.usage.cacheRead} input=${warmup.usage.input} rate=${warmup.hitRate.toFixed(3)}`,
+        );
+        expect(warmup.usage.cacheWrite ?? 0).toBeGreaterThan(0);
+
+        const hitRuns = [
+          await runAnthropicCacheProbe({
+            ...fixture,
+            sessionId: ANTHROPIC_SESSION_ID,
+            suffix: "hit-a",
+            cacheRetention: "short",
+          }),
+          await runAnthropicCacheProbe({
+            ...fixture,
+            sessionId: ANTHROPIC_SESSION_ID,
+            suffix: "hit-b",
+            cacheRetention: "short",
+          }),
+        ];
+
+        const bestHit = hitRuns.reduce((best, candidate) =>
+          (candidate.usage.cacheRead ?? 0) > (best.usage.cacheRead ?? 0) ? candidate : best,
+        );
+        logLiveCache(
+          `anthropic best-hit suffix=${bestHit.suffix} cacheWrite=${bestHit.usage.cacheWrite} cacheRead=${bestHit.usage.cacheRead} input=${bestHit.usage.input} rate=${bestHit.hitRate.toFixed(3)}`,
+        );
+
+        expect(bestHit.usage.cacheRead ?? 0).toBeGreaterThan(1_024);
+        expect(bestHit.hitRate).toBeGreaterThanOrEqual(0.7);
+      },
+      6 * 60_000,
+    );
+
+    it(
+      "does not report meaningful cache activity when retention is disabled",
+      async () => {
+        const disabled = await runAnthropicCacheProbe({
+          ...fixture,
+          sessionId: `${ANTHROPIC_SESSION_ID}-disabled`,
+          suffix: "no-cache",
+          cacheRetention: "none",
+        });
+        logLiveCache(
+          `anthropic none cacheWrite=${disabled.usage.cacheWrite} cacheRead=${disabled.usage.cacheRead} input=${disabled.usage.input}`,
+        );
+
+        expect(disabled.usage.cacheRead ?? 0).toBeLessThanOrEqual(32);
+        expect(disabled.usage.cacheWrite ?? 0).toBeLessThanOrEqual(32);
+      },
+      3 * 60_000,
+    );
+  });
+});
--- a/src/agents/provider-headers.live.test.ts
+++ b/src/agents/provider-headers.live.test.ts
@@ -0,0 +1,93 @@
+import { beforeAll, describe, expect, it } from "vitest";
+import {
+  LIVE_CACHE_TEST_ENABLED,
+  logLiveCache,
+  resolveLiveDirectModel,
+  withLiveCacheHeartbeat,
+} from "./live-cache-test-support.js";
+
+const describeLive = LIVE_CACHE_TEST_ENABLED ? describe : describe.skip;
+
+describeLive("provider response headers (live)", () => {
+  describe("openai", () => {
+    let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
+
+    beforeAll(async () => {
+      fixture = await resolveLiveDirectModel({
+        provider: "openai",
+        api: "openai-responses",
+        envVar: "OPENCLAW_LIVE_OPENAI_CACHE_MODEL",
+        preferredModelIds: ["gpt-5.4-mini", "gpt-5.4", "gpt-5.2"],
+      });
+    }, 120_000);
+
+    it("returns request-id style headers from Responses", async () => {
+      const response = await withLiveCacheHeartbeat(
+        fetch("https://api.openai.com/v1/responses", {
+          method: "POST",
+          headers: {
+            "content-type": "application/json",
+            authorization: `Bearer ${fixture.apiKey}`,
+          },
+          body: JSON.stringify({
+            model: fixture.model.id,
+            input: "Reply with OK.",
+            max_output_tokens: 32,
+          }),
+        }),
+        "openai headers probe",
+      );
+      const bodyText = await response.text();
+      expect(response.ok, bodyText).toBe(true);
+
+      const requestId = response.headers.get("x-request-id");
+      const processingMs = response.headers.get("openai-processing-ms");
+      const rateLimitHeaders = [...response.headers.entries()]
+        .filter(([key]) => key.startsWith("x-ratelimit-"))
+        .map(([key, value]) => `${key}=${value}`);
+
+      logLiveCache(
+        `openai headers x-request-id=${requestId ?? "(missing)"} openai-processing-ms=${processingMs ?? "(missing)"} ${rateLimitHeaders.join(" ")}`.trim(),
+      );
+      expect(requestId).toBeTruthy();
+    }, 120_000);
+  });
+
+  describe("anthropic", () => {
+    let fixture: Awaited<ReturnType<typeof resolveLiveDirectModel>>;
+
+    beforeAll(async () => {
+      fixture = await resolveLiveDirectModel({
+        provider: "anthropic",
+        api: "anthropic-messages",
+        envVar: "OPENCLAW_LIVE_ANTHROPIC_CACHE_MODEL",
+        preferredModelIds: ["claude-sonnet-4-6", "claude-sonnet-4-5", "claude-haiku-3-5"],
+      });
+    }, 120_000);
+
+    it("returns request-id from Messages", async () => {
+      const response = await withLiveCacheHeartbeat(
+        fetch("https://api.anthropic.com/v1/messages", {
+          method: "POST",
+          headers: {
+            "content-type": "application/json",
+            "x-api-key": fixture.apiKey,
+            "anthropic-version": "2023-06-01",
+          },
+          body: JSON.stringify({
+            model: fixture.model.id,
+            max_tokens: 32,
+            messages: [{ role: "user", content: "Reply with OK." }],
+          }),
+        }),
+        "anthropic headers probe",
+      );
+      const bodyText = await response.text();
+      expect(response.ok, bodyText).toBe(true);
+
+      const requestId = response.headers.get("request-id");
+      logLiveCache(`anthropic headers request-id=${requestId ?? "(missing)"}`);
+      expect(requestId).toBeTruthy();
+    }, 120_000);
+  });
+});