docs: refresh status cache fallback refs

2026-04-21 05:32:53 +00:00 · 2026-04-04 12:39:02 +01:00
parent ac254f50e8
commit c0d509e794
6 changed files with 21 additions and 0 deletions
--- a/docs/cli/status.md
+++ b/docs/cli/status.md
@@ -22,6 +22,7 @@ Notes:
 - `--deep` runs live probes (WhatsApp Web + Telegram + Discord + Slack + Signal).
 - `--usage` prints normalized provider usage windows as `X% left`.
 - MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, so OpenClaw inverts them before display; count-based fields win when present. `model_remains` responses prefer the chat-model entry, derive the window label from timestamps when needed, and include the model name in the plan label.
+- When the current session snapshot is missing cache counters, `/status` can backfill `cacheRead` / `cacheWrite` from the most recent transcript usage log. Existing nonzero live cache values still win over transcript fallback values.
 - Output includes per-agent session stores when multiple agents are configured.
 - Overview includes Gateway + node host service install/runtime status when available.
 - Overview includes update channel + git SHA (for source checkouts).
--- a/docs/concepts/usage-tracking.md
+++ b/docs/concepts/usage-tracking.md
@@ -14,6 +14,9 @@ title: "Usage Tracking"
 - No estimated costs; only the provider-reported windows.
 - Human-readable status output is normalized to `X% left`, even when an
  upstream API reports consumed quota, remaining quota, or only raw counts.
+- Session-level `/status` cache counters can fall back to the latest transcript
+  usage entry when the live session snapshot is missing `cacheRead` /
+  `cacheWrite`. Existing nonzero live cache values still win.

 ## Where it shows up

--- a/docs/reference/api-usage-costs.md
+++ b/docs/reference/api-usage-costs.md
@@ -18,6 +18,9 @@ OpenClaw features that can generate provider usage or paid API calls.

 - `/status` shows the current session model, context usage, and last response tokens.
 - If the model uses **API-key auth**, `/status` also shows **estimated cost** for the last reply.
+- If live session metadata is missing cache counters, `/status` can recover
+  `cacheRead` / `cacheWrite` from the latest transcript usage entry. Existing
+  nonzero live cache values still take precedence.

 **Per-message cost footer**

--- a/docs/reference/prompt-caching.md
+++ b/docs/reference/prompt-caching.md
@@ -11,6 +11,11 @@ read_when:

 Prompt caching means the model provider can reuse unchanged prompt prefixes (usually system/developer instructions and other stable context) across turns instead of re-processing them every time. OpenClaw normalizes provider usage into `cacheRead` and `cacheWrite` where the upstream API exposes those counters directly.

+Status surfaces can also recover cache counters from the most recent transcript
+usage log when the live session snapshot is missing them, so `/status` can keep
+showing a cache line after partial session metadata loss. Existing nonzero live
+cache values still take precedence over transcript fallback values.
+
 Why this matters: lower token cost, faster responses, and more predictable performance for long-running sessions. Without caching, repeated prompts pay the full prompt cost on every turn even when most input did not change.

 This page covers all cache-related knobs that affect prompt reuse and token cost.
@@ -195,6 +200,10 @@ agents:

 OpenClaw exposes dedicated cache-trace diagnostics for embedded agent runs.

+For normal user-facing diagnostics, `/status` and other usage summaries can use
+the latest transcript usage entry as a fallback source for `cacheRead` /
+`cacheWrite` when the live session entry does not have those counters.
+
 ## Live regression tests

 OpenClaw keeps one combined live cache regression gate for repeated prefixes, tool turns, image turns, MCP-style tool transcripts, and an Anthropic no-cache control.
--- a/docs/reference/token-use.md
+++ b/docs/reference/token-use.md
@@ -68,6 +68,10 @@ field names do not change `/status`, `/usage`, or session summaries.
 Gemini CLI JSON usage is normalized too: reply text comes from `response`, and
 `stats.cached` maps to `cacheRead` with `stats.input_tokens - stats.cached`
 used when the CLI omits an explicit `stats.input` field.
+When the current session snapshot is missing cache counters, `/status` can also
+recover `cacheRead` / `cacheWrite` from the most recent transcript usage log.
+Existing nonzero live cache values still take precedence over transcript
+fallback values.

 ## Cost estimation (when shown)

--- a/docs/tools/slash-commands.md
+++ b/docs/tools/slash-commands.md
@@ -183,6 +183,7 @@ of treating `/tools` as a static catalog.
 ## Usage surfaces (what shows where)

 - **Provider usage/quota** (example: “Claude 80% left”) shows up in `/status` for the current model provider when usage tracking is enabled. OpenClaw normalizes provider windows to `% left`; for MiniMax, remaining-only percent fields are inverted before display, and `model_remains` responses prefer the chat-model entry plus a model-tagged plan label.
+- **Cache usage line** in `/status` can fall back to the latest transcript usage entry when the live session snapshot is missing `cacheRead` / `cacheWrite`. Existing nonzero live cache values still win.
 - **Per-response tokens/cost** is controlled by `/usage off|tokens|full` (appended to normal replies).
 - `/model status` is about **models/auth/endpoints**, not usage.