mirror of
https://github.com/moltbot/moltbot.git
synced 2026-03-08 06:54:24 +00:00
docs: detail per-agent prompt caching configuration
This commit is contained in:
@@ -15,13 +15,13 @@ Session pruning trims **old tool results** from the in-memory context right befo
|
||||
- When `mode: "cache-ttl"` is enabled and the last Anthropic call for the session is older than `ttl`.
|
||||
- Only affects the messages sent to the model for that request.
|
||||
- Only active for Anthropic API calls (and OpenRouter Anthropic models).
|
||||
- For best results, match `ttl` to your model `cacheControlTtl`.
|
||||
- For best results, match `ttl` to your model `cacheRetention` policy (`short` = 5m, `long` = 1h).
|
||||
- After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again.
|
||||
|
||||
## Smart defaults (Anthropic)
|
||||
|
||||
- **OAuth or setup-token** profiles: enable `cache-ttl` pruning and set heartbeat to `1h`.
|
||||
- **API key** profiles: enable `cache-ttl` pruning, set heartbeat to `30m`, and default `cacheControlTtl` to `1h` on Anthropic models.
|
||||
- **API key** profiles: enable `cache-ttl` pruning, set heartbeat to `30m`, and default `cacheRetention: "short"` on Anthropic models.
|
||||
- If you set any of these values explicitly, OpenClaw does **not** override them.
|
||||
|
||||
## What this improves (cost + cache behavior)
|
||||
@@ -91,9 +91,7 @@ Default (off):
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: { mode: "off" },
|
||||
},
|
||||
agents: { defaults: { contextPruning: { mode: "off" } } },
|
||||
}
|
||||
```
|
||||
|
||||
@@ -101,9 +99,7 @@ Enable TTL-aware pruning:
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: { mode: "cache-ttl", ttl: "5m" },
|
||||
},
|
||||
agents: { defaults: { contextPruning: { mode: "cache-ttl", ttl: "5m" } } },
|
||||
}
|
||||
```
|
||||
|
||||
@@ -111,10 +107,12 @@ Restrict pruning to specific tools:
|
||||
|
||||
```json5
|
||||
{
|
||||
agent: {
|
||||
contextPruning: {
|
||||
mode: "cache-ttl",
|
||||
tools: { allow: ["exec", "read"], deny: ["*image*"] },
|
||||
agents: {
|
||||
defaults: {
|
||||
contextPruning: {
|
||||
mode: "cache-ttl",
|
||||
tools: { allow: ["exec", "read"], deny: ["*image*"] },
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
@@ -725,7 +725,8 @@ Time format in system prompt. Default: `auto` (OS preference).
|
||||
- Used by the `image` tool path as its vision-model config.
|
||||
- Also used as fallback routing when the selected/default model cannot accept image input.
|
||||
- `model.primary`: format `provider/model` (e.g. `anthropic/claude-opus-4-6`). If you omit the provider, OpenClaw assumes `anthropic` (deprecated).
|
||||
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific: `temperature`, `maxTokens`).
|
||||
- `models`: the configured model catalog and allowlist for `/model`. Each entry can include `alias` (shortcut) and `params` (provider-specific, for example `temperature`, `maxTokens`, `cacheRetention`, `context1m`).
|
||||
- `params` merge precedence (config): `agents.defaults.models["provider/model"].params` is the base, then `agents.list[].params` (matching agent id) overrides by key.
|
||||
- Config writers that mutate these fields (for example `/models set`, `/models set-image`, and fallback add/remove commands) save canonical object form and preserve existing fallback lists when possible.
|
||||
- `maxConcurrent`: max parallel agent runs across sessions (each session still serialized). Default: 1.
|
||||
|
||||
@@ -1050,6 +1051,7 @@ scripts/sandbox-browser-setup.sh # optional browser image
|
||||
workspace: "~/.openclaw/workspace",
|
||||
agentDir: "~/.openclaw/agents/main/agent",
|
||||
model: "anthropic/claude-opus-4-6", // or { primary, fallbacks }
|
||||
params: { cacheRetention: "none" }, // overrides matching defaults.models params by key
|
||||
identity: {
|
||||
name: "Samantha",
|
||||
theme: "helpful sloth",
|
||||
@@ -1074,6 +1076,7 @@ scripts/sandbox-browser-setup.sh # optional browser image
|
||||
- `id`: stable agent id (required).
|
||||
- `default`: when multiple are set, first wins (warning logged). If none set, first list entry is default.
|
||||
- `model`: string form overrides `primary` only; object form `{ primary, fallbacks }` overrides both (`[]` disables global fallbacks). Cron jobs that only override `primary` still inherit default fallbacks unless you set `fallbacks: []`.
|
||||
- `params`: per-agent stream params merged over the selected model entry in `agents.defaults.models`. Use this for agent-specific overrides like `cacheRetention`, `temperature`, or `maxTokens` without duplicating the whole model catalog.
|
||||
- `identity.avatar`: workspace-relative path, `http(s)` URL, or `data:` URI.
|
||||
- `identity` derives defaults: `ackReaction` from `emoji`, `mentionPatterns` from `name`/`emoji`.
|
||||
- `subagents.allowAgents`: allowlist of agent ids for `sessions_spawn` (`["*"]` = any; default: same agent only).
|
||||
|
||||
@@ -67,6 +67,42 @@ Use the `cacheRetention` parameter in your model config:
|
||||
|
||||
When using Anthropic API Key authentication, OpenClaw automatically applies `cacheRetention: "short"` (5-minute cache) for all Anthropic models. You can override this by explicitly setting `cacheRetention` in your config.
|
||||
|
||||
### Per-agent cacheRetention overrides
|
||||
|
||||
Use model-level params as your baseline, then override specific agents via `agents.list[].params`.
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "anthropic/claude-opus-4-6" },
|
||||
models: {
|
||||
"anthropic/claude-opus-4-6": {
|
||||
params: { cacheRetention: "long" }, // baseline for most agents
|
||||
},
|
||||
},
|
||||
},
|
||||
list: [
|
||||
{ id: "research", default: true },
|
||||
{ id: "alerts", params: { cacheRetention: "none" } }, // override for this agent only
|
||||
],
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Config merge order for cache-related params:
|
||||
|
||||
1. `agents.defaults.models["provider/model"].params`
|
||||
2. `agents.list[].params` (matching `id`, overrides by key)
|
||||
|
||||
This lets one agent keep a long-lived cache while another agent on the same model disables caching to avoid write costs on bursty/low-reuse traffic.
|
||||
|
||||
### Bedrock Claude notes
|
||||
|
||||
- Anthropic Claude models on Bedrock (`amazon-bedrock/*anthropic.claude*`) accept `cacheRetention` pass-through when configured.
|
||||
- Non-Anthropic Bedrock models are forced to `cacheRetention: "none"` at runtime.
|
||||
- Anthropic API-key smart defaults also seed `cacheRetention: "short"` for Claude-on-Bedrock model refs when no explicit value is set.
|
||||
|
||||
### Legacy parameter
|
||||
|
||||
The older `cacheControlTtl` parameter is still supported for backwards compatibility:
|
||||
|
||||
@@ -88,6 +88,9 @@ Heartbeat can keep the cache **warm** across idle gaps. If your model cache TTL
|
||||
is `1h`, setting the heartbeat interval just under that (e.g., `55m`) can avoid
|
||||
re-caching the full prompt, reducing cache write costs.
|
||||
|
||||
In multi-agent setups, you can keep one shared model config and tune cache behavior
|
||||
per agent with `agents.list[].params.cacheRetention`.
|
||||
|
||||
For Anthropic API pricing, cache reads are significantly cheaper than input
|
||||
tokens, while cache writes are billed at a higher multiplier. See Anthropic’s
|
||||
prompt caching pricing for the latest rates and TTL multipliers:
|
||||
@@ -108,6 +111,30 @@ agents:
|
||||
every: "55m"
|
||||
```
|
||||
|
||||
### Example: mixed traffic with per-agent cache strategy
|
||||
|
||||
```yaml
|
||||
agents:
|
||||
defaults:
|
||||
model:
|
||||
primary: "anthropic/claude-opus-4-6"
|
||||
models:
|
||||
"anthropic/claude-opus-4-6":
|
||||
params:
|
||||
cacheRetention: "long" # default baseline for most agents
|
||||
list:
|
||||
- id: "research"
|
||||
default: true
|
||||
heartbeat:
|
||||
every: "55m" # keep long cache warm for deep sessions
|
||||
- id: "alerts"
|
||||
params:
|
||||
cacheRetention: "none" # avoid cache writes for bursty notifications
|
||||
```
|
||||
|
||||
`agents.list[].params` merges on top of the selected model's `params`, so you can
|
||||
override only `cacheRetention` and inherit other model defaults unchanged.
|
||||
|
||||
### Example: enable Anthropic 1M context beta header
|
||||
|
||||
Anthropic's 1M context window is currently beta-gated. OpenClaw can inject the
|
||||
|
||||
Reference in New Issue
Block a user