docs: simplify sessions/memory concept pages and fix QMD experimental label

This commit is contained in:
Vincent Koc
2026-03-30 07:32:00 +09:00
parent 57069f2b2f
commit 143b4c54ba
7 changed files with 336 additions and 1074 deletions

View File

@@ -1,69 +1,50 @@
---
summary: "How OpenClaw compacts long sessions to stay within model context limits"
summary: "How OpenClaw summarizes long conversations to stay within model limits"
read_when:
- You want to understand auto-compaction and /compact
- You are debugging long sessions hitting context limits
- You want to tune compaction behavior or use a custom context engine
title: "Compaction"
---
# Compaction
Every model has a **context window** -- the maximum number of tokens it can see
at once. As a conversation grows, it eventually approaches that limit. OpenClaw
**compacts** older history into a summary so the session can continue without
losing important context.
Every model has a context window -- the maximum number of tokens it can process.
When a conversation approaches that limit, OpenClaw **compacts** older messages
into a summary so the chat can continue.
## How compaction works
## How it works
Compaction is a three-step process:
1. Older conversation turns are summarized into a compact entry.
2. The summary is saved in the session transcript.
3. Recent messages are kept intact.
1. **Summarize** older conversation turns into a compact summary.
2. **Persist** the summary as a `compaction` entry in the session transcript
(JSONL).
3. **Keep** recent messages after the compaction point intact.
After compaction, future turns see the summary plus all messages after the
compaction point. The on-disk transcript retains the full history -- compaction
only changes what gets loaded into the model context.
The full conversation history stays on disk. Compaction only changes what the
model sees on the next turn.
## Auto-compaction
Auto-compaction is **on by default**. It triggers in two situations:
Auto-compaction is on by default. It runs when the session nears the context
limit, or when the model returns a context-overflow error (in which case
OpenClaw compacts and retries).
1. **Threshold maintenance** -- after a successful turn, when estimated context
usage exceeds `contextWindow - reserveTokens`.
2. **Overflow recovery** -- the model returns a context-overflow error. OpenClaw
compacts and retries the request.
When auto-compaction runs you will see:
- `Auto-compaction complete` in verbose mode
- `/status` showing `Compactions: <count>`
### Pre-compaction memory flush
Before compacting, OpenClaw can run a **silent turn** that reminds the model to
write durable notes to disk. This prevents important context from being lost in
the summary. The flush is controlled by `agents.defaults.compaction.memoryFlush`
and runs once per compaction cycle. See [Memory](/concepts/memory) for details.
<Info>
Before compacting, OpenClaw automatically reminds the agent to save important
notes to [memory](/concepts/memory) files. This prevents context loss.
</Info>
## Manual compaction
Use `/compact` in any chat to force a compaction pass. You can optionally add
instructions to guide the summary:
Type `/compact` in any chat to force a compaction. Add instructions to guide
the summary:
```
/compact Focus on decisions and open questions
/compact Focus on the API design decisions
```
## Configuration
## Using a different model
### Compaction model
By default, compaction uses the agent's primary model. You can override this
with a different model for summarization -- useful when your primary model is
small or local and you want a more capable summarizer:
By default, compaction uses your agent's primary model. You can use a more
capable model for better summaries:
```json5
{
@@ -77,106 +58,29 @@ small or local and you want a more capable summarizer:
}
```
### Reserve tokens and floor
- `reserveTokens` -- headroom reserved for prompts and the next model output
(Pi runtime default: `16384`).
- `reserveTokensFloor` -- minimum reserve enforced by OpenClaw (default:
`20000`). Set to `0` to disable.
- `keepRecentTokens` -- how many tokens of recent conversation to preserve
during compaction (default: `20000`).
### Identifier preservation
Compaction summaries preserve opaque identifiers by default
(`identifierPolicy: "strict"`). Override with:
- `"off"` -- no special identifier handling.
- `"custom"` -- provide your own instructions via `identifierInstructions`.
### Memory flush
```json5
{
agents: {
defaults: {
compaction: {
memoryFlush: {
enabled: true, // default
softThresholdTokens: 4000,
systemPrompt: "Session nearing compaction. Store durable memories now.",
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.",
},
},
},
},
}
```
The flush triggers when context usage crosses
`contextWindow - reserveTokensFloor - softThresholdTokens`. It runs silently
(the user sees nothing) and is skipped when the workspace is read-only.
## Compaction vs pruning
| | Compaction | Session pruning |
| ---------------- | ------------------------------ | -------------------------------- |
| **What it does** | Summarizes older conversation | Trims old tool results |
| **Persisted?** | Yes (in JSONL transcript) | No (in-memory only, per request) |
| **Scope** | Entire conversation history | Tool result messages only |
| **Frequency** | Once when threshold is reached | Every LLM call (when enabled) |
| | Compaction | Pruning |
| ---------------- | ----------------------------- | -------------------------------- |
| **What it does** | Summarizes older conversation | Trims old tool results |
| **Saved?** | Yes (in session transcript) | No (in-memory only, per request) |
| **Scope** | Entire conversation | Tool results only |
See [Session Pruning](/concepts/session-pruning) for pruning details.
## OpenAI server-side compaction
OpenClaw also supports OpenAI Responses server-side compaction for compatible
direct OpenAI models. This is separate from local compaction and can run
alongside it:
- **Local compaction** -- OpenClaw summarizes and persists into session JSONL.
- **Server-side compaction** -- OpenAI compacts context on the provider side when
`store` + `context_management` are enabled.
See [OpenAI provider](/providers/openai) for model params and overrides.
## Custom context engines
Compaction behavior is owned by the active
[context engine](/concepts/context-engine). The built-in engine uses the
summarization described above. Plugin engines (selected via
`plugins.slots.contextEngine`) can implement any strategy -- DAG summaries,
vector retrieval, incremental condensation, etc.
When a plugin engine sets `ownsCompaction: true`, OpenClaw delegates all
compaction decisions to the engine and does not run built-in auto-compaction.
When `ownsCompaction` is `false` or unset, the built-in auto-compaction still
runs, but the engine's `compact()` method handles `/compact` and overflow
recovery. If you are building a non-owning engine, implement `compact()` by
calling `delegateCompactionToRuntime(...)` from `openclaw/plugin-sdk/core`.
[Session pruning](/concepts/session-pruning) is a lighter-weight complement that
trims tool output without summarizing.
## Troubleshooting
**Compaction triggers too often?**
**Compacting too often?** The model's context window may be small, or tool
outputs may be large. Try enabling
[session pruning](/concepts/session-pruning).
- Check the model's context window -- small models compact more frequently.
- High `reserveTokens` relative to the context window can trigger early
compaction.
- Large tool outputs accumulate fast. Enable
[session pruning](/concepts/session-pruning) to reduce tool-result buildup.
**Context feels stale after compaction?** Use `/compact Focus on <topic>` to
guide the summary, or enable the [memory flush](/concepts/memory) so notes
survive.
**Context feels stale after compaction?**
**Need a clean slate?** `/new` starts a fresh session without compacting.
- Use `/compact Focus on <topic>` to guide the summary.
- Increase `keepRecentTokens` to preserve more recent conversation.
- Enable the [memory flush](/concepts/memory) so durable notes survive
compaction.
**Need a fresh start?**
- `/new` or `/reset` starts a new session ID without compacting.
For the full internal lifecycle (store schema, transcript structure, Pi runtime
semantics), see
For advanced configuration (reserve tokens, identifier preservation, custom
context engines, OpenAI server-side compaction), see the
[Session Management Deep Dive](/reference/session-management-compaction).

View File

@@ -1,171 +1,92 @@
---
title: "Memory Search"
summary: "How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay"
summary: "How memory search finds relevant notes using embeddings and hybrid retrieval"
read_when:
- You want to understand how memory_search retrieves results
- You want to tune hybrid search, MMR, or temporal decay
- You want to understand how memory_search works
- You want to choose an embedding provider
- You want to tune search quality
---
# Memory Search
OpenClaw indexes workspace memory files (`MEMORY.md` and `memory/*.md`) into
chunks (~400 tokens, 80-token overlap) and searches them with `memory_search`.
This page explains how the search pipeline works and how to tune it. For the
file layout and memory basics, see [Memory](/concepts/memory).
`memory_search` finds relevant notes from your memory files, even when the
wording differs from the original text. It works by indexing memory into small
chunks and searching them using embeddings, keywords, or both.
## Search pipeline
## Quick start
If you have an OpenAI, Gemini, Voyage, or Mistral API key configured, memory
search works automatically. To set a provider explicitly:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "openai", // or "gemini", "local", "ollama", etc.
},
},
},
}
```
For local embeddings with no API key, use `provider: "local"` (requires
node-llama-cpp).
## Supported providers
| Provider | ID | Needs API key | Notes |
| -------- | --------- | ------------- | ----------------------------- |
| OpenAI | `openai` | Yes | Auto-detected, fast |
| Gemini | `gemini` | Yes | Supports image/audio indexing |
| Voyage | `voyage` | Yes | Auto-detected |
| Mistral | `mistral` | Yes | Auto-detected |
| Ollama | `ollama` | No | Local, must set explicitly |
| Local | `local` | No | GGUF model, ~0.6 GB download |
## How search works
OpenClaw runs two retrieval paths in parallel and merges the results:
```
Query -> Embedding -> Vector Search ─┐
├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K
├─> Merge -> Top Results
Query -> Tokenize -> BM25 Search ──┘
```
Both retrieval paths run in parallel when hybrid search is enabled. If either
path is unavailable (no embeddings or no FTS5), the other runs alone.
- **Vector search** finds notes with similar meaning ("gateway host" matches
"the machine running OpenClaw").
- **BM25 keyword search** finds exact matches (IDs, error strings, config
keys).
## Embedding providers
If only one path is available (no embeddings or no FTS), the other runs alone.
The default `memory-core` plugin ships built-in adapters for these providers:
## Improving search quality
| Provider | Adapter ID | Auto-selected | Notes |
| ---------- | ---------- | -------------------- | ----------------------------------- |
| Local GGUF | `local` | Yes (first priority) | node-llama-cpp, ~0.6 GB model |
| OpenAI | `openai` | Yes | `text-embedding-3-small` default |
| Gemini | `gemini` | Yes | Supports multimodal (images, audio) |
| Voyage | `voyage` | Yes | |
| Mistral | `mistral` | Yes | |
| Ollama | `ollama` | No (explicit only) | Local/self-hosted |
Two optional features help when you have a large note history:
Auto-selection picks the first provider whose API key can be resolved. Set
`memorySearch.provider` explicitly to override.
### Temporal decay
Remote embeddings require an API key for the embedding provider. OpenClaw
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth covers chat/completions only and does not satisfy
embedding requests.
Old notes gradually lose ranking weight so recent information surfaces first.
With the default half-life of 30 days, a note from last month scores at 50% of
its original weight. Evergreen files like `MEMORY.md` are never decayed.
### Quick start
<Tip>
Enable temporal decay if your agent has months of daily notes and stale
information keeps outranking recent context.
</Tip>
Enable memory search with OpenAI embeddings:
### MMR (diversity)
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
},
},
},
}
```
Reduces redundant results. If five notes all mention the same router config, MMR
ensures the top results cover different topics instead of repeating.
Or use local embeddings (no API key needed):
<Tip>
Enable MMR if `memory_search` keeps returning near-duplicate snippets from
different daily notes.
</Tip>
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "local",
},
},
},
}
```
Local mode uses node-llama-cpp and may require `pnpm approve-builds` to build
the native addon.
## Hybrid search (BM25 + vector)
When both FTS5 and embeddings are available, OpenClaw combines two retrieval
signals:
- **Vector similarity** -- semantic matching. Good at paraphrases ("Mac Studio
gateway host" vs "the machine running the gateway").
- **BM25 keyword relevance** -- exact token matching. Good at IDs, code symbols,
error strings, and config keys.
### How scores are merged
1. Retrieve a candidate pool from each side (top
`maxResults x candidateMultiplier`).
2. Convert BM25 rank to a 0-1 score: `textScore = 1 / (1 + max(0, bm25Rank))`.
3. Union candidates by chunk ID and compute:
`finalScore = vectorWeight x vectorScore + textWeight x textScore`.
Weights are normalized to 1.0, so they behave as percentages. If either path is
unavailable, the other runs alone with no hard failure.
### CJK support
FTS5 uses configurable trigram tokenization with a short-substring fallback so
Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted
correctly during chunk-size estimation, and surrogate-pair characters are
preserved during fine splits.
## Post-processing
After merging scores, two optional stages refine the result list:
### Temporal decay (recency boost)
Daily notes accumulate over months. Without decay, a well-worded note from six
months ago can outrank yesterday's update on the same topic.
Temporal decay applies an exponential multiplier based on age:
```
decayedScore = score x e^(-lambda x ageInDays)
```
With the default half-life of 30 days:
| Age | Score retained |
| -------- | -------------- |
| Today | 100% |
| 7 days | ~84% |
| 30 days | 50% |
| 90 days | 12.5% |
| 180 days | ~1.6% |
**Evergreen files are never decayed** -- `MEMORY.md` and non-dated files in
`memory/` (like `memory/projects.md`) always rank at full score. Dated daily
files use the date from the filename.
**When to enable:** Your agent has months of daily notes and stale information
outranks recent context.
### MMR re-ranking (diversity)
When search returns results, multiple chunks may contain similar or overlapping
content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance
with diversity.
How it works:
1. Start with the highest-scoring result.
2. Iteratively select the next result that maximizes:
`lambda x relevance - (1 - lambda) x max_similarity_to_already_selected`.
3. Similarity is measured using Jaccard text similarity on tokenized content.
The `lambda` parameter controls the trade-off:
- `1.0` -- pure relevance (no diversity penalty).
- `0.0` -- maximum diversity (ignores relevance).
- Default: `0.7` (balanced, slight relevance bias).
**When to enable:** `memory_search` returns redundant or near-duplicate
snippets, especially with daily notes that repeat similar information.
## Configuration
Both post-processing features and hybrid search weights are configured under
`memorySearch.query.hybrid`:
### Enable both
```json5
{
@@ -174,18 +95,8 @@ Both post-processing features and hybrid search weights are configured under
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4,
mmr: {
enabled: true, // default: false
lambda: 0.7,
},
temporalDecay: {
enabled: true, // default: false
halfLifeDays: 30,
},
mmr: { enabled: true },
temporalDecay: { enabled: true },
},
},
},
@@ -194,55 +105,32 @@ Both post-processing features and hybrid search weights are configured under
}
```
You can enable either feature independently:
## Multimodal memory
- **MMR only** -- many similar notes but age does not matter.
- **Temporal decay only** -- recency matters but results are already diverse.
- **Both** -- recommended for agents with large, long-running daily note
histories.
With Gemini Embedding 2, you can index images and audio files alongside
Markdown. Search queries remain text, but they match against visual and audio
content. See the [Memory configuration reference](/reference/memory-config) for
setup.
## Session memory search (experimental)
## Session memory search
You can optionally index session transcripts and surface them via
`memory_search`. This is gated behind an experimental flag:
```json5
{
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"],
},
},
},
}
```
Session indexing is opt-in and runs asynchronously. Results can be slightly stale
until background sync finishes. Session logs live on disk, so treat filesystem
access as the trust boundary.
You can optionally index session transcripts so `memory_search` can recall
earlier conversations. This is opt-in via
`memorySearch.experimental.sessionMemory`. See the
[configuration reference](/reference/memory-config) for details.
## Troubleshooting
**`memory_search` returns nothing?**
**No results?** Run `openclaw memory status` to check the index. If empty, run
`openclaw memory index --force`.
- Check `openclaw memory status` -- is the index populated?
- Verify an embedding provider is configured and has a valid key.
- Run `openclaw memory index --force` to trigger a full reindex.
**Only keyword matches?** Your embedding provider may not be configured. Check
`openclaw memory status --deep`.
**Results are all keyword matches, no semantic results?**
- Embeddings may not be configured. Check `openclaw memory status --deep`.
- If using `local`, ensure node-llama-cpp built successfully.
**CJK text not found?**
- FTS5 trigram tokenization handles CJK. If results are missing, run
`openclaw memory index --force` to rebuild the FTS index.
**CJK text not found?** Rebuild the FTS index with
`openclaw memory index --force`.
## Further reading
- [Memory](/concepts/memory) -- file layout, backends, tools
- [Memory configuration reference](/reference/memory-config) -- all config knobs
including QMD, batch indexing, embedding cache, sqlite-vec, and multimodal

View File

@@ -1,228 +1,97 @@
---
title: "Memory"
summary: "How OpenClaw memory works -- file layout, backends, search, and automatic flush"
summary: "How OpenClaw remembers things across sessions"
read_when:
- You want the memory file layout and workflow
- You want to understand memory search and backends
- You want to tune the automatic pre-compaction memory flush
- You want to understand how memory works
- You want to know what memory files to write
---
# Memory
OpenClaw memory is **plain Markdown in the agent workspace**. The files are the
source of truth -- the model only "remembers" what gets written to disk.
OpenClaw remembers things by writing **plain Markdown files** in your agent's
workspace. The model only "remembers" what gets saved to disk -- there is no
hidden state.
Memory search tools are provided by the active memory plugin (default:
`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.
## How it works
## File layout
Your agent has two places to store memories:
The default workspace uses two memory layers:
- **`MEMORY.md`** -- long-term memory. Durable facts, preferences, and
decisions. Loaded at the start of every DM session.
- **`memory/YYYY-MM-DD.md`** -- daily notes. Running context and observations.
Today and yesterday's notes are loaded automatically.
| Path | Purpose | Loaded at session start |
| ---------------------- | ------------------------ | -------------------------- |
| `memory/YYYY-MM-DD.md` | Daily log (append-only) | Today + yesterday |
| `MEMORY.md` | Curated long-term memory | Yes (main DM session only) |
These files live in the agent workspace (default `~/.openclaw/workspace`).
If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads
both (deduplicated by realpath so symlinks are not injected twice). `MEMORY.md`
is only loaded in the main, private session -- never in group contexts.
These files live under the agent workspace (`agents.defaults.workspace`, default
`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for
the full layout.
## When to write memory
- **Decisions, preferences, and durable facts** go to `MEMORY.md`.
- **Day-to-day notes and running context** go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," **write it down** (do not keep it in RAM).
- If you want something to stick, **ask the bot to write it** into memory.
<Tip>
If you want your agent to remember something, just ask it: "Remember that I
prefer TypeScript." It will write it to the appropriate file.
</Tip>
## Memory tools
OpenClaw exposes two agent-facing tools:
The agent has two tools for working with memory:
- **`memory_search`** -- semantic recall over indexed snippets. Uses the active
memory backend's search pipeline (vector similarity, keyword matching, or
hybrid).
- **`memory_get`** -- targeted read of a specific Markdown file or line range.
Degrades gracefully when a file does not exist (returns empty text instead of
an error).
- **`memory_search`** -- finds relevant notes using semantic search, even when
the wording differs from the original.
- **`memory_get`** -- reads a specific memory file or line range.
## Memory backends
OpenClaw supports two memory backends that control how `memory_search` indexes
and retrieves content:
### Builtin (default)
The builtin backend uses a per-agent SQLite database with optional extensions:
- **FTS5 full-text search** for keyword matching (BM25 scoring).
- **sqlite-vec** for in-database vector similarity (falls back to in-process
cosine similarity when unavailable).
- **Hybrid search** combining BM25 + vector scores for best-of-both-worlds
retrieval.
- **CJK support** via configurable trigram tokenization with short-substring
fallback.
The builtin backend works out of the box with no extra dependencies. For
embedding vectors, configure an embedding provider (OpenAI, Gemini, Voyage,
Mistral, Ollama, or local GGUF). Without an embedding provider, only keyword
search is available.
Index location: `~/.openclaw/memory/<agentId>.sqlite`
### QMD (experimental)
[QMD](https://github.com/tobi/qmd) is a local-first search sidecar that
combines BM25 + vectors + reranking in a single binary. Set
`memory.backend = "qmd"` to opt in.
Key differences from the builtin backend:
- Runs as a subprocess (Bun + node-llama-cpp), auto-downloads GGUF models.
- Supports advanced post-processing: reranking, query expansion.
- Can index extra directories beyond the workspace (`memory.qmd.paths`).
- Can optionally index session transcripts (`memory.qmd.sessions`).
- Falls back to the builtin backend if QMD is unavailable.
QMD requires a separate install (`bun install -g https://github.com/tobi/qmd`)
and a SQLite build that allows extensions. See the
[Memory configuration reference](/reference/memory-config) for full setup.
Both tools are provided by the active memory plugin (default: `memory-core`).
## Memory search
When an embedding provider is configured, `memory_search` uses semantic vector
search to find relevant notes even when the wording differs from the query.
Hybrid search (BM25 + vector) is enabled by default when both FTS5 and
embeddings are available.
When an embedding provider is configured, `memory_search` uses **hybrid
search** -- combining vector similarity (semantic meaning) with keyword matching
(exact terms like IDs and code symbols). This works out of the box once you have
an API key for any supported provider.
For details on how search works -- embedding providers, hybrid scoring, MMR
diversity re-ranking, temporal decay, and tuning -- see
<Info>
OpenClaw auto-detects your embedding provider from available API keys. If you
have an OpenAI, Gemini, Voyage, or Mistral key configured, memory search is
enabled automatically.
</Info>
For details on how search works, tuning options, and provider setup, see
[Memory Search](/concepts/memory-search).
### Embedding provider auto-selection
## Memory backends
If `memorySearch.provider` is not set, OpenClaw auto-selects the first available
provider in this order:
OpenClaw has two backends for indexing and searching memory:
1. `local` -- if `memorySearch.local.modelPath` is configured and exists.
2. `openai` -- if an OpenAI key can be resolved.
3. `gemini` -- if a Gemini key can be resolved.
4. `voyage` -- if a Voyage key can be resolved.
5. `mistral` -- if a Mistral key can be resolved.
**Builtin (default)** -- uses a per-agent SQLite database. Works out of the box
with no extra dependencies. Supports keyword search, vector similarity, and
hybrid search with CJK support.
If none can be resolved, memory search stays disabled until configured. Ollama
is supported but not auto-selected (set `memorySearch.provider = "ollama"`
explicitly).
**QMD** -- a local-first search sidecar that adds reranking, query expansion,
and the ability to index directories outside the workspace (like project docs or
session transcripts). Set `memory.backend = "qmd"` to switch.
## Additional memory paths
Index Markdown files outside the default workspace layout:
```json5
{
agents: {
defaults: {
memorySearch: {
extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"],
},
},
},
}
```
Paths can be absolute or workspace-relative. Directories are scanned
recursively for `.md` files. Symlinks are ignored.
## Multimodal memory (Gemini)
When using `gemini-embedding-2-preview`, OpenClaw can index image and audio
files from `memorySearch.extraPaths`:
```json5
{
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-2-preview",
extraPaths: ["assets/reference", "voice-notes"],
multimodal: {
enabled: true,
modalities: ["image", "audio"],
},
},
},
},
}
```
Search queries remain text, but Gemini can compare them against indexed
image/audio embeddings. `memory_get` still reads Markdown only.
See the [Memory configuration reference](/reference/memory-config) for supported
formats and limitations.
See the [Memory configuration reference](/reference/memory-config) for backend
setup and all config knobs.
## Automatic memory flush
When a session is close to auto-compaction, OpenClaw runs a **silent turn** that
reminds the model to write durable notes before the context is summarized. This
prevents important information from being lost during compaction.
Before [compaction](/concepts/compaction) summarizes your conversation, OpenClaw
runs a silent turn that reminds the agent to save important context to memory
files. This is on by default -- you do not need to configure anything.
Controlled by `agents.defaults.compaction.memoryFlush`:
<Tip>
The memory flush prevents context loss during compaction. If your agent has
important facts in the conversation that are not yet written to a file, they
will be saved automatically before the summary happens.
</Tip>
```json5
{
agents: {
defaults: {
compaction: {
memoryFlush: {
enabled: true, // default
softThresholdTokens: 4000, // how far below compaction threshold to trigger
},
},
},
},
}
## CLI
```bash
openclaw memory status # Check index status and provider
openclaw memory search "query" # Search from the command line
openclaw memory index --force # Rebuild the index
```
Details:
- **Triggers** when context usage crosses
`contextWindow - reserveTokensFloor - softThresholdTokens`.
- **Runs silently** -- prompts include `NO_REPLY` so nothing is delivered to the
user.
- **Once per compaction cycle** (tracked in `sessions.json`).
- **Skipped** when the workspace is read-only (`workspaceAccess: "ro"` or
`"none"`).
- The active memory plugin owns the flush prompt and path policy. The default
`memory-core` plugin writes to `memory/YYYY-MM-DD.md`.
For the full compaction lifecycle, see [Compaction](/concepts/compaction).
## CLI commands
| Command | Description |
| -------------------------------- | ------------------------------------------ |
| `openclaw memory status` | Show memory index status and provider info |
| `openclaw memory search <query>` | Search memory from the command line |
| `openclaw memory index` | Force a reindex of memory files |
Add `--agent <id>` to target a specific agent, `--deep` for extended
diagnostics, or `--json` for machine-readable output.
See [CLI: memory](/cli/memory) for the full command reference.
## Further reading
- [Memory Search](/concepts/memory-search) -- how search works, hybrid search,
MMR, temporal decay
- [Memory Search](/concepts/memory-search) -- search pipeline, providers, and
tuning
- [Memory configuration reference](/reference/memory-config) -- all config knobs
for providers, QMD, hybrid search, batch indexing, and multimodal
- [Compaction](/concepts/compaction) -- how compaction interacts with memory
flush
- [Session Management Deep Dive](/reference/session-management-compaction) --
internal session and compaction lifecycle

View File

@@ -1,113 +1,54 @@
---
title: "Session Pruning"
summary: "How session pruning trims old tool results to reduce context bloat and improve cache efficiency"
summary: "Trimming old tool results to keep context lean and caching efficient"
read_when:
- You want to reduce LLM context growth from tool outputs
- You are tuning agents.defaults.contextPruning
- You want to reduce context growth from tool outputs
- You want to understand Anthropic prompt cache optimization
---
# Session Pruning
Session pruning trims **old tool results** from the in-memory context before
each LLM call. It does **not** rewrite the on-disk session history (JSONL) --
it only affects what gets sent to the model for that request.
Session pruning trims **old tool results** from the context before each LLM
call. It reduces context bloat from accumulated tool outputs (exec results, file
reads, search results) without touching your conversation messages.
## Why prune
<Info>
Pruning is in-memory only -- it does not modify the on-disk session transcript.
Your full history is always preserved.
</Info>
Long-running sessions accumulate tool outputs (exec results, file reads, search
results). These inflate the context window, increasing cost and eventually
forcing [compaction](/concepts/compaction). Pruning removes stale tool output so
the model sees a leaner context on each turn.
## Why it matters
Pruning is also important for **Anthropic prompt caching**. When a session goes
idle past the cache TTL, the next request re-caches the full prompt. Pruning
reduces the cache-write size for that first post-TTL request, which directly
reduces cost.
Long sessions accumulate tool output that inflates the context window. This
increases cost and can force [compaction](/concepts/compaction) sooner than
necessary.
Pruning is especially valuable for **Anthropic prompt caching**. After the cache
TTL expires, the next request re-caches the full prompt. Pruning reduces the
cache-write size, directly lowering cost.
## How it works
Pruning runs in `cache-ttl` mode, which is the only supported mode:
1. **Check the clock** -- pruning only runs if the last Anthropic API call for
the session is older than `ttl` (default `5m`).
2. **Find prunable messages** -- only `toolResult` messages are eligible. User
and assistant messages are never modified.
3. **Protect recent context** -- the last `keepLastAssistants` assistant
messages (default `3`) and all tool results after that cutoff are preserved.
4. **Soft-trim** oversized tool results -- keep the head and tail, insert
`...`, and append a note with the original size.
5. **Hard-clear** remaining eligible results -- replace the entire content with
a placeholder.
6. **Reset the TTL** -- subsequent requests keep cache until `ttl` expires
again.
### What gets skipped
- Tool results containing **image blocks** are never trimmed.
- If there are not enough assistant messages to establish the cutoff, pruning
is skipped entirely.
- Pruning currently only activates for Anthropic API calls (and OpenRouter
Anthropic models).
1. Wait for the cache TTL to expire (default 5 minutes).
2. Find old tool results (user and assistant messages are never touched).
3. **Soft-trim** oversized results -- keep the head and tail, insert `...`.
4. **Hard-clear** the rest -- replace with a placeholder.
5. Reset the TTL so follow-up requests reuse the fresh cache.
## Smart defaults
OpenClaw auto-configures pruning for Anthropic profiles:
OpenClaw auto-enables pruning for Anthropic profiles:
| Profile type | Pruning | Heartbeat | Cache retention |
| -------------------- | ------------------- | --------- | ------------------ |
| OAuth or setup-token | `cache-ttl` enabled | `1h` | (provider default) |
| API key | `cache-ttl` enabled | `30m` | `short` (5 min) |
| Profile type | Pruning enabled | Heartbeat |
| -------------------- | --------------- | --------- |
| OAuth or setup-token | Yes | 1 hour |
| API key | Yes | 30 min |
If you set any of these values explicitly, OpenClaw does not override them.
If you set explicit values, OpenClaw does not override them.
Match `ttl` to your model `cacheRetention` policy for best results (`short` =
5 min, `long` = 1 hour).
## Enable or disable
## Pruning vs compaction
| | Pruning | Compaction |
| -------------- | --------------------------------- | ------------------------------- |
| **What** | Trims tool result messages | Summarizes conversation history |
| **Persisted?** | No (in-memory, per request) | Yes (in JSONL transcript) |
| **Scope** | Tool results only | Entire conversation |
| **Trigger** | Every LLM call (when TTL expired) | Context window threshold |
Built-in tools already truncate their own output. Pruning is an additional layer
that prevents long-running chats from accumulating too much tool output over
time. See [Compaction](/concepts/compaction) for the summarization approach.
## Configuration
### Defaults (when enabled)
| Setting | Default | Description |
| ----------------------- | ----------------------------------- | ------------------------------------------------ |
| `ttl` | `5m` | Prune only after this idle period |
| `keepLastAssistants` | `3` | Protect tool results near recent assistant turns |
| `softTrimRatio` | `0.3` | Context ratio for soft-trim eligibility |
| `hardClearRatio` | `0.5` | Context ratio for hard-clear eligibility |
| `minPrunableToolChars` | `50000` | Minimum tool result size to consider |
| `softTrim.maxChars` | `4000` | Max chars after soft-trim |
| `softTrim.headChars` | `1500` | Head portion to keep |
| `softTrim.tailChars` | `1500` | Tail portion to keep |
| `hardClear.enabled` | `true` | Enable hard-clear stage |
| `hardClear.placeholder` | `[Old tool result content cleared]` | Replacement text |
### Examples
Disable pruning (default state):
```json5
{
agents: {
defaults: {
contextPruning: { mode: "off" },
},
},
}
```
Enable TTL-aware pruning:
Pruning is off by default for non-Anthropic providers. To enable:
```json5
{
@@ -119,40 +60,21 @@ Enable TTL-aware pruning:
}
```
Restrict pruning to specific tools:
To disable: set `mode: "off"`.
```json5
{
agents: {
defaults: {
contextPruning: {
mode: "cache-ttl",
tools: {
allow: ["exec", "read"],
deny: ["*image*"],
},
},
},
},
}
```
## Pruning vs compaction
Tool selection supports `*` wildcards, deny wins over allow, matching is
case-insensitive, and an empty allow list means all tools are allowed.
| | Pruning | Compaction |
| ---------- | ------------------ | ----------------------- |
| **What** | Trims tool results | Summarizes conversation |
| **Saved?** | No (per-request) | Yes (in transcript) |
| **Scope** | Tool results only | Entire conversation |
## Context window estimation
They complement each other -- pruning keeps tool output lean between
compaction cycles.
Pruning estimates the context window (chars = tokens x 4). The base window is
resolved in this order:
1. `models.providers.*.models[].contextWindow` override.
2. Model definition `contextWindow` from the model registry.
3. Default `200000` tokens.
If `agents.defaults.contextTokens` is set, it caps the resolved window.
## Related
## Further reading
- [Compaction](/concepts/compaction) -- summarization-based context reduction
- [Session Management](/concepts/session) -- session lifecycle and routing
- [Gateway Configuration](/gateway/configuration) -- full config reference
- [Gateway Configuration](/gateway/configuration) -- all pruning config knobs
(`contextPruning.*`)

View File

@@ -1,220 +1,84 @@
---
summary: "Agent tools for listing sessions, reading history, cross-session messaging, and spawning sub-agents"
summary: "Agent tools for listing sessions, reading history, and cross-session messaging"
read_when:
- You want to understand agent session tools
- You are configuring cross-session access or sub-agent spawning
- You want to understand what session tools the agent has
- You want to configure cross-session access or sub-agent spawning
title: "Session Tools"
---
# Session Tools
OpenClaw gives agents a small set of tools to interact with sessions: list them,
read their history, send messages across sessions, and spawn isolated sub-agent
runs.
OpenClaw gives agents tools to work across sessions -- listing conversations,
reading history, sending messages to other sessions, and spawning sub-agents.
## Overview
## Available tools
| Tool | Purpose |
| ------------------ | ----------------------------------- |
| `sessions_list` | List sessions with optional filters |
| `sessions_history` | Fetch transcript for one session |
| `sessions_send` | Send a message into another session |
| `sessions_spawn` | Spawn an isolated sub-agent session |
| Tool | What it does |
| ------------------ | ------------------------------------------------------- |
| `sessions_list` | List sessions with optional filters (kind, recency) |
| `sessions_history` | Read the transcript of a specific session |
| `sessions_send` | Send a message to another session and optionally wait |
| `sessions_spawn` | Spawn an isolated sub-agent session for background work |
## Session keys
## Listing and reading sessions
Session tools use **session keys** to identify conversations:
`sessions_list` returns sessions with their key, kind, channel, model, token
counts, and timestamps. Filter by kind (`main`, `group`, `cron`, `hook`,
`node`) or recency (`activeMinutes`).
- `"main"` -- the agent's main direct-chat session.
- `agent:<agentId>:<channel>:group:<id>` -- group chat (pass the full key).
- `cron:<job.id>` -- cron job session.
- `hook:<uuid>` -- webhook session.
- `node-<nodeId>` -- node session.
`sessions_history` fetches the conversation transcript for a specific session.
By default, tool results are excluded -- pass `includeTools: true` to see them.
`global` and `unknown` are reserved and never listed. If
`session.scope = "global"`, it is aliased to `main` for all tools.
Both tools accept either a **session key** (like `"main"`) or a **session ID**
from a previous list call.
## sessions_list
## Sending cross-session messages
Lists sessions as an array of rows.
`sessions_send` delivers a message to another session and optionally waits for
the response:
**Parameters:**
- **Fire-and-forget:** set `timeoutSeconds: 0` to enqueue and return
immediately.
- **Wait for reply:** set a timeout and get the response inline.
| Parameter | Type | Default | Description |
| --------------- | ---------- | -------------- | -------------------------------------------------------- |
| `kinds` | `string[]` | all | Filter: `main`, `group`, `cron`, `hook`, `node`, `other` |
| `limit` | `number` | server default | Max rows returned |
| `activeMinutes` | `number` | -- | Only sessions updated within N minutes |
| `messageLimit` | `number` | `0` | Include last N messages per session (0 = none) |
After the target responds, OpenClaw can run a **reply-back loop** where the
agents alternate messages (up to 5 turns). The target agent can reply
`REPLY_SKIP` to stop early.
When `messageLimit > 0`, OpenClaw fetches chat history per session and includes
the last N messages. Tool results are filtered out in list output -- use
`sessions_history` for tool messages.
## Spawning sub-agents
**Row fields:** `key`, `kind`, `channel`, `displayName`, `updatedAt`,
`sessionId`, `model`, `contextTokens`, `totalTokens`, `thinkingLevel`,
`verboseLevel`, `sendPolicy`, `lastChannel`, `lastTo`, `deliveryContext`,
`transcriptPath`, and optionally `messages`.
`sessions_spawn` creates an isolated session for a background task. It is always
non-blocking -- it returns immediately with a `runId` and `childSessionKey`.
## sessions_history
Key options:
Fetches the transcript for one session.
- `runtime: "subagent"` (default) or `"acp"` for external harness agents.
- `model` and `thinking` overrides for the child session.
- `thread: true` to bind the spawn to a chat thread (Discord, Slack, etc.).
- `sandbox: "require"` to enforce sandboxing on the child.
**Parameters:**
Sub-agents get the full tool set minus session tools (no recursive spawning).
After completion, an announce step posts the result to the requester's channel.
| Parameter | Type | Default | Description |
| -------------- | --------- | -------------- | ----------------------------------------------- |
| `sessionKey` | `string` | required | Session key or `sessionId` from `sessions_list` |
| `limit` | `number` | server default | Max messages |
| `includeTools` | `boolean` | `false` | Include `toolResult` messages |
For ACP-specific behavior, see [ACP Agents](/tools/acp-agents).
When given a `sessionId`, OpenClaw resolves it to the corresponding session key.
## Visibility
### Gateway APIs
Session tools are scoped to limit what the agent can see:
Control UI and gateway clients can use lower-level APIs directly:
| Level | Scope |
| ------- | ---------------------------------------- |
| `self` | Only the current session |
| `tree` | Current session + spawned sub-agents |
| `agent` | All sessions for this agent |
| `all` | All sessions (cross-agent if configured) |
- **HTTP:** `GET /sessions/{sessionKey}/history` with query params `limit`,
`cursor`, `includeTools=1`, `follow=1` (upgrades to SSE stream).
- **WebSocket:** `sessions.subscribe` for all lifecycle events,
`sessions.messages.subscribe { key }` for one session's transcript,
`sessions.messages.unsubscribe { key }` to remove.
Default is `tree`. Sandboxed sessions are clamped to `tree` regardless of
config.
## sessions_send
## Further reading
Sends a message into another session.
**Parameters:**
| Parameter | Type | Default | Description |
| ---------------- | -------- | -------- | ---------------------------------- |
| `sessionKey` | `string` | required | Target session key or `sessionId` |
| `message` | `string` | required | Message content |
| `timeoutSeconds` | `number` | > 0 | Wait timeout (0 = fire-and-forget) |
**Behavior:**
- `timeoutSeconds = 0` -- enqueue and return `{ runId, status: "accepted" }`.
- `timeoutSeconds > 0` -- wait for completion, then return the reply.
- Timeout: `{ runId, status: "timeout" }`. The run continues; check
`sessions_history` later.
### Reply-back loop
After the target session responds, OpenClaw runs an alternating reply loop
between requester and target agents:
- Reply `REPLY_SKIP` to stop the ping-pong.
- Max turns: `session.agentToAgent.maxPingPongTurns` (0--5, default 5).
After the loop, an **announce step** posts the result to the target's chat
channel. Reply `ANNOUNCE_SKIP` to stay silent. The announce includes the
original request, round-1 reply, and latest ping-pong reply.
Inter-session messages are tagged with
`message.provenance.kind = "inter_session"` so transcript readers can
distinguish routed agent instructions from external user input.
## sessions_spawn
Spawns an isolated delegated session for background work.
**Parameters:**
| Parameter | Type | Default | Description |
| ------------------- | --------- | ---------- | -------------------------------------------- |
| `task` | `string` | required | Task description |
| `runtime` | `string` | `subagent` | `subagent` or `acp` |
| `label` | `string` | -- | Label for logs/UI |
| `agentId` | `string` | -- | Target agent or ACP harness ID |
| `model` | `string` | -- | Override sub-agent model |
| `thinking` | `string` | -- | Override thinking level |
| `runTimeoutSeconds` | `number` | `0` | Abort after N seconds (0 = no limit) |
| `thread` | `boolean` | `false` | Request thread-bound routing |
| `mode` | `string` | `run` | `run` or `session` (session requires thread) |
| `cleanup` | `string` | `keep` | `delete` or `keep` |
| `sandbox` | `string` | `inherit` | `inherit` or `require` |
| `attachments` | `array` | -- | Inline files (subagent only) |
**Behavior:**
- Always non-blocking: returns `{ status: "accepted", runId, childSessionKey }`.
- Creates a new `agent:<agentId>:subagent:<uuid>` session with
`deliver: false`.
- Sub-agents get the full tool set minus session tools (configurable via
`tools.subagents.tools`).
- Sub-agents cannot call `sessions_spawn` (no recursive spawning).
- After completion, an announce step posts the result to the requester's
channel. Reply `ANNOUNCE_SKIP` to stay silent.
- Sub-agent sessions are auto-archived after
`agents.defaults.subagents.archiveAfterMinutes` (default: 60).
### Allowlists
- **Subagent:** `agents.list[].subagents.allowAgents` controls which agent IDs
are allowed (`["*"]` for any). Default: only the requester.
- **ACP:** `acp.allowedAgents` controls allowed ACP harness IDs (separate from
subagent policy).
- If the requester is sandboxed, targets that would run unsandboxed are
rejected.
### Attachments
Each entry: `{ name, content, encoding?: "utf8" | "base64", mimeType? }`.
Files are materialized into `<workspace>/.openclaw/attachments/<uuid>/` and a
receipt with sha256 is returned. ACP runtime rejects attachments.
For ACP-specific behavior (harness targeting, permission modes), see
[ACP Agents](/tools/acp-agents).
## Visibility and access control
Session tools can be scoped to limit cross-session access.
### Visibility levels
| Level | What the agent can see |
| ---------------- | ------------------------------------------------------- |
| `self` | Only the current session |
| `tree` (default) | Current session + spawned sub-agent sessions |
| `agent` | Any session belonging to the current agent |
| `all` | Any session (cross-agent requires `tools.agentToAgent`) |
Configure at `tools.sessions.visibility`.
### Sandbox clamping
Sandboxed sessions have an additional clamp via
`agents.defaults.sandbox.sessionToolsVisibility` (default: `spawned`). When
this is set, visibility is clamped to `tree` even if
`tools.sessions.visibility = "all"`.
```json5
{
tools: {
sessions: {
visibility: "tree",
},
},
agents: {
defaults: {
sandbox: {
sessionToolsVisibility: "spawned",
},
},
},
}
```
## Send policy
Policy-based blocking by channel or chat type prevents agents from sending to
restricted sessions. See [Session Management](/concepts/session) for send policy
configuration.
## Related
- [Session Management](/concepts/session) -- session routing, lifecycle, and
maintenance
- [ACP Agents](/tools/acp-agents) -- ACP-specific spawning and permissions
- [Session Management](/concepts/session) -- routing, lifecycle, maintenance
- [ACP Agents](/tools/acp-agents) -- external harness spawning
- [Multi-agent](/concepts/multi-agent) -- multi-agent architecture
- [Gateway Configuration](/gateway/configuration) -- session tool config knobs

View File

@@ -1,298 +1,113 @@
---
summary: "How OpenClaw manages sessions -- routing, isolation, lifecycle, and maintenance"
summary: "How OpenClaw manages conversation sessions"
read_when:
- You want to understand session keys and routing
- You want to configure DM isolation or multi-user setups
- You want to tune session lifecycle or maintenance
- You want to understand session routing and isolation
- You want to configure DM scope for multi-user setups
title: "Session Management"
---
# Session Management
OpenClaw manages conversations through **sessions**. Each session has a key
(which conversation bucket it belongs to), an ID (which transcript file
continues it), and metadata tracked in a session store.
OpenClaw organizes conversations into **sessions**. Each message is routed to a
session based on where it came from -- DMs, group chats, cron jobs, etc.
## How sessions are routed
## How messages are routed
Every inbound message is mapped to a **session key** that determines which
conversation it joins:
| Source | Behavior |
| --------------- | ------------------------- |
| Direct messages | Shared session by default |
| Group chats | Isolated per group |
| Rooms/channels | Isolated per room |
| Cron jobs | Fresh session per run |
| Webhooks | Isolated per hook |
| Source | Session key pattern | Behavior |
| --------------- | ---------------------------------------- | ------------------------------------- |
| Direct messages | `agent:<agentId>:<mainKey>` | Shared by default (`dmScope: "main"`) |
| Group chats | `agent:<agentId>:<channel>:group:<id>` | Isolated per group |
| Rooms/channels | `agent:<agentId>:<channel>:channel:<id>` | Isolated per room |
| Cron jobs | `cron:<job.id>` | Fresh session per run |
| Webhooks | `hook:<uuid>` | Unless explicitly overridden |
| Node runs | `node-<nodeId>` | Unless explicitly overridden |
## DM isolation
Telegram forum topics append `:topic:<threadId>` for per-topic isolation.
## DM scope and isolation
By default, all direct messages share one session (`dmScope: "main"`) for
continuity across devices and channels. This works well for single-user setups,
but can leak context when multiple people message your agent.
### Secure DM mode
By default, all DMs share one session for continuity. This is fine for
single-user setups.
<Warning>
If your agent receives DMs from multiple people, you should enable DM isolation.
Without it, all users share the same conversation context.
If multiple people can message your agent, enable DM isolation. Without it, all
users share the same conversation context -- Alice's private messages would be
visible to Bob.
</Warning>
**The problem:** Alice messages about a private topic. Bob asks "What were we
talking about?" Because both share a session, the model may answer Bob using
Alice's context.
**The fix:**
```json5
{
session: {
dmScope: "per-channel-peer",
dmScope: "per-channel-peer", // isolate by channel + sender
},
}
```
### DM scope options
Other options:
| Value | Key pattern | Best for |
| -------------------------- | -------------------------------------------------- | ------------------------------------ |
| `main` (default) | `agent:<id>:main` | Single-user, cross-device continuity |
| `per-peer` | `agent:<id>:direct:<peerId>` | Multi-user, cross-channel identity |
| `per-channel-peer` | `agent:<id>:<channel>:direct:<peerId>` | Multi-user inboxes (recommended) |
| `per-account-channel-peer` | `agent:<id>:<channel>:<accountId>:direct:<peerId>` | Multi-account inboxes |
- `main` (default) -- all DMs share one session.
- `per-peer` -- isolate by sender (across channels).
- `per-channel-peer` -- isolate by channel + sender (recommended).
- `per-account-channel-peer` -- isolate by account + channel + sender.
### Cross-channel identity linking
<Tip>
If the same person contacts you from multiple channels, use
`session.identityLinks` to link their identities so they share one session.
</Tip>
When using `per-peer` or `per-channel-peer`, the same person messaging from
different channels gets separate sessions. Use `session.identityLinks` to
collapse them:
```json5
{
session: {
identityLinks: {
alice: ["telegram:123456789", "discord:987654321012345678"],
},
},
}
```
The canonical key replaces `<peerId>` so Alice shares one session across
channels.
**When to enable DM isolation:**
- Pairing approvals for more than one sender
- DM allowlist with multiple entries
- `dmPolicy: "open"`
- Multiple phone numbers or accounts can message the agent
Verify settings with `openclaw security audit` (see [security](/cli/security)).
Local CLI onboarding writes `per-channel-peer` by default when unset.
Verify your setup with `openclaw security audit`.
## Session lifecycle
### Resets
Sessions are reused until they expire:
Sessions are reused until they expire. Expiry is evaluated on the next inbound
message:
- **Daily reset** (default) -- new session at 4:00 AM local time on the gateway
host.
- **Idle reset** (optional) -- new session after a period of inactivity. Set
`session.reset.idleMinutes`.
- **Manual reset** -- type `/new` or `/reset` in chat. `/new <model>` also
switches the model.
- **Daily reset** (default) -- 4:00 AM local time on the gateway host. A
session is stale once its last update is before the most recent reset time.
- **Idle reset** (optional) -- `idleMinutes` adds a sliding idle window.
- **Combined** -- when both are configured, whichever expires first forces a new
session.
Override per session type or channel:
```json5
{
session: {
reset: {
mode: "daily",
atHour: 4,
idleMinutes: 120,
},
resetByType: {
thread: { mode: "daily", atHour: 4 },
direct: { mode: "idle", idleMinutes: 240 },
group: { mode: "idle", idleMinutes: 120 },
},
resetByChannel: {
discord: { mode: "idle", idleMinutes: 10080 },
},
},
}
```
### Manual resets
- `/new` or `/reset` starts a fresh session. The remainder of the message is
passed through.
- `/new <model>` accepts a model alias, `provider/model`, or provider name
(fuzzy match) to set the session model.
- If sent alone, OpenClaw runs a short greeting turn to confirm the reset.
- Custom triggers: add to `resetTriggers` array.
- Delete specific keys from the store or remove the JSONL transcript; the next
message recreates them.
- Isolated cron jobs always mint a fresh `sessionId` per run.
When both daily and idle resets are configured, whichever expires first wins.
## Where state lives
All session state is **owned by the gateway**. UI clients (macOS app, WebChat,
TUI) query the gateway for session lists and token counts.
All session state is owned by the **gateway**. UI clients query the gateway for
session data.
In remote mode, the session store lives on the remote gateway host, not your
local machine.
### Storage
| Artifact | Path | Purpose |
| ------------- | --------------------------------------------------------- | --------------------------------- |
| Session store | `~/.openclaw/agents/<agentId>/sessions/sessions.json` | Key-value map of session metadata |
| Transcripts | `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl` | Append-only conversation history |
The store maps `sessionKey -> { sessionId, updatedAt, ... }`. Deleting entries
is safe; they are recreated on demand. Group entries may include `displayName`,
`channel`, `subject`, `room`, and `space` for UI labeling.
Telegram topic sessions use `.../<sessionId>-topic-<threadId>.jsonl`.
- **Store:** `~/.openclaw/agents/<agentId>/sessions/sessions.json`
- **Transcripts:** `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl`
## Session maintenance
OpenClaw keeps the session store and transcripts bounded over time.
### Defaults
| Setting | Default | Description |
| ----------------------- | ------------------- | --------------------------------------------------------------- |
| `mode` | `warn` | `warn` reports what would be evicted; `enforce` applies cleanup |
| `pruneAfter` | `30d` | Stale-entry age cutoff |
| `maxEntries` | `500` | Cap entries in sessions.json |
| `rotateBytes` | `10mb` | Rotate sessions.json when oversized |
| `resetArchiveRetention` | `30d` | Retention for reset archives |
| `maxDiskBytes` | unset | Optional sessions-directory budget |
| `highWaterBytes` | 80% of maxDiskBytes | Target after cleanup |
### Enforcement order (`mode: "enforce"`)
1. Prune stale entries older than `pruneAfter`.
2. Cap entry count to `maxEntries` (oldest first).
3. Archive transcript files for removed entries.
4. Purge old reset/deleted archives by retention policy.
5. Rotate `sessions.json` when it exceeds `rotateBytes`.
6. If `maxDiskBytes` is set, enforce disk budget toward `highWaterBytes`.
### Configuration examples
Conservative enforce policy:
OpenClaw automatically bounds session storage over time. By default, it runs
in `warn` mode (reports what would be cleaned). Set `session.maintenance.mode`
to `"enforce"` for automatic cleanup:
```json5
{
session: {
maintenance: {
mode: "enforce",
pruneAfter: "45d",
maxEntries: 800,
rotateBytes: "20mb",
resetArchiveRetention: "14d",
pruneAfter: "30d",
maxEntries: 500,
},
},
}
```
Hard disk budget:
```json5
{
session: {
maintenance: {
mode: "enforce",
maxDiskBytes: "1gb",
highWaterBytes: "800mb",
},
},
}
```
Preview or force from CLI:
```bash
openclaw sessions cleanup --dry-run
openclaw sessions cleanup --enforce
```
### Performance note
Large session stores can increase write-path latency. To keep things fast:
- Use `mode: "enforce"` in production.
- Set both time and count limits (`pruneAfter` + `maxEntries`).
- Set `maxDiskBytes` + `highWaterBytes` for hard upper bounds.
- Run `openclaw sessions cleanup --dry-run --json` after config changes to
preview impact.
## Send policy
Block delivery for specific session types without listing individual IDs:
```json5
{
session: {
sendPolicy: {
rules: [
{ action: "deny", match: { channel: "discord", chatType: "group" } },
{ action: "deny", match: { keyPrefix: "cron:" } },
{ action: "deny", match: { rawKeyPrefix: "agent:main:discord:" } },
],
default: "allow",
},
},
}
```
Runtime override (owner only):
- `/send on` -- allow for this session.
- `/send off` -- deny for this session.
- `/send inherit` -- clear override and use config rules.
Preview with `openclaw sessions cleanup --dry-run`.
## Inspecting sessions
| Method | What it shows |
| ------------------------------------ | ---------------------------------------------------------- |
| `openclaw status` | Store path, recent sessions |
| `openclaw sessions --json` | All entries (filter with `--active <minutes>`) |
| `/status` in chat | Reachability, context usage, toggles, cred freshness |
| `/context list` or `/context detail` | System prompt contents, biggest context contributors |
| `/stop` in chat | Abort current run, clear queued followups, stop sub-agents |
- `openclaw status` -- session store path and recent activity.
- `openclaw sessions --json` -- all sessions (filter with `--active <minutes>`).
- `/status` in chat -- context usage, model, and toggles.
- `/context list` -- what is in the system prompt.
JSONL transcripts can be opened directly to review full turns.
## Further reading
## Session origin metadata
Each session entry records where it came from (best-effort) in `origin`:
- `label` -- human label (from conversation label + group subject/channel).
- `provider` -- normalized channel ID (including extensions).
- `from` / `to` -- raw routing IDs from the inbound envelope.
- `accountId` -- provider account ID (multi-account).
- `threadId` -- thread/topic ID when supported.
Extensions populate these by sending `ConversationLabel`, `GroupSubject`,
`GroupChannel`, `GroupSpace`, and `SenderName` in the inbound context.
## Tips
- Keep the primary key dedicated to 1:1 traffic; let groups keep their own
keys.
- When automating cleanup, delete individual keys instead of the whole store to
preserve context elsewhere.
- Related: [Session Pruning](/concepts/session-pruning),
[Compaction](/concepts/compaction),
[Session Tools](/concepts/session-tool),
[Session Management Deep Dive](/reference/session-management-compaction).
- [Session Pruning](/concepts/session-pruning) -- trimming tool results
- [Compaction](/concepts/compaction) -- summarizing long conversations
- [Session Tools](/concepts/session-tool) -- agent tools for cross-session work
- [Session Management Deep Dive](/reference/session-management-compaction) --
store schema, transcripts, send policy, origin metadata, and advanced config

View File

@@ -51,7 +51,7 @@ local policy).
When using a custom OpenAI-compatible endpoint,
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
## QMD backend (experimental)
## QMD backend
Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
[QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines