docs: simplify sessions/memory concept pages and fix QMD experimental label

2026-04-20 21:23:23 +00:00 · 2026-03-30 07:32:00 +09:00
parent 57069f2b2f
commit 143b4c54ba
7 changed files with 336 additions and 1074 deletions
--- a/docs/concepts/compaction.md
+++ b/docs/concepts/compaction.md
@@ -1,69 +1,50 @@
 ---
-summary: "How OpenClaw compacts long sessions to stay within model context limits"
+summary: "How OpenClaw summarizes long conversations to stay within model limits"
 read_when:
  - You want to understand auto-compaction and /compact
  - You are debugging long sessions hitting context limits
-  - You want to tune compaction behavior or use a custom context engine
 title: "Compaction"
 ---

 # Compaction

-Every model has a **context window** -- the maximum number of tokens it can see
-at once. As a conversation grows, it eventually approaches that limit. OpenClaw
-**compacts** older history into a summary so the session can continue without
-losing important context.
+Every model has a context window -- the maximum number of tokens it can process.
+When a conversation approaches that limit, OpenClaw **compacts** older messages
+into a summary so the chat can continue.

-## How compaction works
+## How it works

-Compaction is a three-step process:
+1. Older conversation turns are summarized into a compact entry.
+2. The summary is saved in the session transcript.
+3. Recent messages are kept intact.

-1. **Summarize** older conversation turns into a compact summary.
-2. **Persist** the summary as a `compaction` entry in the session transcript
-   (JSONL).
-3. **Keep** recent messages after the compaction point intact.
-
-After compaction, future turns see the summary plus all messages after the
-compaction point. The on-disk transcript retains the full history -- compaction
-only changes what gets loaded into the model context.
+The full conversation history stays on disk. Compaction only changes what the
+model sees on the next turn.

 ## Auto-compaction

-Auto-compaction is **on by default**. It triggers in two situations:
+Auto-compaction is on by default. It runs when the session nears the context
+limit, or when the model returns a context-overflow error (in which case
+OpenClaw compacts and retries).

-1. **Threshold maintenance** -- after a successful turn, when estimated context
-   usage exceeds `contextWindow - reserveTokens`.
-2. **Overflow recovery** -- the model returns a context-overflow error. OpenClaw
-   compacts and retries the request.
-
-When auto-compaction runs you will see:
-
- `Auto-compaction complete` in verbose mode
- `/status` showing `Compactions: <count>`
-
-### Pre-compaction memory flush
-
-Before compacting, OpenClaw can run a **silent turn** that reminds the model to
-write durable notes to disk. This prevents important context from being lost in
-the summary. The flush is controlled by `agents.defaults.compaction.memoryFlush`
-and runs once per compaction cycle. See [Memory](/concepts/memory) for details.
+<Info>
+Before compacting, OpenClaw automatically reminds the agent to save important
+notes to [memory](/concepts/memory) files. This prevents context loss.
+</Info>

 ## Manual compaction

-Use `/compact` in any chat to force a compaction pass. You can optionally add
-instructions to guide the summary:
+Type `/compact` in any chat to force a compaction. Add instructions to guide
+the summary:

 ```
-/compact Focus on decisions and open questions
+/compact Focus on the API design decisions
 ```

-## Configuration
+## Using a different model

-### Compaction model
-
-By default, compaction uses the agent's primary model. You can override this
-with a different model for summarization -- useful when your primary model is
-small or local and you want a more capable summarizer:
+By default, compaction uses your agent's primary model. You can use a more
+capable model for better summaries:

 ```json5
 {
@@ -77,106 +58,29 @@ small or local and you want a more capable summarizer:
 }
 ```

-### Reserve tokens and floor
-
- `reserveTokens` -- headroom reserved for prompts and the next model output
-  (Pi runtime default: `16384`).
- `reserveTokensFloor` -- minimum reserve enforced by OpenClaw (default:
-  `20000`). Set to `0` to disable.
- `keepRecentTokens` -- how many tokens of recent conversation to preserve
-  during compaction (default: `20000`).
-
-### Identifier preservation
-
-Compaction summaries preserve opaque identifiers by default
-(`identifierPolicy: "strict"`). Override with:
-
- `"off"` -- no special identifier handling.
- `"custom"` -- provide your own instructions via `identifierInstructions`.
-
-### Memory flush
-
-```json5
-{
-  agents: {
-    defaults: {
-      compaction: {
-        memoryFlush: {
-          enabled: true, // default
-          softThresholdTokens: 4000,
-          systemPrompt: "Session nearing compaction. Store durable memories now.",
-          prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store.",
-        },
-      },
-    },
-  },
-}
-```
-
-The flush triggers when context usage crosses
-`contextWindow - reserveTokensFloor - softThresholdTokens`. It runs silently
-(the user sees nothing) and is skipped when the workspace is read-only.
-
 ## Compaction vs pruning

-|                  | Compaction                     | Session pruning                  |
-| ---------------- | ------------------------------ | -------------------------------- |
-| **What it does** | Summarizes older conversation  | Trims old tool results           |
-| **Persisted?**   | Yes (in JSONL transcript)      | No (in-memory only, per request) |
-| **Scope**        | Entire conversation history    | Tool result messages only        |
-| **Frequency**    | Once when threshold is reached | Every LLM call (when enabled)    |
+|                  | Compaction                    | Pruning                          |
+| ---------------- | ----------------------------- | -------------------------------- |
+| **What it does** | Summarizes older conversation | Trims old tool results           |
+| **Saved?**       | Yes (in session transcript)   | No (in-memory only, per request) |
+| **Scope**        | Entire conversation           | Tool results only                |

-See [Session Pruning](/concepts/session-pruning) for pruning details.
-
-## OpenAI server-side compaction
-
-OpenClaw also supports OpenAI Responses server-side compaction for compatible
-direct OpenAI models. This is separate from local compaction and can run
-alongside it:
-
- **Local compaction** -- OpenClaw summarizes and persists into session JSONL.
- **Server-side compaction** -- OpenAI compacts context on the provider side when
-  `store` + `context_management` are enabled.
-
-See [OpenAI provider](/providers/openai) for model params and overrides.
-
-## Custom context engines
-
-Compaction behavior is owned by the active
-[context engine](/concepts/context-engine). The built-in engine uses the
-summarization described above. Plugin engines (selected via
-`plugins.slots.contextEngine`) can implement any strategy -- DAG summaries,
-vector retrieval, incremental condensation, etc.
-
-When a plugin engine sets `ownsCompaction: true`, OpenClaw delegates all
-compaction decisions to the engine and does not run built-in auto-compaction.
-
-When `ownsCompaction` is `false` or unset, the built-in auto-compaction still
-runs, but the engine's `compact()` method handles `/compact` and overflow
-recovery. If you are building a non-owning engine, implement `compact()` by
-calling `delegateCompactionToRuntime(...)` from `openclaw/plugin-sdk/core`.
+[Session pruning](/concepts/session-pruning) is a lighter-weight complement that
+trims tool output without summarizing.

 ## Troubleshooting

-**Compaction triggers too often?**
+**Compacting too often?** The model's context window may be small, or tool
+outputs may be large. Try enabling
+[session pruning](/concepts/session-pruning).

- Check the model's context window -- small models compact more frequently.
- High `reserveTokens` relative to the context window can trigger early
-  compaction.
- Large tool outputs accumulate fast. Enable
-  [session pruning](/concepts/session-pruning) to reduce tool-result buildup.
+**Context feels stale after compaction?** Use `/compact Focus on <topic>` to
+guide the summary, or enable the [memory flush](/concepts/memory) so notes
+survive.

-**Context feels stale after compaction?**
+**Need a clean slate?** `/new` starts a fresh session without compacting.

- Use `/compact Focus on <topic>` to guide the summary.
- Increase `keepRecentTokens` to preserve more recent conversation.
- Enable the [memory flush](/concepts/memory) so durable notes survive
-  compaction.
-
-**Need a fresh start?**
-
- `/new` or `/reset` starts a new session ID without compacting.
-
-For the full internal lifecycle (store schema, transcript structure, Pi runtime
-semantics), see
+For advanced configuration (reserve tokens, identifier preservation, custom
+context engines, OpenAI server-side compaction), see the
 [Session Management Deep Dive](/reference/session-management-compaction).
--- a/docs/concepts/memory-search.md
+++ b/docs/concepts/memory-search.md
@@ -1,171 +1,92 @@
 ---
 title: "Memory Search"
-summary: "How OpenClaw memory search works -- embedding providers, hybrid search, MMR, and temporal decay"
+summary: "How memory search finds relevant notes using embeddings and hybrid retrieval"
 read_when:
-  - You want to understand how memory_search retrieves results
-  - You want to tune hybrid search, MMR, or temporal decay
+  - You want to understand how memory_search works
  - You want to choose an embedding provider
+  - You want to tune search quality
 ---

 # Memory Search

-OpenClaw indexes workspace memory files (`MEMORY.md` and `memory/*.md`) into
-chunks (~400 tokens, 80-token overlap) and searches them with `memory_search`.
-This page explains how the search pipeline works and how to tune it. For the
-file layout and memory basics, see [Memory](/concepts/memory).
+`memory_search` finds relevant notes from your memory files, even when the
+wording differs from the original text. It works by indexing memory into small
+chunks and searching them using embeddings, keywords, or both.

-## Search pipeline
+## Quick start
+
+If you have an OpenAI, Gemini, Voyage, or Mistral API key configured, memory
+search works automatically. To set a provider explicitly:
+
+```json5
+{
+  agents: {
+    defaults: {
+      memorySearch: {
+        provider: "openai", // or "gemini", "local", "ollama", etc.
+      },
+    },
+  },
+}
+```
+
+For local embeddings with no API key, use `provider: "local"` (requires
+node-llama-cpp).
+
+## Supported providers
+
+| Provider | ID        | Needs API key | Notes                         |
+| -------- | --------- | ------------- | ----------------------------- |
+| OpenAI   | `openai`  | Yes           | Auto-detected, fast           |
+| Gemini   | `gemini`  | Yes           | Supports image/audio indexing |
+| Voyage   | `voyage`  | Yes           | Auto-detected                 |
+| Mistral  | `mistral` | Yes           | Auto-detected                 |
+| Ollama   | `ollama`  | No            | Local, must set explicitly    |
+| Local    | `local`   | No            | GGUF model, ~0.6 GB download  |
+
+## How search works
+
+OpenClaw runs two retrieval paths in parallel and merges the results:

 ```
 Query -> Embedding -> Vector Search ─┐
-                                     ├─> Weighted Merge -> Temporal Decay -> MMR -> Top-K
+                                     ├─> Merge -> Top Results
 Query -> Tokenize  -> BM25 Search  ──┘
 ```

-Both retrieval paths run in parallel when hybrid search is enabled. If either
-path is unavailable (no embeddings or no FTS5), the other runs alone.
+- **Vector search** finds notes with similar meaning ("gateway host" matches
+  "the machine running OpenClaw").
+- **BM25 keyword search** finds exact matches (IDs, error strings, config
+  keys).

-## Embedding providers
+If only one path is available (no embeddings or no FTS), the other runs alone.

-The default `memory-core` plugin ships built-in adapters for these providers:
+## Improving search quality

-| Provider   | Adapter ID | Auto-selected        | Notes                               |
-| ---------- | ---------- | -------------------- | ----------------------------------- |
-| Local GGUF | `local`    | Yes (first priority) | node-llama-cpp, ~0.6 GB model       |
-| OpenAI     | `openai`   | Yes                  | `text-embedding-3-small` default    |
-| Gemini     | `gemini`   | Yes                  | Supports multimodal (images, audio) |
-| Voyage     | `voyage`   | Yes                  |                                     |
-| Mistral    | `mistral`  | Yes                  |                                     |
-| Ollama     | `ollama`   | No (explicit only)   | Local/self-hosted                   |
+Two optional features help when you have a large note history:

-Auto-selection picks the first provider whose API key can be resolved. Set
-`memorySearch.provider` explicitly to override.
+### Temporal decay

-Remote embeddings require an API key for the embedding provider. OpenClaw
-resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
-variables. Codex OAuth covers chat/completions only and does not satisfy
-embedding requests.
+Old notes gradually lose ranking weight so recent information surfaces first.
+With the default half-life of 30 days, a note from last month scores at 50% of
+its original weight. Evergreen files like `MEMORY.md` are never decayed.

-### Quick start
+<Tip>
+Enable temporal decay if your agent has months of daily notes and stale
+information keeps outranking recent context.
+</Tip>

-Enable memory search with OpenAI embeddings:
+### MMR (diversity)

-```json5
-{
-  agents: {
-    defaults: {
-      memorySearch: {
-        provider: "openai",
-        model: "text-embedding-3-small",
-      },
-    },
-  },
-}
-```
+Reduces redundant results. If five notes all mention the same router config, MMR
+ensures the top results cover different topics instead of repeating.

-Or use local embeddings (no API key needed):
+<Tip>
+Enable MMR if `memory_search` keeps returning near-duplicate snippets from
+different daily notes.
+</Tip>

-```json5
-{
-  agents: {
-    defaults: {
-      memorySearch: {
-        provider: "local",
-      },
-    },
-  },
-}
-```
-
-Local mode uses node-llama-cpp and may require `pnpm approve-builds` to build
-the native addon.
-
-## Hybrid search (BM25 + vector)
-
-When both FTS5 and embeddings are available, OpenClaw combines two retrieval
-signals:
-
- **Vector similarity** -- semantic matching. Good at paraphrases ("Mac Studio
-  gateway host" vs "the machine running the gateway").
- **BM25 keyword relevance** -- exact token matching. Good at IDs, code symbols,
-  error strings, and config keys.
-
-### How scores are merged
-
-1. Retrieve a candidate pool from each side (top
-   `maxResults x candidateMultiplier`).
-2. Convert BM25 rank to a 0-1 score: `textScore = 1 / (1 + max(0, bm25Rank))`.
-3. Union candidates by chunk ID and compute:
-   `finalScore = vectorWeight x vectorScore + textWeight x textScore`.
-
-Weights are normalized to 1.0, so they behave as percentages. If either path is
-unavailable, the other runs alone with no hard failure.
-
-### CJK support
-
-FTS5 uses configurable trigram tokenization with a short-substring fallback so
-Chinese, Japanese, and Korean text is searchable. CJK-heavy text is weighted
-correctly during chunk-size estimation, and surrogate-pair characters are
-preserved during fine splits.
-
-## Post-processing
-
-After merging scores, two optional stages refine the result list:
-
-### Temporal decay (recency boost)
-
-Daily notes accumulate over months. Without decay, a well-worded note from six
-months ago can outrank yesterday's update on the same topic.
-
-Temporal decay applies an exponential multiplier based on age:
-
-```
-decayedScore = score x e^(-lambda x ageInDays)
-```
-
-With the default half-life of 30 days:
-
-| Age      | Score retained |
-| -------- | -------------- |
-| Today    | 100%           |
-| 7 days   | ~84%           |
-| 30 days  | 50%            |
-| 90 days  | 12.5%          |
-| 180 days | ~1.6%          |
-
-**Evergreen files are never decayed** -- `MEMORY.md` and non-dated files in
-`memory/` (like `memory/projects.md`) always rank at full score. Dated daily
-files use the date from the filename.
-
-**When to enable:** Your agent has months of daily notes and stale information
-outranks recent context.
-
-### MMR re-ranking (diversity)
-
-When search returns results, multiple chunks may contain similar or overlapping
-content. MMR (Maximal Marginal Relevance) re-ranks results to balance relevance
-with diversity.
-
-How it works:
-
-1. Start with the highest-scoring result.
-2. Iteratively select the next result that maximizes:
-   `lambda x relevance - (1 - lambda) x max_similarity_to_already_selected`.
-3. Similarity is measured using Jaccard text similarity on tokenized content.
-
-The `lambda` parameter controls the trade-off:
-
- `1.0` -- pure relevance (no diversity penalty).
- `0.0` -- maximum diversity (ignores relevance).
- Default: `0.7` (balanced, slight relevance bias).
-
-**When to enable:** `memory_search` returns redundant or near-duplicate
-snippets, especially with daily notes that repeat similar information.
-
-## Configuration
-
-Both post-processing features and hybrid search weights are configured under
-`memorySearch.query.hybrid`:
+### Enable both

 ```json5
 {
@@ -174,18 +95,8 @@ Both post-processing features and hybrid search weights are configured under
      memorySearch: {
        query: {
          hybrid: {
-            enabled: true,
-            vectorWeight: 0.7,
-            textWeight: 0.3,
-            candidateMultiplier: 4,
-            mmr: {
-              enabled: true, // default: false
-              lambda: 0.7,
-            },
-            temporalDecay: {
-              enabled: true, // default: false
-              halfLifeDays: 30,
-            },
+            mmr: { enabled: true },
+            temporalDecay: { enabled: true },
          },
        },
      },
@@ -194,55 +105,32 @@ Both post-processing features and hybrid search weights are configured under
 }
 ```

-You can enable either feature independently:
+## Multimodal memory

- **MMR only** -- many similar notes but age does not matter.
- **Temporal decay only** -- recency matters but results are already diverse.
- **Both** -- recommended for agents with large, long-running daily note
-  histories.
+With Gemini Embedding 2, you can index images and audio files alongside
+Markdown. Search queries remain text, but they match against visual and audio
+content. See the [Memory configuration reference](/reference/memory-config) for
+setup.

-## Session memory search (experimental)
+## Session memory search

-You can optionally index session transcripts and surface them via
-`memory_search`. This is gated behind an experimental flag:
-
-```json5
-{
-  agents: {
-    defaults: {
-      memorySearch: {
-        experimental: { sessionMemory: true },
-        sources: ["memory", "sessions"],
-      },
-    },
-  },
-}
-```
-
-Session indexing is opt-in and runs asynchronously. Results can be slightly stale
-until background sync finishes. Session logs live on disk, so treat filesystem
-access as the trust boundary.
+You can optionally index session transcripts so `memory_search` can recall
+earlier conversations. This is opt-in via
+`memorySearch.experimental.sessionMemory`. See the
+[configuration reference](/reference/memory-config) for details.

 ## Troubleshooting

-**`memory_search` returns nothing?**
+**No results?** Run `openclaw memory status` to check the index. If empty, run
+`openclaw memory index --force`.

- Check `openclaw memory status` -- is the index populated?
- Verify an embedding provider is configured and has a valid key.
- Run `openclaw memory index --force` to trigger a full reindex.
+**Only keyword matches?** Your embedding provider may not be configured. Check
+`openclaw memory status --deep`.

-**Results are all keyword matches, no semantic results?**
-
- Embeddings may not be configured. Check `openclaw memory status --deep`.
- If using `local`, ensure node-llama-cpp built successfully.
-
-**CJK text not found?**
-
- FTS5 trigram tokenization handles CJK. If results are missing, run
-  `openclaw memory index --force` to rebuild the FTS index.
+**CJK text not found?** Rebuild the FTS index with
+`openclaw memory index --force`.

 ## Further reading

 - [Memory](/concepts/memory) -- file layout, backends, tools
 - [Memory configuration reference](/reference/memory-config) -- all config knobs
-  including QMD, batch indexing, embedding cache, sqlite-vec, and multimodal
--- a/docs/concepts/memory.md
+++ b/docs/concepts/memory.md
@@ -1,228 +1,97 @@
 ---
 title: "Memory"
-summary: "How OpenClaw memory works -- file layout, backends, search, and automatic flush"
+summary: "How OpenClaw remembers things across sessions"
 read_when:
-  - You want the memory file layout and workflow
-  - You want to understand memory search and backends
-  - You want to tune the automatic pre-compaction memory flush
+  - You want to understand how memory works
+  - You want to know what memory files to write
 ---

 # Memory

-OpenClaw memory is **plain Markdown in the agent workspace**. The files are the
-source of truth -- the model only "remembers" what gets written to disk.
+OpenClaw remembers things by writing **plain Markdown files** in your agent's
+workspace. The model only "remembers" what gets saved to disk -- there is no
+hidden state.

-Memory search tools are provided by the active memory plugin (default:
-`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.
+## How it works

-## File layout
+Your agent has two places to store memories:

-The default workspace uses two memory layers:
+- **`MEMORY.md`** -- long-term memory. Durable facts, preferences, and
+  decisions. Loaded at the start of every DM session.
+- **`memory/YYYY-MM-DD.md`** -- daily notes. Running context and observations.
+  Today and yesterday's notes are loaded automatically.

-| Path                   | Purpose                  | Loaded at session start    |
-| ---------------------- | ------------------------ | -------------------------- |
-| `memory/YYYY-MM-DD.md` | Daily log (append-only)  | Today + yesterday          |
-| `MEMORY.md`            | Curated long-term memory | Yes (main DM session only) |
+These files live in the agent workspace (default `~/.openclaw/workspace`).

-If both `MEMORY.md` and `memory.md` exist at the workspace root, OpenClaw loads
-both (deduplicated by realpath so symlinks are not injected twice). `MEMORY.md`
-is only loaded in the main, private session -- never in group contexts.
-
-These files live under the agent workspace (`agents.defaults.workspace`, default
-`~/.openclaw/workspace`). See [Agent workspace](/concepts/agent-workspace) for
-the full layout.
-
-## When to write memory
-
- **Decisions, preferences, and durable facts** go to `MEMORY.md`.
- **Day-to-day notes and running context** go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," **write it down** (do not keep it in RAM).
- If you want something to stick, **ask the bot to write it** into memory.
+<Tip>
+If you want your agent to remember something, just ask it: "Remember that I
+prefer TypeScript." It will write it to the appropriate file.
+</Tip>

 ## Memory tools

-OpenClaw exposes two agent-facing tools:
+The agent has two tools for working with memory:

- **`memory_search`** -- semantic recall over indexed snippets. Uses the active
-  memory backend's search pipeline (vector similarity, keyword matching, or
-  hybrid).
- **`memory_get`** -- targeted read of a specific Markdown file or line range.
-  Degrades gracefully when a file does not exist (returns empty text instead of
-  an error).
+- **`memory_search`** -- finds relevant notes using semantic search, even when
+  the wording differs from the original.
+- **`memory_get`** -- reads a specific memory file or line range.

-## Memory backends
-
-OpenClaw supports two memory backends that control how `memory_search` indexes
-and retrieves content:
-
-### Builtin (default)
-
-The builtin backend uses a per-agent SQLite database with optional extensions:
-
- **FTS5 full-text search** for keyword matching (BM25 scoring).
- **sqlite-vec** for in-database vector similarity (falls back to in-process
-  cosine similarity when unavailable).
- **Hybrid search** combining BM25 + vector scores for best-of-both-worlds
-  retrieval.
- **CJK support** via configurable trigram tokenization with short-substring
-  fallback.
-
-The builtin backend works out of the box with no extra dependencies. For
-embedding vectors, configure an embedding provider (OpenAI, Gemini, Voyage,
-Mistral, Ollama, or local GGUF). Without an embedding provider, only keyword
-search is available.
-
-Index location: `~/.openclaw/memory/<agentId>.sqlite`
-
-### QMD (experimental)
-
-[QMD](https://github.com/tobi/qmd) is a local-first search sidecar that
-combines BM25 + vectors + reranking in a single binary. Set
-`memory.backend = "qmd"` to opt in.
-
-Key differences from the builtin backend:
-
- Runs as a subprocess (Bun + node-llama-cpp), auto-downloads GGUF models.
- Supports advanced post-processing: reranking, query expansion.
- Can index extra directories beyond the workspace (`memory.qmd.paths`).
- Can optionally index session transcripts (`memory.qmd.sessions`).
- Falls back to the builtin backend if QMD is unavailable.
-
-QMD requires a separate install (`bun install -g https://github.com/tobi/qmd`)
-and a SQLite build that allows extensions. See the
-[Memory configuration reference](/reference/memory-config) for full setup.
+Both tools are provided by the active memory plugin (default: `memory-core`).

 ## Memory search

-When an embedding provider is configured, `memory_search` uses semantic vector
-search to find relevant notes even when the wording differs from the query.
-Hybrid search (BM25 + vector) is enabled by default when both FTS5 and
-embeddings are available.
+When an embedding provider is configured, `memory_search` uses **hybrid
+search** -- combining vector similarity (semantic meaning) with keyword matching
+(exact terms like IDs and code symbols). This works out of the box once you have
+an API key for any supported provider.

-For details on how search works -- embedding providers, hybrid scoring, MMR
-diversity re-ranking, temporal decay, and tuning -- see
+<Info>
+OpenClaw auto-detects your embedding provider from available API keys. If you
+have an OpenAI, Gemini, Voyage, or Mistral key configured, memory search is
+enabled automatically.
+</Info>
+
+For details on how search works, tuning options, and provider setup, see
 [Memory Search](/concepts/memory-search).

-### Embedding provider auto-selection
+## Memory backends

-If `memorySearch.provider` is not set, OpenClaw auto-selects the first available
-provider in this order:
+OpenClaw has two backends for indexing and searching memory:

-1. `local` -- if `memorySearch.local.modelPath` is configured and exists.
-2. `openai` -- if an OpenAI key can be resolved.
-3. `gemini` -- if a Gemini key can be resolved.
-4. `voyage` -- if a Voyage key can be resolved.
-5. `mistral` -- if a Mistral key can be resolved.
+**Builtin (default)** -- uses a per-agent SQLite database. Works out of the box
+with no extra dependencies. Supports keyword search, vector similarity, and
+hybrid search with CJK support.

-If none can be resolved, memory search stays disabled until configured. Ollama
-is supported but not auto-selected (set `memorySearch.provider = "ollama"`
-explicitly).
+**QMD** -- a local-first search sidecar that adds reranking, query expansion,
+and the ability to index directories outside the workspace (like project docs or
+session transcripts). Set `memory.backend = "qmd"` to switch.

-## Additional memory paths
-
-Index Markdown files outside the default workspace layout:
-
-```json5
-{
-  agents: {
-    defaults: {
-      memorySearch: {
-        extraPaths: ["../team-docs", "/srv/shared-notes/overview.md"],
-      },
-    },
-  },
-}
-```
-
-Paths can be absolute or workspace-relative. Directories are scanned
-recursively for `.md` files. Symlinks are ignored.
-
-## Multimodal memory (Gemini)
-
-When using `gemini-embedding-2-preview`, OpenClaw can index image and audio
-files from `memorySearch.extraPaths`:
-
-```json5
-{
-  agents: {
-    defaults: {
-      memorySearch: {
-        provider: "gemini",
-        model: "gemini-embedding-2-preview",
-        extraPaths: ["assets/reference", "voice-notes"],
-        multimodal: {
-          enabled: true,
-          modalities: ["image", "audio"],
-        },
-      },
-    },
-  },
-}
-```
-
-Search queries remain text, but Gemini can compare them against indexed
-image/audio embeddings. `memory_get` still reads Markdown only.
-
-See the [Memory configuration reference](/reference/memory-config) for supported
-formats and limitations.
+See the [Memory configuration reference](/reference/memory-config) for backend
+setup and all config knobs.

 ## Automatic memory flush

-When a session is close to auto-compaction, OpenClaw runs a **silent turn** that
-reminds the model to write durable notes before the context is summarized. This
-prevents important information from being lost during compaction.
+Before [compaction](/concepts/compaction) summarizes your conversation, OpenClaw
+runs a silent turn that reminds the agent to save important context to memory
+files. This is on by default -- you do not need to configure anything.

-Controlled by `agents.defaults.compaction.memoryFlush`:
+<Tip>
+The memory flush prevents context loss during compaction. If your agent has
+important facts in the conversation that are not yet written to a file, they
+will be saved automatically before the summary happens.
+</Tip>

-```json5
-{
-  agents: {
-    defaults: {
-      compaction: {
-        memoryFlush: {
-          enabled: true, // default
-          softThresholdTokens: 4000, // how far below compaction threshold to trigger
-        },
-      },
-    },
-  },
-}
+## CLI
+
+```bash
+openclaw memory status          # Check index status and provider
+openclaw memory search "query"  # Search from the command line
+openclaw memory index --force   # Rebuild the index
 ```

-Details:
-
- **Triggers** when context usage crosses
-  `contextWindow - reserveTokensFloor - softThresholdTokens`.
- **Runs silently** -- prompts include `NO_REPLY` so nothing is delivered to the
-  user.
- **Once per compaction cycle** (tracked in `sessions.json`).
- **Skipped** when the workspace is read-only (`workspaceAccess: "ro"` or
-  `"none"`).
- The active memory plugin owns the flush prompt and path policy. The default
-  `memory-core` plugin writes to `memory/YYYY-MM-DD.md`.
-
-For the full compaction lifecycle, see [Compaction](/concepts/compaction).
-
-## CLI commands
-
-| Command                          | Description                                |
-| -------------------------------- | ------------------------------------------ |
-| `openclaw memory status`         | Show memory index status and provider info |
-| `openclaw memory search <query>` | Search memory from the command line        |
-| `openclaw memory index`          | Force a reindex of memory files            |
-
-Add `--agent <id>` to target a specific agent, `--deep` for extended
-diagnostics, or `--json` for machine-readable output.
-
-See [CLI: memory](/cli/memory) for the full command reference.
-
 ## Further reading

- [Memory Search](/concepts/memory-search) -- how search works, hybrid search,
-  MMR, temporal decay
+- [Memory Search](/concepts/memory-search) -- search pipeline, providers, and
+  tuning
 - [Memory configuration reference](/reference/memory-config) -- all config knobs
-  for providers, QMD, hybrid search, batch indexing, and multimodal
 - [Compaction](/concepts/compaction) -- how compaction interacts with memory
-  flush
- [Session Management Deep Dive](/reference/session-management-compaction) --
-  internal session and compaction lifecycle
--- a/docs/concepts/session-pruning.md
+++ b/docs/concepts/session-pruning.md
@@ -1,113 +1,54 @@
 ---
 title: "Session Pruning"
-summary: "How session pruning trims old tool results to reduce context bloat and improve cache efficiency"
+summary: "Trimming old tool results to keep context lean and caching efficient"
 read_when:
-  - You want to reduce LLM context growth from tool outputs
-  - You are tuning agents.defaults.contextPruning
+  - You want to reduce context growth from tool outputs
+  - You want to understand Anthropic prompt cache optimization
 ---

 # Session Pruning

-Session pruning trims **old tool results** from the in-memory context before
-each LLM call. It does **not** rewrite the on-disk session history (JSONL) --
-it only affects what gets sent to the model for that request.
+Session pruning trims **old tool results** from the context before each LLM
+call. It reduces context bloat from accumulated tool outputs (exec results, file
+reads, search results) without touching your conversation messages.

-## Why prune
+<Info>
+Pruning is in-memory only -- it does not modify the on-disk session transcript.
+Your full history is always preserved.
+</Info>

-Long-running sessions accumulate tool outputs (exec results, file reads, search
-results). These inflate the context window, increasing cost and eventually
-forcing [compaction](/concepts/compaction). Pruning removes stale tool output so
-the model sees a leaner context on each turn.
+## Why it matters

-Pruning is also important for **Anthropic prompt caching**. When a session goes
-idle past the cache TTL, the next request re-caches the full prompt. Pruning
-reduces the cache-write size for that first post-TTL request, which directly
-reduces cost.
+Long sessions accumulate tool output that inflates the context window. This
+increases cost and can force [compaction](/concepts/compaction) sooner than
+necessary.
+
+Pruning is especially valuable for **Anthropic prompt caching**. After the cache
+TTL expires, the next request re-caches the full prompt. Pruning reduces the
+cache-write size, directly lowering cost.

 ## How it works

-Pruning runs in `cache-ttl` mode, which is the only supported mode:
-
-1. **Check the clock** -- pruning only runs if the last Anthropic API call for
-   the session is older than `ttl` (default `5m`).
-2. **Find prunable messages** -- only `toolResult` messages are eligible. User
-   and assistant messages are never modified.
-3. **Protect recent context** -- the last `keepLastAssistants` assistant
-   messages (default `3`) and all tool results after that cutoff are preserved.
-4. **Soft-trim** oversized tool results -- keep the head and tail, insert
-   `...`, and append a note with the original size.
-5. **Hard-clear** remaining eligible results -- replace the entire content with
-   a placeholder.
-6. **Reset the TTL** -- subsequent requests keep cache until `ttl` expires
-   again.
-
-### What gets skipped
-
- Tool results containing **image blocks** are never trimmed.
- If there are not enough assistant messages to establish the cutoff, pruning
-  is skipped entirely.
- Pruning currently only activates for Anthropic API calls (and OpenRouter
-  Anthropic models).
+1. Wait for the cache TTL to expire (default 5 minutes).
+2. Find old tool results (user and assistant messages are never touched).
+3. **Soft-trim** oversized results -- keep the head and tail, insert `...`.
+4. **Hard-clear** the rest -- replace with a placeholder.
+5. Reset the TTL so follow-up requests reuse the fresh cache.

 ## Smart defaults

-OpenClaw auto-configures pruning for Anthropic profiles:
+OpenClaw auto-enables pruning for Anthropic profiles:

-| Profile type         | Pruning             | Heartbeat | Cache retention    |
-| -------------------- | ------------------- | --------- | ------------------ |
-| OAuth or setup-token | `cache-ttl` enabled | `1h`      | (provider default) |
-| API key              | `cache-ttl` enabled | `30m`     | `short` (5 min)    |
+| Profile type         | Pruning enabled | Heartbeat |
+| -------------------- | --------------- | --------- |
+| OAuth or setup-token | Yes             | 1 hour    |
+| API key              | Yes             | 30 min    |

-If you set any of these values explicitly, OpenClaw does not override them.
+If you set explicit values, OpenClaw does not override them.

-Match `ttl` to your model `cacheRetention` policy for best results (`short` =
-5 min, `long` = 1 hour).
+## Enable or disable

-## Pruning vs compaction
-
-|                | Pruning                           | Compaction                      |
-| -------------- | --------------------------------- | ------------------------------- |
-| **What**       | Trims tool result messages        | Summarizes conversation history |
-| **Persisted?** | No (in-memory, per request)       | Yes (in JSONL transcript)       |
-| **Scope**      | Tool results only                 | Entire conversation             |
-| **Trigger**    | Every LLM call (when TTL expired) | Context window threshold        |
-
-Built-in tools already truncate their own output. Pruning is an additional layer
-that prevents long-running chats from accumulating too much tool output over
-time. See [Compaction](/concepts/compaction) for the summarization approach.
-
-## Configuration
-
-### Defaults (when enabled)
-
-| Setting                 | Default                             | Description                                      |
-| ----------------------- | ----------------------------------- | ------------------------------------------------ |
-| `ttl`                   | `5m`                                | Prune only after this idle period                |
-| `keepLastAssistants`    | `3`                                 | Protect tool results near recent assistant turns |
-| `softTrimRatio`         | `0.3`                               | Context ratio for soft-trim eligibility          |
-| `hardClearRatio`        | `0.5`                               | Context ratio for hard-clear eligibility         |
-| `minPrunableToolChars`  | `50000`                             | Minimum tool result size to consider             |
-| `softTrim.maxChars`     | `4000`                              | Max chars after soft-trim                        |
-| `softTrim.headChars`    | `1500`                              | Head portion to keep                             |
-| `softTrim.tailChars`    | `1500`                              | Tail portion to keep                             |
-| `hardClear.enabled`     | `true`                              | Enable hard-clear stage                          |
-| `hardClear.placeholder` | `[Old tool result content cleared]` | Replacement text                                 |
-
-### Examples
-
-Disable pruning (default state):
-
-```json5
-{
-  agents: {
-    defaults: {
-      contextPruning: { mode: "off" },
-    },
-  },
-}
-```
-
-Enable TTL-aware pruning:
+Pruning is off by default for non-Anthropic providers. To enable:

 ```json5
 {
@@ -119,40 +60,21 @@ Enable TTL-aware pruning:
 }
 ```

-Restrict pruning to specific tools:
+To disable: set `mode: "off"`.

-```json5
-{
-  agents: {
-    defaults: {
-      contextPruning: {
-        mode: "cache-ttl",
-        tools: {
-          allow: ["exec", "read"],
-          deny: ["*image*"],
-        },
-      },
-    },
-  },
-}
-```
+## Pruning vs compaction

-Tool selection supports `*` wildcards, deny wins over allow, matching is
-case-insensitive, and an empty allow list means all tools are allowed.
+|            | Pruning            | Compaction              |
+| ---------- | ------------------ | ----------------------- |
+| **What**   | Trims tool results | Summarizes conversation |
+| **Saved?** | No (per-request)   | Yes (in transcript)     |
+| **Scope**  | Tool results only  | Entire conversation     |

-## Context window estimation
+They complement each other -- pruning keeps tool output lean between
+compaction cycles.

-Pruning estimates the context window (chars = tokens x 4). The base window is
-resolved in this order:
-
-1. `models.providers.*.models[].contextWindow` override.
-2. Model definition `contextWindow` from the model registry.
-3. Default `200000` tokens.
-
-If `agents.defaults.contextTokens` is set, it caps the resolved window.
-
-## Related
+## Further reading

 - [Compaction](/concepts/compaction) -- summarization-based context reduction
- [Session Management](/concepts/session) -- session lifecycle and routing
- [Gateway Configuration](/gateway/configuration) -- full config reference
+- [Gateway Configuration](/gateway/configuration) -- all pruning config knobs
+  (`contextPruning.*`)
--- a/docs/concepts/session-tool.md
+++ b/docs/concepts/session-tool.md
@@ -1,220 +1,84 @@
 ---
-summary: "Agent tools for listing sessions, reading history, cross-session messaging, and spawning sub-agents"
+summary: "Agent tools for listing sessions, reading history, and cross-session messaging"
 read_when:
-  - You want to understand agent session tools
-  - You are configuring cross-session access or sub-agent spawning
+  - You want to understand what session tools the agent has
+  - You want to configure cross-session access or sub-agent spawning
 title: "Session Tools"
 ---

 # Session Tools

-OpenClaw gives agents a small set of tools to interact with sessions: list them,
-read their history, send messages across sessions, and spawn isolated sub-agent
-runs.
+OpenClaw gives agents tools to work across sessions -- listing conversations,
+reading history, sending messages to other sessions, and spawning sub-agents.

-## Overview
+## Available tools

-| Tool               | Purpose                             |
-| ------------------ | ----------------------------------- |
-| `sessions_list`    | List sessions with optional filters |
-| `sessions_history` | Fetch transcript for one session    |
-| `sessions_send`    | Send a message into another session |
-| `sessions_spawn`   | Spawn an isolated sub-agent session |
+| Tool               | What it does                                            |
+| ------------------ | ------------------------------------------------------- |
+| `sessions_list`    | List sessions with optional filters (kind, recency)     |
+| `sessions_history` | Read the transcript of a specific session               |
+| `sessions_send`    | Send a message to another session and optionally wait   |
+| `sessions_spawn`   | Spawn an isolated sub-agent session for background work |

-## Session keys
+## Listing and reading sessions

-Session tools use **session keys** to identify conversations:
+`sessions_list` returns sessions with their key, kind, channel, model, token
+counts, and timestamps. Filter by kind (`main`, `group`, `cron`, `hook`,
+`node`) or recency (`activeMinutes`).

- `"main"` -- the agent's main direct-chat session.
- `agent:<agentId>:<channel>:group:<id>` -- group chat (pass the full key).
- `cron:<job.id>` -- cron job session.
- `hook:<uuid>` -- webhook session.
- `node-<nodeId>` -- node session.
+`sessions_history` fetches the conversation transcript for a specific session.
+By default, tool results are excluded -- pass `includeTools: true` to see them.

-`global` and `unknown` are reserved and never listed. If
-`session.scope = "global"`, it is aliased to `main` for all tools.
+Both tools accept either a **session key** (like `"main"`) or a **session ID**
+from a previous list call.

-## sessions_list
+## Sending cross-session messages

-Lists sessions as an array of rows.
+`sessions_send` delivers a message to another session and optionally waits for
+the response:

-**Parameters:**
+- **Fire-and-forget:** set `timeoutSeconds: 0` to enqueue and return
+  immediately.
+- **Wait for reply:** set a timeout and get the response inline.

-| Parameter       | Type       | Default        | Description                                              |
-| --------------- | ---------- | -------------- | -------------------------------------------------------- |
-| `kinds`         | `string[]` | all            | Filter: `main`, `group`, `cron`, `hook`, `node`, `other` |
-| `limit`         | `number`   | server default | Max rows returned                                        |
-| `activeMinutes` | `number`   | --             | Only sessions updated within N minutes                   |
-| `messageLimit`  | `number`   | `0`            | Include last N messages per session (0 = none)           |
+After the target responds, OpenClaw can run a **reply-back loop** where the
+agents alternate messages (up to 5 turns). The target agent can reply
+`REPLY_SKIP` to stop early.

-When `messageLimit > 0`, OpenClaw fetches chat history per session and includes
-the last N messages. Tool results are filtered out in list output -- use
-`sessions_history` for tool messages.
+## Spawning sub-agents

-**Row fields:** `key`, `kind`, `channel`, `displayName`, `updatedAt`,
-`sessionId`, `model`, `contextTokens`, `totalTokens`, `thinkingLevel`,
-`verboseLevel`, `sendPolicy`, `lastChannel`, `lastTo`, `deliveryContext`,
-`transcriptPath`, and optionally `messages`.
+`sessions_spawn` creates an isolated session for a background task. It is always
+non-blocking -- it returns immediately with a `runId` and `childSessionKey`.

-## sessions_history
+Key options:

-Fetches the transcript for one session.
+- `runtime: "subagent"` (default) or `"acp"` for external harness agents.
+- `model` and `thinking` overrides for the child session.
+- `thread: true` to bind the spawn to a chat thread (Discord, Slack, etc.).
+- `sandbox: "require"` to enforce sandboxing on the child.

-**Parameters:**
+Sub-agents get the full tool set minus session tools (no recursive spawning).
+After completion, an announce step posts the result to the requester's channel.

-| Parameter      | Type      | Default        | Description                                     |
-| -------------- | --------- | -------------- | ----------------------------------------------- |
-| `sessionKey`   | `string`  | required       | Session key or `sessionId` from `sessions_list` |
-| `limit`        | `number`  | server default | Max messages                                    |
-| `includeTools` | `boolean` | `false`        | Include `toolResult` messages                   |
+For ACP-specific behavior, see [ACP Agents](/tools/acp-agents).

-When given a `sessionId`, OpenClaw resolves it to the corresponding session key.
+## Visibility

-### Gateway APIs
+Session tools are scoped to limit what the agent can see:

-Control UI and gateway clients can use lower-level APIs directly:
+| Level   | Scope                                    |
+| ------- | ---------------------------------------- |
+| `self`  | Only the current session                 |
+| `tree`  | Current session + spawned sub-agents     |
+| `agent` | All sessions for this agent              |
+| `all`   | All sessions (cross-agent if configured) |

- **HTTP:** `GET /sessions/{sessionKey}/history` with query params `limit`,
-  `cursor`, `includeTools=1`, `follow=1` (upgrades to SSE stream).
- **WebSocket:** `sessions.subscribe` for all lifecycle events,
-  `sessions.messages.subscribe { key }` for one session's transcript,
-  `sessions.messages.unsubscribe { key }` to remove.
+Default is `tree`. Sandboxed sessions are clamped to `tree` regardless of
+config.

-## sessions_send
+## Further reading

-Sends a message into another session.
-
-**Parameters:**
-
-| Parameter        | Type     | Default  | Description                        |
-| ---------------- | -------- | -------- | ---------------------------------- |
-| `sessionKey`     | `string` | required | Target session key or `sessionId`  |
-| `message`        | `string` | required | Message content                    |
-| `timeoutSeconds` | `number` | > 0      | Wait timeout (0 = fire-and-forget) |
-
-**Behavior:**
-
- `timeoutSeconds = 0` -- enqueue and return `{ runId, status: "accepted" }`.
- `timeoutSeconds > 0` -- wait for completion, then return the reply.
- Timeout: `{ runId, status: "timeout" }`. The run continues; check
-  `sessions_history` later.
-
-### Reply-back loop
-
-After the target session responds, OpenClaw runs an alternating reply loop
-between requester and target agents:
-
- Reply `REPLY_SKIP` to stop the ping-pong.
- Max turns: `session.agentToAgent.maxPingPongTurns` (0--5, default 5).
-
-After the loop, an **announce step** posts the result to the target's chat
-channel. Reply `ANNOUNCE_SKIP` to stay silent. The announce includes the
-original request, round-1 reply, and latest ping-pong reply.
-
-Inter-session messages are tagged with
-`message.provenance.kind = "inter_session"` so transcript readers can
-distinguish routed agent instructions from external user input.
-
-## sessions_spawn
-
-Spawns an isolated delegated session for background work.
-
-**Parameters:**
-
-| Parameter           | Type      | Default    | Description                                  |
-| ------------------- | --------- | ---------- | -------------------------------------------- |
-| `task`              | `string`  | required   | Task description                             |
-| `runtime`           | `string`  | `subagent` | `subagent` or `acp`                          |
-| `label`             | `string`  | --         | Label for logs/UI                            |
-| `agentId`           | `string`  | --         | Target agent or ACP harness ID               |
-| `model`             | `string`  | --         | Override sub-agent model                     |
-| `thinking`          | `string`  | --         | Override thinking level                      |
-| `runTimeoutSeconds` | `number`  | `0`        | Abort after N seconds (0 = no limit)         |
-| `thread`            | `boolean` | `false`    | Request thread-bound routing                 |
-| `mode`              | `string`  | `run`      | `run` or `session` (session requires thread) |
-| `cleanup`           | `string`  | `keep`     | `delete` or `keep`                           |
-| `sandbox`           | `string`  | `inherit`  | `inherit` or `require`                       |
-| `attachments`       | `array`   | --         | Inline files (subagent only)                 |
-
-**Behavior:**
-
- Always non-blocking: returns `{ status: "accepted", runId, childSessionKey }`.
- Creates a new `agent:<agentId>:subagent:<uuid>` session with
-  `deliver: false`.
- Sub-agents get the full tool set minus session tools (configurable via
-  `tools.subagents.tools`).
- Sub-agents cannot call `sessions_spawn` (no recursive spawning).
- After completion, an announce step posts the result to the requester's
-  channel. Reply `ANNOUNCE_SKIP` to stay silent.
- Sub-agent sessions are auto-archived after
-  `agents.defaults.subagents.archiveAfterMinutes` (default: 60).
-
-### Allowlists
-
- **Subagent:** `agents.list[].subagents.allowAgents` controls which agent IDs
-  are allowed (`["*"]` for any). Default: only the requester.
- **ACP:** `acp.allowedAgents` controls allowed ACP harness IDs (separate from
-  subagent policy).
- If the requester is sandboxed, targets that would run unsandboxed are
-  rejected.
-
-### Attachments
-
-Each entry: `{ name, content, encoding?: "utf8" | "base64", mimeType? }`.
-Files are materialized into `<workspace>/.openclaw/attachments/<uuid>/` and a
-receipt with sha256 is returned. ACP runtime rejects attachments.
-
-For ACP-specific behavior (harness targeting, permission modes), see
-[ACP Agents](/tools/acp-agents).
-
-## Visibility and access control
-
-Session tools can be scoped to limit cross-session access.
-
-### Visibility levels
-
-| Level            | What the agent can see                                  |
-| ---------------- | ------------------------------------------------------- |
-| `self`           | Only the current session                                |
-| `tree` (default) | Current session + spawned sub-agent sessions            |
-| `agent`          | Any session belonging to the current agent              |
-| `all`            | Any session (cross-agent requires `tools.agentToAgent`) |
-
-Configure at `tools.sessions.visibility`.
-
-### Sandbox clamping
-
-Sandboxed sessions have an additional clamp via
-`agents.defaults.sandbox.sessionToolsVisibility` (default: `spawned`). When
-this is set, visibility is clamped to `tree` even if
-`tools.sessions.visibility = "all"`.
-
-```json5
-{
-  tools: {
-    sessions: {
-      visibility: "tree",
-    },
-  },
-  agents: {
-    defaults: {
-      sandbox: {
-        sessionToolsVisibility: "spawned",
-      },
-    },
-  },
-}
-```
-
-## Send policy
-
-Policy-based blocking by channel or chat type prevents agents from sending to
-restricted sessions. See [Session Management](/concepts/session) for send policy
-configuration.
-
-## Related
-
- [Session Management](/concepts/session) -- session routing, lifecycle, and
-  maintenance
- [ACP Agents](/tools/acp-agents) -- ACP-specific spawning and permissions
+- [Session Management](/concepts/session) -- routing, lifecycle, maintenance
+- [ACP Agents](/tools/acp-agents) -- external harness spawning
 - [Multi-agent](/concepts/multi-agent) -- multi-agent architecture
+- [Gateway Configuration](/gateway/configuration) -- session tool config knobs
--- a/docs/concepts/session.md
+++ b/docs/concepts/session.md
@@ -1,298 +1,113 @@
 ---
-summary: "How OpenClaw manages sessions -- routing, isolation, lifecycle, and maintenance"
+summary: "How OpenClaw manages conversation sessions"
 read_when:
-  - You want to understand session keys and routing
-  - You want to configure DM isolation or multi-user setups
-  - You want to tune session lifecycle or maintenance
+  - You want to understand session routing and isolation
+  - You want to configure DM scope for multi-user setups
 title: "Session Management"
 ---

 # Session Management

-OpenClaw manages conversations through **sessions**. Each session has a key
-(which conversation bucket it belongs to), an ID (which transcript file
-continues it), and metadata tracked in a session store.
+OpenClaw organizes conversations into **sessions**. Each message is routed to a
+session based on where it came from -- DMs, group chats, cron jobs, etc.

-## How sessions are routed
+## How messages are routed

-Every inbound message is mapped to a **session key** that determines which
-conversation it joins:
+| Source          | Behavior                  |
+| --------------- | ------------------------- |
+| Direct messages | Shared session by default |
+| Group chats     | Isolated per group        |
+| Rooms/channels  | Isolated per room         |
+| Cron jobs       | Fresh session per run     |
+| Webhooks        | Isolated per hook         |

-| Source          | Session key pattern                      | Behavior                              |
-| --------------- | ---------------------------------------- | ------------------------------------- |
-| Direct messages | `agent:<agentId>:<mainKey>`              | Shared by default (`dmScope: "main"`) |
-| Group chats     | `agent:<agentId>:<channel>:group:<id>`   | Isolated per group                    |
-| Rooms/channels  | `agent:<agentId>:<channel>:channel:<id>` | Isolated per room                     |
-| Cron jobs       | `cron:<job.id>`                          | Fresh session per run                 |
-| Webhooks        | `hook:<uuid>`                            | Unless explicitly overridden          |
-| Node runs       | `node-<nodeId>`                          | Unless explicitly overridden          |
+## DM isolation

-Telegram forum topics append `:topic:<threadId>` for per-topic isolation.
-
-## DM scope and isolation
-
-By default, all direct messages share one session (`dmScope: "main"`) for
-continuity across devices and channels. This works well for single-user setups,
-but can leak context when multiple people message your agent.
-
-### Secure DM mode
+By default, all DMs share one session for continuity. This is fine for
+single-user setups.

 <Warning>
-If your agent receives DMs from multiple people, you should enable DM isolation.
-Without it, all users share the same conversation context.
+If multiple people can message your agent, enable DM isolation. Without it, all
+users share the same conversation context -- Alice's private messages would be
+visible to Bob.
 </Warning>

-**The problem:** Alice messages about a private topic. Bob asks "What were we
-talking about?" Because both share a session, the model may answer Bob using
-Alice's context.
-
 **The fix:**

 ```json5
 {
  session: {
-    dmScope: "per-channel-peer",
+    dmScope: "per-channel-peer", // isolate by channel + sender
  },
 }
 ```

-### DM scope options
+Other options:

-| Value                      | Key pattern                                        | Best for                             |
-| -------------------------- | -------------------------------------------------- | ------------------------------------ |
-| `main` (default)           | `agent:<id>:main`                                  | Single-user, cross-device continuity |
-| `per-peer`                 | `agent:<id>:direct:<peerId>`                       | Multi-user, cross-channel identity   |
-| `per-channel-peer`         | `agent:<id>:<channel>:direct:<peerId>`             | Multi-user inboxes (recommended)     |
-| `per-account-channel-peer` | `agent:<id>:<channel>:<accountId>:direct:<peerId>` | Multi-account inboxes                |
+- `main` (default) -- all DMs share one session.
+- `per-peer` -- isolate by sender (across channels).
+- `per-channel-peer` -- isolate by channel + sender (recommended).
+- `per-account-channel-peer` -- isolate by account + channel + sender.

-### Cross-channel identity linking
+<Tip>
+If the same person contacts you from multiple channels, use
+`session.identityLinks` to link their identities so they share one session.
+</Tip>

-When using `per-peer` or `per-channel-peer`, the same person messaging from
-different channels gets separate sessions. Use `session.identityLinks` to
-collapse them:
-
-```json5
-{
-  session: {
-    identityLinks: {
-      alice: ["telegram:123456789", "discord:987654321012345678"],
-    },
-  },
-}
-```
-
-The canonical key replaces `<peerId>` so Alice shares one session across
-channels.
-
-**When to enable DM isolation:**
-
- Pairing approvals for more than one sender
- DM allowlist with multiple entries
- `dmPolicy: "open"`
- Multiple phone numbers or accounts can message the agent
-
-Verify settings with `openclaw security audit` (see [security](/cli/security)).
-Local CLI onboarding writes `per-channel-peer` by default when unset.
+Verify your setup with `openclaw security audit`.

 ## Session lifecycle

-### Resets
+Sessions are reused until they expire:

-Sessions are reused until they expire. Expiry is evaluated on the next inbound
-message:
+- **Daily reset** (default) -- new session at 4:00 AM local time on the gateway
+  host.
+- **Idle reset** (optional) -- new session after a period of inactivity. Set
+  `session.reset.idleMinutes`.
+- **Manual reset** -- type `/new` or `/reset` in chat. `/new <model>` also
+  switches the model.

- **Daily reset** (default) -- 4:00 AM local time on the gateway host. A
-  session is stale once its last update is before the most recent reset time.
- **Idle reset** (optional) -- `idleMinutes` adds a sliding idle window.
- **Combined** -- when both are configured, whichever expires first forces a new
-  session.
-
-Override per session type or channel:
-
-```json5
-{
-  session: {
-    reset: {
-      mode: "daily",
-      atHour: 4,
-      idleMinutes: 120,
-    },
-    resetByType: {
-      thread: { mode: "daily", atHour: 4 },
-      direct: { mode: "idle", idleMinutes: 240 },
-      group: { mode: "idle", idleMinutes: 120 },
-    },
-    resetByChannel: {
-      discord: { mode: "idle", idleMinutes: 10080 },
-    },
-  },
-}
-```
-
-### Manual resets
-
- `/new` or `/reset` starts a fresh session. The remainder of the message is
-  passed through.
- `/new <model>` accepts a model alias, `provider/model`, or provider name
-  (fuzzy match) to set the session model.
- If sent alone, OpenClaw runs a short greeting turn to confirm the reset.
- Custom triggers: add to `resetTriggers` array.
- Delete specific keys from the store or remove the JSONL transcript; the next
-  message recreates them.
- Isolated cron jobs always mint a fresh `sessionId` per run.
+When both daily and idle resets are configured, whichever expires first wins.

 ## Where state lives

-All session state is **owned by the gateway**. UI clients (macOS app, WebChat,
-TUI) query the gateway for session lists and token counts.
+All session state is owned by the **gateway**. UI clients query the gateway for
+session data.

-In remote mode, the session store lives on the remote gateway host, not your
-local machine.
-
-### Storage
-
-| Artifact      | Path                                                      | Purpose                           |
-| ------------- | --------------------------------------------------------- | --------------------------------- |
-| Session store | `~/.openclaw/agents/<agentId>/sessions/sessions.json`     | Key-value map of session metadata |
-| Transcripts   | `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl` | Append-only conversation history  |
-
-The store maps `sessionKey -> { sessionId, updatedAt, ... }`. Deleting entries
-is safe; they are recreated on demand. Group entries may include `displayName`,
-`channel`, `subject`, `room`, and `space` for UI labeling.
-
-Telegram topic sessions use `.../<sessionId>-topic-<threadId>.jsonl`.
+- **Store:** `~/.openclaw/agents/<agentId>/sessions/sessions.json`
+- **Transcripts:** `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl`

 ## Session maintenance

-OpenClaw keeps the session store and transcripts bounded over time.
-
-### Defaults
-
-| Setting                 | Default             | Description                                                     |
-| ----------------------- | ------------------- | --------------------------------------------------------------- |
-| `mode`                  | `warn`              | `warn` reports what would be evicted; `enforce` applies cleanup |
-| `pruneAfter`            | `30d`               | Stale-entry age cutoff                                          |
-| `maxEntries`            | `500`               | Cap entries in sessions.json                                    |
-| `rotateBytes`           | `10mb`              | Rotate sessions.json when oversized                             |
-| `resetArchiveRetention` | `30d`               | Retention for reset archives                                    |
-| `maxDiskBytes`          | unset               | Optional sessions-directory budget                              |
-| `highWaterBytes`        | 80% of maxDiskBytes | Target after cleanup                                            |
-
-### Enforcement order (`mode: "enforce"`)
-
-1. Prune stale entries older than `pruneAfter`.
-2. Cap entry count to `maxEntries` (oldest first).
-3. Archive transcript files for removed entries.
-4. Purge old reset/deleted archives by retention policy.
-5. Rotate `sessions.json` when it exceeds `rotateBytes`.
-6. If `maxDiskBytes` is set, enforce disk budget toward `highWaterBytes`.
-
-### Configuration examples
-
-Conservative enforce policy:
+OpenClaw automatically bounds session storage over time. By default, it runs
+in `warn` mode (reports what would be cleaned). Set `session.maintenance.mode`
+to `"enforce"` for automatic cleanup:

 ```json5
 {
  session: {
    maintenance: {
      mode: "enforce",
-      pruneAfter: "45d",
-      maxEntries: 800,
-      rotateBytes: "20mb",
-      resetArchiveRetention: "14d",
+      pruneAfter: "30d",
+      maxEntries: 500,
    },
  },
 }
 ```

-Hard disk budget:
-
-```json5
-{
-  session: {
-    maintenance: {
-      mode: "enforce",
-      maxDiskBytes: "1gb",
-      highWaterBytes: "800mb",
-    },
-  },
-}
-```
-
-Preview or force from CLI:
-
-```bash
-openclaw sessions cleanup --dry-run
-openclaw sessions cleanup --enforce
-```
-
-### Performance note
-
-Large session stores can increase write-path latency. To keep things fast:
-
- Use `mode: "enforce"` in production.
- Set both time and count limits (`pruneAfter` + `maxEntries`).
- Set `maxDiskBytes` + `highWaterBytes` for hard upper bounds.
- Run `openclaw sessions cleanup --dry-run --json` after config changes to
-  preview impact.
-
-## Send policy
-
-Block delivery for specific session types without listing individual IDs:
-
-```json5
-{
-  session: {
-    sendPolicy: {
-      rules: [
-        { action: "deny", match: { channel: "discord", chatType: "group" } },
-        { action: "deny", match: { keyPrefix: "cron:" } },
-        { action: "deny", match: { rawKeyPrefix: "agent:main:discord:" } },
-      ],
-      default: "allow",
-    },
-  },
-}
-```
-
-Runtime override (owner only):
-
- `/send on` -- allow for this session.
- `/send off` -- deny for this session.
- `/send inherit` -- clear override and use config rules.
+Preview with `openclaw sessions cleanup --dry-run`.

 ## Inspecting sessions

-| Method                               | What it shows                                              |
-| ------------------------------------ | ---------------------------------------------------------- |
-| `openclaw status`                    | Store path, recent sessions                                |
-| `openclaw sessions --json`           | All entries (filter with `--active <minutes>`)             |
-| `/status` in chat                    | Reachability, context usage, toggles, cred freshness       |
-| `/context list` or `/context detail` | System prompt contents, biggest context contributors       |
-| `/stop` in chat                      | Abort current run, clear queued followups, stop sub-agents |
+- `openclaw status` -- session store path and recent activity.
+- `openclaw sessions --json` -- all sessions (filter with `--active <minutes>`).
+- `/status` in chat -- context usage, model, and toggles.
+- `/context list` -- what is in the system prompt.

-JSONL transcripts can be opened directly to review full turns.
+## Further reading

-## Session origin metadata
-
-Each session entry records where it came from (best-effort) in `origin`:
-
- `label` -- human label (from conversation label + group subject/channel).
- `provider` -- normalized channel ID (including extensions).
- `from` / `to` -- raw routing IDs from the inbound envelope.
- `accountId` -- provider account ID (multi-account).
- `threadId` -- thread/topic ID when supported.
-
-Extensions populate these by sending `ConversationLabel`, `GroupSubject`,
-`GroupChannel`, `GroupSpace`, and `SenderName` in the inbound context.
-
-## Tips
-
- Keep the primary key dedicated to 1:1 traffic; let groups keep their own
-  keys.
- When automating cleanup, delete individual keys instead of the whole store to
-  preserve context elsewhere.
- Related: [Session Pruning](/concepts/session-pruning),
-  [Compaction](/concepts/compaction),
-  [Session Tools](/concepts/session-tool),
-  [Session Management Deep Dive](/reference/session-management-compaction).
+- [Session Pruning](/concepts/session-pruning) -- trimming tool results
+- [Compaction](/concepts/compaction) -- summarizing long conversations
+- [Session Tools](/concepts/session-tool) -- agent tools for cross-session work
+- [Session Management Deep Dive](/reference/session-management-compaction) --
+  store schema, transcripts, send policy, origin metadata, and advanced config
--- a/docs/reference/memory-config.md
+++ b/docs/reference/memory-config.md
@@ -51,7 +51,7 @@ local policy).
 When using a custom OpenAI-compatible endpoint,
 set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).

-## QMD backend (experimental)
+## QMD backend

 Set `memory.backend = "qmd"` to swap the built-in SQLite indexer for
 [QMD](https://github.com/tobi/qmd): a local-first search sidecar that combines