docs: update session sqlite guidance

2026-05-10 12:32:27 +00:00 · 2026-05-08 11:54:19 +01:00
parent bca4b69e2c
commit a577c30863
3 changed files with 113 additions and 53 deletions
--- a/docs/concepts/active-memory.md
+++ b/docs/concepts/active-memory.md
@@ -612,14 +612,14 @@ or compact user-fact context for the main model.

 ## Transcript persistence

-Active memory blocking memory sub-agent runs create a real `session.jsonl`
-transcript during the blocking memory sub-agent call.
+Active memory blocking memory sub-agent runs create SQLite transcript rows
+during the blocking memory sub-agent call.

 By default, that transcript is temporary:

- it is written to a temp directory
+- it uses a temporary transcript scope
 - it is used only for the blocking memory sub-agent run
- it is deleted immediately after the run finishes
+- its rows are removed after the run finishes

 If you want to keep those blocking memory sub-agent transcripts on disk for debugging or
 inspection, turn persistence on explicitly:
@@ -642,8 +642,8 @@ inspection, turn persistence on explicitly:
 ```

 When enabled, active memory records the blocking sub-agent transcript in the
-agent SQLite database and registers a plugin-owned transcript locator under the
-state directory, not in the main user conversation transcript path.
+agent SQLite database and registers plugin-owned transcript locator metadata,
+not a JSONL runtime sidecar and not the main user conversation transcript path.

 The default locator namespace is conceptually:

--- a/docs/reference/session-management-compaction.md
+++ b/docs/reference/session-management-compaction.md
@@ -71,14 +71,21 @@ cached by file path plus `mtimeMs`/`size` and shared across concurrent readers.

 Per agent, on the Gateway host:

- Store: `~/.openclaw/state/openclaw.sqlite` by default. `openclaw doctor --fix`
-  imports legacy `~/.openclaw/agents/<agentId>/sessions/sessions.json` indexes
-  into SQLite and removes the JSON index after import; Gateway startup leaves
-  legacy indexes alone.
- Transcripts: `~/.openclaw/state/openclaw.sqlite` (`transcript_events` and
-  `transcript_files`). Legacy/export paths may still use
-  `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl` names as stable
-  handles.
+- Global store: `~/.openclaw/state/openclaw.sqlite` by default. It stores
+  shared registry, migration, plugin, task, backup, and transcript-locator
+  metadata.
+- Agent store: `~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite`. It
+  stores canonical session rows, transcript events, snapshots, VFS entries,
+  artifacts, and agent-local cache rows.
+- Legacy imports: `openclaw doctor --fix` imports
+  `~/.openclaw/agents/<agentId>/sessions/sessions.json` indexes and JSONL
+  transcripts into the agent SQLite database, then removes imported legacy
+  sources after durable verification. Gateway startup leaves legacy indexes
+  alone.
+- Transcripts: runtime transcript events live in the per-agent database
+  (`transcript_events` and `transcript_event_identities`). The global
+  `transcript_files` table maps legacy/export/debug path-shaped locators to
+  `{ agentId, sessionId }`; JSONL files are not runtime sidecars.
  - Telegram topic handles: `.../<sessionId>-topic-<threadId>.jsonl`

 OpenClaw resolves these via `src/config/sessions/*`.
@@ -199,7 +206,7 @@ The store is safe to edit, but the Gateway is the authority: it may rewrite or r

 Transcripts are managed by OpenClaw's SQLite-backed `SessionManager`.

-The event stream is stored in `transcript_events`:
+The event stream is stored in the per-agent `transcript_events` table:

 - First event: session header (`type: "session"`, includes `id`, `cwd`,
  `timestamp`, optional `parentSession`)
--- a/skills/session-logs/SKILL.md
+++ b/skills/session-logs/SKILL.md
@@ -1,12 +1,12 @@
 ---
 name: session-logs
-description: Search and analyze your own session logs (older/parent conversations) using jq.
+description: Search and analyze your own SQLite-backed session logs (older/parent conversations).
 metadata:
  {
    "openclaw":
      {
        "emoji": "📜",
-        "requires": { "bins": ["jq", "rg"] },
+        "requires": { "bins": ["jq", "rg", "sqlite3"] },
        "install":
          [
            {
@@ -23,6 +23,13 @@ metadata:
              "bins": ["rg"],
              "label": "Install ripgrep (brew)",
            },
+            {
+              "id": "brew-sqlite",
+              "kind": "brew",
+              "formula": "sqlite",
+              "bins": ["sqlite3"],
+              "label": "Install sqlite3 (brew)",
+            },
          ],
      },
  }
@@ -30,7 +37,9 @@ metadata:

 # session-logs

-Search your complete conversation history stored in session JSONL files. Use this when a user references older/parent conversations or asks what was said before.
+Search your complete conversation history stored in per-agent SQLite databases.
+Use this when a user references older/parent conversations or asks what was said
+before.

 ## Trigger

@@ -38,16 +47,22 @@ Use this skill when the user asks about prior chats, parent conversations, or hi

 ## Location

-Session logs live under the active state directory:
-`$OPENCLAW_STATE_DIR/agents/<agentId>/sessions/` (default: `~/.openclaw/agents/<agentId>/sessions/`).
+Session logs live under the active state directory in the per-agent database:
+`$OPENCLAW_STATE_DIR/agents/<agentId>/agent/openclaw-agent.sqlite` (default:
+`~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite`).
 Use the `agent=<id>` value from the system prompt Runtime line.

- **`sessions.json`** - Index mapping session keys to session IDs
- **`<session-id>.jsonl`** - Full conversation transcript per session
+- **`session_entries`** - Session-key rows with JSON metadata
+- **`transcript_events`** - Full conversation transcript event stream per session
+- **`transcript_event_identities`** - Queryable event ids, parent ids, event types, and idempotency keys
+
+Legacy JSON/JSONL files under `agents/<agentId>/sessions/` are doctor migration
+inputs or explicit debug/export artifacts only.

 ## Structure

-Each `.jsonl` file contains messages with:
+Each `transcript_events.event_json` value uses the same JSON shape exported to
+JSONL:

 - `type`: "session" (metadata) or "message"
 - `timestamp`: ISO timestamp
@@ -61,91 +76,129 @@ Each `.jsonl` file contains messages with:

 ```bash
 AGENT_ID="<agentId>"
-SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
-for f in "$SESSION_DIR"/*.jsonl; do
-  date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1)
-  size=$(ls -lh "$f" | awk '{print $5}')
-  echo "$date $size $(basename $f)"
-done | sort -r
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -json "$DB" '
+  SELECT
+    session_key,
+    json_extract(entry_json, "$.sessionId") AS session_id,
+    updated_at
+  FROM session_entries
+  ORDER BY updated_at DESC
+  LIMIT 100;
+' | jq -r '.[] | "\(.updated_at) \(.session_id) \(.session_key)"'
 ```

 ### Find sessions from a specific day

 ```bash
 AGENT_ID="<agentId>"
-SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
-for f in "$SESSION_DIR"/*.jsonl; do
-  head -1 "$f" | jq -r '.timestamp' | grep -q "2026-01-06" && echo "$f"
-done
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -json "$DB" '
+  SELECT session_id, min(created_at) AS first_event_at, max(created_at) AS last_event_at
+  FROM transcript_events
+  GROUP BY session_id
+  HAVING date(first_event_at / 1000, "unixepoch") = "2026-01-06"
+  ORDER BY first_event_at DESC;
+'
 ```

 ### Extract user messages from a session

 ```bash
-jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text' <session>.jsonl
+AGENT_ID="<agentId>"
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text'
 ```

 ### Search for keyword in assistant responses

 ```bash
-jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' <session>.jsonl | rg -i "keyword"
+AGENT_ID="<agentId>"
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' |
+  rg -i "keyword"
 ```

 ### Get total cost for a session

 ```bash
-jq -s '[.[] | .message.usage.cost.total // 0] | add' <session>.jsonl
+AGENT_ID="<agentId>"
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -s '[.[] | .message.usage.cost.total // 0] | add'
 ```

 ### Daily cost summary

 ```bash
 AGENT_ID="<agentId>"
-SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
-for f in "$SESSION_DIR"/*.jsonl; do
-  date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1)
-  cost=$(jq -s '[.[] | .message.usage.cost.total // 0] | add' "$f")
-  echo "$date $cost"
-done | awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" 'SELECT event_json FROM transcript_events ORDER BY created_at;' |
+  jq -r '[.timestamp[0:10], (.message.usage.cost.total // 0)] | @tsv' |
+  awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r
 ```

 ### Count messages and tokens in a session

 ```bash
-jq -s '{
+AGENT_ID="<agentId>"
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -s '{
  messages: length,
  user: [.[] | select(.message.role == "user")] | length,
  assistant: [.[] | select(.message.role == "assistant")] | length,
  first: .[0].timestamp,
  last: .[-1].timestamp
-}' <session>.jsonl
+}'
 ```

 ### Tool usage breakdown

 ```bash
-jq -r '.message.content[]? | select(.type == "toolCall") | .name' <session>.jsonl | sort | uniq -c | sort -rn
+AGENT_ID="<agentId>"
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -r '.message.content[]? | select(.type == "toolCall") | .name' |
+  sort | uniq -c | sort -rn
 ```

 ### Search across ALL sessions for a phrase

 ```bash
 AGENT_ID="<agentId>"
-SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
-rg -l "phrase" "$SESSION_DIR"/*.jsonl
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" 'SELECT session_id || char(9) || event_json FROM transcript_events ORDER BY created_at;' |
+  rg -i "phrase"
 ```

 ## Tips

- Sessions are append-only JSONL (one JSON object per line)
- Large sessions can be several MB - use `head`/`tail` for sampling
- The `sessions.json` index maps chat providers (discord, whatsapp, etc.) to session IDs
- Deleted sessions have `.deleted.<timestamp>` suffix
+- Sessions are append-only SQLite rows; export/debug JSONL is one JSON object per line
+- Large sessions can be several MB; always filter by `session_id` when you know it
+- `session_entries` maps chat providers (Discord, WhatsApp, etc.) to session IDs
+- Deleted legacy debug/export files can have `.deleted.<timestamp>` suffix

 ## Fast text-only hint (low noise)

 ```bash
 AGENT_ID="<agentId>"
-SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
-jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' "$SESSION_DIR"/<id>.jsonl | rg 'keyword'
+SESSION_ID="<sessionId>"
+DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
+sqlite3 -readonly -noheader "$DB" \
+  "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
+  jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' |
+  rg 'keyword'
 ```