diff --git a/docs/concepts/active-memory.md b/docs/concepts/active-memory.md index 06f1caa8684..01755983bdd 100644 --- a/docs/concepts/active-memory.md +++ b/docs/concepts/active-memory.md @@ -612,14 +612,14 @@ or compact user-fact context for the main model. ## Transcript persistence -Active memory blocking memory sub-agent runs create a real `session.jsonl` -transcript during the blocking memory sub-agent call. +Active memory blocking memory sub-agent runs create SQLite transcript rows +during the blocking memory sub-agent call. By default, that transcript is temporary: -- it is written to a temp directory +- it uses a temporary transcript scope - it is used only for the blocking memory sub-agent run -- it is deleted immediately after the run finishes +- its rows are removed after the run finishes If you want to keep those blocking memory sub-agent transcripts on disk for debugging or inspection, turn persistence on explicitly: @@ -642,8 +642,8 @@ inspection, turn persistence on explicitly: ``` When enabled, active memory records the blocking sub-agent transcript in the -agent SQLite database and registers a plugin-owned transcript locator under the -state directory, not in the main user conversation transcript path. +agent SQLite database and registers plugin-owned transcript locator metadata, +not a JSONL runtime sidecar and not the main user conversation transcript path. The default locator namespace is conceptually: diff --git a/docs/reference/session-management-compaction.md b/docs/reference/session-management-compaction.md index 4204e6e58e0..cce16cb6a9b 100644 --- a/docs/reference/session-management-compaction.md +++ b/docs/reference/session-management-compaction.md @@ -71,14 +71,21 @@ cached by file path plus `mtimeMs`/`size` and shared across concurrent readers. Per agent, on the Gateway host: -- Store: `~/.openclaw/state/openclaw.sqlite` by default. `openclaw doctor --fix` - imports legacy `~/.openclaw/agents//sessions/sessions.json` indexes - into SQLite and removes the JSON index after import; Gateway startup leaves - legacy indexes alone. -- Transcripts: `~/.openclaw/state/openclaw.sqlite` (`transcript_events` and - `transcript_files`). Legacy/export paths may still use - `~/.openclaw/agents//sessions/.jsonl` names as stable - handles. +- Global store: `~/.openclaw/state/openclaw.sqlite` by default. It stores + shared registry, migration, plugin, task, backup, and transcript-locator + metadata. +- Agent store: `~/.openclaw/agents//agent/openclaw-agent.sqlite`. It + stores canonical session rows, transcript events, snapshots, VFS entries, + artifacts, and agent-local cache rows. +- Legacy imports: `openclaw doctor --fix` imports + `~/.openclaw/agents//sessions/sessions.json` indexes and JSONL + transcripts into the agent SQLite database, then removes imported legacy + sources after durable verification. Gateway startup leaves legacy indexes + alone. +- Transcripts: runtime transcript events live in the per-agent database + (`transcript_events` and `transcript_event_identities`). The global + `transcript_files` table maps legacy/export/debug path-shaped locators to + `{ agentId, sessionId }`; JSONL files are not runtime sidecars. - Telegram topic handles: `.../-topic-.jsonl` OpenClaw resolves these via `src/config/sessions/*`. @@ -199,7 +206,7 @@ The store is safe to edit, but the Gateway is the authority: it may rewrite or r Transcripts are managed by OpenClaw's SQLite-backed `SessionManager`. -The event stream is stored in `transcript_events`: +The event stream is stored in the per-agent `transcript_events` table: - First event: session header (`type: "session"`, includes `id`, `cwd`, `timestamp`, optional `parentSession`) diff --git a/skills/session-logs/SKILL.md b/skills/session-logs/SKILL.md index 51d62a4a812..3f9ed2f1a16 100644 --- a/skills/session-logs/SKILL.md +++ b/skills/session-logs/SKILL.md @@ -1,12 +1,12 @@ --- name: session-logs -description: Search and analyze your own session logs (older/parent conversations) using jq. +description: Search and analyze your own SQLite-backed session logs (older/parent conversations). metadata: { "openclaw": { "emoji": "📜", - "requires": { "bins": ["jq", "rg"] }, + "requires": { "bins": ["jq", "rg", "sqlite3"] }, "install": [ { @@ -23,6 +23,13 @@ metadata: "bins": ["rg"], "label": "Install ripgrep (brew)", }, + { + "id": "brew-sqlite", + "kind": "brew", + "formula": "sqlite", + "bins": ["sqlite3"], + "label": "Install sqlite3 (brew)", + }, ], }, } @@ -30,7 +37,9 @@ metadata: # session-logs -Search your complete conversation history stored in session JSONL files. Use this when a user references older/parent conversations or asks what was said before. +Search your complete conversation history stored in per-agent SQLite databases. +Use this when a user references older/parent conversations or asks what was said +before. ## Trigger @@ -38,16 +47,22 @@ Use this skill when the user asks about prior chats, parent conversations, or hi ## Location -Session logs live under the active state directory: -`$OPENCLAW_STATE_DIR/agents//sessions/` (default: `~/.openclaw/agents//sessions/`). +Session logs live under the active state directory in the per-agent database: +`$OPENCLAW_STATE_DIR/agents//agent/openclaw-agent.sqlite` (default: +`~/.openclaw/agents//agent/openclaw-agent.sqlite`). Use the `agent=` value from the system prompt Runtime line. -- **`sessions.json`** - Index mapping session keys to session IDs -- **`.jsonl`** - Full conversation transcript per session +- **`session_entries`** - Session-key rows with JSON metadata +- **`transcript_events`** - Full conversation transcript event stream per session +- **`transcript_event_identities`** - Queryable event ids, parent ids, event types, and idempotency keys + +Legacy JSON/JSONL files under `agents//sessions/` are doctor migration +inputs or explicit debug/export artifacts only. ## Structure -Each `.jsonl` file contains messages with: +Each `transcript_events.event_json` value uses the same JSON shape exported to +JSONL: - `type`: "session" (metadata) or "message" - `timestamp`: ISO timestamp @@ -61,91 +76,129 @@ Each `.jsonl` file contains messages with: ```bash AGENT_ID="" -SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions" -for f in "$SESSION_DIR"/*.jsonl; do - date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1) - size=$(ls -lh "$f" | awk '{print $5}') - echo "$date $size $(basename $f)" -done | sort -r +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -json "$DB" ' + SELECT + session_key, + json_extract(entry_json, "$.sessionId") AS session_id, + updated_at + FROM session_entries + ORDER BY updated_at DESC + LIMIT 100; +' | jq -r '.[] | "\(.updated_at) \(.session_id) \(.session_key)"' ``` ### Find sessions from a specific day ```bash AGENT_ID="" -SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions" -for f in "$SESSION_DIR"/*.jsonl; do - head -1 "$f" | jq -r '.timestamp' | grep -q "2026-01-06" && echo "$f" -done +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -json "$DB" ' + SELECT session_id, min(created_at) AS first_event_at, max(created_at) AS last_event_at + FROM transcript_events + GROUP BY session_id + HAVING date(first_event_at / 1000, "unixepoch") = "2026-01-06" + ORDER BY first_event_at DESC; +' ``` ### Extract user messages from a session ```bash -jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text' .jsonl +AGENT_ID="" +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text' ``` ### Search for keyword in assistant responses ```bash -jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' .jsonl | rg -i "keyword" +AGENT_ID="" +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' | + rg -i "keyword" ``` ### Get total cost for a session ```bash -jq -s '[.[] | .message.usage.cost.total // 0] | add' .jsonl +AGENT_ID="" +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -s '[.[] | .message.usage.cost.total // 0] | add' ``` ### Daily cost summary ```bash AGENT_ID="" -SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions" -for f in "$SESSION_DIR"/*.jsonl; do - date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1) - cost=$(jq -s '[.[] | .message.usage.cost.total // 0] | add' "$f") - echo "$date $cost" -done | awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" 'SELECT event_json FROM transcript_events ORDER BY created_at;' | + jq -r '[.timestamp[0:10], (.message.usage.cost.total // 0)] | @tsv' | + awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r ``` ### Count messages and tokens in a session ```bash -jq -s '{ +AGENT_ID="" +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -s '{ messages: length, user: [.[] | select(.message.role == "user")] | length, assistant: [.[] | select(.message.role == "assistant")] | length, first: .[0].timestamp, last: .[-1].timestamp -}' .jsonl +}' ``` ### Tool usage breakdown ```bash -jq -r '.message.content[]? | select(.type == "toolCall") | .name' .jsonl | sort | uniq -c | sort -rn +AGENT_ID="" +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -r '.message.content[]? | select(.type == "toolCall") | .name' | + sort | uniq -c | sort -rn ``` ### Search across ALL sessions for a phrase ```bash AGENT_ID="" -SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions" -rg -l "phrase" "$SESSION_DIR"/*.jsonl +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" 'SELECT session_id || char(9) || event_json FROM transcript_events ORDER BY created_at;' | + rg -i "phrase" ``` ## Tips -- Sessions are append-only JSONL (one JSON object per line) -- Large sessions can be several MB - use `head`/`tail` for sampling -- The `sessions.json` index maps chat providers (discord, whatsapp, etc.) to session IDs -- Deleted sessions have `.deleted.` suffix +- Sessions are append-only SQLite rows; export/debug JSONL is one JSON object per line +- Large sessions can be several MB; always filter by `session_id` when you know it +- `session_entries` maps chat providers (Discord, WhatsApp, etc.) to session IDs +- Deleted legacy debug/export files can have `.deleted.` suffix ## Fast text-only hint (low noise) ```bash AGENT_ID="" -SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions" -jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' "$SESSION_DIR"/.jsonl | rg 'keyword' +SESSION_ID="" +DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite" +sqlite3 -readonly -noheader "$DB" \ + "SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" | + jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' | + rg 'keyword' ```