docs: update session sqlite guidance

This commit is contained in:
Peter Steinberger
2026-05-08 11:54:19 +01:00
parent bca4b69e2c
commit a577c30863
3 changed files with 113 additions and 53 deletions

View File

@@ -612,14 +612,14 @@ or compact user-fact context for the main model.
## Transcript persistence
Active memory blocking memory sub-agent runs create a real `session.jsonl`
transcript during the blocking memory sub-agent call.
Active memory blocking memory sub-agent runs create SQLite transcript rows
during the blocking memory sub-agent call.
By default, that transcript is temporary:
- it is written to a temp directory
- it uses a temporary transcript scope
- it is used only for the blocking memory sub-agent run
- it is deleted immediately after the run finishes
- its rows are removed after the run finishes
If you want to keep those blocking memory sub-agent transcripts on disk for debugging or
inspection, turn persistence on explicitly:
@@ -642,8 +642,8 @@ inspection, turn persistence on explicitly:
```
When enabled, active memory records the blocking sub-agent transcript in the
agent SQLite database and registers a plugin-owned transcript locator under the
state directory, not in the main user conversation transcript path.
agent SQLite database and registers plugin-owned transcript locator metadata,
not a JSONL runtime sidecar and not the main user conversation transcript path.
The default locator namespace is conceptually:

View File

@@ -71,14 +71,21 @@ cached by file path plus `mtimeMs`/`size` and shared across concurrent readers.
Per agent, on the Gateway host:
- Store: `~/.openclaw/state/openclaw.sqlite` by default. `openclaw doctor --fix`
imports legacy `~/.openclaw/agents/<agentId>/sessions/sessions.json` indexes
into SQLite and removes the JSON index after import; Gateway startup leaves
legacy indexes alone.
- Transcripts: `~/.openclaw/state/openclaw.sqlite` (`transcript_events` and
`transcript_files`). Legacy/export paths may still use
`~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl` names as stable
handles.
- Global store: `~/.openclaw/state/openclaw.sqlite` by default. It stores
shared registry, migration, plugin, task, backup, and transcript-locator
metadata.
- Agent store: `~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite`. It
stores canonical session rows, transcript events, snapshots, VFS entries,
artifacts, and agent-local cache rows.
- Legacy imports: `openclaw doctor --fix` imports
`~/.openclaw/agents/<agentId>/sessions/sessions.json` indexes and JSONL
transcripts into the agent SQLite database, then removes imported legacy
sources after durable verification. Gateway startup leaves legacy indexes
alone.
- Transcripts: runtime transcript events live in the per-agent database
(`transcript_events` and `transcript_event_identities`). The global
`transcript_files` table maps legacy/export/debug path-shaped locators to
`{ agentId, sessionId }`; JSONL files are not runtime sidecars.
- Telegram topic handles: `.../<sessionId>-topic-<threadId>.jsonl`
OpenClaw resolves these via `src/config/sessions/*`.
@@ -199,7 +206,7 @@ The store is safe to edit, but the Gateway is the authority: it may rewrite or r
Transcripts are managed by OpenClaw's SQLite-backed `SessionManager`.
The event stream is stored in `transcript_events`:
The event stream is stored in the per-agent `transcript_events` table:
- First event: session header (`type: "session"`, includes `id`, `cwd`,
`timestamp`, optional `parentSession`)

View File

@@ -1,12 +1,12 @@
---
name: session-logs
description: Search and analyze your own session logs (older/parent conversations) using jq.
description: Search and analyze your own SQLite-backed session logs (older/parent conversations).
metadata:
{
"openclaw":
{
"emoji": "📜",
"requires": { "bins": ["jq", "rg"] },
"requires": { "bins": ["jq", "rg", "sqlite3"] },
"install":
[
{
@@ -23,6 +23,13 @@ metadata:
"bins": ["rg"],
"label": "Install ripgrep (brew)",
},
{
"id": "brew-sqlite",
"kind": "brew",
"formula": "sqlite",
"bins": ["sqlite3"],
"label": "Install sqlite3 (brew)",
},
],
},
}
@@ -30,7 +37,9 @@ metadata:
# session-logs
Search your complete conversation history stored in session JSONL files. Use this when a user references older/parent conversations or asks what was said before.
Search your complete conversation history stored in per-agent SQLite databases.
Use this when a user references older/parent conversations or asks what was said
before.
## Trigger
@@ -38,16 +47,22 @@ Use this skill when the user asks about prior chats, parent conversations, or hi
## Location
Session logs live under the active state directory:
`$OPENCLAW_STATE_DIR/agents/<agentId>/sessions/` (default: `~/.openclaw/agents/<agentId>/sessions/`).
Session logs live under the active state directory in the per-agent database:
`$OPENCLAW_STATE_DIR/agents/<agentId>/agent/openclaw-agent.sqlite` (default:
`~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite`).
Use the `agent=<id>` value from the system prompt Runtime line.
- **`sessions.json`** - Index mapping session keys to session IDs
- **`<session-id>.jsonl`** - Full conversation transcript per session
- **`session_entries`** - Session-key rows with JSON metadata
- **`transcript_events`** - Full conversation transcript event stream per session
- **`transcript_event_identities`** - Queryable event ids, parent ids, event types, and idempotency keys
Legacy JSON/JSONL files under `agents/<agentId>/sessions/` are doctor migration
inputs or explicit debug/export artifacts only.
## Structure
Each `.jsonl` file contains messages with:
Each `transcript_events.event_json` value uses the same JSON shape exported to
JSONL:
- `type`: "session" (metadata) or "message"
- `timestamp`: ISO timestamp
@@ -61,91 +76,129 @@ Each `.jsonl` file contains messages with:
```bash
AGENT_ID="<agentId>"
SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
for f in "$SESSION_DIR"/*.jsonl; do
date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1)
size=$(ls -lh "$f" | awk '{print $5}')
echo "$date $size $(basename $f)"
done | sort -r
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -json "$DB" '
SELECT
session_key,
json_extract(entry_json, "$.sessionId") AS session_id,
updated_at
FROM session_entries
ORDER BY updated_at DESC
LIMIT 100;
' | jq -r '.[] | "\(.updated_at) \(.session_id) \(.session_key)"'
```
### Find sessions from a specific day
```bash
AGENT_ID="<agentId>"
SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
for f in "$SESSION_DIR"/*.jsonl; do
head -1 "$f" | jq -r '.timestamp' | grep -q "2026-01-06" && echo "$f"
done
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -json "$DB" '
SELECT session_id, min(created_at) AS first_event_at, max(created_at) AS last_event_at
FROM transcript_events
GROUP BY session_id
HAVING date(first_event_at / 1000, "unixepoch") = "2026-01-06"
ORDER BY first_event_at DESC;
'
```
### Extract user messages from a session
```bash
jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text' <session>.jsonl
AGENT_ID="<agentId>"
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -r 'select(.message.role == "user") | .message.content[]? | select(.type == "text") | .text'
```
### Search for keyword in assistant responses
```bash
jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' <session>.jsonl | rg -i "keyword"
AGENT_ID="<agentId>"
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -r 'select(.message.role == "assistant") | .message.content[]? | select(.type == "text") | .text' |
rg -i "keyword"
```
### Get total cost for a session
```bash
jq -s '[.[] | .message.usage.cost.total // 0] | add' <session>.jsonl
AGENT_ID="<agentId>"
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -s '[.[] | .message.usage.cost.total // 0] | add'
```
### Daily cost summary
```bash
AGENT_ID="<agentId>"
SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
for f in "$SESSION_DIR"/*.jsonl; do
date=$(head -1 "$f" | jq -r '.timestamp' | cut -dT -f1)
cost=$(jq -s '[.[] | .message.usage.cost.total // 0] | add' "$f")
echo "$date $cost"
done | awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" 'SELECT event_json FROM transcript_events ORDER BY created_at;' |
jq -r '[.timestamp[0:10], (.message.usage.cost.total // 0)] | @tsv' |
awk '{a[$1]+=$2} END {for(d in a) print d, "$"a[d]}' | sort -r
```
### Count messages and tokens in a session
```bash
jq -s '{
AGENT_ID="<agentId>"
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -s '{
messages: length,
user: [.[] | select(.message.role == "user")] | length,
assistant: [.[] | select(.message.role == "assistant")] | length,
first: .[0].timestamp,
last: .[-1].timestamp
}' <session>.jsonl
}'
```
### Tool usage breakdown
```bash
jq -r '.message.content[]? | select(.type == "toolCall") | .name' <session>.jsonl | sort | uniq -c | sort -rn
AGENT_ID="<agentId>"
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -r '.message.content[]? | select(.type == "toolCall") | .name' |
sort | uniq -c | sort -rn
```
### Search across ALL sessions for a phrase
```bash
AGENT_ID="<agentId>"
SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
rg -l "phrase" "$SESSION_DIR"/*.jsonl
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" 'SELECT session_id || char(9) || event_json FROM transcript_events ORDER BY created_at;' |
rg -i "phrase"
```
## Tips
- Sessions are append-only JSONL (one JSON object per line)
- Large sessions can be several MB - use `head`/`tail` for sampling
- The `sessions.json` index maps chat providers (discord, whatsapp, etc.) to session IDs
- Deleted sessions have `.deleted.<timestamp>` suffix
- Sessions are append-only SQLite rows; export/debug JSONL is one JSON object per line
- Large sessions can be several MB; always filter by `session_id` when you know it
- `session_entries` maps chat providers (Discord, WhatsApp, etc.) to session IDs
- Deleted legacy debug/export files can have `.deleted.<timestamp>` suffix
## Fast text-only hint (low noise)
```bash
AGENT_ID="<agentId>"
SESSION_DIR="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/sessions"
jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' "$SESSION_DIR"/<id>.jsonl | rg 'keyword'
SESSION_ID="<sessionId>"
DB="${OPENCLAW_STATE_DIR:-$HOME/.openclaw}/agents/$AGENT_ID/agent/openclaw-agent.sqlite"
sqlite3 -readonly -noheader "$DB" \
"SELECT event_json FROM transcript_events WHERE session_id = '$SESSION_ID' ORDER BY seq;" |
jq -r 'select(.type=="message") | .message.content[]? | select(.type=="text") | .text' |
rg 'keyword'
```