mirror of
https://github.com/moltbot/moltbot.git
synced 2026-05-18 20:19:47 +00:00
refactor: move session state migration to doctor
This commit is contained in:
@@ -61,27 +61,22 @@ This plan has started landing in slices:
|
||||
canonical SQLite stores avoid that path. The cron timer no longer runs a
|
||||
dedicated session reaper; cron run sessions are maintained through the same
|
||||
explicit session cleanup path as other rows.
|
||||
- Transcript events have a SQLite store primitive with JSONL import/export.
|
||||
Transcript append paths dual-write when the caller already has agent and
|
||||
session scope, including gateway-injected assistant messages. Scoped appends
|
||||
also import the current JSONL stream into SQLite when the SQLite transcript is
|
||||
empty, so headers and legacy rows are not skipped before the new event is
|
||||
mirrored. Scoped latest/tail assistant transcript reads can now use the
|
||||
SQLite mirror first, and delivery-mirror idempotency/latest-match checks use
|
||||
the same scoped mirror before falling back to JSONL for legacy or file-only
|
||||
callers. `/export-session` and `before_reset` hook payload construction can
|
||||
also read scoped SQLite transcript events when the compatibility JSONL is
|
||||
missing, and silent session-rotation replay can use the scoped SQLite
|
||||
transcript tail before falling back to JSONL. Shared async Gateway transcript
|
||||
readers also have a scoped SQLite fallback for chat history, TUI history,
|
||||
restart and subagent recovery, managed outgoing media indexing, token
|
||||
estimation, title/preview/usage helpers, and bounded session inspection
|
||||
surfaces. JSONL remains the compatibility file while the transcript moves to
|
||||
OpenClaw-owned semantics. The remaining transcript tail rewrites for
|
||||
recovery/yield cleanup are now isolated behind OpenClaw-owned helpers instead
|
||||
of being duplicated inline, and live runs no longer need PI's private
|
||||
first-run persistence normalization because OpenClaw's file-backed manager
|
||||
persists the header and initial user message synchronously.
|
||||
- Transcript events are SQLite-primary. OpenClaw-owned append paths require
|
||||
agent/session scope and write `transcript_events` directly; `*.jsonl` is no
|
||||
longer a runtime mirror for those paths. JSONL is now an explicit
|
||||
import/export/debug shape only. The OpenClaw transcript session manager,
|
||||
Gateway-injected assistant messages, CLI transcript persistence, Codex
|
||||
app-server mirroring, compaction successor transcripts, manual compaction
|
||||
boundary rewrites, and reset/header creation all persist through SQLite.
|
||||
Scoped latest/tail assistant reads, delivery-mirror idempotency/latest-match
|
||||
checks, `/export-session`, `before_reset` hook payloads, silent rotation
|
||||
replay, chat/TUI history, restart/subagent recovery, managed media indexing,
|
||||
token estimation, title/preview/usage helpers, runtime transcript repair,
|
||||
bootstrap completion checks, and bounded inspection all use the scoped SQLite
|
||||
transcript. Legacy JSONL import is doctor/import/debug only: `openclaw doctor
|
||||
--fix` builds the transcript database from old files and removes the JSONL
|
||||
sources after successful import. Runtime paths do not import, prune, or repair
|
||||
JSONL files.
|
||||
- `AgentFilesystem` and `SqliteVirtualAgentFs` exist for scratch storage, with
|
||||
`disk`, `vfs-scratch`, and `vfs-only` filesystem modes at the runtime
|
||||
boundary. VFS contents can be listed and exported for support bundles. When
|
||||
@@ -195,12 +190,12 @@ This plan has started landing in slices:
|
||||
OpenAI completion conversion subpaths route through narrow OpenClaw facades.
|
||||
TUI imports route through `src/agents/pi-tui-contract.ts`, with
|
||||
`src/tui/pi-tui-contract.ts` left as a local compatibility re-export.
|
||||
- Transcript JSONL header, entry, tree, parser, legacy migration, context
|
||||
- Transcript header, entry, tree, parser, legacy migration, context
|
||||
builder, and session-manager structural types are now defined by OpenClaw's
|
||||
transcript contract. The parser, migration, and context builder runtime
|
||||
helpers have one OpenClaw-owned implementation under `src/agents/transcript`
|
||||
instead of duplicated facade/file-state logic. OpenClaw also owns a
|
||||
synchronous file-backed transcript session manager that implements the live
|
||||
synchronous SQLite-backed transcript session manager that implements the live
|
||||
`SessionManager` shape over `TranscriptFileState`, including header creation,
|
||||
append persistence, tree, label, branch, session name, branch-summary,
|
||||
in-memory, create/open, list/listAll, and fork APIs. Live embedded runs,
|
||||
@@ -356,9 +351,10 @@ Migration order:
|
||||
stores.
|
||||
5. Import old `sessions.json` only from `openclaw doctor --fix`, then remove the
|
||||
JSON index after SQLite has the rows. Done for session indexes.
|
||||
6. Leave `*.jsonl` transcripts on disk while PI owns transcript semantics.
|
||||
7. After session manager ownership moves behind OpenClaw APIs, store transcript
|
||||
events in SQLite and export JSONL for compatibility.
|
||||
6. Import old `*.jsonl` transcripts only from `openclaw doctor --fix`, then
|
||||
remove the JSONL source after SQLite has the events. Done for canonical
|
||||
transcript files.
|
||||
7. Keep JSONL export as explicit debug/support output only.
|
||||
|
||||
Keep `openclaw.json` and `auth-profiles.json` file-backed until operator
|
||||
repair, secret audit, and backup flows can handle the SQLite layout naturally.
|
||||
@@ -588,7 +584,7 @@ Phase 5: transcript ownership
|
||||
|
||||
- Move transcript mutation behind OpenClaw APIs.
|
||||
- Store transcript events in SQLite.
|
||||
- Export JSONL for compatibility and debugging.
|
||||
- Import legacy JSONL through doctor only; export JSONL for debugging/support.
|
||||
- Remove direct PI `SessionManager` usage from non-adapter code.
|
||||
|
||||
Phase 6: internalize or replace PI pieces
|
||||
|
||||
@@ -1,7 +1,7 @@
|
||||
---
|
||||
summary: "Deep dive: session store + transcripts, lifecycle, and (auto)compaction internals"
|
||||
read_when:
|
||||
- You need to debug session ids, transcript JSONL, SQLite session rows, or legacy sessions.json fields
|
||||
- You need to debug session ids, SQLite session rows/events, or doctor migration of legacy sessions.json/JSONL files
|
||||
- You are changing auto-compaction behavior or adding "pre-compaction" housekeeping
|
||||
- You want to implement memory flushes or silent system turns
|
||||
title: "Session management deep dive"
|
||||
@@ -11,7 +11,7 @@ OpenClaw manages sessions end-to-end across these areas:
|
||||
|
||||
- **Session routing** (how inbound messages map to a `sessionKey`)
|
||||
- **Session store** and what it tracks
|
||||
- **Transcript persistence** (`*.jsonl`) and its structure
|
||||
- **Transcript persistence** (SQLite event streams, doctor-only JSONL import, explicit debug export) and its structure
|
||||
- **Transcript hygiene** (provider-specific fixups before runs)
|
||||
- **Context limits** (context window vs tracked tokens)
|
||||
- **Compaction** (manual and auto-compaction) and where to hook pre-compaction work
|
||||
@@ -47,17 +47,15 @@ OpenClaw persists sessions in two layers:
|
||||
- Tracks session metadata (current session id, last activity, toggles, token counters, etc.)
|
||||
|
||||
2. **Transcript (`<sessionId>.jsonl`)**
|
||||
- Append-only transcript with tree structure (entries have `id` + `parentId`)
|
||||
- SQLite-backed transcript event stream with tree structure (entries have `id` + `parentId`)
|
||||
- Stores the actual conversation + tool calls + compaction summaries
|
||||
- Used to rebuild the model context for future turns
|
||||
- Mirrored into SQLite for scoped Gateway appends; scoped latest/tail
|
||||
assistant-text lookups, session exports, and `before_reset` hook payloads
|
||||
prefer that mirror and fall back to JSONL. Silent session rotations also
|
||||
replay recent user/assistant turns from the scoped SQLite mirror when
|
||||
available. Shared async Gateway transcript readers fall back to the scoped
|
||||
SQLite mirror for chat history, TUI history, recovery, managed media
|
||||
indexing, token estimation, title/preview/usage helpers, and bounded
|
||||
session inspection when the compatibility JSONL is missing.
|
||||
- Stored in SQLite for OpenClaw-owned runtime paths; JSONL is legacy
|
||||
import/export/debug compatibility, not a runtime sidecar
|
||||
- Scoped latest/tail assistant-text lookups, session exports, `before_reset`
|
||||
hook payloads, silent session rotations, chat history, TUI history,
|
||||
recovery, managed media indexing, token estimation, title/preview/usage
|
||||
helpers, and bounded session inspection read the scoped SQLite transcript.
|
||||
- Large pre-compaction debug checkpoints are skipped once the active
|
||||
transcript exceeds the checkpoint size cap, avoiding a second giant
|
||||
`.checkpoint.*.jsonl` copy.
|
||||
@@ -78,8 +76,11 @@ Per agent, on the Gateway host:
|
||||
imports legacy `~/.openclaw/agents/<agentId>/sessions/sessions.json` indexes
|
||||
into SQLite and removes the JSON index after import; Gateway startup leaves
|
||||
legacy indexes alone.
|
||||
- Transcripts: `~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl`
|
||||
- Telegram topic sessions: `.../<sessionId>-topic-<threadId>.jsonl`
|
||||
- Transcripts: `~/.openclaw/state/openclaw.sqlite` (`transcript_events` and
|
||||
`transcript_files`). Legacy/export paths may still use
|
||||
`~/.openclaw/agents/<agentId>/sessions/<sessionId>.jsonl` names as stable
|
||||
handles.
|
||||
- Telegram topic handles: `.../<sessionId>-topic-<threadId>.jsonl`
|
||||
|
||||
OpenClaw resolves these via `src/config/sessions/*`.
|
||||
|
||||
@@ -105,10 +106,9 @@ configured age, count, or disk budget.
|
||||
|
||||
OpenClaw no longer creates automatic `sessions.json.bak.*` rotation backups during Gateway writes. The legacy `session.maintenance.rotateBytes` key is ignored and `openclaw doctor --fix` removes it from older configs.
|
||||
|
||||
Transcript mutations use a session write lock on the transcript file. Lock acquisition waits up to
|
||||
`session.writeLock.acquireTimeoutMs` before surfacing a busy-session error; the default is `60000`
|
||||
ms. Raise this only when legitimate prep, cleanup, compaction, or transcript mirror work contends
|
||||
longer on slow machines. Stale-lock detection and maximum hold warnings remain separate policies.
|
||||
Transcript mutations are serialized through SQLite transactions plus the
|
||||
per-session append queue. The legacy `session.writeLock.acquireTimeoutMs`
|
||||
setting remains for older import/debug paths that still touch JSONL files.
|
||||
|
||||
Enforcement order for disk budget cleanup (`mode: "enforce"`):
|
||||
|
||||
@@ -209,15 +209,18 @@ The store is safe to edit, but the Gateway is the authority: it may rewrite or r
|
||||
|
||||
---
|
||||
|
||||
## Transcript structure (`*.jsonl`)
|
||||
## Transcript structure
|
||||
|
||||
Transcripts are managed by `@mariozechner/pi-coding-agent`'s `SessionManager`.
|
||||
Transcripts are managed by OpenClaw's SQLite-backed `SessionManager`.
|
||||
|
||||
The file is JSONL:
|
||||
The event stream is stored in `transcript_events`:
|
||||
|
||||
- First line: session header (`type: "session"`, includes `id`, `cwd`, `timestamp`, optional `parentSession`)
|
||||
- First event: session header (`type: "session"`, includes `id`, `cwd`,
|
||||
`timestamp`, optional `parentSession`)
|
||||
- Then: session entries with `id` + `parentId` (tree)
|
||||
|
||||
JSONL import/export uses the same event shape, one JSON object per line.
|
||||
|
||||
Notable entry types:
|
||||
|
||||
- `message`: user/assistant/toolResult messages
|
||||
|
||||
Reference in New Issue
Block a user