76 KiB
summary, title, read_when
| summary | title | read_when | ||||
|---|---|---|---|---|---|---|
| Migration plan for making SQLite the primary durable state and cache layer while keeping config file-backed | Database-first state refactor |
|
Database-First State Refactor
Decision
Use a two-level SQLite layout:
- Global database:
~/.openclaw/state/openclaw.sqlite - Agent database: one SQLite database per agent for agent-owned workspace, transcript, VFS, artifact, and large per-agent runtime state
- Configuration stays file-backed:
openclaw.jsonand explicit credential or auth-profile files remain outside the database until there is a separate secrets/export design
The global database is the control-plane database. It owns agent discovery, shared gateway state, pairing, device/node state, task and flow ledgers, plugin state, scheduler runtime state, backup metadata, and migration state.
The agent database is the data-plane database. It owns the agent's session metadata, transcript event stream, VFS workspace or scratch namespace, tool artifacts, run artifacts, and searchable/indexable agent-local cache data.
This gives one durable global view without forcing large agent workspaces, transcripts, and binary scratch data into the shared gateway write lane.
Hard Contract
This migration has one canonical runtime shape:
- Session rows persist session metadata only. They must not persist
transcriptLocator, transcript file paths, sibling JSONL paths, lock paths, pruning metadata, or file-era compatibility pointers. - Transcript identity is always SQLite identity:
{agentId, sessionId}plus optional topic metadata where the protocol needs it. sqlite-transcript://...is not a runtime or protocol identity. New code must not derive, persist, pass, parse, or migrate transcript locators. Remaining uses are cleanup debt in tests or explicit export/debug materialization only.- Legacy
sessions.json, transcript JSONL,.jsonl.lock, pruning, truncation, and old session-path logic belong only to the doctor migration/import path. - Runtime startup, hot reply paths, compaction, reset, recovery, diagnostics,
TTS, memory hooks, subagents, plugin command routing, protocol boundaries, and
hooks must pass
{agentId, sessionId}through the runtime. - Tests should seed and assert SQLite transcript rows through
{agentId, sessionId}. Tests that only prove JSONL path forwarding, caller-supplied locator preservation, or transcript-file compatibility should be deleted unless they cover doctor import, export/debug materialization, or protocol shape. runEmbeddedPiAgent(...), prepared worker runs, and the inner embedded attempt must not accept transcript locators. They open the SQLite transcript manager by{agentId, sessionId}and pass that manager to the internalized PI-compatible agent session, so stale callers cannot make the runner write JSON/JSONL transcripts.- Runner diagnostics must store runtime/cache/payload trace records in SQLite. Runtime diagnostics must not expose JSONL file override knobs; export/debug commands can materialize files explicitly from database rows.
Implementation work should keep deleting code until these statements are true without exceptions outside doctor/import/export/debug boundaries.
Code-Read Assumptions
No follow-up product decisions are blocking this plan. The implementation should proceed with these assumptions:
- Use
node:sqlitedirectly and require the Node 24+ runtime for this storage path. - Keep exactly one normal configuration file. Do not move config, credentials, provider auth profiles, plugin manifests, or Git workspaces into SQLite in this refactor.
- Runtime compatibility files are not required. Legacy JSON and JSONL files are migration inputs only. The branch-local SQLite sidecars never shipped and are deleted instead of imported.
openclaw doctor --fixowns the legacy file-to-database migration step. Runtime startup andopenclaw migrateshould not carry legacy OpenClaw database-upgrade paths.- Runtime must not migrate, normalize, or bridge transcript locators. Active
transcript identity is
{agentId, sessionId}in SQLite. File paths are legacy doctor inputs only, andsqlite-transcript://...must disappear from runtime, protocol, hook, and plugin surfaces instead of being treated as a boundary handle. - Codex app-server bindings use the OpenClaw
sessionIdas the canonical SQLite key.sessionKeyis metadata for routing/display and must not replace the durable session id or resurrect transcript-file identity. - Backup output should remain one archive file. Database contents should enter that archive as compact SQLite snapshots, not raw live WAL sidecars.
- Transcript search is useful but not required for the first database-first cut. Design the schema so FTS can be added later.
- Worker execution should stay experimental behind settings while the database boundary settles.
Code-Read Findings
The current branch is already past the proof-of-concept stage. The shared
database exists, Node node:sqlite is wired through a small runtime helper, and
former stores now write to state/openclaw.sqlite or the owning
openclaw-agent.sqlite database.
The remaining work is not choosing SQLite; it is deleting compatibility-shaped interfaces that still look like the old file world:
- Some compatibility call surfaces still carry
storePathfor explicit migration/export/debug metadata, but hot session reads and writes now resolve the SQLite row from{ agentId, sessionKey }instead of treating the path as the runtime identity. - Session writes no longer pass through the old in-process
store-writer.tsqueue. SQLite patch writes use conflict detection and bounded retry instead. - Legacy path discovery still has valid migration uses, but runtime code should
stop treating
sessions.jsonand transcript JSONL files as possible write targets. - Agent-owned tables live in per-agent SQLite databases. The global DB keeps
registry/control-plane rows; transcript identity is
{agentId, sessionId}in the per-agent transcript rows. Runtime code must not persist transcript file paths or migrate transcript locators. - Doctor already imports several legacy files. The cleanup is to make that a single explicit migration implementation that doctor calls, with a durable migration report.
No additional product questions are blocking implementation.
Current Code Shape
The branch already has a real shared SQLite base:
src/state/openclaw-state-db.tsopensopenclaw.sqlite, sets WAL,synchronous=NORMAL,busy_timeout=30000,foreign_keys=ON, and applies the generated schema module derived fromsrc/state/openclaw-state-schema.sql.- Kysely table types and runtime schema modules are generated from disposable
SQLite databases created from the committed
.sqlfiles; runtime code no longer keeps copy-pasted schema strings for global, per-agent, or proxy capture databases. - Runtime stores derive selected and inserted row types from those generated
Kysely
DBinterfaces instead of shadowing SQLite row shapes by hand. Raw SQL remains limited to schema application, pragmas, and migration-only DDL. - The SQLite schemas are collapsed to
user_version = 1because this database layout has not shipped yet. Runtime openers create the current schema only; file-to-database import remains in doctor code, and branch-local database upgrade helpers have been deleted. - Relational ownership is enforced where the ownership boundary is canonical:
source migration rows cascade from
migration_runs, task delivery state cascades fromtask_runs, and transcript identity rows cascade from transcript events. - Current shared tables include
kv,agents,agent_databases,plugin_state_entries,plugin_blob_entries,capture_sessions,capture_events,capture_blobs,sandbox_registry_entries,cron_run_logs,cron_jobs,commitments,delivery_queue_entries,current_conversation_bindings,tui_last_sessions,task_runs,task_delivery_state,flow_runs,subagent_runs,migration_runs, andbackup_runs. src/state/openclaw-agent-db.tsopensagents/<agentId>/agent/openclaw-agent.sqlite, registers the database in the global DB, and owns agent-local session, transcript, VFS, artifact, and cache tables. Shared runtime discovery now reads the generated-typedagent_databasesregistry instead of reimplementing that query at each call site.- Subagent run recovery state now lives in typed shared
subagent_runsrows with indexed child, requester, and controller session keys. The oldsubagents/runs.jsonfile is doctor migration input only. - Current conversation bindings now live in typed shared
current_conversation_bindingsrows keyed by normalized conversation id and indexed by target session. The oldbindings/current-conversations.jsonfile is doctor migration input only. - TUI last-session restore pointers now live in typed shared
tui_last_sessionsrows keyed by the hashed TUI connection/session scope. The old TUI JSON file is doctor migration input only. - Default TTS prefs now live in shared plugin-state SQLite rows keyed under the
speech-coreplugin. The oldsettings/tts.jsonfile is doctor migration input only; runtime no longer reads or writes TTS prefs JSON files, and the legacy path resolver lives in the doctor migration module. - Subagent run recovery and OpenRouter model capability cache runtime modules now keep SQLite snapshot readers/writers separate from doctor-only legacy JSON import helpers.
src/agents/filesystem/virtual-agent-fs.sqlite.tsimplements a SQLite VFS over the agent databasevfs_entriestable.src/agents/runtime-worker.entry.tscreates per-run SQLite VFS, tool artifact, run artifact, and scoped cache stores for workers.- Workspace bootstrap completion markers now live in shared SQLite KV keyed by
resolved workspace path instead of
.openclaw/workspace-state.json; runtime no longer reads or rewrites the legacy workspace marker. - Exec approvals now live in shared SQLite KV (
exec.approvals/current). Doctor imports legacy~/.openclaw/exec-approvals.json; runtime writes no longer create or rewrite that file. - Device identity, device auth, and bootstrap runtime modules now keep their SQLite snapshot readers/writers separate from doctor-only legacy JSON import helpers.
- Device identity creation fails closed when legacy
identity/device.jsonexists, when the SQLite identity row is invalid, or when the SQLite identity store cannot be opened. Doctor imports and removes that file first, so runtime startup cannot silently rotate pairing identity before migration. - Web push, APNs, Voice Wake, and Voice Wake routing runtime modules now keep their SQLite snapshot readers/writers separate from doctor-only legacy JSON import helpers.
- Pairing state, plugin binding approvals, and cron job state now follow the
same split: runtime modules expose SQLite-backed operations and neutral
snapshot helpers, while doctor imports/removes the old JSON files through
src/commands/doctor/legacy/*modules. - Core pairing and cron runtime modules no longer export legacy JSON path
builders. Doctor-owned legacy modules construct
pending.json,paired.json,bootstrap.json, andcron/jobs.jsonsource paths for import tests and migration only. src/commands/doctor-sqlite-state.tsalready imports several legacy JSON state files, including node host config, into SQLite from doctor.src/commands/doctor/state-migrations.tsimports legacysessions.jsonand*.jsonltranscripts into SQLite and removes successful sources.
The remaining cleanup is mostly consolidation and deletion:
- Plugin state now uses the shared
state/openclaw.sqlitedatabase. The old branch-localplugin-state/state.sqlitesidecar importer is removed because that SQLite layout never shipped. - Task and Task Flow runtime tables now live in the shared
state/openclaw.sqlitedatabase instead oftasks/runs.sqliteandtasks/flows/registry.sqlite; the old sidecar importers are removed for the same unshipped-layout reason. src/config/sessions/store.tsno longer needsstorePathfor inbound metadata, route updates, or updated-at reads. Command persistence, CLI session cleanup, subagent depth, auth overrides, and transcript session identity use agent/session row APIs. Writes are applied as SQLite row patches with optimistic conflict retry.- Session target resolution now exposes per-agent database targets, not legacy
sessions.jsonpaths. Shared gateway, ACP metadata, doctor route repair, andopenclaw sessionsenumerateagent_databasesplus configured agents. - Gateway session routing now uses
resolveGatewaySessionDatabaseTarget; the returned target carriesdatabasePathand candidate SQLite row keys instead of a legacy session-store file path. - Channel session runtime types now expose
{agentId, sessionKey}for updated-at reads, inbound metadata, and last-route updates. The oldsaveSessionStore(storePath, store)compatibility type is gone. - Plugin runtime, extension API, root library, and
config/sessionsbarrel surfaces no longer exportresolveStorePath; plugin code uses SQLite-backed session row helpers. The oldresolveLegacySessionStorePathhelper is gone; legacysessions.jsonpath construction is now local to migration and test fixtures. src/config/sessions/store-backend.sqlite.tsnow stores canonical session entries in the per-agent database and has row-level read/upsert/delete patch support. Runtime upsert/patch/delete no longer scans for case variants or prunes legacy alias keys; doctor owns canonicalization. The standalone JSON import helper is gone, and migration merges upsert newer rows instead of replacing the whole session table.- Transcript events, VFS rows, and tool artifact rows now write to the per-agent database. The unshipped global transcript-file mapping table is gone; doctor records legacy source paths in durable migration rows instead.
- Runtime transcript lookup no longer scans JSONL byte offsets or probes legacy transcript files. Gateway chat/media/history paths read transcript rows from SQLite; JSONL is now a legacy doctor input or in-memory export encoding, not a runtime state file.
- Runtime transcript store APIs resolve SQLite scope, not filesystem paths. The
old
resolve...ForPathhelper and unusedtranscriptPathwrite options are gone from runtime callers. - Runtime session resolution now uses
{agentId, sessionId}and must not derivesqlite-transcript://<agent>/<session>strings for external boundaries. Legacy absolute JSONL paths are doctor migration inputs only. - Native hook relay direct-bridge records now live in shared SQLite KV rows
keyed by relay id. Runtime no longer writes a
/tmpJSON registry for those short-lived bridge records. runEmbeddedPiAgent(...)no longer has a transcript-locator parameter. Prepared worker descriptors also omit transcript locators. Runtime session state and queued follow-up runs carry{agentId, sessionId}instead of derived transcript handles.- Embedded compaction now takes SQLite scope from
agentIdandsessionId. Compaction hooks, context-engine calls, CLI delegation, and protocol replies must not receive derivedsqlite-transcript://...handles. Export/debug code can materialize files from rows explicitly, but it does not feed those names back into runtime identity. - Context-engine delegation no longer parses a transcript locator to recover
agent identity. The prepared runtime context carries the resolved
agentIdinto the built-in compaction bridge. - Transcript rewrite and live tool-result truncation now read and persist
transcript state by
{agentId, sessionId}and do not derive temporary locators for transcript-update event payloads. - The transcript-state helper surface no longer has locator-based
readTranscriptState,replaceTranscriptStateEvents, orpersistTranscriptStateMutationvariants. Runtime callers must use the{agentId, sessionId}APIs. Doctor import reads legacy files by explicit file path and writes SQLite rows; it does not migrate locator strings. - The runtime session-manager contract no longer exposes
open(locator),forkFrom(locator), orsetTranscriptLocator(...). Persisted session managers open and fork by{agentId, sessionId}only. - Gateway transcript reader APIs are scope-first. They take
{agentId, sessionId}and do not accept a positional transcript locator that could accidentally become runtime identity. Locator parsing is explicit and limited to boundary adapters. - Transcript update events are also scope-first.
emitSessionTranscriptUpdateno longer accepts a bare locator string, and listeners route by{agentId, sessionId}without parsing a handle. - Gateway session-message broadcast resolves session keys from agent/session scope, not from a transcript locator. The old transcript-locator-to-session key resolver/cache is gone.
- Gateway session-history SSE filters live updates by agent/session scope. It no longer canonicalizes transcript locator candidates, realpaths, or file-shaped transcript identities to decide whether a stream should receive an update.
- Session lifecycle hooks no longer derive or expose transcript locators on
session_end. Hook consumers getsessionId,sessionKey, next-session ids, and agent context; transcript files are not part of the lifecycle contract. - Reset hooks no longer derive or expose transcript locators either. The
before_resetpayload carries recovered SQLite messages plus the reset reason, while session identity stays in hook context. - Agent harness reset no longer accepts a transcript locator. Reset dispatch is
scoped by
sessionId/sessionKeyplus reason. - Agent extension session types no longer expose
transcriptLocator; extensions should use session context and runtime APIs rather than reaching for a file-shaped transcript identity. - Plugin compaction hooks no longer expose transcript locators. Hook context already carries session identity, and transcript reads must go through SQLite scope-aware APIs instead of file-shaped handles.
before_agent_finalizehooks no longer exposetranscriptPath, including native hook relay payloads. Finalization hooks use session context only.- Gateway reset responses no longer synthesize a transcript locator on the returned entry. The reset creates SQLite transcript rows, returns the clean session entry, and leaves transcript access to scope-aware readers.
- Embedded run and compaction results no longer surface transcript locators for
session accounting. Automatic compaction updates only the active
sessionId, compaction counters, and token metadata. - Embedded attempt results no longer return
transcriptLocatorUsed, and context-enginecompact()results no longer return transcript locators. Runtime retry loops only accept a successorsessionId. - Delivery-mirror transcript append results no longer return transcript
locators. Callers get the appended
messageId; transcript update signals use SQLite scope. - Parent-session fork helpers return only the forked
sessionId. Subagent preparation passes the child agent/session scope to engines. - CLI runner params and history reseeding no longer accept transcript locators.
CLI history reads resolve the SQLite transcript scope from
{agentId, sessionId}and session key context. - CLI and embedded-runner test fixtures now seed and read SQLite transcript rows
by session id instead of pretending active sessions are
*.jsonlfiles or passing asqlite-transcript://...string through runtime params. - Session tool-result guard events emit from known session scope even when an
in-memory manager has no derived locator. Its tests no longer fake active
/tmp/*.jsonltranscript files. - BTW and compaction-checkpoint helpers now read and fork transcript rows by SQLite scope. Checkpoint metadata now stores session ids and leaf/entry ids only; derived locators are no longer written into checkpoint payloads.
- Gateway transcript-key lookup uses SQLite transcript scope at protocol boundaries and no longer realpaths or stats transcript filenames.
- Automatic compaction transcript rotation writes successor transcript rows directly through the SQLite transcript store. Session rows keep only the successor session identity, not a durable JSONL path or persisted locator.
- Embedded context-engine compaction uses SQLite-named transcript rotation helpers. The rotation tests no longer construct JSONL successor paths or model active sessions as files.
- Managed outgoing image retention keys its transcript-message cache from SQLite transcript stats instead of filesystem stat calls.
- Runtime session locks and the standalone legacy
.jsonl.lockdoctor lane have been removed. - The Microsoft Teams runtime barrel no longer re-exports the old plugin SDK file-lock helper; its durable state paths are SQLite-backed.
- Session age/count pruning and explicit session cleanup have been removed. Doctor owns legacy import; stale sessions are reset or deleted explicitly.
- Doctor no longer treats
agents/<agent>/sessions/as required runtime state. It only scans that directory when it already exists, as legacy import or orphan-cleanup input. - Gateway
sessions.resolve, session patch/reset/compact paths, subagent spawning, fast abort, ACP metadata, heartbeat-isolated sessions, and TUI patching no longer migrate or prune legacy session keys as a side effect of normal runtime work. - CLI command session resolution now returns the owning
agentIdinstead of astorePath, and it no longer copies legacy main-session rows during normal--toor--session-idresolution. Legacy main-row canonicalization belongs to doctor only. - Runtime subagent depth resolution no longer reads
sessions.jsonor JSON5 session stores. It reads SQLitesession_entriesby agent id, and legacy depth/session metadata can only enter through the doctor import path. - Auth profile session overrides persist through direct
{agentId, sessionKey}row upserts instead of lazy-loading a file-shaped session-store runtime. - Auto-reply verbose gating and session update helpers now read/upsert SQLite session rows by session identity and no longer require a legacy store path before touching persisted row state.
- Command-run session metadata helpers now use entry-oriented names and module
paths; the old
session-storecommand helper surface has been removed. - Bootstrap header seeding and manual compaction boundary hardening now mutate
SQLite transcript rows directly. Runtime callers pass session identity, not
writable
.jsonlpaths. - Silent session-rotation replay copies recent user/assistant turns by
{agentId, sessionId}from SQLite transcript rows. It no longer accepts source or target transcript locators. - Fresh runtime session rows no longer store transcript locators. Callers use
{agentId, sessionId}directly; export/debug commands can choose output file names when they materialize rows. - Starting a new persisted transcript session now always opens SQLite rows by scope. The session manager no longer reuses a previous file-era transcript path or locator as the identity for the new session.
- Plugin runtime no longer exposes
api.runtime.agent.session.resolveTranscriptLocatorPath; plugin code uses SQLite row helpers and scope values. - The public
session-store-runtimeSDK surface no longer exports database close/reset test helpers; plugin tests import those through the testing SDK surface instead. - Active-memory blocking subagent runs use SQLite transcript rows instead of
creating temporary or persisted
session.jsonlfiles under plugin state. The oldtranscriptDiroption is removed. - One-off slug generation and Crestodian planner runs use SQLite transcript rows
instead of creating temporary
session.jsonlfiles. llm-taskhelper runs and hidden commitment extraction also use SQLite transcript rows, so these model-only helper sessions no longer create temporary JSON/JSONL transcript files.TranscriptSessionManagercreate, list, fork, branch, and continue paths use SQLite transcript rows only. Doctor/import/debug code handles explicit legacy source files outside the runtime session manager.- Parent transcript fork decisions and fork creation no longer accept
storePathorsessionsDir; they use{agentId, sessionId}SQLite transcript scope instead of retained filesystem path metadata. - Memory-host no longer exports no-op session-directory transcript classification helpers; transcript filtering now derives from SQLite row metadata during entry construction.
- Memory-host and QMD session-export tests use SQLite transcript scopes. Old
agents/<agentId>/sessions/*.jsonlpaths stay covered only where a test is intentionally proving doctor/import/export compatibility. - QA-lab raw session inspection now uses
sessions.listthrough the gateway instead of readingagents/qa/sessions/sessions.json; MSteams feedback appends directly to SQLite transcripts without fabricating a JSONL path. - Shared inbound channel turns now carry
{agentId, sessionKey}rather than a legacystorePath. LINE, WhatsApp, Slack, Discord, Telegram, Matrix, Signal, iMessage, BlueBubbles, Feishu, Google Chat, IRC, Nextcloud Talk, Zalo, Zalo Personal, QA Channel, Microsoft Teams, Mattermost, Synology Chat, Tlon, Twitch, and QQBot recording paths now read updated-at metadata and record inbound session rows through SQLite identity. - Transcript locator persistence is removed from active session rows.
resolveSessionTranscriptTargetreturnsagentId,sessionId, and optional topic metadata; doctor is the only code that imports legacy transcript file names. - Embedded PI runs reject incoming transcript handles. They use the SQLite
{agentId, sessionId}identity before worker launch and again before the attempt touches transcript state. A stale/tmp/*.jsonlinput cannot select a runtime write target. - Cache trace, Anthropic payload, and diagnostics timeline records now write to
SQLite diagnostic KV rows only. The old
diagnostics.cacheTrace.filePath,OPENCLAW_CACHE_TRACE_FILE,OPENCLAW_ANTHROPIC_PAYLOAD_LOG_FILE, andOPENCLAW_DIAGNOSTICS_TIMELINE_PATHJSONL override paths are removed. - Cron persistence now reconciles SQLite
cron_jobsrows instead of deleting/reinserting the whole job table on each save. Plugin target writebacks update matching cron rows directly and keep runtime cron state in the same state-database transaction. - Cron runtime callers now use a stable SQLite cron store key. Legacy
cron.storepaths are doctor import inputs only; production gateway, task maintenance, status, run-log, and Telegram target writeback paths useresolveCronStoreKeyand no longer path-normalize the key. Cron status now reportsstoreKeyrather than the old file-shapedstorePathfield. - ACP spawn no longer resolves or persists transcript JSONL file paths. Spawn and thread-bind setup persist the SQLite session row directly and keep the session id as the retained transcript identity.
- ACP session metadata APIs now read/list/upsert SQLite rows by
agentIdand no longer exposestorePathas part of the ACP session entry contract. - Session usage accounting and gateway usage aggregation now resolve transcripts
by
{agentId, sessionId}only. The cost/usage cache and discovered-session summaries no longer synthesize or return transcript locator strings. - Gateway chat append, abort-partial persistence,
/sessions.send, and webchat media transcript writes append directly through SQLite transcript scope. The gateway transcript-injection helper no longer accepts atranscriptLocatorparameter. - SQLite transcript discovery now lists transcript scopes and stats only:
{agentId, sessionId, updatedAt, eventCount}. The deadlistSqliteSessionTranscriptLocatorscompatibility helper and per-rowlocatorfield are gone. - Transcript repair runtime now exposes only
repairTranscriptSessionStateIfNeeded({agentId, sessionId}). The old locator-based repair helper is deleted; doctor/debug code reads explicit source file paths and never migrates locator strings. - ACP replay ledger runtime now stores per-session replay rows in the shared
SQLite state database instead of
acp/event-ledger.json; doctor imports and removes the legacy file. - Gateway transcript reader helpers now live in
src/gateway/session-transcript-readers.tsinstead of the oldsession-utils.fsmodule name. The fallback retry history check is named for SQLite transcript content instead of the old file-helper surface. - Gateway injected-chat and compaction helpers now pass SQLite transcript scope through internal helper APIs instead of naming values transcript paths or source files.
- Bootstrap continuation detection now checks SQLite transcript rows through
hasCompletedBootstrapTranscriptTurn; it no longer exposes a file-shaped helper name. - Embedded-runner tests now use SQLite transcript identity, and opening a new
transcript manager always requires an explicit
sessionId. - Memory indexing helpers now use SQLite transcript terminology end to end:
host exports list/build session transcript entries, targeted sync queues
sessionTranscripts, and QMD/builtin indexers no longer expose file-shaped helper names. - The generic plugin SDK persistent-dedupe helper no longer exposes file-shaped options. Callers provide SQLite scope keys and durable dedupe rows live in shared plugin state.
- Microsoft Teams SSO and delegated OAuth tokens moved from locked JSON files
to SQLite plugin state. Doctor imports
msteams-sso-tokens.jsonandmsteams-delegated.json, rebuilds canonical SSO token keys from payloads, and removes the source files. - Matrix sync cache state moved from
bot-storage.jsonto SQLite plugin state. Doctor imports legacy raw or wrapped sync payloads and removes the source file. - Matrix legacy crypto migration status moved from
legacy-crypto-migration.jsonto SQLite plugin state. Doctor imports the old status file; Matrix SDK IndexedDB snapshots moved fromcrypto-idb-snapshot.jsonto SQLite plugin blobs. Recovery keys remain file-backed because they are Matrix user secret material rather than OpenClaw runtime cache rows. - Memory Wiki activity logs now use SQLite plugin state instead of
.openclaw-wiki/log.jsonl. The Memory Wiki migration provider imports old JSONL logs; wiki markdown and user vault content stay file-backed as workspace content. - Memory Wiki no longer creates
.openclaw-wiki/state.jsonor the unused.openclaw-wiki/locksdirectory. The migration provider removes those retired plugin metadata files if an older vault still has them. - Crestodian audit entries now use core SQLite plugin state instead of
audit/crestodian.jsonl. Doctor imports the legacy JSONL audit log and removes it after successful import. - Config write/observe audit entries now use core SQLite plugin state instead
of
logs/config-audit.jsonl. Doctor imports the legacy JSONL audit log and removes it after successful import. - Crestodian rescue pending approvals now use core SQLite plugin state instead
of
crestodian/rescue-pending/*.json. Doctor imports legacy pending approval files and removes them after successful import. - Phone Control temporary arm state now uses SQLite plugin state instead of
plugins/phone-control/armed.json. Doctor imports the legacy armed-state file into thephone-control/arm-statenamespace and removes the file. - Doctor no longer repairs JSONL transcripts in place or creates backup JSONL files. It imports the active branch into SQLite and removes the legacy source.
- Session-memory hook transcript lookup uses
{agentId, sessionId}scope-only SQLite reads. Its helper no longer accepts or derives transcript locators, legacy file reads, or file-rewrite options. - Codex app-server conversation bindings now key SQLite plugin state by
OpenClaw session key or explicit
{agentId, sessionId}scope. They must not preserve transcript-path fallback bindings. - Codex app-server mirrored-history reads use the SQLite transcript scope only; they must not recover identity from transcript file paths.
- Role-ordering and compaction reset paths no longer unlink old transcript files; reset only rotates the SQLite session row and transcript identity.
- Gateway reset and checkpoint responses return clean session rows plus session ids. They no longer synthesize SQLite transcript locators for clients.
- Memory-core dreaming no longer prunes session rows by probing for missing
JSONL files. Subagent cleanup goes through the session runtime API instead of
filesystem existence checks. Its transcript-ingestion tests seed SQLite rows
directly instead of creating
agents/<id>/sessionsfixtures or locator placeholders. - Gateway doctor memory status reads short-term recall and phase-signal counts
from SQLite plugin-state rows instead of
memory/.dreams/*.json; CLI and doctor output now label that storage as a SQLite store, not a path. - Sandbox container/browser registries now use the shared
sandbox_registry_entriesSQLite table. Doctor imports legacy monolithic and sharded JSON registry files and removes successful sources. - Commitments now use a typed shared
commitmentstable instead of a whole-store JSON blob. Doctor imports legacycommitments.jsonand removes it after a successful import. - Cron job definitions, schedule state, and run history no longer have runtime
JSON writers or readers. Runtime uses
cron_jobsrows with inline runtime state pluscron_run_logs; doctor imports legacyjobs.json,jobs-state.json, andruns/*.jsonlfiles and removes the imported sources. Plugin target writebacks update matchingcron_jobsrows instead of loading and replacing the whole cron store. - Discord model-picker preferences, command-deploy hashes, and thread bindings now use shared SQLite plugin state. Their legacy JSON import plans live in the Discord plugin setup/doctor migration surface, not in core migration code.
- BlueBubbles catchup cursors and inbound dedupe markers now use shared SQLite plugin state. Their legacy JSON import plans live in the BlueBubbles plugin setup/doctor migration surface, not in core migration code.
- Telegram update offsets, sticker cache rows, sent-message cache rows, topic-name cache rows, and thread bindings now use shared SQLite plugin state. Their legacy JSON import plans live in the Telegram plugin setup/doctor migration surface, not in core migration code.
- iMessage catchup cursors, reply short-id mappings, and sent-echo dedupe rows
now use shared SQLite plugin state. The old
imessage/catchup/*.json,imessage/reply-cache.jsonl, andimessage/sent-echoes.jsonlfiles are doctor inputs only. - Feishu message dedupe rows now use shared SQLite plugin state instead of
feishu/dedup/*.jsonfiles. Its legacy JSON import plan lives in the Feishu plugin setup/doctor migration surface, not in core migration code. - Microsoft Teams conversations, polls, pending upload buffers, and feedback
learnings now use shared SQLite plugin state/blob tables. The pending upload
path uses
plugin_blob_entriesso media buffers are stored as SQLite BLOBs instead of base64 JSON. The runtime helper names now use SQLite/state naming rather than*-fsfile-store naming, and the oldstorePathshim is gone from these stores. Its legacy JSON import plan lives in the Microsoft Teams plugin setup/doctor migration surface. - Zalo hosted outbound media now uses shared SQLite
plugin_blob_entriesinstead ofopenclaw-zalo-outbound-mediaJSON/bin temp sidecars. - Diffs viewer HTML and metadata now use shared SQLite
plugin_blob_entriesinstead ofmeta.json/viewer.htmltemp files. Rendered PNG/PDF outputs stay temp materializations because channel delivery still needs a file path. - File Transfer audit decisions now use shared SQLite
plugin_state_entriesinstead of the unboundedaudit/file-transfer.jsonlruntime log. Doctor imports the legacy JSONL audit file into plugin state and removes the source after a clean import. - ACPX process leases and gateway instance identity now use shared SQLite plugin
state. Doctor imports the legacy
gateway-instance-idfile into plugin state and removes the source. - Gateway media attachments now use the shared
media_blobsSQLite table as the canonical byte store. Local paths returned to channel and sandbox compatibility surfaces are temp materializations of the database row, not the durable media store. - Cache-trace diagnostics, Anthropic payload diagnostics, raw model stream
diagnostics, and diagnostics timeline events now write SQLite diagnostic rows
instead of
logs/*.jsonlfiles. Runtime path override flags and env vars have been removed; export/debug commands can materialize files explicitly from database rows. - Gateway singleton locks now use shared SQLite KV instead of temp-dir lock files. Done.
- Gateway restart sentinel state now uses shared SQLite KV instead of
restart-sentinel.json; runtime code clears the SQLite row directly and no longer carries file cleanup plumbing. - Gateway restart intent and supervisor handoff state now use shared SQLite KV
instead of
gateway-restart-intent.jsonandgateway-supervisor-restart-handoff.jsonsidecars. - Gateway singleton coordination now uses SQLite KV rows under
gateway_locksinstead of writinggateway.<hash>.lockfiles. The lock still records pid, config path, process start time, and stale-owner metadata, but SQLite owns the atomic acquire/release boundary. - Main-session restart recovery now discovers candidate agents through the
SQLite
agent_databasesregistry instead of scanningagents/*/sessionsdirectories. - Gemini session-corruption recovery now deletes only the SQLite session row;
it no longer needs a legacy
storePathgate or tries to unlink a derived transcript JSONL path. - Path override handling now treats literal
undefined/nullenvironment values as unset, preventing accidental repo-rootundefined/state/*.sqlitedatabases during tests or shell handoffs. - Config health fingerprints now use shared SQLite KV instead of
logs/config-health.json, keeping the normal config file as the only non-credential configuration document. - Voice Wake trigger and routing settings now use shared SQLite KV instead of
settings/voicewake.jsonandsettings/voicewake-routing.json; doctor imports the legacy JSON files and removes them after a successful migration. - Plugin conversation binding approvals now use shared SQLite KV instead of
plugin-binding-approvals.json; the legacy file is a doctor migration input. - Generic current-conversation bindings now store typed
current_conversation_bindingsrows instead of rewritingbindings/current-conversations.json; doctor imports the legacy JSON file and removes it after a successful migration. - Memory Wiki imported-source sync ledgers now store one SQLite plugin-state row
per vault/source key instead of rewriting
.openclaw-wiki/source-sync.json; the migration provider imports and removes the legacy JSON ledger. - Memory Wiki ChatGPT import-run records now store one SQLite plugin-state row
per vault/run id instead of writing
.openclaw-wiki/import-runs/*.json. Rollback snapshots remain explicit vault files until import-run snapshot archival is moved into blob storage. - Memory Wiki compiled digests now store SQLite plugin blob rows instead of
writing
.openclaw-wiki/cache/agent-digest.jsonand.openclaw-wiki/cache/claims.jsonl. The migration provider imports old cache files and removes the cache directory when it becomes empty. - ClawHub skill install tracking now stores one SQLite plugin-state row per
workspace/skill instead of writing or reading
.clawhub/lock.jsonand.clawhub/origin.jsonsidecars at runtime. Doctor/migrate imports the legacy sidecars from configured agent workspaces and removes them after a clean import. - The installed plugin index now reads and writes shared SQLite KV
installed_plugin_index/currentinstead ofplugins/installs.json; the legacy JSON file is only a doctor migration input and is removed after import. - The legacy
plugins/installs.jsonpath helper now lives in doctor legacy code. Runtime plugin-index modules expose only SQLite-backed persistence options, not a JSON file path. - Matrix sync cache, storage metadata, thread bindings, inbound dedupe markers, startup verification cooldown state, and SDK IndexedDB crypto snapshots now use shared SQLite plugin state/blob tables. Their legacy JSON import plan lives in the Matrix plugin setup/doctor migration surface. Matrix recovery keys remain explicit Matrix client files until a separate credential/secret export design exists.
- Nostr bus cursors and profile publish state now use shared SQLite plugin state. Their legacy JSON import plan lives in the Nostr plugin setup/doctor migration surface.
- Active Memory session toggles now use shared SQLite plugin state instead of
session-toggles.json; toggling memory back on deletes the row instead of rewriting a JSON object. - Skill Workshop proposals and review counters now use shared SQLite plugin
state instead of per-workspace
skill-workshop/<workspace>.jsonstores. Each proposal is a separate row underskill-workshop/proposals, and the review counter is a separate row underskill-workshop/reviews. - Skill Workshop reviewer subagent runs now use the runtime session transcript
resolver instead of creating
skill-workshop/<sessionId>.jsonsidecar session paths. - ACPX process leases now use shared SQLite plugin state under
acpx/process-leasesinstead of a whole-fileprocess-leases.jsonregistry. Each lease is stored as its own row, preserving startup stale-process reaping without a runtime JSON rewrite path. - Subagent run registry persistence uses typed shared
subagent_runsrows. The oldsubagents/runs.jsonpath is now only a doctor migration input, and runtime helper names no longer describe the state layer as disk-backed. - Backup stages the state directory before archiving, copies non-database files,
snapshots
*.sqlitedatabases withVACUUM INTO, omits live WAL/SHM sidecars, records snapshot metadata in the archive manifest, and records completed backup runs in SQLite with the archive manifest. - Plain setup and onboarding workspace preparation no longer create
agents/<agentId>/sessions/directories. They create config/workspace only; SQLite session rows and transcript rows are created on demand in the per-agent database. - Security permission repair now targets the global and per-agent SQLite
databases plus WAL/SHM sidecars instead of
sessions.jsonand transcript JSONL files. openclaw reset --scope config+creds+sessionsremoves per-agentopenclaw-agent.sqlitedatabases plus WAL/SHM sidecars, not only legacysessions/directories.- Gateway aggregate session helpers now use entry-oriented names:
loadCombinedSessionEntriesForGatewayreturns{ databasePath, entries }. The old combined-store naming has been removed from runtime callers. - Docker MCP channel seeding now writes the main session row and transcript
events into the per-agent SQLite database instead of creating
sessions.jsonand a JSONL transcript. - The bundled session-memory hook now resolves previous-session context from
SQLite by
{agentId, sessionId}and only treats retained transcript paths as legacy metadata. It no longer scans or synthesizesworkspace/sessionsdirectories. migration_runsrecords legacy-state migration executions with status, timestamps, and JSON reports.migration_sourcesrecords each imported legacy file source with hash, size, record count, target table, run id, status, and source-removal state.backup_runsrecords backup archive paths, status, and JSON manifests.check:database-first-legacy-storesfails new runtime source that pairs legacy store names with write-style filesystem APIs. Tests and migration, doctor, import, and explicit export code remain allowed. The guard now also covers runtimecache/*.jsonstores, genericthread-bindings.jsonsidecars, cron state/run-log JSON, config health JSON, restart and lock sidecars, Voice Wake settings, plugin binding approvals, installed plugin index JSON, File Transfer audit JSONL, and Memory Wiki activity logs.
Target Schema Shape
Keep schemas explicit. Use typed tables for hot paths and kv only for low-risk
configuration-shaped state.
Global database:
kv(scope, key, value_json, updated_at)
agents(agent_id, config_fingerprint, created_at, updated_at, agent_db_path)
agent_databases(agent_id, path, schema_version, last_seen_at, size_bytes)
task_runs(...)
task_delivery_state(...)
flow_runs(...)
subagent_runs(run_id, child_session_key, requester_session_key, controller_session_key, created_at, ended_at, cleanup_handled, payload_json)
current_conversation_bindings(binding_key, binding_id, channel, account_id, conversation_id, target_session_key, status, bound_at, expires_at, record_json)
tui_last_sessions(scope_key, session_key, updated_at)
plugin_state_entries(plugin_id, namespace, entry_key, value_json, created_at, expires_at)
plugin_blob_entries(plugin_id, namespace, entry_key, metadata_json, blob, created_at, expires_at)
media_blobs(subdir, id, content_type, size_bytes, blob, created_at, updated_at)
sandbox_registry_entries(registry_kind, container_name, entry_json, updated_at)
cron_run_logs(...)
commitments(id, agent_id, session_key, channel, status, due_earliest_ms, due_latest_ms, updated_at_ms, record_json)
migration_runs(id, started_at, finished_at, status, report_json)
migration_sources(source_key, migration_kind, source_path, target_table, source_sha256, source_size_bytes, source_record_count, last_run_id, status, imported_at, removed_source, report_json)
backup_runs(id, created_at, archive_path, status, manifest_json)
Agent database:
kv(scope, key, value_json, updated_at)
session_entries(session_key, entry_json, updated_at)
transcript_events(session_id, seq, event_json, created_at)
transcript_event_identities(session_id, event_id, seq, event_type, has_parent, parent_id, message_idempotency_key, created_at)
transcript_snapshots(session_id, snapshot_id, reason, event_count, created_at, metadata_json)
vfs_entries(namespace, path, kind, content_blob, metadata_json, updated_at)
tool_artifacts(run_id, artifact_id, kind, metadata_json, blob, created_at)
run_artifacts(run_id, path, kind, metadata_json, blob, created_at)
cache_entries(scope, key, value_json, blob, expires_at, updated_at)
Future search can add FTS tables without changing the canonical event tables:
transcript_events_fts(session_id, seq, text)
vfs_entries_fts(namespace, path, text)
Large values should use blob columns, not JSON string encoding. Keep
value_json for small structured data that must remain inspectable with plain
SQLite tooling.
Doctor Migration Shape
Doctor should call one explicit migration step that is reportable and safe to rerun:
openclaw doctor --fix
openclaw doctor --fix invokes the state migration implementation after
ordinary config preflight and creates a verified backup before import. Runtime
startup and openclaw migrate must not import legacy OpenClaw state files.
Migration properties:
- One migration pass discovers all legacy file sources and produces a plan before mutating anything.
- Doctor creates a verified pre-migration backup archive before importing legacy files.
- Imports are idempotent and keyed by source path, mtime, size, hash, and target table.
- Successful source files are removed or archived after the target database has committed.
- Failed imports leave the source untouched and record a warning in
migration_runs. - Runtime code reads SQLite only after the migration exists.
- No downgrade/export-to-runtime-files path is required.
Migration Inventory
Move these into the global database:
- Task registry runtime writes now use the shared database; the unshipped
tasks/runs.sqlitesidecar importer is deleted. - Task Flow runtime writes now use the shared database; the unshipped
tasks/flows/registry.sqlitesidecar importer is deleted. - Plugin state runtime writes now use the shared database; the unshipped
plugin-state/state.sqlitesidecar importer is deleted. - Builtin memory search no longer defaults to
memory/<agentId>.sqlite; its index tables live in the owning agent database, and the explicitmemorySearch.store.pathsidecar opt-in has been retired to doctor config migration. - Sandbox container/browser registries from monolithic and sharded JSON. Runtime writes now use the shared database; legacy JSON import remains.
- Cron job definitions, schedule state, and run history now use shared SQLite;
doctor imports/removes legacy
jobs.json,jobs-state.json, andcron/runs/*.jsonlfiles - Device identity/auth/bootstrap, pairing, push, update check, commitments, OpenRouter model cache, installed plugin index, and app-server bindings
- Device-pair notification subscribers and delivered-request markers now use the
shared SQLite plugin-state table instead of
device-pair-notify.json. - Voice-call call records now use the shared SQLite plugin-state table under the
voice-call/callsnamespace instead ofcalls.jsonl; the plugin CLI tails and summarizes SQLite-backed call history. - QQBot gateway sessions, known-user records, and ref-index quote cache now use
SQLite plugin state under
qqbotnamespaces (sessions,known-users,ref-index) instead ofsession-*.json,known-users.json, andref-index.jsonl; the QQBot doctor/setup migration imports and removes the legacy files. - Discord model-picker preferences, command-deploy hashes, and thread bindings
now use SQLite plugin state under
discordnamespaces (model-picker-preferences,command-deploy-hashes,thread-bindings) instead ofmodel-picker-preferences.json,command-deploy-cache.json, andthread-bindings.json; the Discord doctor/setup migration imports and removes the legacy files. - BlueBubbles catchup cursors and inbound dedupe markers now use SQLite plugin
state under
bluebubblesnamespaces (catchup-cursors,inbound-dedupe) instead ofbluebubbles/catchup/*.jsonandbluebubbles/inbound-dedupe/*.json; the BlueBubbles doctor/setup migration imports and removes the legacy files. - Telegram update offsets, sticker cache entries, reply-chain message cache
entries, sent-message cache entries, topic-name cache entries, and thread
bindings now use SQLite plugin state under
telegramnamespaces (update-offsets,sticker-cache,message-cache,sent-messages,topic-names,thread-bindings) instead ofupdate-offset-*.json,sticker-cache.json,*.telegram-messages.json,*.telegram-sent-messages.json,*.telegram-topic-names.json, andthread-bindings-*.json; the Telegram doctor/setup migration imports and removes the legacy files. - iMessage catchup cursors, reply short-id mappings, and sent-echo dedupe rows
now use SQLite plugin state under
imessagenamespaces (catchup-cursors,reply-cache,sent-echoes) instead ofimessage/catchup/*.json,imessage/reply-cache.jsonl, andimessage/sent-echoes.jsonl; the iMessage doctor/setup migration imports and removes the legacy files. - Microsoft Teams conversations, polls, delegated tokens, pending uploads, and
feedback learnings now use SQLite plugin state/blob namespaces
(
conversations,polls,delegated-tokens,pending-uploads,feedback-learnings) instead ofmsteams-conversations.json,msteams-polls.json,msteams-delegated.json,msteams-pending-uploads.json, and*.learnings.json; the Microsoft Teams doctor/setup migration imports and removes the legacy files. - Matrix sync cache, storage metadata, thread bindings, inbound dedupe markers,
startup verification cooldown state, and SDK IndexedDB crypto snapshots now
use SQLite plugin state/blob namespaces under
matrix(sync-store,storage-meta,thread-bindings,inbound-dedupe,startup-verification,idb-snapshots) instead ofbot-storage.json,storage-meta.json,thread-bindings.json,inbound-dedupe.json,startup-verification.json, andcrypto-idb-snapshot.json; the Matrix doctor/setup migration imports and removes those legacy files from account-scoped Matrix storage roots. - Nostr bus cursors and profile publish state now use SQLite plugin state under
nostrnamespaces (bus-state,profile-state) instead ofbus-state-*.jsonandprofile-state-*.json; the Nostr doctor/setup migration imports and removes the legacy files. - Active Memory session toggles now use SQLite plugin state under
active-memory/session-togglesinstead ofsession-toggles.json. - Skill Workshop proposal queues and review counters now use SQLite plugin state
under
skill-workshop/proposalsandskill-workshop/reviewsinstead of per-workspaceskill-workshop/<workspace>.jsonfiles. - Outbound delivery and session delivery queues now share the global SQLite
delivery_queue_entriestable under separate queue names (outbound-delivery,session-delivery) instead of durabledelivery-queue/*.json,delivery-queue/failed/*.json, andsession-delivery-queue/*.jsonfiles. The doctor legacy-state step imports pending and failed rows, removes stale delivered markers, and deletes the old JSON files after import. - ACPX process leases now use SQLite plugin state under
acpx/process-leasesinstead ofprocess-leases.json. - Backup and migration run metadata
Move these into agent databases:
- Agent session entries. Done for runtime writes.
- Agent transcript events. Done for runtime writes.
- Compaction checkpoints and transcript snapshots. Done for runtime writes:
checkpoint transcript copies are SQLite transcript rows and checkpoint
metadata is recorded in
transcript_snapshots. Gateway checkpoint helpers now name these values as transcript snapshots rather than source files. - Agent VFS scratch/workspace namespaces. Done for runtime VFS writes.
- Tool artifacts. Done for runtime writes.
- Run artifacts. Done for worker runtime writes through the per-agent
run_artifactstable. - Agent-local runtime caches. Done for worker runtime scoped cache writes through
the per-agent
cache_entriestable. Gateway-wide model caches stay in the global database unless they become agent-specific. - ACP parent stream logs. Done for runtime writes.
- ACP replay ledger sessions. Done for runtime writes via
acp_replay_sessionsandacp_replay_events; legacyacp/event-ledger.jsonremains only as doctor input. - Trajectory sidecars when they are not explicit export files. Done for runtime
writes: trajectory capture writes agent-database
trajectory_runtime_eventsrows and mirrors run-scoped artifacts into SQLite. Legacy sidecars remain readable only as export/migration compatibility input. Runtime trajectory capture exposes SQLite scope; JSONL path helpers are isolated to legacy export/debug support and are not re-exported from the runtime module. Embedded-runner trajectory metadata records{agentId, sessionId, sessionKey}identity instead of persisting a transcript locator.
Keep these file-backed for now:
openclaw.jsonauth-profiles.json- provider or CLI credential files
- plugin/package manifests
- user workspaces and Git repositories when disk mode is selected
- logs intended for operator tailing, unless a specific log surface is moved
Migration Plan
Phase 0: Freeze The Boundary
Make the durable-state boundary explicit before moving more rows:
- Add a
migration_runstable to the global database. Done for legacy-state migration execution reports. - Add a single doctor-owned state migration service for file-to-database import.
Done:
openclaw doctor --fixuses the legacy-state migration implementation. - Make
planread-only and makeapplycreate a backup, import, verify, and then delete or quarantine old files. Done: doctor creates a verified pre-migration backup, passes the backup path intomigration_runs, and reuses the importer/removal paths. - Add static bans so new runtime code cannot write legacy state files while migration code and tests can still seed/read them. Done for the currently migrated legacy stores.
Phase 1: Finish The Global Control Plane
Keep shared coordination state in state/openclaw.sqlite:
- Agents and agent database registry
- Task and Task Flow ledgers
- Plugin state
- Sandbox container/browser registry
- Cron/scheduler run history
- Pairing, device, push, update-check, TUI, OpenRouter/model caches, and other small gateway-scoped runtime state
- Backup and migration metadata
- Gateway media attachment bytes. Done for runtime writes; direct file paths
are temp materializations for compatibility with channel senders and sandbox
staging. Doctor imports legacy media files into
media_blobsand removes the source files after successful row writes. - Debug proxy capture sessions, events, and payload blobs. Done for the default
capture path: explicit
OPENCLAW_DEBUG_PROXY_DB_PATHremains a one-off diagnostics escape hatch, but normal captures live in the shared state DB and use the same WAL/busy-timeout settings.
This phase also deletes duplicate sidecar openers, permission helpers, WAL setup, filesystem pruning, and compatibility writers from those subsystems.
Phase 2: Introduce Per-Agent Databases
Create one database per agent and register it from the global DB:
~/.openclaw/state/openclaw.sqlite
~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite
The global agents or agent_databases row stores the path, schema version,
last-seen timestamp, and basic size/integrity metadata. Runtime code asks the
registry for the agent DB instead of deriving file paths directly.
The agent DB owns:
session_entriestranscript_events- transcript snapshots and compaction checkpoints. Done for runtime writes.
vfs_entriestool_artifactsand run artifacts- agent-local runtime/cache rows. Done for worker scoped caches.
- ACP parent stream events
- trajectory runtime events when they are not explicit export artifacts
Phase 3: Replace Session Store APIs
Delete the file-shaped session store surface:
- Replace
loadSessionStore(storePath)runtime usage with agent/session row APIs. - Replace
saveSessionStoreandupdateSessionStorewhole-object rewrites with row operations:getSessionEntry(agentId, sessionKey)upsertSessionEntry(agentId, sessionKey, patch | entry)deleteSessionEntry(agentId, sessionKey)listSessionEntries(agentId, filters)- SQL cleanup for missing transcript references
- Delete
store-writer.tsand queue tests. Done. - Keep
sessions.jsonparsing only in the migration service and doctor tests. - Runtime lifecycle fallback reads the SQLite transcript header, not the old JSONL first line.
This is the pass that removes most remaining session-management garbage: file-lock parameters, pruning/truncation vocabulary, store path identity, and tests that prove JSON persistence.
Phase 4: Move Transcripts, ACP Streams, Trajectories, And VFS
Make every agent data stream database-native:
- Transcript append writes go through one SQLite transaction that ensures the
session header, checks message idempotency, selects the parent tail, inserts
into
transcript_events, and records queryable identity metadata intranscript_event_identities. - ACP parent stream logs become rows, not
.acp-stream.jsonlfiles. Done. - ACP spawn setup no longer persists transcript JSONL paths. Done.
- Runtime trajectory capture writes event rows/artifacts directly. The explicit support/export command can still produce JSONL bundles as an export format. Done.
- Disk workspaces stay on disk when configured as disk mode.
- VFS scratch and experimental VFS-only workspace mode use the agent DB.
The migration imports old JSONL files once, records counts/hashes in
migration_runs, and removes imported files after integrity checks.
Phase 5: Backup, Restore, Vacuum, And Verify
Backups remain one archive file:
- Checkpoint every global and agent database.
- Snapshot each DB with SQLite backup semantics or
VACUUM INTO. - Archive compact DB snapshots, config, credentials/auth profile files, and requested workspace exports.
- Omit raw live
*.sqlite-waland*.sqlite-shmfiles. - Verify by opening every DB snapshot and running
PRAGMA integrity_check. - Restore copies snapshots back to their target paths. This branch resets the
unshipped SQLite layout to
user_version = 1; future shipped schema changes can add explicit migrations when they are needed.
Phase 6: Worker Runtime
Keep worker mode experimental while the database split lands:
- Workers receive agent id, run id, filesystem mode, and DB registry identity.
- Each worker opens its own SQLite connection.
- Parent keeps channel delivery, approvals, config, and cancellation authority.
- Start with one worker per active run; add pooling only after lifecycle and DB connection ownership are stable.
Phase 7: Delete The Old World
After the migration path and row APIs land:
- Remove runtime
sessions.json, transcript JSONL, sandbox registry JSON, task sidecar SQLite, and plugin-state sidecar SQLite writes. - Remove JSON/session pruning and truncation code.
- Remove file locks and lock-shaped tests.
- Remove runtime compatibility exports that only exist to keep old session files current.
- Keep explicit support exports as user-requested archive formats only.
Backup And Restore
Backups should be one archive file, but database capture should be SQLite-native:
- Stop long-running write activity or enter a short backup barrier.
- For every global and agent database, run a checkpoint.
- Snapshot each database using SQLite backup semantics or
VACUUM INTOinto a temporary backup directory. - Archive the compacted database snapshots, config file, credentials directory, selected workspaces, and a manifest.
- Verify the archive by opening every included SQLite snapshot and running
PRAGMA integrity_check.
Do not rely on raw live *.sqlite, *.sqlite-wal, and *.sqlite-shm copies as
the primary backup format. The archive manifest should record database role,
agent id, schema version, source path, snapshot path, byte size, and integrity
status.
Restore should rebuild the global database and agent database files from the archive snapshots. Because the SQLite layout has not shipped yet, this refactor keeps only the version-1 schema plus doctor file-to-database import.
Runtime Refactor Plan
-
Add database registry APIs.
- Resolve global DB and per-agent DB paths.
- Keep the unshipped schemas at
user_version = 1; do not add schema migration runner code until a shipped schema needs it. - Add close/checkpoint/integrity helpers used by tests, backup, and doctor.
-
Collapse sidecar SQLite stores.
- Move plugin state tables into the global database. Done for runtime writes; the unshipped legacy sidecar importer is deleted.
- Move task registry tables into the global database. Done for runtime writes; the unshipped legacy sidecar importer is deleted.
- Move Task Flow tables into the global database. Done for runtime writes; the unshipped legacy sidecar importer is deleted.
- Move builtin memory-search tables into each agent database. Done; explicit
custom
memorySearch.store.pathis now removed by doctor config migration. - Delete duplicate database openers, WAL setup, permission helpers, and close paths from those subsystems.
-
Move agent-owned tables into per-agent databases.
- Create agent DB on demand through the global database registry. Done.
- Move runtime session entries, transcript events, VFS rows, and tool artifacts to agent DBs. Done.
- Do not migrate branch-local shared-DB session entries, transcript events, VFS rows, or tool artifacts; that layout never shipped. Keep only legacy file-to-database import in doctor.
-
Replace session store APIs.
- Remove
storePathas the runtime identity. Done for the shared inbound channel turn/session-record pipeline and mostly done for hot paths: session metadata, route updates, command persistence, CLI session cleanup, Feishu reasoning previews, transcript-state persistence, subagent depth, auth profile session overrides, parent-fork logic, and QA-lab inspection now resolve the database from canonical agent/session keys. Gateway/TUI/UI/macOS session-list responses now exposedatabasePathinstead of legacypath; macOS debug surfaces show the per-agent database as read-only state instead of writingsession.storeconfig./statusand chat-driven trajectory export no longer propagate legacy store paths; transcript usage fallback reads SQLite by agent/session identity. RemainingstorePathcall surfaces are migration/debug metadata and older RPC response fields that still carry SQLite keys for compatibility. Gateway combined-session loading no longer has a special runtime branch for non-templatedsession.storevalues; it aggregates per-agent SQLite rows. The legacy session-lock doctor lane and its.jsonl.lockcleanup helper were removed; SQLite is the session concurrency boundary now. Hot runtime call sites use row-oriented helper names such asresolveSessionRowEntry; the oldresolveSessionStoreEntrycompatibility alias has been removed from runtime and plugin SDK exports.
- Remove
- Use
{ agentId, sessionKey }row operations. Done:getSessionEntry,upsertSessionEntry,deleteSessionEntry,patchSessionEntry, andlistSessionEntriesare SQLite-first APIs that do not require a session store path. Status summary, local agent status, health, and theopenclaw sessionslisting command now read per-agent rows directly and display per-agent SQLite database paths instead ofsessions.jsonpaths. - Replace whole-store delete/insert with
upsertSessionEntry,deleteSessionEntry,listSessionEntries, and SQL cleanup queries. Done for runtime: hot paths now use row APIs and conflict-retried row patches; remaining whole-store import/replace helpers are limited to migration import code and SQLite backend tests.- Delete
store-writer.tsand writer-queue tests. Done. - Delete runtime legacy-key pruning and alias-delete parameters from session row upserts/patches. Done.
- Delete
-
Delete runtime JSON registry behavior.
- Make sandbox registry reads and writes SQLite-only. Done.
- Import monolithic and sharded JSON only from the migration step. Done.
- Remove sharded registry locks and JSON writes. Done.
- Keep one typed registry table instead of storing registry rows as generic
kvif the shape remains hot-path operational state. Done.
-
Delete file-lock-shaped session mutation.
- Done for runtime lock creation and runtime lock APIs.
- The standalone legacy
.jsonl.lockdoctor cleanup lane is removed. session.writeLockis doctor-migrated legacy config, not a typed runtime setting.- Generic plugin SDK dedupe persistence no longer uses file locks or JSON files; it writes shared SQLite plugin-state rows. Done.
- QMD embed coordination uses a SQLite state lease instead of
qmd/embed.lock. Done.
-
Make workers database-aware.
- Workers open their own SQLite connections.
- Parent owns delivery, channel callbacks, and config.
- Worker receives agent id, run id, filesystem mode, and DB registry identity, not live handles.
vfs-onlystays experimental and uses the agent database as its storage root.- Keep one worker per active run first. Pooling can wait until DB connection lifetime and cancellation behavior are boring.
-
Backup integration.
- Teach backup to snapshot global and agent databases via SQLite backup or
VACUUM INTO. Done for discovered*.sqlitefiles under the state asset. - Add backup verification for SQLite integrity and schema version. Done for backup creation and archive verification integrity checks.
- Record backup run metadata in SQLite. Done via the shared
backup_runstable with archive path, status, and manifest JSON. - Include VFS/workspace export only when requested; do not export session internals as JSON.
- Teach backup to snapshot global and agent databases via SQLite backup or
-
Delete obsolete tests and code.
-
Remove tests that assert runtime creation of
sessions.jsonor transcript JSONL files. Done for core session store, chat, gateway transcript events, preview, lifecycle, command session-entry updates, auto-reply reset/trace, and memory-core dreaming fixtures, approval target routing, session transcript repair, security permission repair, trajectory export, and session export. Active-memory transcript tests now assert SQLite scopes and no temporary or persisted JSONL file creation. The old heartbeat transcript-pruning regression was removed because runtime no longer truncates JSONL transcripts. Agent session-list tool tests no longer model legacysessions.jsonpaths as the gateway response shape; app/UI/macOS tests usedatabasePath./statustranscript-usage tests now seed SQLite transcript rows directly instead of writing JSONL files. Context-engine trajectory capture tests now readtrajectory_runtime_eventsrows from an isolated agent database instead of readingsession.trajectory.jsonl. Docker MCP channel seed scripts now seed SQLite rows directly. Directsessions.jsonwrites are limited to doctor fixtures. Memory-core host events and session-corpus scratch rows now live in shared SQLite plugin-state;events.jsonlandsession-corpus/*.txtare legacy doctor migration inputs only. The runtime SQLite session backend test suite no longer fabricates asessions.json; legacy source fixtures now live in the doctor tests that import them. -
Keep tests that seed legacy files only for migration.
-
Replace JSON-file proof with SQL row proof.
-
Add static bans for runtime writes to legacy session/cache JSON paths. Done for the repo guard.
- Make the migration report auditable.
- Record migration runs in SQLite with started/finished timestamps, source
paths, source hashes, counts, warnings, and backup path.
Done: legacy-state migration executions now persist a
migration_runsreport with source path/table inventory, source file SHA-256, sizes, record counts, warnings, and backup path. Done: legacy-state migration executions also persistmigration_sourcesrows for source-level audit and future skip/backfill decisions. - Make apply idempotent. Re-running after a partial import should either skip an already imported source or merge by stable key. Done: session indexes, transcripts, delivery queues, plugin state, task ledgers, and agent-owned global SQLite rows import through stable keys or upsert/replace semantics, so reruns merge without duplicating durable rows.
- Failed imports must keep the original source file in place.
Done: failed transcript imports now leave the original JSONL source at
its detected path, and
migration_sourcesrecords the source aswarningwithremoved_source=0for the next doctor run.
- Record migration runs in SQLite with started/finished timestamps, source
paths, source hashes, counts, warnings, and backup path.
Done: legacy-state migration executions now persist a
Performance Rules
- One connection per thread/process is fine; do not share handles across workers.
- Use WAL,
foreign_keys=ON, a 30s busy timeout, and shortBEGIN IMMEDIATEwrite transactions. - Keep write transaction helpers synchronous unless/until an async transaction API adds explicit mutex/backpressure semantics.
- Keep parent delivery writes small and transactional.
- Avoid whole-store rewrites; use row-level upsert/delete.
- Add indexes for list-by-agent, list-by-session, updated-at, run id, and expiration paths before moving hot code.
- Store large artifacts as BLOBs or chunked BLOB rows, not base64 JSON.
- Keep
kventries small and scoped. - Add SQL cleanup for TTL/expiration instead of filesystem pruning. Done for database-owned runtime stores: media, plugin state, plugin blobs, persistent dedupe, and agent cache all expire through SQLite rows. Remaining filesystem cleanup is limited to temporary materializations or explicit removal commands.
Static Bans
Add a repo check that fails new runtime writes to legacy state paths:
sessions.json*.trajectory.jsonlexcept explicit export/debug paths.acp-stream.jsonlacp/event-ledger.jsoncache/*.jsonruntime cache filescron/runs/*.jsonljobs-state.jsondevice-pair-notify.jsondevices/pending.jsondevices/paired.jsondevices/bootstrap.jsonnodes/pending.jsonnodes/paired.jsonidentity/device.jsonidentity/device-auth.jsonpush/web-push-subscriptions.jsonpush/vapid-keys.jsonpush/apns-registrations.jsonsession-toggles.json- Memory-core
.dreams/events.jsonl - Memory-core
.dreams/session-corpus/ - Memory-core
.dreams/daily-ingestion.json - Memory-core
.dreams/session-ingestion.json - Memory-core
.dreams/short-term-recall.json - Memory-core
.dreams/phase-signals.json - Memory-core
.dreams/short-term-promotion.lock - Skill Workshop
skill-workshop/<workspace>.json - Skill Workshop
skill-workshop/skill-workshop-review-*.json - Nostr
bus-state-*.json - Nostr
profile-state-*.json calls.jsonlknown-users.jsonref-index.jsonl- QQBot
session-*.json - BlueBubbles
bluebubbles/catchup/*.json - BlueBubbles
bluebubbles/inbound-dedupe/*.json - Telegram
update-offset-*.json - Telegram
sticker-cache.json - Telegram
*.telegram-messages.json - Telegram
*.telegram-sent-messages.json - Telegram
*.telegram-topic-names.json - Telegram
thread-bindings-*.json - iMessage
catchup/*.json - iMessage
reply-cache.jsonl - iMessage
sent-echoes.jsonl - Microsoft Teams
msteams-conversations.json - Microsoft Teams
msteams-polls.json - Microsoft Teams
msteams-delegated.json - Microsoft Teams
msteams-pending-uploads.json - Microsoft Teams
*.learnings.json - Matrix
thread-bindings.json - Matrix
inbound-dedupe.json - Matrix
startup-verification.json - Matrix
storage-meta.json - Matrix
crypto-idb-snapshot.json - sandbox registry shard JSON files
- native hook relay
/tmpbridge JSON files plugin-state/state.sqlitetasks/runs.sqlitetasks/flows/registry.sqlitebindings/current-conversations.jsonrestart-sentinel.jsongateway.<hash>.lockqmd/embed.locksettings/voicewake.jsonsettings/voicewake-routing.jsonplugin-binding-approvals.jsonplugins/installs.jsonaudit/file-transfer.jsonlaudit/crestodian.jsonlcrestodian/rescue-pending/*.jsonplugins/phone-control/armed.json- Memory Wiki
.openclaw-wiki/log.jsonl - Memory Wiki
.openclaw-wiki/state.json - Memory Wiki
.openclaw-wiki/locks/ - Memory Wiki
.openclaw-wiki/source-sync.json - Memory Wiki
.openclaw-wiki/import-runs/*.json - Memory Wiki
.openclaw-wiki/cache/agent-digest.json - Memory Wiki
.openclaw-wiki/cache/claims.jsonl - ClawHub
.clawhub/lock.json - ClawHub
.clawhub/origin.json - Browser profile decoration
.openclaw-profile-decorated
The ban should allow tests to create legacy fixtures and allow migration code to read/import/remove legacy file sources. Unshipped SQLite sidecars stay banned and do not get doctor import allowances.
Done Criteria
- Runtime data and cache writes go to the global or agent SQLite database.
- Runtime no longer writes session indexes, transcript JSONL, sandbox registry JSON, task sidecar SQLite, or plugin-state sidecar SQLite. The unshipped task and plugin-state sidecar SQLite importers are deleted.
- Legacy file import is doctor-only.
- Backup produces one archive with compact SQLite snapshots and integrity proof.
- Agent workers can run with disk, VFS scratch, or experimental VFS-only storage.
- Config and explicit credential files remain the only expected persistent non-database control files.
- Repo checks prevent reintroducing legacy runtime file stores.