mirror of
https://github.com/moltbot/moltbot.git
synced 2026-05-11 12:58:34 +00:00
1366 lines
76 KiB
Markdown
1366 lines
76 KiB
Markdown
---
|
|
summary: "Migration plan for making SQLite the primary durable state and cache layer while keeping config file-backed"
|
|
title: "Database-first state refactor"
|
|
read_when:
|
|
- Moving OpenClaw runtime data, cache, transcripts, task state, or scratch files into SQLite
|
|
- Designing doctor migrations from legacy JSON or JSONL files
|
|
- Changing backup, restore, VFS, or worker storage behavior
|
|
- Removing session locks, pruning, truncation, or JSON compatibility paths
|
|
---
|
|
|
|
# Database-First State Refactor
|
|
|
|
## Decision
|
|
|
|
Use a two-level SQLite layout:
|
|
|
|
- Global database: `~/.openclaw/state/openclaw.sqlite`
|
|
- Agent database: one SQLite database per agent for agent-owned workspace,
|
|
transcript, VFS, artifact, and large per-agent runtime state
|
|
- Configuration stays file-backed: `openclaw.json` and explicit credential or
|
|
auth-profile files remain outside the database until there is a separate
|
|
secrets/export design
|
|
|
|
The global database is the control-plane database. It owns agent discovery,
|
|
shared gateway state, pairing, device/node state, task and flow ledgers, plugin
|
|
state, scheduler runtime state, backup metadata, and migration state.
|
|
|
|
The agent database is the data-plane database. It owns the agent's session
|
|
metadata, transcript event stream, VFS workspace or scratch namespace, tool
|
|
artifacts, run artifacts, and searchable/indexable agent-local cache data.
|
|
|
|
This gives one durable global view without forcing large agent workspaces,
|
|
transcripts, and binary scratch data into the shared gateway write lane.
|
|
|
|
## Hard Contract
|
|
|
|
This migration has one canonical runtime shape:
|
|
|
|
- Session rows persist session metadata only. They must not persist
|
|
`transcriptLocator`, transcript file paths, sibling JSONL paths, lock paths,
|
|
pruning metadata, or file-era compatibility pointers.
|
|
- Transcript identity is always SQLite identity: `{agentId, sessionId}` plus
|
|
optional topic metadata where the protocol needs it.
|
|
- `sqlite-transcript://...` is not a runtime or protocol identity. New code must
|
|
not derive, persist, pass, parse, or migrate transcript locators. Remaining
|
|
uses are cleanup debt in tests or explicit export/debug materialization only.
|
|
- Legacy `sessions.json`, transcript JSONL, `.jsonl.lock`, pruning, truncation,
|
|
and old session-path logic belong only to the doctor migration/import path.
|
|
- Runtime startup, hot reply paths, compaction, reset, recovery, diagnostics,
|
|
TTS, memory hooks, subagents, plugin command routing, protocol boundaries, and
|
|
hooks must pass `{agentId, sessionId}` through the runtime.
|
|
- Tests should seed and assert SQLite transcript rows through
|
|
`{agentId, sessionId}`. Tests that only prove JSONL path forwarding, caller-supplied
|
|
locator preservation, or transcript-file compatibility should be deleted
|
|
unless they cover doctor import, export/debug materialization, or protocol
|
|
shape.
|
|
- `runEmbeddedPiAgent(...)`, prepared worker runs, and the inner embedded
|
|
attempt must not accept transcript locators. They open the SQLite transcript
|
|
manager by `{agentId, sessionId}` and pass that manager to the internalized
|
|
PI-compatible agent session, so stale callers cannot make the runner write
|
|
JSON/JSONL transcripts.
|
|
- Runner diagnostics must store runtime/cache/payload trace records in SQLite.
|
|
Runtime diagnostics must not expose JSONL file override knobs; export/debug
|
|
commands can materialize files explicitly from database rows.
|
|
|
|
Implementation work should keep deleting code until these statements are true
|
|
without exceptions outside doctor/import/export/debug boundaries.
|
|
|
|
## Code-Read Assumptions
|
|
|
|
No follow-up product decisions are blocking this plan. The implementation should
|
|
proceed with these assumptions:
|
|
|
|
- Use `node:sqlite` directly and require the Node 24+ runtime for this storage
|
|
path.
|
|
- Keep exactly one normal configuration file. Do not move config, credentials,
|
|
provider auth profiles, plugin manifests, or Git workspaces into SQLite in
|
|
this refactor.
|
|
- Runtime compatibility files are not required. Legacy JSON and JSONL files are
|
|
migration inputs only. The branch-local SQLite sidecars never shipped and are
|
|
deleted instead of imported.
|
|
- `openclaw doctor --fix` owns the legacy file-to-database migration step.
|
|
Runtime startup and `openclaw migrate` should not carry legacy OpenClaw
|
|
database-upgrade paths.
|
|
- Runtime must not migrate, normalize, or bridge transcript locators. Active
|
|
transcript identity is `{agentId, sessionId}` in SQLite. File paths are
|
|
legacy doctor inputs only, and `sqlite-transcript://...` must disappear from
|
|
runtime, protocol, hook, and plugin surfaces instead of being treated as a
|
|
boundary handle.
|
|
- Codex app-server bindings use the OpenClaw `sessionId` as the canonical
|
|
SQLite key. `sessionKey` is metadata for routing/display and must not replace
|
|
the durable session id or resurrect transcript-file identity.
|
|
- Backup output should remain one archive file. Database contents should enter
|
|
that archive as compact SQLite snapshots, not raw live WAL sidecars.
|
|
- Transcript search is useful but not required for the first database-first
|
|
cut. Design the schema so FTS can be added later.
|
|
- Worker execution should stay experimental behind settings while the database
|
|
boundary settles.
|
|
|
|
## Code-Read Findings
|
|
|
|
The current branch is already past the proof-of-concept stage. The shared
|
|
database exists, Node `node:sqlite` is wired through a small runtime helper, and
|
|
former stores now write to `state/openclaw.sqlite` or the owning
|
|
`openclaw-agent.sqlite` database.
|
|
|
|
The remaining work is not choosing SQLite; it is deleting compatibility-shaped
|
|
interfaces that still look like the old file world:
|
|
|
|
- Some compatibility call surfaces still carry `storePath` for explicit
|
|
migration/export/debug metadata, but hot session reads and writes now resolve
|
|
the SQLite row from `{ agentId, sessionKey }` instead of treating the path as
|
|
the runtime identity.
|
|
- Session writes no longer pass through the old in-process `store-writer.ts`
|
|
queue. SQLite patch writes use conflict detection and bounded retry instead.
|
|
- Legacy path discovery still has valid migration uses, but runtime code should
|
|
stop treating `sessions.json`, transcript JSONL files, and sandbox registry
|
|
JSON as possible write targets.
|
|
- Agent-owned tables live in per-agent SQLite databases. The global DB keeps
|
|
registry/control-plane rows; transcript identity is `{agentId, sessionId}` in
|
|
the per-agent transcript rows. Runtime code must not persist transcript file
|
|
paths or migrate transcript locators.
|
|
- Doctor already imports several legacy files. The cleanup is to make that a
|
|
single explicit migration implementation that doctor calls, with a durable
|
|
migration report.
|
|
|
|
No additional product questions are blocking implementation.
|
|
|
|
## Current Code Shape
|
|
|
|
The branch already has a real shared SQLite base:
|
|
|
|
- `src/state/openclaw-state-db.ts` opens `openclaw.sqlite`, sets WAL,
|
|
`synchronous=NORMAL`, `busy_timeout=30000`, `foreign_keys=ON`, and applies
|
|
the generated schema module derived from
|
|
`src/state/openclaw-state-schema.sql`.
|
|
- Kysely table types and runtime schema modules are generated from disposable
|
|
SQLite databases created from the committed `.sql` files; runtime code no
|
|
longer keeps copy-pasted schema strings for global, per-agent, or proxy
|
|
capture databases.
|
|
- Runtime stores derive selected and inserted row types from those generated
|
|
Kysely `DB` interfaces instead of shadowing SQLite row shapes by hand. Raw SQL
|
|
remains limited to schema application, pragmas, and migration-only DDL.
|
|
- The SQLite schemas are collapsed to `user_version = 1` because this database
|
|
layout has not shipped yet. Runtime openers create the current schema only;
|
|
file-to-database import remains in doctor code, and branch-local
|
|
database upgrade helpers have been deleted.
|
|
- Relational ownership is enforced where the ownership boundary is canonical:
|
|
source migration rows cascade from `migration_runs`, task delivery state
|
|
cascades from `task_runs`, and transcript identity rows cascade from
|
|
transcript events.
|
|
- Current shared tables include `kv`, `agents`, `agent_databases`,
|
|
`plugin_state_entries`, `plugin_blob_entries`, `capture_sessions`,
|
|
`capture_events`, `capture_blobs`, `sandbox_registry_entries`,
|
|
`cron_run_logs`, `cron_jobs`, `commitments`, `delivery_queue_entries`,
|
|
`current_conversation_bindings`, `tui_last_sessions`, `task_runs`,
|
|
`task_delivery_state`, `flow_runs`, `subagent_runs`, `migration_runs`, and
|
|
`backup_runs`.
|
|
- `src/state/openclaw-agent-db.ts` opens
|
|
`agents/<agentId>/agent/openclaw-agent.sqlite`, registers the database in the
|
|
global DB, and owns agent-local session, transcript, VFS, artifact, and cache
|
|
tables. Shared runtime discovery now reads the generated-typed
|
|
`agent_databases` registry instead of reimplementing that query at each call
|
|
site.
|
|
- Subagent run recovery state now lives in typed shared `subagent_runs` rows
|
|
with indexed child, requester, and controller session keys. The old
|
|
`subagents/runs.json` file is doctor migration input only.
|
|
- Current conversation bindings now live in typed shared
|
|
`current_conversation_bindings` rows keyed by normalized conversation id and
|
|
indexed by target session. The old `bindings/current-conversations.json` file
|
|
is doctor migration input only.
|
|
- TUI last-session restore pointers now live in typed shared
|
|
`tui_last_sessions` rows keyed by the hashed TUI connection/session scope.
|
|
The old TUI JSON file is doctor migration input only.
|
|
- Default TTS prefs now live in shared plugin-state SQLite rows keyed under the
|
|
`speech-core` plugin. The old `settings/tts.json` file is doctor migration
|
|
input only; runtime no longer reads or writes TTS prefs JSON files, and the
|
|
legacy path resolver lives in the doctor migration module.
|
|
- Subagent run recovery and OpenRouter model capability cache runtime modules
|
|
now keep SQLite snapshot readers/writers separate from doctor-only legacy JSON
|
|
import helpers.
|
|
- `src/agents/filesystem/virtual-agent-fs.sqlite.ts` implements a SQLite VFS
|
|
over the agent database `vfs_entries` table.
|
|
- `src/agents/runtime-worker.entry.ts` creates per-run SQLite VFS, tool artifact,
|
|
run artifact, and scoped cache stores for workers.
|
|
- Workspace bootstrap completion markers now live in shared SQLite KV keyed by
|
|
resolved workspace path instead of `.openclaw/workspace-state.json`; runtime
|
|
no longer reads or rewrites the legacy workspace marker.
|
|
- Exec approvals now live in shared SQLite KV (`exec.approvals/current`).
|
|
Doctor imports legacy `~/.openclaw/exec-approvals.json`; runtime writes no
|
|
longer create or rewrite that file.
|
|
- Device identity, device auth, and bootstrap runtime modules now keep their
|
|
SQLite snapshot readers/writers separate from doctor-only legacy JSON import
|
|
helpers.
|
|
- Device identity creation fails closed when legacy `identity/device.json`
|
|
exists but the SQLite identity row is missing. Doctor imports and removes that
|
|
file first, so runtime startup cannot silently rotate pairing identity before
|
|
migration.
|
|
- Web push, APNs, Voice Wake, and Voice Wake routing runtime modules now keep
|
|
their SQLite snapshot readers/writers separate from doctor-only legacy JSON
|
|
import helpers.
|
|
- Pairing state, plugin binding approvals, and cron job state now follow the
|
|
same split: runtime modules expose SQLite-backed operations and neutral
|
|
snapshot helpers, while doctor imports/removes the old JSON files through
|
|
`src/commands/doctor/legacy/*` modules.
|
|
- Core pairing and cron runtime modules no longer export legacy JSON path
|
|
builders. Doctor-owned legacy modules construct `pending.json`, `paired.json`,
|
|
`bootstrap.json`, and `cron/jobs.json` source paths for import tests and
|
|
migration only.
|
|
- `src/commands/doctor-sqlite-state.ts` already imports several legacy JSON
|
|
state files, including node host config, into SQLite from doctor.
|
|
- `src/commands/doctor/state-migrations.ts` imports legacy `sessions.json` and
|
|
`*.jsonl` transcripts into SQLite and removes successful sources.
|
|
|
|
The remaining cleanup is mostly consolidation and deletion:
|
|
|
|
- Plugin state now uses the shared `state/openclaw.sqlite` database. The old
|
|
branch-local `plugin-state/state.sqlite` sidecar importer is removed because
|
|
that SQLite layout never shipped.
|
|
- Task and Task Flow runtime tables now live in the shared
|
|
`state/openclaw.sqlite` database instead of `tasks/runs.sqlite` and
|
|
`tasks/flows/registry.sqlite`; the old sidecar importers are removed for the
|
|
same unshipped-layout reason.
|
|
- `src/config/sessions/store.ts` no longer needs `storePath` for inbound
|
|
metadata, route updates, or updated-at reads. Command persistence, CLI
|
|
session cleanup, subagent depth, auth overrides, and transcript session
|
|
identity use agent/session row APIs. Writes are applied as SQLite row patches
|
|
with optimistic conflict retry.
|
|
- Session target resolution now exposes per-agent database targets, not legacy
|
|
`sessions.json` paths. Shared gateway, ACP metadata, doctor route repair, and
|
|
`openclaw sessions` enumerate `agent_databases` plus configured agents.
|
|
- Gateway session routing now uses `resolveGatewaySessionDatabaseTarget`; the
|
|
returned target carries `databasePath` and candidate SQLite row keys instead
|
|
of a legacy session-store file path.
|
|
- Channel session runtime types now expose `{agentId, sessionKey}` for
|
|
updated-at reads, inbound metadata, and last-route updates. The old
|
|
`saveSessionStore(storePath, store)` compatibility type is gone.
|
|
- Plugin runtime, extension API, root library, and `config/sessions` barrel
|
|
surfaces no longer export `resolveStorePath`; plugin code uses SQLite-backed
|
|
session row helpers. The old `resolveLegacySessionStorePath` helper is gone;
|
|
legacy `sessions.json` path construction is now local to migration and test
|
|
fixtures.
|
|
- `src/config/sessions/store-backend.sqlite.ts` now stores canonical session
|
|
entries in the per-agent database and has row-level read/upsert/delete patch
|
|
support. Runtime upsert/patch/delete no longer scans for case variants or
|
|
prunes legacy alias keys; doctor owns canonicalization. The
|
|
standalone JSON import helper is gone, and migration merges upsert newer rows
|
|
instead of replacing the whole session table.
|
|
- Transcript events, VFS rows, and tool artifact rows now write to the per-agent
|
|
database. The unshipped global transcript-file mapping table is gone; doctor
|
|
records legacy source paths in durable migration rows instead.
|
|
- Runtime transcript lookup no longer scans JSONL byte offsets or probes legacy
|
|
transcript files. Gateway chat/media/history paths read transcript rows from
|
|
SQLite; JSONL is now a legacy doctor input or in-memory export
|
|
encoding, not a runtime state file.
|
|
- Runtime transcript store APIs resolve SQLite scope, not filesystem paths. The
|
|
old `resolve...ForPath` helper and unused `transcriptPath` write options are
|
|
gone from runtime callers.
|
|
- Runtime session resolution now uses `{agentId, sessionId}` and must not derive
|
|
`sqlite-transcript://<agent>/<session>` strings for external boundaries.
|
|
Legacy absolute JSONL paths are doctor migration inputs only.
|
|
- `runEmbeddedPiAgent(...)` no longer has a transcript-locator parameter.
|
|
Prepared worker descriptors also omit transcript locators. Runtime session
|
|
state and queued follow-up runs carry `{agentId, sessionId}` instead of
|
|
derived transcript handles.
|
|
- Embedded compaction now takes SQLite scope from `agentId` and `sessionId`.
|
|
Compaction hooks, context-engine calls, CLI delegation, and protocol replies
|
|
must not receive derived `sqlite-transcript://...` handles. Export/debug code
|
|
can materialize files from rows explicitly, but it does not feed those names
|
|
back into runtime identity.
|
|
- Context-engine delegation no longer parses a transcript locator to recover
|
|
agent identity. The prepared runtime context carries the resolved `agentId`
|
|
into the built-in compaction bridge.
|
|
- Transcript rewrite and live tool-result truncation now read and persist
|
|
transcript state by `{agentId, sessionId}` and do not derive temporary
|
|
locators for transcript-update event payloads.
|
|
- The transcript-state helper surface no longer has locator-based
|
|
`readTranscriptState`, `replaceTranscriptStateEvents`, or
|
|
`persistTranscriptStateMutation` variants. Runtime callers must use the
|
|
`{agentId, sessionId}` APIs. Doctor import reads legacy files by explicit file
|
|
path and writes SQLite rows; it does not migrate locator strings.
|
|
- The runtime session-manager contract no longer exposes `open(locator)`,
|
|
`forkFrom(locator)`, or `setTranscriptLocator(...)`. Persisted session
|
|
managers open and fork by `{agentId, sessionId}` only.
|
|
- Gateway transcript reader APIs are scope-first. They take
|
|
`{agentId, sessionId}` and do not accept a positional transcript locator that
|
|
could accidentally become runtime identity. Locator parsing is explicit and
|
|
limited to boundary adapters.
|
|
- Transcript update events are also scope-first. `emitSessionTranscriptUpdate`
|
|
no longer accepts a bare locator string, and listeners route by
|
|
`{agentId, sessionId}` without parsing a handle.
|
|
- Gateway session-message broadcast resolves session keys from agent/session
|
|
scope, not from a transcript locator. The old transcript-locator-to-session
|
|
key resolver/cache is gone.
|
|
- Gateway session-history SSE filters live updates by agent/session scope. It no
|
|
longer canonicalizes transcript locator candidates, realpaths, or file-shaped
|
|
transcript identities to decide whether a stream should receive an update.
|
|
- Session lifecycle hooks no longer derive or expose transcript locators on
|
|
`session_end`. Hook consumers get `sessionId`, `sessionKey`, next-session
|
|
ids, and agent context; transcript files are not part of the lifecycle
|
|
contract.
|
|
- Reset hooks no longer derive or expose transcript locators either. The
|
|
`before_reset` payload carries recovered SQLite messages plus the reset
|
|
reason, while session identity stays in hook context.
|
|
- Agent harness reset no longer accepts a transcript locator. Reset dispatch is
|
|
scoped by `sessionId`/`sessionKey` plus reason.
|
|
- Agent extension session types no longer expose `transcriptLocator`; extensions
|
|
should use session context and runtime APIs rather than reaching for a
|
|
file-shaped transcript identity.
|
|
- Plugin compaction hooks no longer expose transcript locators. Hook context
|
|
already carries session identity, and transcript reads must go through SQLite
|
|
scope-aware APIs instead of file-shaped handles.
|
|
- `before_agent_finalize` hooks no longer expose `transcriptPath`, including
|
|
native hook relay payloads. Finalization hooks use session context only.
|
|
- Gateway reset responses no longer synthesize a transcript locator on the
|
|
returned entry. The reset creates SQLite transcript rows, returns the clean
|
|
session entry, and leaves transcript access to scope-aware readers.
|
|
- Embedded run and compaction results no longer surface transcript locators for
|
|
session accounting. Automatic compaction updates only the active `sessionId`,
|
|
compaction counters, and token metadata.
|
|
- Embedded attempt results no longer return `transcriptLocatorUsed`, and
|
|
context-engine `compact()` results no longer return transcript locators.
|
|
Runtime retry loops only accept a successor `sessionId`.
|
|
- Delivery-mirror transcript append results no longer return transcript
|
|
locators. Callers get the appended `messageId`; transcript update signals use
|
|
SQLite scope.
|
|
- Parent-session fork helpers return only the forked `sessionId`. Subagent
|
|
preparation passes the child agent/session scope to engines.
|
|
- CLI runner params and history reseeding no longer accept transcript locators.
|
|
CLI history reads resolve the SQLite transcript scope from `{agentId,
|
|
sessionId}` and session key context.
|
|
- CLI and embedded-runner test fixtures now seed and read SQLite transcript rows
|
|
by session id instead of pretending active sessions are `*.jsonl` files or
|
|
passing a `sqlite-transcript://...` string through runtime params.
|
|
- Session tool-result guard events emit from known session scope even when an
|
|
in-memory manager has no derived locator. Its tests no longer fake active
|
|
`/tmp/*.jsonl` transcript files.
|
|
- BTW and compaction-checkpoint helpers now read and fork transcript rows by
|
|
SQLite scope. Checkpoint metadata now stores session ids and leaf/entry ids
|
|
only; derived locators are no longer written into checkpoint payloads.
|
|
- Gateway transcript-key lookup uses SQLite transcript scope at protocol
|
|
boundaries and no longer realpaths or stats transcript filenames.
|
|
- Automatic compaction transcript rotation writes successor transcript rows
|
|
directly through the SQLite transcript store. Session rows keep only the
|
|
successor session identity, not a durable JSONL path or persisted locator.
|
|
- Embedded context-engine compaction uses SQLite-named transcript rotation
|
|
helpers. The rotation tests no longer construct JSONL successor paths or
|
|
model active sessions as files.
|
|
- Managed outgoing image retention keys its transcript-message cache from
|
|
SQLite transcript stats instead of filesystem stat calls.
|
|
- Runtime session locks and the standalone legacy `.jsonl.lock` doctor
|
|
lane have been removed.
|
|
- The Microsoft Teams runtime barrel no longer re-exports the old plugin SDK
|
|
file-lock helper; its durable state paths are SQLite-backed.
|
|
- Session age/count pruning and explicit session cleanup have been removed.
|
|
Doctor owns legacy import; stale sessions are reset or deleted explicitly.
|
|
- Doctor no longer treats `agents/<agent>/sessions/` as required runtime
|
|
state. It only scans that directory when it already exists, as legacy import
|
|
or orphan-cleanup input.
|
|
- Gateway `sessions.resolve`, session patch/reset/compact paths, subagent
|
|
spawning, fast abort, ACP metadata, heartbeat-isolated sessions, and TUI
|
|
patching no longer migrate or prune legacy session keys as a side effect of
|
|
normal runtime work.
|
|
- CLI command session resolution now returns the owning `agentId` instead of a
|
|
`storePath`, and it no longer copies legacy main-session rows during normal
|
|
`--to` or `--session-id` resolution. Legacy main-row canonicalization belongs
|
|
to doctor only.
|
|
- Runtime subagent depth resolution no longer reads `sessions.json` or JSON5
|
|
session stores. It reads SQLite `session_entries` by agent id, and legacy
|
|
depth/session metadata can only enter through the doctor import path.
|
|
- Auth profile session overrides persist through direct `{agentId, sessionKey}`
|
|
row upserts instead of lazy-loading a file-shaped session-store runtime.
|
|
- Auto-reply verbose gating and session update helpers now read/upsert SQLite
|
|
session rows by session identity and no longer require a legacy store path
|
|
before touching persisted row state.
|
|
- Command-run session metadata helpers now use entry-oriented names and module
|
|
paths; the old `session-store` command helper surface has been removed.
|
|
- Bootstrap header seeding and manual compaction boundary hardening now mutate
|
|
SQLite transcript rows directly. Runtime callers pass session identity, not
|
|
writable `.jsonl` paths.
|
|
- Silent session-rotation replay copies recent user/assistant turns by
|
|
`{agentId, sessionId}` from SQLite transcript rows. It no longer accepts
|
|
source or target transcript locators.
|
|
- Fresh runtime session rows no longer store transcript locators. Callers use
|
|
`{agentId, sessionId}` directly; export/debug commands can choose output file
|
|
names when they materialize rows.
|
|
- Starting a new persisted transcript session now always opens SQLite rows by
|
|
scope. The session manager no longer reuses a previous file-era transcript
|
|
path or locator as the identity for the new session.
|
|
- Plugin runtime no longer exposes `api.runtime.agent.session.resolveTranscriptLocatorPath`;
|
|
plugin code uses SQLite row helpers and scope values.
|
|
- The public `session-store-runtime` SDK surface no longer exports database
|
|
close/reset test helpers; plugin tests import those through the testing SDK
|
|
surface instead.
|
|
- Active-memory blocking subagent runs use SQLite transcript rows instead of
|
|
creating temporary or persisted `session.jsonl` files under plugin state. The
|
|
old `transcriptDir` option is now a compatibility no-op.
|
|
- One-off slug generation and Crestodian planner runs use SQLite transcript rows
|
|
instead of creating temporary `session.jsonl` files.
|
|
- `llm-task` helper runs and hidden commitment extraction also use SQLite
|
|
transcript rows, so these model-only helper sessions no longer create
|
|
temporary JSON/JSONL transcript files.
|
|
- `TranscriptSessionManager` default create, list, fork, and branch paths use
|
|
SQLite transcript rows unless a caller explicitly supplies a legacy transcript
|
|
directory for doctor/import/debug handling.
|
|
- Parent transcript fork decisions and fork creation no longer accept
|
|
`storePath` or `sessionsDir`; they use `{agentId, sessionId}` SQLite
|
|
transcript scope instead of retained filesystem path metadata.
|
|
- Memory-host no longer exports no-op session-directory transcript
|
|
classification helpers; transcript filtering now derives from SQLite row
|
|
metadata during entry construction.
|
|
- Memory-host and QMD session-export tests use SQLite transcript scopes. Old
|
|
`agents/<agentId>/sessions/*.jsonl` paths stay covered only where a test is
|
|
intentionally proving doctor/import/export compatibility.
|
|
- QA-lab raw session inspection now uses `sessions.list` through the gateway
|
|
instead of reading `agents/qa/sessions/sessions.json`; MSteams feedback
|
|
appends directly to SQLite transcripts without fabricating a JSONL path.
|
|
- Shared inbound channel turns now carry `{agentId, sessionKey}` rather than a
|
|
legacy `storePath`. LINE, WhatsApp, Slack, Discord, Telegram, Matrix, Signal,
|
|
iMessage, BlueBubbles, Feishu, Google Chat, IRC, Nextcloud Talk, Zalo,
|
|
Zalo Personal, QA Channel, Microsoft Teams, Mattermost, Synology Chat, Tlon,
|
|
Twitch, and QQBot recording paths now read updated-at metadata and record
|
|
inbound session rows through SQLite identity.
|
|
- Transcript locator persistence is removed from active session rows.
|
|
`resolveSessionTranscriptTarget` returns `agentId`, `sessionId`, and optional
|
|
topic metadata; doctor is the only code that imports legacy transcript file
|
|
names.
|
|
- Embedded PI runs reject incoming transcript handles. They use the SQLite
|
|
`{agentId, sessionId}` identity before worker launch and again before the
|
|
attempt touches transcript state. A stale `/tmp/*.jsonl` input cannot select a
|
|
runtime write target.
|
|
- Cache trace, Anthropic payload, and diagnostics timeline records now write to
|
|
SQLite diagnostic KV rows only. The old `diagnostics.cacheTrace.filePath`,
|
|
`OPENCLAW_CACHE_TRACE_FILE`, `OPENCLAW_ANTHROPIC_PAYLOAD_LOG_FILE`, and
|
|
`OPENCLAW_DIAGNOSTICS_TIMELINE_PATH` JSONL override paths are removed.
|
|
- Cron persistence now reconciles SQLite `cron_jobs` rows instead of
|
|
deleting/reinserting the whole job table on each save. Plugin target
|
|
writebacks update matching cron rows directly and keep runtime cron state in
|
|
the same state-database transaction.
|
|
- Cron runtime callers now use a stable SQLite cron store key. Legacy
|
|
`cron.store` paths are doctor import inputs only; production gateway, task
|
|
maintenance, status, run-log, and Telegram target writeback paths use
|
|
`resolveCronStoreKey` and no longer path-normalize the key. Cron status now
|
|
reports `storeKey` rather than the old file-shaped `storePath` field.
|
|
- ACP spawn no longer resolves or persists transcript JSONL file paths. Spawn
|
|
and thread-bind setup persist the SQLite session row directly and keep the
|
|
session id as the retained transcript identity.
|
|
- ACP session metadata APIs now read/list/upsert SQLite rows by `agentId` and
|
|
no longer expose `storePath` as part of the ACP session entry contract.
|
|
- Session usage accounting and gateway usage aggregation now resolve transcripts
|
|
by `{agentId, sessionId}` only. The cost/usage cache and discovered-session
|
|
summaries no longer synthesize or return transcript locator strings.
|
|
- Gateway chat append, abort-partial persistence, `/sessions.send`, and
|
|
webchat media transcript writes append directly through SQLite transcript
|
|
scope. The gateway transcript-injection helper no longer accepts a
|
|
`transcriptLocator` parameter.
|
|
- SQLite transcript discovery now lists transcript scopes and stats only:
|
|
`{agentId, sessionId, updatedAt, eventCount}`. The dead
|
|
`listSqliteSessionTranscriptLocators` compatibility helper and per-row
|
|
`locator` field are gone.
|
|
- Transcript repair runtime now exposes only
|
|
`repairTranscriptSessionStateIfNeeded({agentId, sessionId})`. The old
|
|
locator-based repair helper is deleted; doctor/debug code reads explicit
|
|
source file paths and never migrates locator strings.
|
|
- ACP replay ledger runtime now stores per-session replay rows in the shared
|
|
SQLite state database instead of `acp/event-ledger.json`; doctor imports and
|
|
removes the legacy file.
|
|
- Gateway transcript reader helpers now live in
|
|
`src/gateway/session-transcript-readers.ts` instead of the old
|
|
`session-utils.fs` module name. The fallback retry history check is named for
|
|
SQLite transcript content instead of the old file-helper surface.
|
|
- Gateway injected-chat and compaction helpers now pass SQLite transcript scope
|
|
through internal helper APIs instead of naming values transcript paths or
|
|
source files.
|
|
- Bootstrap continuation detection now checks SQLite transcript rows through
|
|
`hasCompletedBootstrapTranscriptTurn`; it no longer exposes a file-shaped
|
|
helper name.
|
|
- Embedded-runner tests now use SQLite transcript identity, and opening a new
|
|
transcript manager always requires an explicit `sessionId`.
|
|
- Memory indexing helpers now use SQLite transcript terminology end to end:
|
|
host exports list/build session transcript entries, targeted sync queues
|
|
`sessionTranscripts`, and QMD/builtin indexers no longer expose file-shaped
|
|
helper names.
|
|
- The generic plugin SDK persistent-dedupe helper no longer exposes file-shaped
|
|
options. Callers provide SQLite scope keys and durable dedupe rows live in
|
|
shared plugin state.
|
|
- Microsoft Teams SSO and delegated OAuth tokens moved from locked JSON files
|
|
to SQLite plugin state. Doctor imports `msteams-sso-tokens.json` and
|
|
`msteams-delegated.json`, rebuilds canonical SSO token keys from payloads,
|
|
and removes the source files.
|
|
- Matrix sync cache state moved from `bot-storage.json` to SQLite plugin
|
|
state. Doctor imports legacy raw or wrapped sync payloads and removes the
|
|
source file.
|
|
- Matrix legacy crypto migration status moved from
|
|
`legacy-crypto-migration.json` to SQLite plugin state. Doctor imports the
|
|
old status file; Matrix SDK IndexedDB snapshots moved from
|
|
`crypto-idb-snapshot.json` to SQLite plugin blobs. Recovery keys remain
|
|
file-backed because they are Matrix user secret material rather than
|
|
OpenClaw runtime cache rows.
|
|
- Memory Wiki activity logs now use SQLite plugin state instead of
|
|
`.openclaw-wiki/log.jsonl`. The Memory Wiki migration provider imports old
|
|
JSONL logs; wiki markdown and user vault content stay file-backed as
|
|
workspace content.
|
|
- Memory Wiki no longer creates `.openclaw-wiki/state.json` or the unused
|
|
`.openclaw-wiki/locks` directory. The migration provider removes those retired
|
|
plugin metadata files if an older vault still has them.
|
|
- Crestodian audit entries now use core SQLite plugin state instead of
|
|
`audit/crestodian.jsonl`. Doctor imports the legacy JSONL audit log and
|
|
removes it after successful import.
|
|
- Config write/observe audit entries now use core SQLite plugin state instead
|
|
of `logs/config-audit.jsonl`. Doctor imports the legacy JSONL audit log and
|
|
removes it after successful import.
|
|
- Crestodian rescue pending approvals now use core SQLite plugin state instead
|
|
of `crestodian/rescue-pending/*.json`. Doctor imports legacy pending approval
|
|
files and removes them after successful import.
|
|
- Phone Control temporary arm state now uses SQLite plugin state instead of
|
|
`plugins/phone-control/armed.json`. Doctor imports the legacy armed-state
|
|
file into the `phone-control/arm-state` namespace and removes the file.
|
|
- Doctor no longer repairs JSONL transcripts in place or creates backup JSONL
|
|
files. It imports the active branch into SQLite and removes the legacy source.
|
|
- Session-memory hook transcript lookup uses `{agentId, sessionId}` scope-only
|
|
SQLite reads. Its helper no longer accepts or derives transcript locators,
|
|
legacy file reads, or file-rewrite options.
|
|
- Codex app-server conversation bindings now key SQLite plugin state by
|
|
OpenClaw session key or explicit `{agentId, sessionId}` scope. They must not
|
|
preserve transcript-path fallback bindings.
|
|
- Codex app-server mirrored-history reads use the SQLite transcript scope only;
|
|
they must not recover identity from transcript file paths.
|
|
- Role-ordering and compaction reset paths no longer unlink old transcript
|
|
files; reset only rotates the SQLite session row and transcript identity.
|
|
- Gateway reset and checkpoint responses return clean session rows plus session
|
|
ids. They no longer synthesize SQLite transcript locators for clients.
|
|
- Memory-core dreaming no longer prunes session rows by probing for missing
|
|
JSONL files. Subagent cleanup goes through the session runtime API instead of
|
|
filesystem existence checks. Its transcript-ingestion tests seed SQLite rows
|
|
directly instead of creating `agents/<id>/sessions` fixtures or locator
|
|
placeholders.
|
|
- Gateway doctor memory status reads short-term recall and phase-signal counts
|
|
from SQLite plugin-state rows instead of `memory/.dreams/*.json`; CLI and
|
|
doctor output now label that storage as a SQLite store, not a path.
|
|
- Sandbox container/browser registries now use the shared
|
|
`sandbox_registry_entries` SQLite table. Doctor imports legacy monolithic and
|
|
sharded JSON registry files and removes successful sources.
|
|
- Commitments now use a typed shared `commitments` table instead of a
|
|
whole-store JSON blob. Doctor imports legacy `commitments.json` and removes
|
|
it after a successful import.
|
|
- Cron job definitions, schedule state, and run history no longer have runtime
|
|
JSON writers or readers. Runtime uses `cron_jobs` rows with inline runtime
|
|
state plus `cron_run_logs`; doctor imports legacy `jobs.json`,
|
|
`jobs-state.json`, and `runs/*.jsonl` files and removes the imported sources.
|
|
Plugin target writebacks update matching `cron_jobs` rows instead of loading
|
|
and replacing the whole cron store.
|
|
- Discord model-picker preferences, command-deploy hashes, and thread bindings
|
|
now use shared SQLite plugin state. Their legacy JSON import plans live in the
|
|
Discord plugin setup/doctor migration surface, not in core migration code.
|
|
- BlueBubbles catchup cursors and inbound dedupe markers now use shared SQLite
|
|
plugin state. Their legacy JSON import plans live in the BlueBubbles plugin
|
|
setup/doctor migration surface, not in core migration code.
|
|
- Telegram update offsets, sticker cache rows, sent-message cache rows,
|
|
topic-name cache rows, and thread bindings now use shared SQLite plugin
|
|
state. Their legacy JSON import plans live in the Telegram plugin
|
|
setup/doctor migration surface, not in core migration code.
|
|
- iMessage reply short-id mappings and sent-echo dedupe rows now use shared
|
|
SQLite plugin state. The old `imessage/reply-cache.jsonl` and
|
|
`imessage/sent-echoes.jsonl` files are doctor inputs only.
|
|
- Feishu message dedupe rows now use shared SQLite plugin state instead of
|
|
`feishu/dedup/*.json` files. Its legacy JSON import plan lives in the Feishu
|
|
plugin setup/doctor migration surface, not in core migration code.
|
|
- Microsoft Teams conversations, polls, pending upload buffers, and feedback
|
|
learnings now use shared SQLite plugin state/blob tables. The pending upload
|
|
path uses `plugin_blob_entries` so media buffers are stored as SQLite BLOBs
|
|
instead of base64 JSON. The runtime helper names now use SQLite/state naming
|
|
rather than `*-fs` file-store naming, and the old `storePath` shim is gone
|
|
from these stores. Its legacy JSON import plan lives in the Microsoft Teams
|
|
plugin setup/doctor migration surface.
|
|
- Zalo hosted outbound media now uses shared SQLite `plugin_blob_entries`
|
|
instead of `openclaw-zalo-outbound-media` JSON/bin temp sidecars.
|
|
- Diffs viewer HTML and metadata now use shared SQLite `plugin_blob_entries`
|
|
instead of `meta.json`/`viewer.html` temp files. Rendered PNG/PDF outputs stay
|
|
temp materializations because channel delivery still needs a file path.
|
|
- File Transfer audit decisions now use shared SQLite `plugin_state_entries`
|
|
instead of the unbounded `audit/file-transfer.jsonl` runtime log. Doctor
|
|
imports the legacy JSONL audit file into plugin state and removes the source
|
|
after a clean import.
|
|
- ACPX process leases and gateway instance identity now use shared SQLite plugin
|
|
state. Doctor imports the legacy `gateway-instance-id` file into plugin state
|
|
and removes the source.
|
|
- Gateway media attachments now use the shared `media_blobs` SQLite table as
|
|
the canonical byte store. Local paths returned to channel and sandbox
|
|
compatibility surfaces are temp materializations of the database row, not the
|
|
durable media store.
|
|
- Cache-trace diagnostics, Anthropic payload diagnostics, raw model stream
|
|
diagnostics, and diagnostics timeline events now write SQLite diagnostic rows
|
|
instead of `logs/*.jsonl` files.
|
|
Runtime path override flags and env vars have been removed; export/debug
|
|
commands can materialize files explicitly from database rows.
|
|
- Gateway singleton locks now use shared SQLite KV instead of temp-dir lock
|
|
files. Done.
|
|
- Gateway restart sentinel state now uses shared SQLite KV instead of
|
|
`restart-sentinel.json`; runtime code clears the SQLite row directly and no
|
|
longer carries file cleanup plumbing.
|
|
- Gateway restart intent and supervisor handoff state now use shared SQLite KV
|
|
instead of `gateway-restart-intent.json` and
|
|
`gateway-supervisor-restart-handoff.json` sidecars.
|
|
- Gateway singleton coordination now uses SQLite KV rows under
|
|
`gateway_locks` instead of writing `gateway.<hash>.lock` files. The lock
|
|
still records pid, config path, process start time, and stale-owner metadata,
|
|
but SQLite owns the atomic acquire/release boundary.
|
|
- Main-session restart recovery now discovers candidate agents through the
|
|
SQLite `agent_databases` registry instead of scanning `agents/*/sessions`
|
|
directories.
|
|
- Gemini session-corruption recovery now deletes only the SQLite session row;
|
|
it no longer needs a legacy `storePath` gate or tries to unlink a derived
|
|
transcript JSONL path.
|
|
- Path override handling now treats literal `undefined`/`null` environment
|
|
values as unset, preventing accidental repo-root `undefined/state/*.sqlite`
|
|
databases during tests or shell handoffs.
|
|
- Config health fingerprints now use shared SQLite KV instead of
|
|
`logs/config-health.json`, keeping the normal config file as the only
|
|
non-credential configuration document.
|
|
- Voice Wake trigger and routing settings now use shared SQLite KV instead of
|
|
`settings/voicewake.json` and `settings/voicewake-routing.json`; doctor imports
|
|
the legacy JSON files and removes them after a successful migration.
|
|
- Plugin conversation binding approvals now use shared SQLite KV instead of
|
|
`plugin-binding-approvals.json`; the legacy file is a doctor migration input.
|
|
- Generic current-conversation bindings now store typed
|
|
`current_conversation_bindings` rows instead of rewriting
|
|
`bindings/current-conversations.json`; doctor imports the legacy JSON file and
|
|
removes it after a successful migration.
|
|
- Memory Wiki imported-source sync ledgers now store one SQLite plugin-state row
|
|
per vault/source key instead of rewriting `.openclaw-wiki/source-sync.json`;
|
|
the migration provider imports and removes the legacy JSON ledger.
|
|
- Memory Wiki ChatGPT import-run records now store one SQLite plugin-state row
|
|
per vault/run id instead of writing `.openclaw-wiki/import-runs/*.json`.
|
|
Rollback snapshots remain explicit vault files until import-run snapshot
|
|
archival is moved into blob storage.
|
|
- Memory Wiki compiled digests now store SQLite plugin blob rows instead of
|
|
writing `.openclaw-wiki/cache/agent-digest.json` and
|
|
`.openclaw-wiki/cache/claims.jsonl`. The migration provider imports old cache
|
|
files and removes the cache directory when it becomes empty.
|
|
- ClawHub skill install tracking now stores one SQLite plugin-state row per
|
|
workspace/skill instead of writing or reading `.clawhub/lock.json` and
|
|
`.clawhub/origin.json` sidecars at runtime. Doctor/migrate imports the legacy
|
|
sidecars from configured agent workspaces and removes them after a clean
|
|
import.
|
|
- The installed plugin index now reads and writes shared SQLite KV
|
|
`installed_plugin_index/current` instead of `plugins/installs.json`; the
|
|
legacy JSON file is only a doctor migration input and is removed after import.
|
|
- The legacy `plugins/installs.json` path helper now lives in doctor legacy
|
|
code. Runtime plugin-index modules expose only SQLite-backed persistence
|
|
options, not a JSON file path.
|
|
- Matrix sync cache, storage metadata, thread bindings, inbound dedupe markers,
|
|
startup verification cooldown state, and SDK IndexedDB crypto snapshots now
|
|
use shared SQLite plugin state/blob tables. Their legacy JSON import plan
|
|
lives in the Matrix plugin setup/doctor migration surface. Matrix recovery
|
|
keys remain explicit Matrix client files until a separate credential/secret
|
|
export design exists.
|
|
- Nostr bus cursors and profile publish state now use shared SQLite plugin
|
|
state. Their legacy JSON import plan lives in the Nostr plugin setup/doctor
|
|
migration surface.
|
|
- Active Memory session toggles now use shared SQLite plugin state instead of
|
|
`session-toggles.json`; toggling memory back on deletes the row instead of
|
|
rewriting a JSON object.
|
|
- Skill Workshop proposals and review counters now use shared SQLite plugin
|
|
state instead of per-workspace `skill-workshop/<workspace>.json` stores. Each
|
|
proposal is a separate row under `skill-workshop/proposals`, and the review
|
|
counter is a separate row under `skill-workshop/reviews`.
|
|
- Skill Workshop reviewer subagent runs now use the runtime session transcript
|
|
resolver instead of creating `skill-workshop/<sessionId>.json` sidecar session
|
|
paths.
|
|
- ACPX process leases now use shared SQLite plugin state under
|
|
`acpx/process-leases` instead of a whole-file `process-leases.json` registry.
|
|
Each lease is stored as its own row, preserving startup stale-process reaping
|
|
without a runtime JSON rewrite path.
|
|
- Subagent run registry persistence uses typed shared `subagent_runs` rows. The
|
|
old `subagents/runs.json` path is now only a doctor migration input, and
|
|
runtime helper names no longer describe the state layer as disk-backed.
|
|
- Backup stages the state directory before archiving, copies non-database files,
|
|
snapshots `*.sqlite` databases with `VACUUM INTO`, omits live WAL/SHM
|
|
sidecars, records snapshot metadata in the archive manifest, and records
|
|
completed backup runs in SQLite with the archive manifest.
|
|
- Plain setup and onboarding workspace preparation no longer create
|
|
`agents/<agentId>/sessions/` directories. They create config/workspace only;
|
|
SQLite session rows and transcript rows are created on demand in the
|
|
per-agent database.
|
|
- Security permission repair now targets the global and per-agent SQLite
|
|
databases plus WAL/SHM sidecars instead of `sessions.json` and transcript
|
|
JSONL files.
|
|
- `openclaw reset --scope config+creds+sessions` removes per-agent
|
|
`openclaw-agent.sqlite` databases plus WAL/SHM sidecars, not only legacy
|
|
`sessions/` directories.
|
|
- Gateway aggregate session helpers now use entry-oriented names:
|
|
`loadCombinedSessionEntriesForGateway` returns `{ databasePath, entries }`.
|
|
The old combined-store naming has been removed from runtime callers.
|
|
- Docker MCP channel seeding now writes the main session row and transcript
|
|
events into the per-agent SQLite database instead of creating
|
|
`sessions.json` and a JSONL transcript.
|
|
- The bundled session-memory hook now resolves previous-session context from
|
|
SQLite by `{agentId, sessionId}` and only treats retained transcript paths as
|
|
legacy metadata. It no longer scans or synthesizes `workspace/sessions`
|
|
directories.
|
|
- `migration_runs` records legacy-state migration executions with status,
|
|
timestamps, and JSON reports.
|
|
- `migration_sources` records each imported legacy file source with hash, size,
|
|
record count, target table, run id, status, and source-removal state.
|
|
- `backup_runs` records backup archive paths, status, and JSON manifests.
|
|
- `check:database-first-legacy-stores` fails new runtime source that pairs
|
|
legacy store names with write-style filesystem APIs. Tests and migration,
|
|
doctor, import, and explicit export code remain allowed. The guard now also
|
|
covers runtime `cache/*.json` stores, generic `thread-bindings.json`
|
|
sidecars, cron state/run-log JSON, config health JSON, restart and lock
|
|
sidecars, Voice Wake settings, plugin binding approvals, installed plugin
|
|
index JSON, File Transfer audit JSONL, and Memory Wiki activity logs.
|
|
|
|
## Target Schema Shape
|
|
|
|
Keep schemas explicit. Use typed tables for hot paths and `kv` only for low-risk
|
|
configuration-shaped state.
|
|
|
|
Global database:
|
|
|
|
```text
|
|
kv(scope, key, value_json, updated_at)
|
|
agents(agent_id, config_fingerprint, created_at, updated_at, agent_db_path)
|
|
agent_databases(agent_id, path, schema_version, last_seen_at, size_bytes)
|
|
task_runs(...)
|
|
task_delivery_state(...)
|
|
flow_runs(...)
|
|
subagent_runs(run_id, child_session_key, requester_session_key, controller_session_key, created_at, ended_at, cleanup_handled, payload_json)
|
|
current_conversation_bindings(binding_key, binding_id, channel, account_id, conversation_id, target_session_key, status, bound_at, expires_at, record_json)
|
|
tui_last_sessions(scope_key, session_key, updated_at)
|
|
plugin_state_entries(plugin_id, namespace, entry_key, value_json, created_at, expires_at)
|
|
plugin_blob_entries(plugin_id, namespace, entry_key, metadata_json, blob, created_at, expires_at)
|
|
media_blobs(subdir, id, content_type, size_bytes, blob, created_at, updated_at)
|
|
sandbox_registry_entries(registry_kind, container_name, entry_json, updated_at)
|
|
cron_run_logs(...)
|
|
commitments(id, agent_id, session_key, channel, status, due_earliest_ms, due_latest_ms, updated_at_ms, record_json)
|
|
migration_runs(id, started_at, finished_at, status, report_json)
|
|
migration_sources(source_key, migration_kind, source_path, target_table, source_sha256, source_size_bytes, source_record_count, last_run_id, status, imported_at, removed_source, report_json)
|
|
backup_runs(id, created_at, archive_path, status, manifest_json)
|
|
```
|
|
|
|
Agent database:
|
|
|
|
```text
|
|
kv(scope, key, value_json, updated_at)
|
|
session_entries(session_key, entry_json, updated_at)
|
|
transcript_events(session_id, seq, event_json, created_at)
|
|
transcript_event_identities(session_id, event_id, seq, event_type, has_parent, parent_id, message_idempotency_key, created_at)
|
|
transcript_snapshots(session_id, snapshot_id, reason, event_count, created_at, metadata_json)
|
|
vfs_entries(namespace, path, kind, content_blob, metadata_json, updated_at)
|
|
tool_artifacts(run_id, artifact_id, kind, metadata_json, blob, created_at)
|
|
run_artifacts(run_id, path, kind, metadata_json, blob, created_at)
|
|
cache_entries(scope, key, value_json, blob, expires_at, updated_at)
|
|
```
|
|
|
|
Future search can add FTS tables without changing the canonical event tables:
|
|
|
|
```text
|
|
transcript_events_fts(session_id, seq, text)
|
|
vfs_entries_fts(namespace, path, text)
|
|
```
|
|
|
|
Large values should use `blob` columns, not JSON string encoding. Keep
|
|
`value_json` for small structured data that must remain inspectable with plain
|
|
SQLite tooling.
|
|
|
|
## Doctor Migration Shape
|
|
|
|
Doctor should call one explicit migration step that is reportable and safe to
|
|
rerun:
|
|
|
|
```bash
|
|
openclaw doctor --fix
|
|
```
|
|
|
|
`openclaw doctor --fix` invokes the state migration implementation after
|
|
ordinary config preflight and creates a verified backup before import. Runtime
|
|
startup and `openclaw migrate` must not import legacy OpenClaw state files.
|
|
|
|
Migration properties:
|
|
|
|
- One migration pass discovers all legacy file sources and produces a plan
|
|
before mutating anything.
|
|
- Doctor creates a verified pre-migration backup archive before importing
|
|
legacy files.
|
|
- Imports are idempotent and keyed by source path, mtime, size, hash, and target
|
|
table.
|
|
- Successful source files are removed or archived after the target database has
|
|
committed.
|
|
- Failed imports leave the source untouched and record a warning in
|
|
`migration_runs`.
|
|
- Runtime code reads SQLite only after the migration exists.
|
|
- No downgrade/export-to-runtime-files path is required.
|
|
|
|
## Migration Inventory
|
|
|
|
Move these into the global database:
|
|
|
|
- Task registry runtime writes now use the shared database; the unshipped
|
|
`tasks/runs.sqlite` sidecar importer is deleted.
|
|
- Task Flow runtime writes now use the shared database; the unshipped
|
|
`tasks/flows/registry.sqlite` sidecar importer is deleted.
|
|
- Plugin state runtime writes now use the shared database; the unshipped
|
|
`plugin-state/state.sqlite` sidecar importer is deleted.
|
|
- Builtin memory search no longer defaults to `memory/<agentId>.sqlite`; its
|
|
index tables live in the owning agent database, and the explicit
|
|
`memorySearch.store.path` sidecar opt-in has been retired to doctor config
|
|
migration.
|
|
- Sandbox container/browser registries from monolithic and sharded JSON. Runtime
|
|
writes now use the shared database; legacy JSON import remains.
|
|
- Cron job definitions, schedule state, and run history now use shared SQLite;
|
|
doctor imports/removes legacy `jobs.json`, `jobs-state.json`, and
|
|
`cron/runs/*.jsonl` files
|
|
- Device identity/auth/bootstrap, pairing, push, update check, commitments,
|
|
OpenRouter model cache, installed plugin index, and app-server bindings
|
|
- Device-pair notification subscribers and delivered-request markers now use the
|
|
shared SQLite plugin-state table instead of `device-pair-notify.json`.
|
|
- Voice-call call records now use the shared SQLite plugin-state table under the
|
|
`voice-call` / `calls` namespace instead of `calls.jsonl`; the plugin CLI
|
|
tails and summarizes SQLite-backed call history.
|
|
- QQBot gateway sessions, known-user records, and ref-index quote cache now use
|
|
SQLite plugin state under `qqbot` namespaces (`sessions`, `known-users`,
|
|
`ref-index`) instead of `session-*.json`, `known-users.json`, and
|
|
`ref-index.jsonl`; the QQBot doctor/setup migration imports and removes the
|
|
legacy files.
|
|
- Discord model-picker preferences, command-deploy hashes, and thread bindings
|
|
now use SQLite plugin state under `discord` namespaces
|
|
(`model-picker-preferences`, `command-deploy-hashes`, `thread-bindings`)
|
|
instead of `model-picker-preferences.json`, `command-deploy-cache.json`, and
|
|
`thread-bindings.json`; the Discord doctor/setup migration imports and
|
|
removes the legacy files.
|
|
- BlueBubbles catchup cursors and inbound dedupe markers now use SQLite plugin
|
|
state under `bluebubbles` namespaces (`catchup-cursors`, `inbound-dedupe`)
|
|
instead of `bluebubbles/catchup/*.json` and
|
|
`bluebubbles/inbound-dedupe/*.json`; the BlueBubbles doctor/setup migration
|
|
imports and removes the legacy files.
|
|
- Telegram update offsets, sticker cache entries, reply-chain message cache
|
|
entries, sent-message cache entries, topic-name cache entries, and thread
|
|
bindings now use SQLite plugin state under `telegram` namespaces
|
|
(`update-offsets`, `sticker-cache`, `message-cache`, `sent-messages`,
|
|
`topic-names`, `thread-bindings`) instead of `update-offset-*.json`,
|
|
`sticker-cache.json`, `*.telegram-messages.json`,
|
|
`*.telegram-sent-messages.json`, `*.telegram-topic-names.json`, and
|
|
`thread-bindings-*.json`; the Telegram doctor/setup migration imports and
|
|
removes the legacy files.
|
|
- iMessage reply short-id mappings and sent-echo dedupe rows now use SQLite
|
|
plugin state under `imessage` namespaces (`reply-cache`, `sent-echoes`)
|
|
instead of `imessage/reply-cache.jsonl` and
|
|
`imessage/sent-echoes.jsonl`; the iMessage doctor/setup migration imports
|
|
and removes the legacy files.
|
|
- Microsoft Teams conversations, polls, delegated tokens, pending uploads, and
|
|
feedback learnings now use SQLite plugin state/blob namespaces
|
|
(`conversations`, `polls`, `delegated-tokens`, `pending-uploads`,
|
|
`feedback-learnings`) instead of `msteams-conversations.json`,
|
|
`msteams-polls.json`, `msteams-delegated.json`,
|
|
`msteams-pending-uploads.json`, and `*.learnings.json`; the Microsoft Teams
|
|
doctor/setup migration imports and removes the legacy files.
|
|
- Matrix sync cache, storage metadata, thread bindings, inbound dedupe markers,
|
|
startup verification cooldown state, and SDK IndexedDB crypto snapshots now
|
|
use SQLite plugin state/blob namespaces under `matrix` (`sync-store`,
|
|
`storage-meta`, `thread-bindings`, `inbound-dedupe`, `startup-verification`,
|
|
`idb-snapshots`) instead of `bot-storage.json`, `storage-meta.json`,
|
|
`thread-bindings.json`, `inbound-dedupe.json`, `startup-verification.json`,
|
|
and `crypto-idb-snapshot.json`; the Matrix doctor/setup migration imports and
|
|
removes those legacy files from account-scoped Matrix storage roots.
|
|
- Nostr bus cursors and profile publish state now use SQLite plugin state under
|
|
`nostr` namespaces (`bus-state`, `profile-state`) instead of
|
|
`bus-state-*.json` and `profile-state-*.json`; the Nostr doctor/setup
|
|
migration imports and removes the legacy files.
|
|
- Active Memory session toggles now use SQLite plugin state under
|
|
`active-memory/session-toggles` instead of `session-toggles.json`.
|
|
- Skill Workshop proposal queues and review counters now use SQLite plugin state
|
|
under `skill-workshop/proposals` and `skill-workshop/reviews` instead of
|
|
per-workspace `skill-workshop/<workspace>.json` files.
|
|
- Outbound delivery and session delivery queues now share the global SQLite
|
|
`delivery_queue_entries` table under separate queue names
|
|
(`outbound-delivery`, `session-delivery`) instead of durable
|
|
`delivery-queue/*.json`, `delivery-queue/failed/*.json`, and
|
|
`session-delivery-queue/*.json` files. The doctor legacy-state step imports
|
|
pending and failed rows, removes stale delivered markers, and deletes the old
|
|
JSON files after import.
|
|
- ACPX process leases now use SQLite plugin state under `acpx/process-leases`
|
|
instead of `process-leases.json`.
|
|
- Backup and migration run metadata
|
|
|
|
Move these into agent databases:
|
|
|
|
- Agent session entries. Done for runtime writes.
|
|
- Agent transcript events. Done for runtime writes.
|
|
- Compaction checkpoints and transcript snapshots. Done for runtime writes:
|
|
checkpoint transcript copies are SQLite transcript rows and checkpoint
|
|
metadata is recorded in `transcript_snapshots`. Gateway checkpoint helpers
|
|
now name these values as transcript snapshots rather than source files.
|
|
- Agent VFS scratch/workspace namespaces. Done for runtime VFS writes.
|
|
- Tool artifacts. Done for runtime writes.
|
|
- Run artifacts. Done for worker runtime writes through the per-agent
|
|
`run_artifacts` table.
|
|
- Agent-local runtime caches. Done for worker runtime scoped cache writes through
|
|
the per-agent `cache_entries` table. Gateway-wide model caches stay in the
|
|
global database unless they become agent-specific.
|
|
- ACP parent stream logs. Done for runtime writes.
|
|
- ACP replay ledger sessions. Done for runtime writes via
|
|
`acp_replay_sessions` and `acp_replay_events`; legacy `acp/event-ledger.json`
|
|
remains only as doctor input.
|
|
- Trajectory sidecars when they are not explicit export files. Done for runtime
|
|
writes: trajectory capture writes agent-database `trajectory_runtime_events`
|
|
rows and mirrors run-scoped artifacts into SQLite. Legacy sidecars remain
|
|
readable only as export/migration compatibility input. Runtime trajectory
|
|
capture exposes SQLite scope; JSONL path helpers are isolated to legacy
|
|
export/debug support and are not re-exported from the runtime module.
|
|
Embedded-runner trajectory metadata records `{agentId, sessionId, sessionKey}`
|
|
identity instead of persisting a transcript locator.
|
|
|
|
Keep these file-backed for now:
|
|
|
|
- `openclaw.json`
|
|
- `auth-profiles.json`
|
|
- provider or CLI credential files
|
|
- plugin/package manifests
|
|
- user workspaces and Git repositories when disk mode is selected
|
|
- logs intended for operator tailing, unless a specific log surface is moved
|
|
|
|
## Migration Plan
|
|
|
|
### Phase 0: Freeze The Boundary
|
|
|
|
Make the durable-state boundary explicit before moving more rows:
|
|
|
|
- Add a `migration_runs` table to the global database.
|
|
Done for legacy-state migration execution reports.
|
|
- Add a single doctor-owned state migration service for file-to-database import.
|
|
Done: `openclaw doctor --fix` uses the legacy-state migration implementation.
|
|
- Make `plan` read-only and make `apply` create a backup, import, verify, and
|
|
then delete or quarantine old files.
|
|
Done: doctor creates a verified pre-migration backup, passes the backup path
|
|
into `migration_runs`, and reuses the importer/removal paths.
|
|
- Add static bans so new runtime code cannot write legacy state files while
|
|
migration code and tests can still seed/read them.
|
|
Done for the currently migrated legacy stores.
|
|
|
|
### Phase 1: Finish The Global Control Plane
|
|
|
|
Keep shared coordination state in `state/openclaw.sqlite`:
|
|
|
|
- Agents and agent database registry
|
|
- Task and Task Flow ledgers
|
|
- Plugin state
|
|
- Sandbox container/browser registry
|
|
- Cron/scheduler run history
|
|
- Pairing, device, push, update-check, TUI, OpenRouter/model caches, and other
|
|
small gateway-scoped runtime state
|
|
- Backup and migration metadata
|
|
- Gateway media attachment bytes. Done for runtime writes; direct file paths
|
|
are temp materializations for compatibility with channel senders and sandbox
|
|
staging. Doctor imports legacy media files into `media_blobs` and removes the
|
|
source files after successful row writes.
|
|
- Debug proxy capture sessions, events, and payload blobs. Done for the default
|
|
capture path: explicit `OPENCLAW_DEBUG_PROXY_DB_PATH` remains a one-off
|
|
diagnostics escape hatch, but normal captures live in the shared state DB and
|
|
use the same WAL/busy-timeout settings.
|
|
|
|
This phase also deletes duplicate sidecar openers, permission helpers, WAL
|
|
setup, filesystem pruning, and compatibility writers from those subsystems.
|
|
|
|
### Phase 2: Introduce Per-Agent Databases
|
|
|
|
Create one database per agent and register it from the global DB:
|
|
|
|
```text
|
|
~/.openclaw/state/openclaw.sqlite
|
|
~/.openclaw/agents/<agentId>/agent/openclaw-agent.sqlite
|
|
```
|
|
|
|
The global `agents` or `agent_databases` row stores the path, schema version,
|
|
last-seen timestamp, and basic size/integrity metadata. Runtime code asks the
|
|
registry for the agent DB instead of deriving file paths directly.
|
|
|
|
The agent DB owns:
|
|
|
|
- `session_entries`
|
|
- `transcript_events`
|
|
- transcript snapshots and compaction checkpoints. Done for runtime writes.
|
|
- `vfs_entries`
|
|
- `tool_artifacts` and run artifacts
|
|
- agent-local runtime/cache rows. Done for worker scoped caches.
|
|
- ACP parent stream events
|
|
- trajectory runtime events when they are not explicit export artifacts
|
|
|
|
### Phase 3: Replace Session Store APIs
|
|
|
|
Delete the file-shaped session store surface:
|
|
|
|
- Replace `loadSessionStore(storePath)` runtime usage with agent/session row
|
|
APIs.
|
|
- Replace `saveSessionStore` and `updateSessionStore` whole-object rewrites
|
|
with row operations:
|
|
- `getSessionEntry(agentId, sessionKey)`
|
|
- `upsertSessionEntry(agentId, sessionKey, patch | entry)`
|
|
- `deleteSessionEntry(agentId, sessionKey)`
|
|
- `listSessionEntries(agentId, filters)`
|
|
- SQL cleanup for missing transcript references
|
|
- Delete `store-writer.ts` and queue tests. Done.
|
|
- Keep `sessions.json` parsing only in the migration service and doctor tests.
|
|
- Runtime lifecycle fallback reads the SQLite transcript header, not the old
|
|
JSONL first line.
|
|
|
|
This is the pass that removes most remaining session-management garbage:
|
|
file-lock parameters, pruning/truncation vocabulary, store path identity, and
|
|
tests that prove JSON persistence.
|
|
|
|
### Phase 4: Move Transcripts, ACP Streams, Trajectories, And VFS
|
|
|
|
Make every agent data stream database-native:
|
|
|
|
- Transcript append writes go through one SQLite transaction that ensures the
|
|
session header, checks message idempotency, selects the parent tail, inserts
|
|
into `transcript_events`, and records queryable identity metadata in
|
|
`transcript_event_identities`.
|
|
- ACP parent stream logs become rows, not `.acp-stream.jsonl` files. Done.
|
|
- ACP spawn setup no longer persists transcript JSONL paths. Done.
|
|
- Runtime trajectory capture writes event rows/artifacts directly. The explicit
|
|
support/export command can still produce JSONL bundles as an export format.
|
|
Done.
|
|
- Disk workspaces stay on disk when configured as disk mode.
|
|
- VFS scratch and experimental VFS-only workspace mode use the agent DB.
|
|
|
|
The migration imports old JSONL files once, records counts/hashes in
|
|
`migration_runs`, and removes imported files after integrity checks.
|
|
|
|
### Phase 5: Backup, Restore, Vacuum, And Verify
|
|
|
|
Backups remain one archive file:
|
|
|
|
- Checkpoint every global and agent database.
|
|
- Snapshot each DB with SQLite backup semantics or `VACUUM INTO`.
|
|
- Archive compact DB snapshots, config, credentials/auth profile files, and
|
|
requested workspace exports.
|
|
- Omit raw live `*.sqlite-wal` and `*.sqlite-shm` files.
|
|
- Verify by opening every DB snapshot and running `PRAGMA integrity_check`.
|
|
- Restore copies snapshots back to their target paths. This branch resets the
|
|
unshipped SQLite layout to `user_version = 1`; future shipped schema changes
|
|
can add explicit migrations when they are needed.
|
|
|
|
### Phase 6: Worker Runtime
|
|
|
|
Keep worker mode experimental while the database split lands:
|
|
|
|
- Workers receive agent id, run id, filesystem mode, and DB registry identity.
|
|
- Each worker opens its own SQLite connection.
|
|
- Parent keeps channel delivery, approvals, config, and cancellation authority.
|
|
- Start with one worker per active run; add pooling only after lifecycle and DB
|
|
connection ownership are stable.
|
|
|
|
### Phase 7: Delete The Old World
|
|
|
|
After the migration path and row APIs land:
|
|
|
|
- Remove runtime `sessions.json`, transcript JSONL, sandbox registry JSON,
|
|
task sidecar SQLite, and plugin-state sidecar SQLite writes.
|
|
- Remove JSON/session pruning and truncation code.
|
|
- Remove file locks and lock-shaped tests.
|
|
- Remove runtime compatibility exports that only exist to keep old session
|
|
files current.
|
|
- Keep explicit support exports as user-requested archive formats only.
|
|
|
|
## Backup And Restore
|
|
|
|
Backups should be one archive file, but database capture should be
|
|
SQLite-native:
|
|
|
|
1. Stop long-running write activity or enter a short backup barrier.
|
|
2. For every global and agent database, run a checkpoint.
|
|
3. Snapshot each database using SQLite backup semantics or `VACUUM INTO` into a
|
|
temporary backup directory.
|
|
4. Archive the compacted database snapshots, config file, credentials directory,
|
|
selected workspaces, and a manifest.
|
|
5. Verify the archive by opening every included SQLite snapshot and running
|
|
`PRAGMA integrity_check`.
|
|
|
|
Do not rely on raw live `*.sqlite`, `*.sqlite-wal`, and `*.sqlite-shm` copies as
|
|
the primary backup format. The archive manifest should record database role,
|
|
agent id, schema version, source path, snapshot path, byte size, and integrity
|
|
status.
|
|
|
|
Restore should rebuild the global database and agent database files from the
|
|
archive snapshots. Because the SQLite layout has not shipped yet, this refactor
|
|
keeps only the version-1 schema plus doctor file-to-database import.
|
|
|
|
## Runtime Refactor Plan
|
|
|
|
1. Add database registry APIs.
|
|
- Resolve global DB and per-agent DB paths.
|
|
- Keep the unshipped schemas at `user_version = 1`; do not add schema
|
|
migration runner code until a shipped schema needs it.
|
|
- Add close/checkpoint/integrity helpers used by tests, backup, and doctor.
|
|
|
|
2. Collapse sidecar SQLite stores.
|
|
- Move plugin state tables into the global database. Done for runtime
|
|
writes; the unshipped legacy sidecar importer is deleted.
|
|
- Move task registry tables into the global database. Done for runtime
|
|
writes; the unshipped legacy sidecar importer is deleted.
|
|
- Move Task Flow tables into the global database. Done for runtime writes;
|
|
the unshipped legacy sidecar importer is deleted.
|
|
- Move builtin memory-search tables into each agent database. Done; explicit
|
|
custom `memorySearch.store.path` is now removed by doctor config migration.
|
|
- Delete duplicate database openers, WAL setup, permission helpers, and
|
|
close paths from those subsystems.
|
|
|
|
3. Move agent-owned tables into per-agent databases.
|
|
- Create agent DB on demand through the global database registry. Done.
|
|
- Move runtime session entries, transcript events, VFS rows, and tool
|
|
artifacts to agent DBs. Done.
|
|
- Do not migrate branch-local shared-DB session entries, transcript events,
|
|
VFS rows, or tool artifacts; that layout never shipped. Keep only legacy
|
|
file-to-database import in doctor.
|
|
|
|
4. Replace session store APIs.
|
|
- Remove `storePath` as the runtime identity. Done for the shared inbound
|
|
channel turn/session-record pipeline and mostly done for hot paths:
|
|
session metadata, route updates, command persistence, CLI session cleanup,
|
|
Feishu reasoning previews, transcript-state persistence, subagent
|
|
depth, auth profile session overrides, parent-fork logic, and QA-lab
|
|
inspection now resolve the database from canonical agent/session keys.
|
|
Gateway/TUI/UI/macOS session-list responses now expose `databasePath`
|
|
instead of legacy `path`; macOS debug surfaces show the per-agent database
|
|
as read-only state instead of writing `session.store` config.
|
|
`/status` and chat-driven trajectory export no longer propagate legacy
|
|
store paths; transcript usage fallback reads SQLite by agent/session
|
|
identity. Remaining `storePath` call surfaces are migration/debug metadata
|
|
and older RPC response fields that still carry
|
|
SQLite keys for compatibility.
|
|
Gateway combined-session loading no longer has a special runtime branch for
|
|
non-templated `session.store` values; it aggregates per-agent SQLite rows.
|
|
The legacy session-lock doctor lane and its `.jsonl.lock` cleanup helper
|
|
were removed; SQLite is the session concurrency boundary now.
|
|
Hot runtime call sites use row-oriented helper names such as
|
|
`resolveSessionRowEntry`; the old `resolveSessionStoreEntry` compatibility
|
|
alias has been removed from runtime and plugin SDK exports.
|
|
|
|
- Use `{ agentId, sessionKey }` row operations.
|
|
Done: `getSessionEntry`, `upsertSessionEntry`, `deleteSessionEntry`,
|
|
`patchSessionEntry`, and `listSessionEntries` are SQLite-first APIs that do
|
|
not require a session store path. Status summary, local agent status, health,
|
|
and the `openclaw sessions` listing command now read per-agent rows directly
|
|
and display per-agent SQLite database paths instead of `sessions.json` paths.
|
|
- Replace whole-store delete/insert with `upsertSessionEntry`,
|
|
`deleteSessionEntry`, `listSessionEntries`, and SQL cleanup queries.
|
|
Done for runtime: hot paths now use row APIs and conflict-retried row patches;
|
|
remaining whole-store import/replace helpers are limited to migration import
|
|
code and SQLite backend tests.
|
|
- Delete `store-writer.ts` and writer-queue tests. Done.
|
|
- Delete runtime legacy-key pruning and alias-delete parameters from session
|
|
row upserts/patches. Done.
|
|
|
|
5. Delete runtime JSON registry behavior.
|
|
- Make sandbox registry reads and writes SQLite-only. Done.
|
|
- Import monolithic and sharded JSON only from the migration step. Done.
|
|
- Remove sharded registry locks and JSON writes. Done.
|
|
- Keep one typed registry table instead of storing registry rows as generic
|
|
`kv` if the shape remains hot-path operational state. Done.
|
|
|
|
6. Delete file-lock-shaped session mutation.
|
|
- Done for runtime lock creation and runtime lock APIs.
|
|
- The standalone legacy `.jsonl.lock` doctor cleanup lane is removed.
|
|
- `session.writeLock` is doctor-migrated legacy config, not a typed runtime
|
|
setting.
|
|
- Generic plugin SDK dedupe persistence no longer uses file locks or JSON
|
|
files; it writes shared SQLite plugin-state rows. Done.
|
|
- QMD embed coordination uses a SQLite state lease instead of
|
|
`qmd/embed.lock`. Done.
|
|
|
|
7. Make workers database-aware.
|
|
- Workers open their own SQLite connections.
|
|
- Parent owns delivery, channel callbacks, and config.
|
|
- Worker receives agent id, run id, filesystem mode, and DB registry
|
|
identity, not live handles.
|
|
- `vfs-only` stays experimental and uses the agent database as its storage
|
|
root.
|
|
- Keep one worker per active run first. Pooling can wait until DB connection
|
|
lifetime and cancellation behavior are boring.
|
|
|
|
8. Backup integration.
|
|
- Teach backup to snapshot global and agent databases via SQLite backup or
|
|
`VACUUM INTO`. Done for discovered `*.sqlite` files under the state asset.
|
|
- Add backup verification for SQLite integrity and schema version. Done for
|
|
backup creation and archive verification integrity checks.
|
|
- Record backup run metadata in SQLite. Done via the shared `backup_runs`
|
|
table with archive path, status, and manifest JSON.
|
|
- Include VFS/workspace export only when requested; do not export session
|
|
internals as JSON.
|
|
|
|
9. Delete obsolete tests and code.
|
|
|
|
- Remove tests that assert runtime creation of `sessions.json` or transcript
|
|
JSONL files. Done for core session store, chat, gateway transcript events,
|
|
preview, lifecycle, command session-entry updates, auto-reply reset/trace, and
|
|
memory-core dreaming fixtures, approval target routing, session transcript
|
|
repair, security permission repair, trajectory export, and session export.
|
|
Active-memory transcript tests now assert SQLite scopes and no temporary or
|
|
persisted JSONL file creation.
|
|
The old heartbeat transcript-pruning regression was removed because
|
|
runtime no longer truncates JSONL transcripts.
|
|
Agent session-list tool tests no longer model legacy `sessions.json` paths
|
|
as the gateway response shape; app/UI/macOS tests use `databasePath`.
|
|
`/status` transcript-usage tests now seed SQLite transcript rows directly
|
|
instead of writing JSONL files.
|
|
Context-engine trajectory capture tests now read `trajectory_runtime_events`
|
|
rows from an isolated agent database instead of reading
|
|
`session.trajectory.jsonl`.
|
|
Docker MCP channel seed scripts now seed SQLite rows directly. Direct
|
|
`sessions.json` writes are limited to doctor fixtures.
|
|
Memory-core host events and session-corpus scratch rows now live in shared
|
|
SQLite plugin-state; `events.jsonl` and `session-corpus/*.txt` are legacy
|
|
doctor migration inputs only.
|
|
The runtime SQLite session backend test suite no longer fabricates a
|
|
`sessions.json`; legacy source fixtures now live in the doctor
|
|
tests that import them.
|
|
- Keep tests that seed legacy files only for migration.
|
|
- Replace JSON-file proof with SQL row proof.
|
|
|
|
- Add static bans for runtime writes to legacy session/cache JSON paths.
|
|
Done for the repo guard.
|
|
|
|
10. Make the migration report auditable.
|
|
- Record migration runs in SQLite with started/finished timestamps, source
|
|
paths, source hashes, counts, warnings, and backup path.
|
|
Done: legacy-state migration executions now persist a `migration_runs`
|
|
report with source path/table inventory, source file SHA-256, sizes,
|
|
record counts, warnings, and backup path.
|
|
Done: legacy-state migration executions also persist `migration_sources`
|
|
rows for source-level audit and future skip/backfill decisions.
|
|
- Make apply idempotent. Re-running after a partial import should either
|
|
skip an already imported source or merge by stable key.
|
|
Done: session indexes, transcripts, delivery queues, plugin state, task
|
|
ledgers, and agent-owned global SQLite rows import through stable keys or
|
|
upsert/replace semantics, so reruns merge without duplicating durable
|
|
rows.
|
|
- Failed imports must keep the original source file in place.
|
|
Done: failed transcript imports now leave the original JSONL source at
|
|
its detected path, and `migration_sources` records the source as
|
|
`warning` with `removed_source=0` for the next doctor run.
|
|
|
|
## Performance Rules
|
|
|
|
- One connection per thread/process is fine; do not share handles across
|
|
workers.
|
|
- Use WAL, `foreign_keys=ON`, a 30s busy timeout, and short `BEGIN IMMEDIATE`
|
|
write transactions.
|
|
- Keep write transaction helpers synchronous unless/until an async transaction
|
|
API adds explicit mutex/backpressure semantics.
|
|
- Keep parent delivery writes small and transactional.
|
|
- Avoid whole-store rewrites; use row-level upsert/delete.
|
|
- Add indexes for list-by-agent, list-by-session, updated-at, run id, and
|
|
expiration paths before moving hot code.
|
|
- Store large artifacts as BLOBs or chunked BLOB rows, not base64 JSON.
|
|
- Keep `kv` entries small and scoped.
|
|
- Add SQL cleanup for TTL/expiration instead of filesystem pruning.
|
|
Done for database-owned runtime stores: media, plugin state, plugin blobs,
|
|
persistent dedupe, and agent cache all expire through SQLite rows. Remaining
|
|
filesystem cleanup is limited to temporary materializations or explicit
|
|
removal commands.
|
|
|
|
## Static Bans
|
|
|
|
Add a repo check that fails new runtime writes to legacy state paths:
|
|
|
|
- `sessions.json`
|
|
- `*.trajectory.jsonl` except explicit export/debug paths
|
|
- `.acp-stream.jsonl`
|
|
- `acp/event-ledger.json`
|
|
- `cache/*.json` runtime cache files
|
|
- `cron/runs/*.jsonl`
|
|
- `jobs-state.json`
|
|
- `device-pair-notify.json`
|
|
- `devices/pending.json`
|
|
- `devices/paired.json`
|
|
- `devices/bootstrap.json`
|
|
- `nodes/pending.json`
|
|
- `nodes/paired.json`
|
|
- `identity/device.json`
|
|
- `identity/device-auth.json`
|
|
- `push/web-push-subscriptions.json`
|
|
- `push/vapid-keys.json`
|
|
- `push/apns-registrations.json`
|
|
- `session-toggles.json`
|
|
- Memory-core `.dreams/events.jsonl`
|
|
- Memory-core `.dreams/session-corpus/`
|
|
- Memory-core `.dreams/daily-ingestion.json`
|
|
- Memory-core `.dreams/session-ingestion.json`
|
|
- Memory-core `.dreams/short-term-recall.json`
|
|
- Memory-core `.dreams/phase-signals.json`
|
|
- Memory-core `.dreams/short-term-promotion.lock`
|
|
- Skill Workshop `skill-workshop/<workspace>.json`
|
|
- Skill Workshop `skill-workshop/skill-workshop-review-*.json`
|
|
- Nostr `bus-state-*.json`
|
|
- Nostr `profile-state-*.json`
|
|
- `calls.jsonl`
|
|
- `known-users.json`
|
|
- `ref-index.jsonl`
|
|
- QQBot `session-*.json`
|
|
- BlueBubbles `bluebubbles/catchup/*.json`
|
|
- BlueBubbles `bluebubbles/inbound-dedupe/*.json`
|
|
- Telegram `update-offset-*.json`
|
|
- Telegram `sticker-cache.json`
|
|
- Telegram `*.telegram-messages.json`
|
|
- Telegram `*.telegram-sent-messages.json`
|
|
- Telegram `*.telegram-topic-names.json`
|
|
- Telegram `thread-bindings-*.json`
|
|
- iMessage `reply-cache.jsonl`
|
|
- iMessage `sent-echoes.jsonl`
|
|
- Microsoft Teams `msteams-conversations.json`
|
|
- Microsoft Teams `msteams-polls.json`
|
|
- Microsoft Teams `msteams-delegated.json`
|
|
- Microsoft Teams `msteams-pending-uploads.json`
|
|
- Microsoft Teams `*.learnings.json`
|
|
- Matrix `thread-bindings.json`
|
|
- Matrix `inbound-dedupe.json`
|
|
- Matrix `startup-verification.json`
|
|
- Matrix `storage-meta.json`
|
|
- Matrix `crypto-idb-snapshot.json`
|
|
- sandbox registry shard JSON files
|
|
- `plugin-state/state.sqlite`
|
|
- `tasks/runs.sqlite`
|
|
- `tasks/flows/registry.sqlite`
|
|
- `bindings/current-conversations.json`
|
|
- `restart-sentinel.json`
|
|
- `gateway.<hash>.lock`
|
|
- `qmd/embed.lock`
|
|
- `settings/voicewake.json`
|
|
- `settings/voicewake-routing.json`
|
|
- `plugin-binding-approvals.json`
|
|
- `plugins/installs.json`
|
|
- `audit/file-transfer.jsonl`
|
|
- `audit/crestodian.jsonl`
|
|
- `crestodian/rescue-pending/*.json`
|
|
- `plugins/phone-control/armed.json`
|
|
- Memory Wiki `.openclaw-wiki/log.jsonl`
|
|
- Memory Wiki `.openclaw-wiki/state.json`
|
|
- Memory Wiki `.openclaw-wiki/locks/`
|
|
- Memory Wiki `.openclaw-wiki/source-sync.json`
|
|
- Memory Wiki `.openclaw-wiki/import-runs/*.json`
|
|
- Memory Wiki `.openclaw-wiki/cache/agent-digest.json`
|
|
- Memory Wiki `.openclaw-wiki/cache/claims.jsonl`
|
|
- ClawHub `.clawhub/lock.json`
|
|
- ClawHub `.clawhub/origin.json`
|
|
- Browser profile decoration `.openclaw-profile-decorated`
|
|
|
|
The ban should allow tests to create legacy fixtures and allow migration code to
|
|
read/import/remove legacy file sources. Unshipped SQLite sidecars stay banned
|
|
and do not get doctor import allowances.
|
|
|
|
## Done Criteria
|
|
|
|
- Runtime data and cache writes go to the global or agent SQLite database.
|
|
- Runtime no longer writes session indexes, transcript JSONL, sandbox registry
|
|
JSON, task sidecar SQLite, or plugin-state sidecar SQLite. The unshipped task
|
|
and plugin-state sidecar SQLite importers are deleted.
|
|
- Legacy file import is doctor-only.
|
|
- Backup produces one archive with compact SQLite snapshots and integrity proof.
|
|
- Agent workers can run with disk, VFS scratch, or experimental VFS-only
|
|
storage.
|
|
- Config and explicit credential files remain the only expected persistent
|
|
non-database control files.
|
|
- Repo checks prevent reintroducing legacy runtime file stores.
|