* fix: prune stale session entries, cap entry count, and rotate sessions.json
The sessions.json file grows unbounded over time. Every heartbeat tick (default: 30m)
triggers multiple full rewrites, and session keys from groups, threads, and DMs
accumulate indefinitely with large embedded objects (skillsSnapshot,
systemPromptReport). At >50MB the synchronous JSON parse blocks the event loop,
causing Telegram webhook timeouts and effectively taking the bot down.
Three mitigations, all running inside saveSessionStoreUnlocked() on every write:
1. Prune stale entries: remove entries with updatedAt older than 30 days
(configurable via session.maintenance.pruneDays in openclaw.json)
2. Cap entry count: keep only the 500 most recently updated entries
(configurable via session.maintenance.maxEntries). Entries without updatedAt
are evicted first.
3. File rotation: if the existing sessions.json exceeds 10MB before a write,
rename it to sessions.json.bak.{timestamp} and keep only the 3 most recent
backups (configurable via session.maintenance.rotateBytes).
All three thresholds are configurable under session.maintenance in openclaw.json
with Zod validation. No env vars.
Existing tests updated to use Date.now() instead of epoch-relative timestamps
(1, 2, 3) that would be incorrectly pruned as stale.
27 new tests covering pruning, capping, rotation, and integration scenarios.
* feat: auto-prune expired cron run sessions (#12289)
Add TTL-based reaper for isolated cron run sessions that accumulate
indefinitely in sessions.json.
New config option:
cron.sessionRetention: string | false (default: '24h')
The reaper runs piggy-backed on the cron timer tick, self-throttled
to sweep at most every 5 minutes. It removes session entries matching
the pattern cron:<jobId>:run:<uuid> whose updatedAt + retention < now.
Design follows the Kubernetes ttlSecondsAfterFinished pattern:
- Sessions are persisted normally (observability/debugging)
- A periodic reaper prunes expired entries
- Configurable retention with sensible default
- Set to false to disable pruning entirely
Files changed:
- src/config/types.cron.ts: Add sessionRetention to CronConfig
- src/config/zod-schema.ts: Add Zod validation for sessionRetention
- src/cron/session-reaper.ts: New reaper module (sweepCronRunSessions)
- src/cron/session-reaper.test.ts: 12 tests covering all paths
- src/cron/service/state.ts: Add cronConfig/sessionStorePath to deps
- src/cron/service/timer.ts: Wire reaper into onTimer tick
- src/gateway/server-cron.ts: Pass config and session store path to deps
Closes#12289
* fix: sweep cron session stores per agent
* docs: add changelog for session maintenance (#13083) (thanks @skyfallsin, @Glucksberg)
* fix: add warn-only session maintenance mode
* fix: warn-only maintenance defaults to active session
* fix: deliver maintenance warnings to active session
* docs: add session maintenance examples
* fix: accept duration and size maintenance thresholds
* refactor: share cron run session key check
* fix: format issues and replace defaultRuntime.warn with console.warn
---------
Co-authored-by: Pradeep Elankumaran <pradeepe@gmail.com>
Co-authored-by: Glucksberg <markuscontasul@gmail.com>
Co-authored-by: max <40643627+quotentiroler@users.noreply.github.com>
Co-authored-by: quotentiroler <max.nussbaumer@maxhealth.tech>
* fix(tools): correct Grok response parsing for xAI Responses API
The xAI Responses API returns content in output[0].content[0].text,
not in output_text field. Updated GrokSearchResponse type and
runGrokSearch to extract content from the correct path.
Fixes the 'No response' issue when using Grok web search.
* fix(tools): harden Grok web_search parsing (#13049) (thanks @ereid7)
---------
Co-authored-by: erai <erai@erais-Mac-mini.local>
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
* fix: suggest /reset in context overflow error message
When the context window overflows, the error message now suggests
using /reset to clear session history, giving users an actionable
recovery path instead of a dead-end error.
Closes#12940
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: suggest /reset in context overflow error message (#12973) (thanks @RamiNoodle733)
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Rami Abdelrazzaq <RamiNoodle733@users.noreply.github.com>
* fix: cap Discord gateway reconnect attempts to prevent infinite loop
The Discord GatewayPlugin was configured with maxAttempts: Infinity,
which causes an unbounded reconnection loop when the Discord gateway
enters a persistent failure state (e.g. code 1005 with stalled HELLO).
In production, this manifested as 2,483+ reconnection attempts in a
single log file, starving the Node.js event loop and preventing cron,
heartbeat, and other subsystems from functioning.
Cap maxAttempts at 50, which provides ~25 minutes of retry time
(with 30s HELLO timeout between attempts) before cleanly exiting
via the existing "Max reconnect attempts" error handler.
Closes#11836
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Changelog: note Discord gateway reconnect cap (#12230) (thanks @Yida-Dev)
---------
Co-authored-by: Yida-Dev <reyifeijun@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Shadow <shadow@clawd.bot>
Problem:
When users execute `/think off`, they still receive `reasoning_content`
from models configured with `reasoning: true` (e.g., GLM-4.7, GLM-4.6,
Kimi K2.5, MiniMax-M2.1).
Expected: `/think off` should completely disable reasoning content.
Actual: Reasoning content is still returned.
Root Cause:
The directive handlers delete `sessionEntry.thinkingLevel` when user
executes `/think off`. This causes the thinking level to become undefined,
and the system falls back to `resolveThinkingDefault()`, which checks the
model catalog and returns "low" for reasoning-capable models, ignoring the
user's explicit intent.
Why We Must Persist "off" (Design Rationale):
1. **Model-dependent defaults**: Unlike other directives where "off" means
use a global default, `thinkingLevel` has model-dependent defaults:
- Reasoning-capable models (GLM-4.7, etc.) → default "low"
- Other models → default "off"
2. **Existing pattern**: The codebase already follows this pattern for
`elevatedLevel`, which persists "off" explicitly to override defaults
that may be "on". The comment explains:
"Persist 'off' explicitly so `/elevated off` actually overrides defaults."
3. **User intent**: When a user explicitly executes `/think off`, they want
to disable thinking regardless of the model's capabilities. Deleting the
field breaks this intent by falling back to the model's default.
Solution:
Persist "off" value instead of deleting the field in all internal directive handlers:
- `src/auto-reply/reply/directive-handling.impl.ts`: Directive-only messages
- `src/auto-reply/reply/directive-handling.persist.ts`: Inline directives
- `src/commands/agent.ts`: CLI command-line flags
Gateway API Backward Compatibility:
The original implementation incorrectly mapped `null` to "off" in
`sessions-patch.ts` for consistency with internal handlers. This was a
breaking change because:
- Previously, `null` cleared the override (deleted the field)
- API clients lost the ability to "clear to default" via `null`
- This contradicts standard JSON semantics where `null` means "no value"
Restored original null semantics in `src/gateway/sessions-patch.ts`:
- `null` → delete field, fall back to model default (clear override)
- `"off"` → persist explicit override
- Other values → normalize and persist
This ensures backward compatibility for API clients while fixing the `/think off`
issue in internal handlers.
Signed-off-by: Liu Yuan <namei.unix@gmail.com>
* Config: reload dotenv before env substitution on runtime loads
* Test: isolate config env var regression from host state env
* fix: keep dotenv vars resolvable on runtime config reloads (#12748) (thanks @rodrigouroz)
---------
Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
- Add description to docs.json for llms.txt blockquote summary
- Add title frontmatter to 10 docs files for llms.txt link text
- ci(docker): skip builds for docs-only changes