mirror of
https://github.com/moltbot/moltbot.git
synced 2026-03-20 16:15:15 +00:00
fix(cron): add retry policy for one-shot jobs on transient errors (#24355) (openclaw#24435) thanks @hugenshen
Verified: - pnpm install --frozen-lockfile - pnpm check - pnpm test -- --run src/cron/service.issue-regressions.test.ts src/config/config-misc.test.ts Co-authored-by: hugenshen <16300669+hugenshen@users.noreply.github.com> Co-authored-by: Tak Hoffman <781889+Takhoffman@users.noreply.github.com>
This commit is contained in:
@@ -353,6 +353,38 @@ Notes:
|
||||
- Isolated cron run sessions in `sessions.json` are pruned by `cron.sessionRetention` (default `24h`; set `false` to disable).
|
||||
- Override store path: `cron.store` in config.
|
||||
|
||||
## Retry policy
|
||||
|
||||
When a job fails, OpenClaw classifies errors as **transient** (retryable) or **permanent** (disable immediately).
|
||||
|
||||
### Transient errors (retried)
|
||||
|
||||
- Rate limit (429, too many requests, resource exhausted)
|
||||
- Network errors (timeout, ECONNRESET, fetch failed, socket)
|
||||
- Server errors (5xx)
|
||||
- Cloudflare-related errors
|
||||
|
||||
### Permanent errors (no retry)
|
||||
|
||||
- Auth failures (invalid API key, unauthorized)
|
||||
- Config or validation errors
|
||||
- Other non-transient errors
|
||||
|
||||
### Default behavior (no config)
|
||||
|
||||
**One-shot jobs (`schedule.kind: "at"`):**
|
||||
|
||||
- On transient error: retry up to 3 times with exponential backoff (30s → 1m → 5m).
|
||||
- On permanent error: disable immediately.
|
||||
- On success or skip: disable (or delete if `deleteAfterRun: true`).
|
||||
|
||||
**Recurring jobs (`cron` / `every`):**
|
||||
|
||||
- On any error: apply exponential backoff (30s → 1m → 5m → 15m → 60m) before the next scheduled run.
|
||||
- Job stays enabled; backoff resets after the next successful run.
|
||||
|
||||
Configure `cron.retry` to override these defaults (see [Configuration](/automation/cron-jobs#configuration)).
|
||||
|
||||
## Configuration
|
||||
|
||||
```json5
|
||||
@@ -361,6 +393,12 @@ Notes:
|
||||
enabled: true, // default true
|
||||
store: "~/.openclaw/cron/jobs.json",
|
||||
maxConcurrentRuns: 1, // default 1
|
||||
// Optional: override retry policy for one-shot jobs
|
||||
retry: {
|
||||
maxAttempts: 3,
|
||||
backoffMs: [60000, 120000, 300000],
|
||||
retryOn: ["rate_limit", "network", "server_error"],
|
||||
},
|
||||
webhook: "https://example.invalid/legacy", // deprecated fallback for stored notify:true jobs
|
||||
webhookToken: "replace-with-dedicated-webhook-token", // optional bearer token for webhook mode
|
||||
sessionRetention: "24h", // duration string or false
|
||||
@@ -617,7 +655,7 @@ openclaw system event --mode now --text "Next heartbeat: check battery."
|
||||
- OpenClaw applies exponential retry backoff for recurring jobs after consecutive errors:
|
||||
30s, 1m, 5m, 15m, then 60m between retries.
|
||||
- Backoff resets automatically after the next successful run.
|
||||
- One-shot (`at`) jobs disable after a terminal run (`ok`, `error`, or `skipped`) and do not retry.
|
||||
- One-shot (`at`) jobs retry transient errors (rate limit, network, server_error) up to 3 times with backoff; permanent errors disable immediately. See [Retry policy](/automation/cron-jobs#retry-policy).
|
||||
|
||||
### Telegram delivers to the wrong place
|
||||
|
||||
|
||||
Reference in New Issue
Block a user