From cf5702233c80c4f354b15b178e6b5976a3791acc Mon Sep 17 00:00:00 2001 From: Peter Steinberger Date: Mon, 2 Mar 2026 18:15:43 +0000 Subject: [PATCH] docs(security)!: document messaging-only onboarding default and hook/model risk --- SECURITY.md | 2 ++ docs/gateway/configuration-reference.md | 2 ++ docs/gateway/configuration.md | 5 +++++ docs/gateway/security/index.md | 5 +++++ docs/reference/wizard.md | 1 + docs/start/onboarding.md | 2 ++ docs/start/wizard-cli-reference.md | 1 + docs/start/wizard.md | 2 ++ 8 files changed, 20 insertions(+) diff --git a/SECURITY.md b/SECURITY.md index 1dc51369f9a..8562a232ddb 100644 --- a/SECURITY.md +++ b/SECURITY.md @@ -149,6 +149,8 @@ OpenClaw's security model is "personal assistant" (one trusted operator, potenti - The model/agent is **not** a trusted principal. Assume prompt/content injection can manipulate behavior. - Security boundaries come from host/config trust, auth, tool policy, sandboxing, and exec approvals. - Prompt injection by itself is not a vulnerability report unless it crosses one of those boundaries. +- Hook/webhook-driven payloads should be treated as untrusted content; keep unsafe bypass flags disabled unless doing tightly scoped debugging (`hooks.gmail.allowUnsafeExternalContent`, `hooks.mappings[].allowUnsafeExternalContent`). +- Weak model tiers are generally easier to prompt-inject. For tool-enabled or hook-driven agents, prefer strong modern model tiers and strict tool policy (for example `tools.profile: "messaging"` or stricter), plus sandboxing where possible. ## Gateway and Node trust concept diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md index bdf6fbdb639..5f5750dfb5a 100644 --- a/docs/gateway/configuration-reference.md +++ b/docs/gateway/configuration-reference.md @@ -1587,6 +1587,8 @@ Defaults for Talk mode (macOS/iOS/Android). `tools.profile` sets a base allowlist before `tools.allow`/`tools.deny`: +Local onboarding defaults new local configs to `tools.profile: "messaging"` when unset (existing explicit profiles are preserved). + | Profile | Includes | | ----------- | ----------------------------------------------------------------------------------------- | | `minimal` | `session_status` only | diff --git a/docs/gateway/configuration.md b/docs/gateway/configuration.md index 16e1deb253d..d3bfe3ad60a 100644 --- a/docs/gateway/configuration.md +++ b/docs/gateway/configuration.md @@ -291,6 +291,11 @@ When validation fails: } ``` + Security note: + - Treat all hook/webhook payload content as untrusted input. + - Keep unsafe-content bypass flags disabled (`hooks.gmail.allowUnsafeExternalContent`, `hooks.mappings[].allowUnsafeExternalContent`) unless doing tightly scoped debugging. + - For hook-driven agents, prefer strong modern model tiers and strict tool policy (for example messaging-only plus sandboxing where possible). + See [full reference](/gateway/configuration-reference#hooks) for all mapping options and Gmail integration. diff --git a/docs/gateway/security/index.md b/docs/gateway/security/index.md index 46876959278..470cb7df08f 100644 --- a/docs/gateway/security/index.md +++ b/docs/gateway/security/index.md @@ -538,6 +538,11 @@ Guidance: - Only enable temporarily for tightly scoped debugging. - If enabled, isolate that agent (sandbox + minimal tools + dedicated session namespace). +Hooks risk note: + +- Hook payloads are untrusted content, even when delivery comes from systems you control (mail/docs/web content can carry prompt injection). +- Weak model tiers increase this risk. For hook-driven automation, prefer strong modern model tiers and keep tool policy tight (`tools.profile: "messaging"` or stricter), plus sandboxing where possible. + ### Prompt injection does not require public DMs Even if **only you** can message the bot, prompt injection can still happen via diff --git a/docs/reference/wizard.md b/docs/reference/wizard.md index 4f85e7e866d..1c459cbaa24 100644 --- a/docs/reference/wizard.md +++ b/docs/reference/wizard.md @@ -245,6 +245,7 @@ Typical fields in `~/.openclaw/openclaw.json`: - `agents.defaults.workspace` - `agents.defaults.model` / `models.providers` (if Minimax chosen) +- `tools.profile` (local onboarding defaults to `"messaging"` when unset; existing explicit values are preserved) - `gateway.*` (mode, bind, auth, tailscale) - `session.dmScope` (behavior details: [CLI Onboarding Reference](/start/wizard-cli-reference#outputs-and-internals)) - `channels.telegram.botToken`, `channels.discord.token`, `channels.signal.*`, `channels.imessage.*` diff --git a/docs/start/onboarding.md b/docs/start/onboarding.md index dfa058af545..3a5c86c360e 100644 --- a/docs/start/onboarding.md +++ b/docs/start/onboarding.md @@ -34,6 +34,8 @@ Security trust model: - By default, OpenClaw is a personal agent: one trusted operator boundary. - Shared/multi-user setups require lock-down (split trust boundaries, keep tool access minimal, and follow [Security](/gateway/security)). +- Local onboarding now defaults new configs to `tools.profile: "messaging"` so broad runtime/filesystem tools are opt-in. +- If hooks/webhooks or other untrusted content feeds are enabled, use a strong modern model tier and keep strict tool policy/sandboxing. diff --git a/docs/start/wizard-cli-reference.md b/docs/start/wizard-cli-reference.md index 5019956a05c..7f70f78f28b 100644 --- a/docs/start/wizard-cli-reference.md +++ b/docs/start/wizard-cli-reference.md @@ -236,6 +236,7 @@ Typical fields in `~/.openclaw/openclaw.json`: - `agents.defaults.workspace` - `agents.defaults.model` / `models.providers` (if Minimax chosen) +- `tools.profile` (local onboarding defaults to `"messaging"` when unset; existing explicit values are preserved) - `gateway.*` (mode, bind, auth, tailscale) - `session.dmScope` (local onboarding defaults this to `per-channel-peer` when unset; existing explicit values are preserved) - `channels.telegram.botToken`, `channels.discord.token`, `channels.signal.*`, `channels.imessage.*` diff --git a/docs/start/wizard.md b/docs/start/wizard.md index ecf059c3b89..d1701e326cd 100644 --- a/docs/start/wizard.md +++ b/docs/start/wizard.md @@ -50,6 +50,7 @@ The wizard starts with **QuickStart** (defaults) vs **Advanced** (full control). - Workspace default (or existing workspace) - Gateway port **18789** - Gateway auth **Token** (auto‑generated, even on loopback) + - Tool policy default for new local setups: `tools.profile: "messaging"` (existing explicit profile is preserved) - DM isolation default: local onboarding writes `session.dmScope: "per-channel-peer"` when unset. Details: [CLI Onboarding Reference](/start/wizard-cli-reference#outputs-and-internals) - Tailscale exposure **Off** - Telegram + WhatsApp DMs default to **allowlist** (you'll be prompted for your phone number) @@ -65,6 +66,7 @@ The wizard starts with **QuickStart** (defaults) vs **Advanced** (full control). 1. **Model/Auth** — Anthropic API key (recommended), OpenAI, or Custom Provider (OpenAI-compatible, Anthropic-compatible, or Unknown auto-detect). Pick a default model. + Security note: if this agent will run tools or process webhook/hooks content, prefer a strong modern model tier and keep tool policy strict. Weaker model tiers are easier to prompt-inject. For non-interactive runs, `--secret-input-mode ref` stores env-backed refs in auth profiles instead of plaintext API key values. In non-interactive `ref` mode, the provider env var must be set; passing inline key flags without that env var fails fast. In interactive runs, choosing secret reference mode lets you point at either an environment variable or a configured provider ref (`file` or `exec`), with a fast preflight validation before saving.