From cf5702233c80c4f354b15b178e6b5976a3791acc Mon Sep 17 00:00:00 2001
From: Peter Steinberger <steipete@gmail.com>
Date: Mon, 2 Mar 2026 18:15:43 +0000
Subject: [PATCH] docs(security)!: document messaging-only onboarding default
 and hook/model risk

---
 SECURITY.md                             | 2 ++
 docs/gateway/configuration-reference.md | 2 ++
 docs/gateway/configuration.md           | 5 +++++
 docs/gateway/security/index.md          | 5 +++++
 docs/reference/wizard.md                | 1 +
 docs/start/onboarding.md                | 2 ++
 docs/start/wizard-cli-reference.md      | 1 +
 docs/start/wizard.md                    | 2 ++
 8 files changed, 20 insertions(+)
diff --git a/SECURITY.md b/SECURITY.md
index 1dc51369f9a..8562a232ddb 100644
--- a/SECURITY.md
+++ b/SECURITY.md
@@ -149,6 +149,8 @@ OpenClaw's security model is "personal assistant" (one trusted operator, potenti
 - The model/agent is **not** a trusted principal. Assume prompt/content injection can manipulate behavior.
 - Security boundaries come from host/config trust, auth, tool policy, sandboxing, and exec approvals.
 - Prompt injection by itself is not a vulnerability report unless it crosses one of those boundaries.
+- Hook/webhook-driven payloads should be treated as untrusted content; keep unsafe bypass flags disabled unless doing tightly scoped debugging (`hooks.gmail.allowUnsafeExternalContent`, `hooks.mappings[].allowUnsafeExternalContent`).
+- Weak model tiers are generally easier to prompt-inject. For tool-enabled or hook-driven agents, prefer strong modern model tiers and strict tool policy (for example `tools.profile: "messaging"` or stricter), plus sandboxing where possible.
 
 ## Gateway and Node trust concept
 
diff --git a/docs/gateway/configuration-reference.md b/docs/gateway/configuration-reference.md
index bdf6fbdb639..5f5750dfb5a 100644
--- a/docs/gateway/configuration-reference.md
+++ b/docs/gateway/configuration-reference.md
@@ -1587,6 +1587,8 @@ Defaults for Talk mode (macOS/iOS/Android).
 
 `tools.profile` sets a base allowlist before `tools.allow`/`tools.deny`:
 
+Local onboarding defaults new local configs to `tools.profile: "messaging"` when unset (existing explicit profiles are preserved).
+
 | Profile     | Includes                                                                                  |
 | ----------- | ----------------------------------------------------------------------------------------- |
 | `minimal`   | `session_status` only                                                                     |
diff --git a/docs/gateway/configuration.md b/docs/gateway/configuration.md
index 16e1deb253d..d3bfe3ad60a 100644
--- a/docs/gateway/configuration.md
+++ b/docs/gateway/configuration.md
@@ -291,6 +291,11 @@ When validation fails:
     }
     ```
 
+    Security note:
+    - Treat all hook/webhook payload content as untrusted input.
+    - Keep unsafe-content bypass flags disabled (`hooks.gmail.allowUnsafeExternalContent`, `hooks.mappings[].allowUnsafeExternalContent`) unless doing tightly scoped debugging.
+    - For hook-driven agents, prefer strong modern model tiers and strict tool policy (for example messaging-only plus sandboxing where possible).
+
     See [full reference](/gateway/configuration-reference#hooks) for all mapping options and Gmail integration.
 
   </Accordion>
diff --git a/docs/gateway/security/index.md b/docs/gateway/security/index.md
index 46876959278..470cb7df08f 100644
--- a/docs/gateway/security/index.md
+++ b/docs/gateway/security/index.md
@@ -538,6 +538,11 @@ Guidance:
 - Only enable temporarily for tightly scoped debugging.
 - If enabled, isolate that agent (sandbox + minimal tools + dedicated session namespace).
 
+Hooks risk note:
+
+- Hook payloads are untrusted content, even when delivery comes from systems you control (mail/docs/web content can carry prompt injection).
+- Weak model tiers increase this risk. For hook-driven automation, prefer strong modern model tiers and keep tool policy tight (`tools.profile: "messaging"` or stricter), plus sandboxing where possible.
+
 ### Prompt injection does not require public DMs
 
 Even if **only you** can message the bot, prompt injection can still happen via
diff --git a/docs/reference/wizard.md b/docs/reference/wizard.md
index 4f85e7e866d..1c459cbaa24 100644
--- a/docs/reference/wizard.md
+++ b/docs/reference/wizard.md
@@ -245,6 +245,7 @@ Typical fields in `~/.openclaw/openclaw.json`:
 
 - `agents.defaults.workspace`
 - `agents.defaults.model` / `models.providers` (if Minimax chosen)
+- `tools.profile` (local onboarding defaults to `"messaging"` when unset; existing explicit values are preserved)
 - `gateway.*` (mode, bind, auth, tailscale)
 - `session.dmScope` (behavior details: [CLI Onboarding Reference](/start/wizard-cli-reference#outputs-and-internals))
 - `channels.telegram.botToken`, `channels.discord.token`, `channels.signal.*`, `channels.imessage.*`
diff --git a/docs/start/onboarding.md b/docs/start/onboarding.md
index dfa058af545..3a5c86c360e 100644
--- a/docs/start/onboarding.md
+++ b/docs/start/onboarding.md
@@ -34,6 +34,8 @@ Security trust model:
 
 - By default, OpenClaw is a personal agent: one trusted operator boundary.
 - Shared/multi-user setups require lock-down (split trust boundaries, keep tool access minimal, and follow [Security](/gateway/security)).
+- Local onboarding now defaults new configs to `tools.profile: "messaging"` so broad runtime/filesystem tools are opt-in.
+- If hooks/webhooks or other untrusted content feeds are enabled, use a strong modern model tier and keep strict tool policy/sandboxing.
 
 </Step>
 <Step title="Local vs Remote">
diff --git a/docs/start/wizard-cli-reference.md b/docs/start/wizard-cli-reference.md
index 5019956a05c..7f70f78f28b 100644
--- a/docs/start/wizard-cli-reference.md
+++ b/docs/start/wizard-cli-reference.md
@@ -236,6 +236,7 @@ Typical fields in `~/.openclaw/openclaw.json`:
 
 - `agents.defaults.workspace`
 - `agents.defaults.model` / `models.providers` (if Minimax chosen)
+- `tools.profile` (local onboarding defaults to `"messaging"` when unset; existing explicit values are preserved)
 - `gateway.*` (mode, bind, auth, tailscale)
 - `session.dmScope` (local onboarding defaults this to `per-channel-peer` when unset; existing explicit values are preserved)
 - `channels.telegram.botToken`, `channels.discord.token`, `channels.signal.*`, `channels.imessage.*`
diff --git a/docs/start/wizard.md b/docs/start/wizard.md
index ecf059c3b89..d1701e326cd 100644
--- a/docs/start/wizard.md
+++ b/docs/start/wizard.md
@@ -50,6 +50,7 @@ The wizard starts with **QuickStart** (defaults) vs **Advanced** (full control).
     - Workspace default (or existing workspace)
     - Gateway port **18789**
     - Gateway auth **Token** (auto‑generated, even on loopback)
+    - Tool policy default for new local setups: `tools.profile: "messaging"` (existing explicit profile is preserved)
     - DM isolation default: local onboarding writes `session.dmScope: "per-channel-peer"` when unset. Details: [CLI Onboarding Reference](/start/wizard-cli-reference#outputs-and-internals)
     - Tailscale exposure **Off**
     - Telegram + WhatsApp DMs default to **allowlist** (you'll be prompted for your phone number)
@@ -65,6 +66,7 @@ The wizard starts with **QuickStart** (defaults) vs **Advanced** (full control).
 
 1. **Model/Auth** — Anthropic API key (recommended), OpenAI, or Custom Provider
    (OpenAI-compatible, Anthropic-compatible, or Unknown auto-detect). Pick a default model.
+   Security note: if this agent will run tools or process webhook/hooks content, prefer a strong modern model tier and keep tool policy strict. Weaker model tiers are easier to prompt-inject.
    For non-interactive runs, `--secret-input-mode ref` stores env-backed refs in auth profiles instead of plaintext API key values.
    In non-interactive `ref` mode, the provider env var must be set; passing inline key flags without that env var fails fast.
    In interactive runs, choosing secret reference mode lets you point at either an environment variable or a configured provider ref (`file` or `exec`), with a fast preflight validation before saving.