docs: add dedicated pdf tool docs page

2026-03-08 06:54:24 +00:00 · 2026-03-03 04:07:00 +00:00
parent 6cdfd2eaaa
commit d9d604c6ad
4 changed files with 159 additions and 15 deletions
--- a/docs/docs.json
+++ b/docs/docs.json
@@ -994,6 +994,7 @@
                  "brave-search",
                  "perplexity",
                  "tools/diffs",
+                  "tools/pdf",
                  "tools/elevated",
                  "tools/exec",
                  "tools/exec-approvals",
--- a/docs/start/hubs.md
+++ b/docs/start/hubs.md
@@ -109,6 +109,7 @@ Use these hubs to discover every page, including deep dives and reference docs t
 - [OpenProse](/prose)
 - [CLI reference](/cli)
 - [Exec tool](/tools/exec)
+- [PDF tool](/tools/pdf)
 - [Elevated mode](/tools/elevated)
 - [Cron jobs](/automation/cron-jobs)
 - [Cron vs Heartbeat](/automation/cron-vs-heartbeat)
--- a/docs/tools/index.md
+++ b/docs/tools/index.md
@@ -401,21 +401,7 @@ Notes:

 Analyze one or more PDF documents.

-Core parameters:
-
- `pdf` (single path or URL)
- `pdfs` (multiple paths or URLs, up to 10)
- `prompt` (optional, defaults to "Analyze this PDF document.")
- `pages` (optional page range like `1-5` or `1,3,7-9`)
- `model` (optional model override)
- `maxBytesMb` (optional size cap)
-
-Notes:
-
- Native PDF provider mode is supported for Anthropic and Google models.
- Non-native models use PDF extraction fallback, text first, then rasterized page images when needed.
- `pages` filtering is only supported in extraction fallback mode. Native providers return a clear error when `pages` is set.
- Defaults are configurable via `agents.defaults.pdfModel`, `agents.defaults.pdfMaxBytesMb`, and `agents.defaults.pdfMaxPages`.
+For full behavior, limits, config, and examples, see [PDF tool](/tools/pdf).

 ### `message`

--- a/docs/tools/pdf.md
+++ b/docs/tools/pdf.md
@@ -0,0 +1,156 @@
+---
+title: "PDF Tool"
+summary: "Analyze one or more PDF documents with native provider support and extraction fallback"
+read_when:
+  - You want to analyze PDFs from agents
+  - You need exact pdf tool parameters and limits
+  - You are debugging native PDF mode vs extraction fallback
+---
+
+# PDF tool
+
+`pdf` analyzes one or more PDF documents and returns text.
+
+Quick behavior:
+
+- Native provider mode for Anthropic and Google model providers.
+- Extraction fallback mode for other providers (extract text first, then page images when needed).
+- Supports single (`pdf`) or multi (`pdfs`) input, max 10 PDFs per call.
+
+## Availability
+
+The tool is only registered when OpenClaw can resolve a PDF-capable model config for the agent:
+
+1. `agents.defaults.pdfModel`
+2. fallback to `agents.defaults.imageModel`
+3. fallback to best effort provider defaults based on available auth
+
+If no usable model can be resolved, the `pdf` tool is not exposed.
+
+## Input reference
+
+- `pdf` (`string`): one PDF path or URL
+- `pdfs` (`string[]`): multiple PDF paths or URLs, up to 10 total
+- `prompt` (`string`): analysis prompt, default `Analyze this PDF document.`
+- `pages` (`string`): page filter like `1-5` or `1,3,7-9`
+- `model` (`string`): optional model override (`provider/model`)
+- `maxBytesMb` (`number`): per-PDF size cap in MB
+
+Input notes:
+
+- `pdf` and `pdfs` are merged and deduplicated before loading.
+- If no PDF input is provided, the tool errors.
+- `pages` is parsed as 1-based page numbers, deduped, sorted, and clamped to the configured max pages.
+- `maxBytesMb` defaults to `agents.defaults.pdfMaxBytesMb` or `10`.
+
+## Supported PDF references
+
+- local file path (including `~` expansion)
+- `file://` URL
+- `http://` and `https://` URL
+
+Reference notes:
+
+- Other URI schemes (for example `ftp://`) are rejected with `unsupported_pdf_reference`.
+- In sandbox mode, remote `http(s)` URLs are rejected.
+- With workspace-only file policy enabled, local file paths outside allowed roots are rejected.
+
+## Execution modes
+
+### Native provider mode
+
+Native mode is used for provider `anthropic` and `google`.
+The tool sends raw PDF bytes directly to provider APIs.
+
+Native mode limits:
+
+- `pages` is not supported. If set, the tool returns an error.
+
+### Extraction fallback mode
+
+Fallback mode is used for non-native providers.
+
+Flow:
+
+1. Extract text from selected pages (up to `agents.defaults.pdfMaxPages`, default `20`).
+2. If extracted text length is below `200` chars, render selected pages to PNG images and include them.
+3. Send extracted content plus prompt to the selected model.
+
+Fallback details:
+
+- Page image extraction uses a pixel budget of `4,000,000`.
+- If the target model does not support image input and there is no extractable text, the tool errors.
+- Extraction fallback requires `pdfjs-dist` (and `@napi-rs/canvas` for image rendering).
+
+## Config
+
+```json5
+{
+  agents: {
+    defaults: {
+      pdfModel: {
+        primary: "anthropic/claude-opus-4-6",
+        fallbacks: ["openai/gpt-5-mini"],
+      },
+      pdfMaxBytesMb: 10,
+      pdfMaxPages: 20,
+    },
+  },
+}
+```
+
+See [Configuration Reference](/gateway/configuration-reference) for full field details.
+
+## Output details
+
+The tool returns text in `content[0].text` and structured metadata in `details`.
+
+Common `details` fields:
+
+- `model`: resolved model ref (`provider/model`)
+- `native`: `true` for native provider mode, `false` for fallback
+- `attempts`: fallback attempts that failed before success
+
+Path fields:
+
+- single PDF input: `details.pdf`
+- multiple PDF inputs: `details.pdfs[]` with `pdf` entries
+- sandbox path rewrite metadata (when applicable): `rewrittenFrom`
+
+## Error behavior
+
+- Missing PDF input: throws `pdf required: provide a path or URL to a PDF document`
+- Too many PDFs: returns structured error in `details.error = "too_many_pdfs"`
+- Unsupported reference scheme: returns `details.error = "unsupported_pdf_reference"`
+- Native mode with `pages`: throws clear `pages is not supported with native PDF providers` error
+
+## Examples
+
+Single PDF:
+
+```json
+{
+  "pdf": "/tmp/report.pdf",
+  "prompt": "Summarize this report in 5 bullets"
+}
+```
+
+Multiple PDFs:
+
+```json
+{
+  "pdfs": ["/tmp/q1.pdf", "/tmp/q2.pdf"],
+  "prompt": "Compare risks and timeline changes across both documents"
+}
+```
+
+Page-filtered fallback model:
+
+```json
+{
+  "pdf": "https://example.com/report.pdf",
+  "pages": "1-3,7",
+  "model": "openai/gpt-5-mini",
+  "prompt": "Extract only customer-impacting incidents"
+}
+```