mirror of
https://github.com/moltbot/moltbot.git
synced 2026-03-08 06:54:24 +00:00
docs: add dedicated pdf tool docs page
This commit is contained in:
@@ -994,6 +994,7 @@
|
||||
"brave-search",
|
||||
"perplexity",
|
||||
"tools/diffs",
|
||||
"tools/pdf",
|
||||
"tools/elevated",
|
||||
"tools/exec",
|
||||
"tools/exec-approvals",
|
||||
|
||||
@@ -109,6 +109,7 @@ Use these hubs to discover every page, including deep dives and reference docs t
|
||||
- [OpenProse](/prose)
|
||||
- [CLI reference](/cli)
|
||||
- [Exec tool](/tools/exec)
|
||||
- [PDF tool](/tools/pdf)
|
||||
- [Elevated mode](/tools/elevated)
|
||||
- [Cron jobs](/automation/cron-jobs)
|
||||
- [Cron vs Heartbeat](/automation/cron-vs-heartbeat)
|
||||
|
||||
@@ -401,21 +401,7 @@ Notes:
|
||||
|
||||
Analyze one or more PDF documents.
|
||||
|
||||
Core parameters:
|
||||
|
||||
- `pdf` (single path or URL)
|
||||
- `pdfs` (multiple paths or URLs, up to 10)
|
||||
- `prompt` (optional, defaults to "Analyze this PDF document.")
|
||||
- `pages` (optional page range like `1-5` or `1,3,7-9`)
|
||||
- `model` (optional model override)
|
||||
- `maxBytesMb` (optional size cap)
|
||||
|
||||
Notes:
|
||||
|
||||
- Native PDF provider mode is supported for Anthropic and Google models.
|
||||
- Non-native models use PDF extraction fallback, text first, then rasterized page images when needed.
|
||||
- `pages` filtering is only supported in extraction fallback mode. Native providers return a clear error when `pages` is set.
|
||||
- Defaults are configurable via `agents.defaults.pdfModel`, `agents.defaults.pdfMaxBytesMb`, and `agents.defaults.pdfMaxPages`.
|
||||
For full behavior, limits, config, and examples, see [PDF tool](/tools/pdf).
|
||||
|
||||
### `message`
|
||||
|
||||
|
||||
156
docs/tools/pdf.md
Normal file
156
docs/tools/pdf.md
Normal file
@@ -0,0 +1,156 @@
|
||||
---
|
||||
title: "PDF Tool"
|
||||
summary: "Analyze one or more PDF documents with native provider support and extraction fallback"
|
||||
read_when:
|
||||
- You want to analyze PDFs from agents
|
||||
- You need exact pdf tool parameters and limits
|
||||
- You are debugging native PDF mode vs extraction fallback
|
||||
---
|
||||
|
||||
# PDF tool
|
||||
|
||||
`pdf` analyzes one or more PDF documents and returns text.
|
||||
|
||||
Quick behavior:
|
||||
|
||||
- Native provider mode for Anthropic and Google model providers.
|
||||
- Extraction fallback mode for other providers (extract text first, then page images when needed).
|
||||
- Supports single (`pdf`) or multi (`pdfs`) input, max 10 PDFs per call.
|
||||
|
||||
## Availability
|
||||
|
||||
The tool is only registered when OpenClaw can resolve a PDF-capable model config for the agent:
|
||||
|
||||
1. `agents.defaults.pdfModel`
|
||||
2. fallback to `agents.defaults.imageModel`
|
||||
3. fallback to best effort provider defaults based on available auth
|
||||
|
||||
If no usable model can be resolved, the `pdf` tool is not exposed.
|
||||
|
||||
## Input reference
|
||||
|
||||
- `pdf` (`string`): one PDF path or URL
|
||||
- `pdfs` (`string[]`): multiple PDF paths or URLs, up to 10 total
|
||||
- `prompt` (`string`): analysis prompt, default `Analyze this PDF document.`
|
||||
- `pages` (`string`): page filter like `1-5` or `1,3,7-9`
|
||||
- `model` (`string`): optional model override (`provider/model`)
|
||||
- `maxBytesMb` (`number`): per-PDF size cap in MB
|
||||
|
||||
Input notes:
|
||||
|
||||
- `pdf` and `pdfs` are merged and deduplicated before loading.
|
||||
- If no PDF input is provided, the tool errors.
|
||||
- `pages` is parsed as 1-based page numbers, deduped, sorted, and clamped to the configured max pages.
|
||||
- `maxBytesMb` defaults to `agents.defaults.pdfMaxBytesMb` or `10`.
|
||||
|
||||
## Supported PDF references
|
||||
|
||||
- local file path (including `~` expansion)
|
||||
- `file://` URL
|
||||
- `http://` and `https://` URL
|
||||
|
||||
Reference notes:
|
||||
|
||||
- Other URI schemes (for example `ftp://`) are rejected with `unsupported_pdf_reference`.
|
||||
- In sandbox mode, remote `http(s)` URLs are rejected.
|
||||
- With workspace-only file policy enabled, local file paths outside allowed roots are rejected.
|
||||
|
||||
## Execution modes
|
||||
|
||||
### Native provider mode
|
||||
|
||||
Native mode is used for provider `anthropic` and `google`.
|
||||
The tool sends raw PDF bytes directly to provider APIs.
|
||||
|
||||
Native mode limits:
|
||||
|
||||
- `pages` is not supported. If set, the tool returns an error.
|
||||
|
||||
### Extraction fallback mode
|
||||
|
||||
Fallback mode is used for non-native providers.
|
||||
|
||||
Flow:
|
||||
|
||||
1. Extract text from selected pages (up to `agents.defaults.pdfMaxPages`, default `20`).
|
||||
2. If extracted text length is below `200` chars, render selected pages to PNG images and include them.
|
||||
3. Send extracted content plus prompt to the selected model.
|
||||
|
||||
Fallback details:
|
||||
|
||||
- Page image extraction uses a pixel budget of `4,000,000`.
|
||||
- If the target model does not support image input and there is no extractable text, the tool errors.
|
||||
- Extraction fallback requires `pdfjs-dist` (and `@napi-rs/canvas` for image rendering).
|
||||
|
||||
## Config
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
pdfModel: {
|
||||
primary: "anthropic/claude-opus-4-6",
|
||||
fallbacks: ["openai/gpt-5-mini"],
|
||||
},
|
||||
pdfMaxBytesMb: 10,
|
||||
pdfMaxPages: 20,
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
See [Configuration Reference](/gateway/configuration-reference) for full field details.
|
||||
|
||||
## Output details
|
||||
|
||||
The tool returns text in `content[0].text` and structured metadata in `details`.
|
||||
|
||||
Common `details` fields:
|
||||
|
||||
- `model`: resolved model ref (`provider/model`)
|
||||
- `native`: `true` for native provider mode, `false` for fallback
|
||||
- `attempts`: fallback attempts that failed before success
|
||||
|
||||
Path fields:
|
||||
|
||||
- single PDF input: `details.pdf`
|
||||
- multiple PDF inputs: `details.pdfs[]` with `pdf` entries
|
||||
- sandbox path rewrite metadata (when applicable): `rewrittenFrom`
|
||||
|
||||
## Error behavior
|
||||
|
||||
- Missing PDF input: throws `pdf required: provide a path or URL to a PDF document`
|
||||
- Too many PDFs: returns structured error in `details.error = "too_many_pdfs"`
|
||||
- Unsupported reference scheme: returns `details.error = "unsupported_pdf_reference"`
|
||||
- Native mode with `pages`: throws clear `pages is not supported with native PDF providers` error
|
||||
|
||||
## Examples
|
||||
|
||||
Single PDF:
|
||||
|
||||
```json
|
||||
{
|
||||
"pdf": "/tmp/report.pdf",
|
||||
"prompt": "Summarize this report in 5 bullets"
|
||||
}
|
||||
```
|
||||
|
||||
Multiple PDFs:
|
||||
|
||||
```json
|
||||
{
|
||||
"pdfs": ["/tmp/q1.pdf", "/tmp/q2.pdf"],
|
||||
"prompt": "Compare risks and timeline changes across both documents"
|
||||
}
|
||||
```
|
||||
|
||||
Page-filtered fallback model:
|
||||
|
||||
```json
|
||||
{
|
||||
"pdf": "https://example.com/report.pdf",
|
||||
"pages": "1-3,7",
|
||||
"model": "openai/gpt-5-mini",
|
||||
"prompt": "Extract only customer-impacting incidents"
|
||||
}
|
||||
```
|
||||
Reference in New Issue
Block a user