mirror of
https://github.com/moltbot/moltbot.git
synced 2026-04-17 19:55:44 +00:00
docs(providers): improve ollama, google, bedrock, minimax, venice with Mintlify components
This commit is contained in:
@@ -8,16 +8,130 @@ title: "Amazon Bedrock"
|
||||
|
||||
# Amazon Bedrock
|
||||
|
||||
OpenClaw can use **Amazon Bedrock** models via pi‑ai’s **Bedrock Converse**
|
||||
OpenClaw can use **Amazon Bedrock** models via pi-ai's **Bedrock Converse**
|
||||
streaming provider. Bedrock auth uses the **AWS SDK default credential chain**,
|
||||
not an API key.
|
||||
|
||||
## What pi-ai supports
|
||||
| Property | Value |
|
||||
| -------- | ----------------------------------------------------------- |
|
||||
| Provider | `amazon-bedrock` |
|
||||
| API | `bedrock-converse-stream` |
|
||||
| Auth | AWS credentials (env vars, shared config, or instance role) |
|
||||
| Region | `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`) |
|
||||
|
||||
- Provider: `amazon-bedrock`
|
||||
- API: `bedrock-converse-stream`
|
||||
- Auth: AWS credentials (env vars, shared config, or instance role)
|
||||
- Region: `AWS_REGION` or `AWS_DEFAULT_REGION` (default: `us-east-1`)
|
||||
## Getting started
|
||||
|
||||
Choose your preferred auth method and follow the setup steps.
|
||||
|
||||
<Tabs>
|
||||
<Tab title="Access keys / env vars">
|
||||
**Best for:** developer machines, CI, or hosts where you manage AWS credentials directly.
|
||||
|
||||
<Steps>
|
||||
<Step title="Set AWS credentials on the gateway host">
|
||||
```bash
|
||||
export AWS_ACCESS_KEY_ID="AKIA..."
|
||||
export AWS_SECRET_ACCESS_KEY="..."
|
||||
export AWS_REGION="us-east-1"
|
||||
# Optional:
|
||||
export AWS_SESSION_TOKEN="..."
|
||||
export AWS_PROFILE="your-profile"
|
||||
# Optional (Bedrock API key/bearer token):
|
||||
export AWS_BEARER_TOKEN_BEDROCK="..."
|
||||
```
|
||||
</Step>
|
||||
<Step title="Add a Bedrock provider and model to your config">
|
||||
No `apiKey` is required. Configure the provider with `auth: "aws-sdk"`:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"amazon-bedrock": {
|
||||
baseUrl: "https://bedrock-runtime.us-east-1.amazonaws.com",
|
||||
api: "bedrock-converse-stream",
|
||||
auth: "aws-sdk",
|
||||
models: [
|
||||
{
|
||||
id: "us.anthropic.claude-opus-4-6-v1:0",
|
||||
name: "Claude Opus 4.6 (Bedrock)",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 200000,
|
||||
maxTokens: 8192,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "amazon-bedrock/us.anthropic.claude-opus-4-6-v1:0" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify models are available">
|
||||
```bash
|
||||
openclaw models list
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Tip>
|
||||
With env-marker auth (`AWS_ACCESS_KEY_ID`, `AWS_PROFILE`, or `AWS_BEARER_TOKEN_BEDROCK`), OpenClaw auto-enables the implicit Bedrock provider for model discovery without extra config.
|
||||
</Tip>
|
||||
|
||||
</Tab>
|
||||
|
||||
<Tab title="EC2 instance roles (IMDS)">
|
||||
**Best for:** EC2 instances with an IAM role attached, using the instance metadata service for authentication.
|
||||
|
||||
<Steps>
|
||||
<Step title="Enable discovery explicitly">
|
||||
When using IMDS, OpenClaw cannot detect AWS auth from env markers alone, so you must opt in:
|
||||
|
||||
```bash
|
||||
openclaw config set plugins.entries.amazon-bedrock.config.discovery.enabled true
|
||||
openclaw config set plugins.entries.amazon-bedrock.config.discovery.region us-east-1
|
||||
```
|
||||
</Step>
|
||||
<Step title="Optionally add an env marker for auto mode">
|
||||
If you also want the env-marker auto-detection path to work (for example, for `openclaw status` surfaces):
|
||||
|
||||
```bash
|
||||
export AWS_PROFILE=default
|
||||
export AWS_REGION=us-east-1
|
||||
```
|
||||
|
||||
You do **not** need a fake API key.
|
||||
</Step>
|
||||
<Step title="Verify models are discovered">
|
||||
```bash
|
||||
openclaw models list
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Warning>
|
||||
The IAM role attached to your EC2 instance must have the following permissions:
|
||||
|
||||
- `bedrock:InvokeModel`
|
||||
- `bedrock:InvokeModelWithResponseStream`
|
||||
- `bedrock:ListFoundationModels` (for automatic discovery)
|
||||
- `bedrock:ListInferenceProfiles` (for inference profile discovery)
|
||||
|
||||
Or attach the managed policy `AmazonBedrockFullAccess`.
|
||||
</Warning>
|
||||
|
||||
<Note>
|
||||
You only need `AWS_PROFILE=default` if you specifically want an env marker for auto mode or status surfaces. The actual Bedrock runtime auth path uses the AWS SDK default chain, so IMDS instance-role auth works even without env markers.
|
||||
</Note>
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Automatic model discovery
|
||||
|
||||
@@ -38,127 +152,52 @@ How the implicit provider is enabled:
|
||||
shared config, SSO, and IMDS instance-role auth can work even when discovery
|
||||
needed `enabled: true` to opt in.
|
||||
|
||||
Config options live under `plugins.entries.amazon-bedrock.config.discovery`:
|
||||
<Note>
|
||||
For explicit `models.providers["amazon-bedrock"]` entries, OpenClaw can still resolve Bedrock env-marker auth early from AWS env markers such as `AWS_BEARER_TOKEN_BEDROCK` without forcing full runtime auth loading. The actual model-call auth path still uses the AWS SDK default chain.
|
||||
</Note>
|
||||
|
||||
```json5
|
||||
{
|
||||
plugins: {
|
||||
entries: {
|
||||
"amazon-bedrock": {
|
||||
config: {
|
||||
discovery: {
|
||||
enabled: true,
|
||||
region: "us-east-1",
|
||||
providerFilter: ["anthropic", "amazon"],
|
||||
refreshInterval: 3600,
|
||||
defaultContextWindow: 32000,
|
||||
defaultMaxTokens: 4096,
|
||||
<AccordionGroup>
|
||||
<Accordion title="Discovery config options">
|
||||
Config options live under `plugins.entries.amazon-bedrock.config.discovery`:
|
||||
|
||||
```json5
|
||||
{
|
||||
plugins: {
|
||||
entries: {
|
||||
"amazon-bedrock": {
|
||||
config: {
|
||||
discovery: {
|
||||
enabled: true,
|
||||
region: "us-east-1",
|
||||
providerFilter: ["anthropic", "amazon"],
|
||||
refreshInterval: 3600,
|
||||
defaultContextWindow: 32000,
|
||||
defaultMaxTokens: 4096,
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
Notes:
|
||||
| Option | Default | Description |
|
||||
| ------ | ------- | ----------- |
|
||||
| `enabled` | auto | In auto mode, OpenClaw only enables the implicit Bedrock provider when it sees a supported AWS env marker. Set `true` to force discovery. |
|
||||
| `region` | `AWS_REGION` / `AWS_DEFAULT_REGION` / `us-east-1` | AWS region used for discovery API calls. |
|
||||
| `providerFilter` | (all) | Matches Bedrock provider names (for example `anthropic`, `amazon`). |
|
||||
| `refreshInterval` | `3600` | Cache duration in seconds. Set to `0` to disable caching. |
|
||||
| `defaultContextWindow` | `32000` | Context window used for discovered models (override if you know your model limits). |
|
||||
| `defaultMaxTokens` | `4096` | Max output tokens used for discovered models (override if you know your model limits). |
|
||||
|
||||
- `enabled` defaults to auto mode. In auto mode, OpenClaw only enables the
|
||||
implicit Bedrock provider when it sees a supported AWS env marker.
|
||||
- `region` defaults to `AWS_REGION` or `AWS_DEFAULT_REGION`, then `us-east-1`.
|
||||
- `providerFilter` matches Bedrock provider names (for example `anthropic`).
|
||||
- `refreshInterval` is seconds; set to `0` to disable caching.
|
||||
- `defaultContextWindow` (default: `32000`) and `defaultMaxTokens` (default: `4096`)
|
||||
are used for discovered models (override if you know your model limits).
|
||||
- For explicit `models.providers["amazon-bedrock"]` entries, OpenClaw can still
|
||||
resolve Bedrock env-marker auth early from AWS env markers such as
|
||||
`AWS_BEARER_TOKEN_BEDROCK` without forcing full runtime auth loading. The
|
||||
actual model-call auth path still uses the AWS SDK default chain.
|
||||
|
||||
## Onboarding
|
||||
|
||||
1. Ensure AWS credentials are available on the **gateway host**:
|
||||
|
||||
```bash
|
||||
export AWS_ACCESS_KEY_ID="AKIA..."
|
||||
export AWS_SECRET_ACCESS_KEY="..."
|
||||
export AWS_REGION="us-east-1"
|
||||
# Optional:
|
||||
export AWS_SESSION_TOKEN="..."
|
||||
export AWS_PROFILE="your-profile"
|
||||
# Optional (Bedrock API key/bearer token):
|
||||
export AWS_BEARER_TOKEN_BEDROCK="..."
|
||||
```
|
||||
|
||||
2. Add a Bedrock provider and model to your config (no `apiKey` required):
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
"amazon-bedrock": {
|
||||
baseUrl: "https://bedrock-runtime.us-east-1.amazonaws.com",
|
||||
api: "bedrock-converse-stream",
|
||||
auth: "aws-sdk",
|
||||
models: [
|
||||
{
|
||||
id: "us.anthropic.claude-opus-4-6-v1:0",
|
||||
name: "Claude Opus 4.6 (Bedrock)",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 200000,
|
||||
maxTokens: 8192,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "amazon-bedrock/us.anthropic.claude-opus-4-6-v1:0" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## EC2 Instance Roles
|
||||
|
||||
When running OpenClaw on an EC2 instance with an IAM role attached, the AWS SDK
|
||||
can use the instance metadata service (IMDS) for authentication. For Bedrock
|
||||
model discovery, OpenClaw only auto-enables the implicit provider from AWS env
|
||||
markers unless you explicitly set
|
||||
`plugins.entries.amazon-bedrock.config.discovery.enabled: true`.
|
||||
|
||||
Recommended setup for IMDS-backed hosts:
|
||||
|
||||
- Set `plugins.entries.amazon-bedrock.config.discovery.enabled` to `true`.
|
||||
- Set `plugins.entries.amazon-bedrock.config.discovery.region` (or export `AWS_REGION`).
|
||||
- You do **not** need a fake API key.
|
||||
- You only need `AWS_PROFILE=default` if you specifically want an env marker
|
||||
for auto mode or status surfaces.
|
||||
|
||||
```bash
|
||||
# Recommended: explicit discovery enable + region
|
||||
openclaw config set plugins.entries.amazon-bedrock.config.discovery.enabled true
|
||||
openclaw config set plugins.entries.amazon-bedrock.config.discovery.region us-east-1
|
||||
|
||||
# Optional: add an env marker if you want auto mode without explicit enable
|
||||
export AWS_PROFILE=default
|
||||
export AWS_REGION=us-east-1
|
||||
```
|
||||
|
||||
**Required IAM permissions** for the EC2 instance role:
|
||||
|
||||
- `bedrock:InvokeModel`
|
||||
- `bedrock:InvokeModelWithResponseStream`
|
||||
- `bedrock:ListFoundationModels` (for automatic discovery)
|
||||
- `bedrock:ListInferenceProfiles` (for inference profile discovery)
|
||||
|
||||
Or attach the managed policy `AmazonBedrockFullAccess`.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Quick setup (AWS path)
|
||||
|
||||
This walkthrough creates an IAM role, attaches Bedrock permissions, associates
|
||||
the instance profile, and enables OpenClaw discovery on the EC2 host.
|
||||
|
||||
```bash
|
||||
# 1. Create IAM role and instance profile
|
||||
aws iam create-role --role-name EC2-Bedrock-Access \
|
||||
@@ -197,106 +236,127 @@ source ~/.bashrc
|
||||
openclaw models list
|
||||
```
|
||||
|
||||
## Inference profiles
|
||||
## Advanced configuration
|
||||
|
||||
OpenClaw discovers **regional and global inference profiles** alongside
|
||||
foundation models. When a profile maps to a known foundation model, the
|
||||
profile inherits that model's capabilities (context window, max tokens,
|
||||
reasoning, vision) and the correct Bedrock request region is injected
|
||||
automatically. This means cross-region Claude profiles work without manual
|
||||
provider overrides.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Inference profiles">
|
||||
OpenClaw discovers **regional and global inference profiles** alongside
|
||||
foundation models. When a profile maps to a known foundation model, the
|
||||
profile inherits that model's capabilities (context window, max tokens,
|
||||
reasoning, vision) and the correct Bedrock request region is injected
|
||||
automatically. This means cross-region Claude profiles work without manual
|
||||
provider overrides.
|
||||
|
||||
Inference profile IDs look like `us.anthropic.claude-opus-4-6-v1:0` (regional)
|
||||
or `anthropic.claude-opus-4-6-v1:0` (global). If the backing model is already
|
||||
in the discovery results, the profile inherits its full capability set;
|
||||
otherwise safe defaults apply.
|
||||
Inference profile IDs look like `us.anthropic.claude-opus-4-6-v1:0` (regional)
|
||||
or `anthropic.claude-opus-4-6-v1:0` (global). If the backing model is already
|
||||
in the discovery results, the profile inherits its full capability set;
|
||||
otherwise safe defaults apply.
|
||||
|
||||
No extra configuration is needed. As long as discovery is enabled and the IAM
|
||||
principal has `bedrock:ListInferenceProfiles`, profiles appear alongside
|
||||
foundation models in `openclaw models list`.
|
||||
No extra configuration is needed. As long as discovery is enabled and the IAM
|
||||
principal has `bedrock:ListInferenceProfiles`, profiles appear alongside
|
||||
foundation models in `openclaw models list`.
|
||||
|
||||
## Notes
|
||||
</Accordion>
|
||||
|
||||
- Bedrock requires **model access** enabled in your AWS account/region.
|
||||
- Automatic discovery needs the `bedrock:ListFoundationModels` and
|
||||
`bedrock:ListInferenceProfiles` permissions.
|
||||
- If you rely on auto mode, set one of the supported AWS auth env markers on the
|
||||
gateway host. If you prefer IMDS/shared-config auth without env markers, set
|
||||
`plugins.entries.amazon-bedrock.config.discovery.enabled: true`.
|
||||
- OpenClaw surfaces the credential source in this order: `AWS_BEARER_TOKEN_BEDROCK`,
|
||||
then `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`, then `AWS_PROFILE`, then the
|
||||
default AWS SDK chain.
|
||||
- Reasoning support depends on the model; check the Bedrock model card for
|
||||
current capabilities.
|
||||
- If you prefer a managed key flow, you can also place an OpenAI‑compatible
|
||||
proxy in front of Bedrock and configure it as an OpenAI provider instead.
|
||||
<Accordion title="Guardrails">
|
||||
You can apply [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html)
|
||||
to all Bedrock model invocations by adding a `guardrail` object to the
|
||||
`amazon-bedrock` plugin config. Guardrails let you enforce content filtering,
|
||||
topic denial, word filters, sensitive information filters, and contextual
|
||||
grounding checks.
|
||||
|
||||
## Guardrails
|
||||
|
||||
You can apply [Amazon Bedrock Guardrails](https://docs.aws.amazon.com/bedrock/latest/userguide/guardrails.html)
|
||||
to all Bedrock model invocations by adding a `guardrail` object to the
|
||||
`amazon-bedrock` plugin config. Guardrails let you enforce content filtering,
|
||||
topic denial, word filters, sensitive information filters, and contextual
|
||||
grounding checks.
|
||||
|
||||
```json5
|
||||
{
|
||||
plugins: {
|
||||
entries: {
|
||||
"amazon-bedrock": {
|
||||
config: {
|
||||
guardrail: {
|
||||
guardrailIdentifier: "abc123", // guardrail ID or full ARN
|
||||
guardrailVersion: "1", // version number or "DRAFT"
|
||||
streamProcessingMode: "sync", // optional: "sync" or "async"
|
||||
trace: "enabled", // optional: "enabled", "disabled", or "enabled_full"
|
||||
```json5
|
||||
{
|
||||
plugins: {
|
||||
entries: {
|
||||
"amazon-bedrock": {
|
||||
config: {
|
||||
guardrail: {
|
||||
guardrailIdentifier: "abc123", // guardrail ID or full ARN
|
||||
guardrailVersion: "1", // version number or "DRAFT"
|
||||
streamProcessingMode: "sync", // optional: "sync" or "async"
|
||||
trace: "enabled", // optional: "enabled", "disabled", or "enabled_full"
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
- `guardrailIdentifier` (required) accepts a guardrail ID (e.g. `abc123`) or a
|
||||
full ARN (e.g. `arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123`).
|
||||
- `guardrailVersion` (required) specifies which published version to use, or
|
||||
`"DRAFT"` for the working draft.
|
||||
- `streamProcessingMode` (optional) controls whether guardrail evaluation runs
|
||||
synchronously (`"sync"`) or asynchronously (`"async"`) during streaming. If
|
||||
omitted, Bedrock uses its default behavior.
|
||||
- `trace` (optional) enables guardrail trace output in the API response. Set to
|
||||
`"enabled"` or `"enabled_full"` for debugging; omit or set `"disabled"` for
|
||||
production.
|
||||
| Option | Required | Description |
|
||||
| ------ | -------- | ----------- |
|
||||
| `guardrailIdentifier` | Yes | Guardrail ID (e.g. `abc123`) or full ARN (e.g. `arn:aws:bedrock:us-east-1:123456789012:guardrail/abc123`). |
|
||||
| `guardrailVersion` | Yes | Published version number, or `"DRAFT"` for the working draft. |
|
||||
| `streamProcessingMode` | No | `"sync"` or `"async"` for guardrail evaluation during streaming. If omitted, Bedrock uses its default. |
|
||||
| `trace` | No | `"enabled"` or `"enabled_full"` for debugging; omit or set `"disabled"` for production. |
|
||||
|
||||
The IAM principal used by the gateway must have the `bedrock:ApplyGuardrail`
|
||||
permission in addition to the standard invoke permissions.
|
||||
<Warning>
|
||||
The IAM principal used by the gateway must have the `bedrock:ApplyGuardrail` permission in addition to the standard invoke permissions.
|
||||
</Warning>
|
||||
|
||||
## Embeddings for memory search
|
||||
</Accordion>
|
||||
|
||||
Bedrock can also serve as the embedding provider for
|
||||
[memory search](/concepts/memory-search). This is configured separately from the
|
||||
inference provider — set `agents.defaults.memorySearch.provider` to `"bedrock"`:
|
||||
<Accordion title="Embeddings for memory search">
|
||||
Bedrock can also serve as the embedding provider for
|
||||
[memory search](/concepts/memory-search). This is configured separately from the
|
||||
inference provider -- set `agents.defaults.memorySearch.provider` to `"bedrock"`:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0", // default
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
memorySearch: {
|
||||
provider: "bedrock",
|
||||
model: "amazon.titan-embed-text-v2:0", // default
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
Bedrock embeddings use the same AWS SDK credential chain as inference (instance
|
||||
roles, SSO, access keys, shared config, and web identity). No API key is
|
||||
needed. When `provider` is `"auto"`, Bedrock is auto-detected if that
|
||||
credential chain resolves successfully.
|
||||
Bedrock embeddings use the same AWS SDK credential chain as inference (instance
|
||||
roles, SSO, access keys, shared config, and web identity). No API key is
|
||||
needed. When `provider` is `"auto"`, Bedrock is auto-detected if that
|
||||
credential chain resolves successfully.
|
||||
|
||||
Supported embedding models include Amazon Titan Embed (v1, v2), Amazon Nova
|
||||
Embed, Cohere Embed (v3, v4), and TwelveLabs Marengo. See
|
||||
[Memory configuration reference — Bedrock](/reference/memory-config#bedrock-embedding-config)
|
||||
for the full model list and dimension options.
|
||||
Supported embedding models include Amazon Titan Embed (v1, v2), Amazon Nova
|
||||
Embed, Cohere Embed (v3, v4), and TwelveLabs Marengo. See
|
||||
[Memory configuration reference -- Bedrock](/reference/memory-config#bedrock-embedding-config)
|
||||
for the full model list and dimension options.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Notes and caveats">
|
||||
- Bedrock requires **model access** enabled in your AWS account/region.
|
||||
- Automatic discovery needs the `bedrock:ListFoundationModels` and
|
||||
`bedrock:ListInferenceProfiles` permissions.
|
||||
- If you rely on auto mode, set one of the supported AWS auth env markers on the
|
||||
gateway host. If you prefer IMDS/shared-config auth without env markers, set
|
||||
`plugins.entries.amazon-bedrock.config.discovery.enabled: true`.
|
||||
- OpenClaw surfaces the credential source in this order: `AWS_BEARER_TOKEN_BEDROCK`,
|
||||
then `AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY`, then `AWS_PROFILE`, then the
|
||||
default AWS SDK chain.
|
||||
- Reasoning support depends on the model; check the Bedrock model card for
|
||||
current capabilities.
|
||||
- If you prefer a managed key flow, you can also place an OpenAI-compatible
|
||||
proxy in front of Bedrock and configure it as an OpenAI provider instead.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Memory search" href="/concepts/memory-search" icon="magnifying-glass">
|
||||
Bedrock embeddings for memory search configuration.
|
||||
</Card>
|
||||
<Card title="Memory config reference" href="/reference/memory-config#bedrock-embedding-config" icon="database">
|
||||
Full Bedrock embedding model list and dimension options.
|
||||
</Card>
|
||||
<Card title="Troubleshooting" href="/help/troubleshooting" icon="wrench">
|
||||
General troubleshooting and FAQ.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -17,74 +17,114 @@ Gemini Grounding.
|
||||
- API: Google Gemini API
|
||||
- Alternative provider: `google-gemini-cli` (OAuth)
|
||||
|
||||
## Quick start
|
||||
## Getting started
|
||||
|
||||
1. Set the API key:
|
||||
Choose your preferred auth method and follow the setup steps.
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice gemini-api-key
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="API key">
|
||||
**Best for:** standard Gemini API access through Google AI Studio.
|
||||
|
||||
2. Set a default model:
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice gemini-api-key
|
||||
```
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "google/gemini-3.1-pro-preview" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
Or pass the key directly:
|
||||
|
||||
## Non-interactive example
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--mode local \
|
||||
--auth-choice gemini-api-key \
|
||||
--gemini-api-key "$GEMINI_API_KEY"
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set a default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "google/gemini-3.1-pro-preview" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider google
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--mode local \
|
||||
--auth-choice gemini-api-key \
|
||||
--gemini-api-key "$GEMINI_API_KEY"
|
||||
```
|
||||
<Tip>
|
||||
The environment variables `GEMINI_API_KEY` and `GOOGLE_API_KEY` are both accepted. Use whichever you already have configured.
|
||||
</Tip>
|
||||
|
||||
## OAuth (Gemini CLI)
|
||||
</Tab>
|
||||
|
||||
An alternative provider `google-gemini-cli` uses PKCE OAuth instead of an API
|
||||
key. This is an unofficial integration; some users report account
|
||||
restrictions. Use at your own risk.
|
||||
<Tab title="Gemini CLI (OAuth)">
|
||||
**Best for:** reusing an existing Gemini CLI login via PKCE OAuth instead of a separate API key.
|
||||
|
||||
- Default model: `google-gemini-cli/gemini-3-flash-preview`
|
||||
- Alias: `gemini-cli`
|
||||
- Install prerequisite: local Gemini CLI available as `gemini`
|
||||
- Homebrew: `brew install gemini-cli`
|
||||
- npm: `npm install -g @google/gemini-cli`
|
||||
- Login:
|
||||
<Warning>
|
||||
The `google-gemini-cli` provider is an unofficial integration. Some users
|
||||
report account restrictions when using OAuth this way. Use at your own risk.
|
||||
</Warning>
|
||||
|
||||
```bash
|
||||
openclaw models auth login --provider google-gemini-cli --set-default
|
||||
```
|
||||
<Steps>
|
||||
<Step title="Install the Gemini CLI">
|
||||
The local `gemini` command must be available on `PATH`.
|
||||
|
||||
Environment variables:
|
||||
```bash
|
||||
# Homebrew
|
||||
brew install gemini-cli
|
||||
|
||||
- `OPENCLAW_GEMINI_OAUTH_CLIENT_ID`
|
||||
- `OPENCLAW_GEMINI_OAUTH_CLIENT_SECRET`
|
||||
# or npm
|
||||
npm install -g @google/gemini-cli
|
||||
```
|
||||
|
||||
(Or the `GEMINI_CLI_*` variants.)
|
||||
OpenClaw supports both Homebrew installs and global npm installs, including
|
||||
common Windows/npm layouts.
|
||||
</Step>
|
||||
<Step title="Log in via OAuth">
|
||||
```bash
|
||||
openclaw models auth login --provider google-gemini-cli --set-default
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider google-gemini-cli
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
If Gemini CLI OAuth requests fail after login, set
|
||||
`GOOGLE_CLOUD_PROJECT` or `GOOGLE_CLOUD_PROJECT_ID` on the gateway host and
|
||||
retry.
|
||||
- Default model: `google-gemini-cli/gemini-3-flash-preview`
|
||||
- Alias: `gemini-cli`
|
||||
|
||||
If login fails before the browser flow starts, make sure the local `gemini`
|
||||
command is installed and on `PATH`. OpenClaw supports both Homebrew installs
|
||||
and global npm installs, including common Windows/npm layouts.
|
||||
**Environment variables:**
|
||||
|
||||
Gemini CLI JSON usage notes:
|
||||
- `OPENCLAW_GEMINI_OAUTH_CLIENT_ID`
|
||||
- `OPENCLAW_GEMINI_OAUTH_CLIENT_SECRET`
|
||||
|
||||
- Reply text comes from the CLI JSON `response` field.
|
||||
- Usage falls back to `stats` when the CLI leaves `usage` empty.
|
||||
- `stats.cached` is normalized into OpenClaw `cacheRead`.
|
||||
- If `stats.input` is missing, OpenClaw derives input tokens from
|
||||
`stats.input_tokens - stats.cached`.
|
||||
(Or the `GEMINI_CLI_*` variants.)
|
||||
|
||||
<Note>
|
||||
If Gemini CLI OAuth requests fail after login, set `GOOGLE_CLOUD_PROJECT` or
|
||||
`GOOGLE_CLOUD_PROJECT_ID` on the gateway host and retry.
|
||||
</Note>
|
||||
|
||||
<Note>
|
||||
If login fails before the browser flow starts, make sure the local `gemini`
|
||||
command is installed and on `PATH`.
|
||||
</Note>
|
||||
|
||||
The OAuth-only `google-gemini-cli` provider is a separate text-inference
|
||||
surface. Image generation, media understanding, and Gemini Grounding stay on
|
||||
the `google` provider id.
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Capabilities
|
||||
|
||||
@@ -100,37 +140,12 @@ Gemini CLI JSON usage notes:
|
||||
| Thinking/reasoning | Yes (Gemini 3.1+) |
|
||||
| Gemma 4 models | Yes |
|
||||
|
||||
Gemma 4 models (for example `gemma-4-26b-a4b-it`) support thinking mode. OpenClaw rewrites `thinkingBudget` to a supported Google `thinkingLevel` for Gemma 4. Setting thinking to `off` preserves thinking disabled instead of mapping to `MINIMAL`.
|
||||
|
||||
## Direct Gemini cache reuse
|
||||
|
||||
For direct Gemini API runs (`api: "google-generative-ai"`), OpenClaw now
|
||||
passes a configured `cachedContent` handle through to Gemini requests.
|
||||
|
||||
- Configure per-model or global params with either
|
||||
`cachedContent` or legacy `cached_content`
|
||||
- If both are present, `cachedContent` wins
|
||||
- Example value: `cachedContents/prebuilt-context`
|
||||
- Gemini cache-hit usage is normalized into OpenClaw `cacheRead` from
|
||||
upstream `cachedContentTokenCount`
|
||||
|
||||
Example:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"google/gemini-2.5-pro": {
|
||||
params: {
|
||||
cachedContent: "cachedContents/prebuilt-context",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
<Tip>
|
||||
Gemma 4 models (for example `gemma-4-26b-a4b-it`) support thinking mode. OpenClaw
|
||||
rewrites `thinkingBudget` to a supported Google `thinkingLevel` for Gemma 4.
|
||||
Setting thinking to `off` preserves thinking disabled instead of mapping to
|
||||
`MINIMAL`.
|
||||
</Tip>
|
||||
|
||||
## Image generation
|
||||
|
||||
@@ -142,10 +157,6 @@ The bundled `google` image-generation provider defaults to
|
||||
- Edit mode: enabled, up to 5 input images
|
||||
- Geometry controls: `size`, `aspectRatio`, and `resolution`
|
||||
|
||||
The OAuth-only `google-gemini-cli` provider is a separate text-inference
|
||||
surface. Image generation, media understanding, and Gemini Grounding stay on
|
||||
the `google` provider id.
|
||||
|
||||
To use Google as the default image provider:
|
||||
|
||||
```json5
|
||||
@@ -160,8 +171,9 @@ To use Google as the default image provider:
|
||||
}
|
||||
```
|
||||
|
||||
See [Image Generation](/tools/image-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Image Generation](/tools/image-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Video generation
|
||||
|
||||
@@ -187,8 +199,9 @@ To use Google as the default video provider:
|
||||
}
|
||||
```
|
||||
|
||||
See [Video Generation](/tools/video-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Music generation
|
||||
|
||||
@@ -216,11 +229,74 @@ To use Google as the default music provider:
|
||||
}
|
||||
```
|
||||
|
||||
See [Music Generation](/tools/music-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Music Generation](/tools/music-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Environment note
|
||||
## Advanced configuration
|
||||
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY`
|
||||
is available to that process (for example, in `~/.openclaw/.env` or via
|
||||
`env.shellEnv`).
|
||||
<AccordionGroup>
|
||||
<Accordion title="Direct Gemini cache reuse">
|
||||
For direct Gemini API runs (`api: "google-generative-ai"`), OpenClaw
|
||||
passes a configured `cachedContent` handle through to Gemini requests.
|
||||
|
||||
- Configure per-model or global params with either
|
||||
`cachedContent` or legacy `cached_content`
|
||||
- If both are present, `cachedContent` wins
|
||||
- Example value: `cachedContents/prebuilt-context`
|
||||
- Gemini cache-hit usage is normalized into OpenClaw `cacheRead` from
|
||||
upstream `cachedContentTokenCount`
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"google/gemini-2.5-pro": {
|
||||
params: {
|
||||
cachedContent: "cachedContents/prebuilt-context",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Gemini CLI JSON usage notes">
|
||||
When using the `google-gemini-cli` OAuth provider, OpenClaw normalizes
|
||||
the CLI JSON output as follows:
|
||||
|
||||
- Reply text comes from the CLI JSON `response` field.
|
||||
- Usage falls back to `stats` when the CLI leaves `usage` empty.
|
||||
- `stats.cached` is normalized into OpenClaw `cacheRead`.
|
||||
- If `stats.input` is missing, OpenClaw derives input tokens from
|
||||
`stats.input_tokens - stats.cached`.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Environment and daemon setup">
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `GEMINI_API_KEY`
|
||||
is available to that process (for example, in `~/.openclaw/.env` or via
|
||||
`env.shellEnv`).
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Image generation" href="/tools/image-generation" icon="image">
|
||||
Shared image tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Video generation" href="/tools/video-generation" icon="video">
|
||||
Shared video tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Music generation" href="/tools/music-generation" icon="music">
|
||||
Shared music tool parameters and provider selection.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -12,31 +12,212 @@ OpenClaw's MiniMax provider defaults to **MiniMax M2.7**.
|
||||
|
||||
MiniMax also provides:
|
||||
|
||||
- bundled speech synthesis via T2A v2
|
||||
- bundled image understanding via `MiniMax-VL-01`
|
||||
- bundled music generation via `music-2.5+`
|
||||
- bundled `web_search` through the MiniMax Coding Plan search API
|
||||
- Bundled speech synthesis via T2A v2
|
||||
- Bundled image understanding via `MiniMax-VL-01`
|
||||
- Bundled music generation via `music-2.5+`
|
||||
- Bundled `web_search` through the MiniMax Coding Plan search API
|
||||
|
||||
Provider split:
|
||||
|
||||
- `minimax`: API-key text provider, plus bundled image generation, image understanding, speech, and web search
|
||||
- `minimax-portal`: OAuth text provider, plus bundled image generation and image understanding
|
||||
| Provider ID | Auth | Capabilities |
|
||||
| ---------------- | ------- | --------------------------------------------------------------- |
|
||||
| `minimax` | API key | Text, image generation, image understanding, speech, web search |
|
||||
| `minimax-portal` | OAuth | Text, image generation, image understanding |
|
||||
|
||||
## Model lineup
|
||||
|
||||
- `MiniMax-M2.7`: default hosted reasoning model.
|
||||
- `MiniMax-M2.7-highspeed`: faster M2.7 reasoning tier.
|
||||
- `image-01`: image generation model (generate and image-to-image editing).
|
||||
| Model | Type | Description |
|
||||
| ------------------------ | ---------------- | ---------------------------------------- |
|
||||
| `MiniMax-M2.7` | Chat (reasoning) | Default hosted reasoning model |
|
||||
| `MiniMax-M2.7-highspeed` | Chat (reasoning) | Faster M2.7 reasoning tier |
|
||||
| `MiniMax-VL-01` | Vision | Image understanding model |
|
||||
| `image-01` | Image generation | Text-to-image and image-to-image editing |
|
||||
| `music-2.5+` | Music generation | Default music model |
|
||||
| `music-2.5` | Music generation | Previous music generation tier |
|
||||
| `music-2.0` | Music generation | Legacy music generation tier |
|
||||
| `MiniMax-Hailuo-2.3` | Video generation | Text-to-video and image reference flows |
|
||||
|
||||
## Image generation
|
||||
## Getting started
|
||||
|
||||
Choose your preferred auth method and follow the setup steps.
|
||||
|
||||
<Tabs>
|
||||
<Tab title="OAuth (Coding Plan)">
|
||||
**Best for:** quick setup with MiniMax Coding Plan via OAuth, no API key required.
|
||||
|
||||
<Tabs>
|
||||
<Tab title="International">
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-global-oauth
|
||||
```
|
||||
|
||||
This authenticates against `api.minimax.io`.
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider minimax-portal
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
</Tab>
|
||||
<Tab title="China">
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-cn-oauth
|
||||
```
|
||||
|
||||
This authenticates against `api.minimaxi.com`.
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider minimax-portal
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
<Note>
|
||||
OAuth setups use the `minimax-portal` provider id. Model refs follow the form `minimax-portal/MiniMax-M2.7`.
|
||||
</Note>
|
||||
|
||||
<Tip>
|
||||
Referral link for MiniMax Coding Plan (10% off): [MiniMax Coding Plan](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link)
|
||||
</Tip>
|
||||
|
||||
</Tab>
|
||||
|
||||
<Tab title="API key">
|
||||
**Best for:** hosted MiniMax with Anthropic-compatible API.
|
||||
|
||||
<Tabs>
|
||||
<Tab title="International">
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-global-api
|
||||
```
|
||||
|
||||
This configures `api.minimax.io` as the base URL.
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider minimax
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
</Tab>
|
||||
<Tab title="China">
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-cn-api
|
||||
```
|
||||
|
||||
This configures `api.minimaxi.com` as the base URL.
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider minimax
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Config example
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { MINIMAX_API_KEY: "sk-..." },
|
||||
agents: { defaults: { model: { primary: "minimax/MiniMax-M2.7" } } },
|
||||
models: {
|
||||
mode: "merge",
|
||||
providers: {
|
||||
minimax: {
|
||||
baseUrl: "https://api.minimax.io/anthropic",
|
||||
apiKey: "${MINIMAX_API_KEY}",
|
||||
api: "anthropic-messages",
|
||||
models: [
|
||||
{
|
||||
id: "MiniMax-M2.7",
|
||||
name: "MiniMax M2.7",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0.3, output: 1.2, cacheRead: 0.06, cacheWrite: 0.375 },
|
||||
contextWindow: 204800,
|
||||
maxTokens: 131072,
|
||||
},
|
||||
{
|
||||
id: "MiniMax-M2.7-highspeed",
|
||||
name: "MiniMax M2.7 Highspeed",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0.6, output: 2.4, cacheRead: 0.06, cacheWrite: 0.375 },
|
||||
contextWindow: 204800,
|
||||
maxTokens: 131072,
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
<Warning>
|
||||
On the Anthropic-compatible streaming path, OpenClaw disables MiniMax thinking by default unless you explicitly set `thinking` yourself. MiniMax's streaming endpoint emits `reasoning_content` in OpenAI-style delta chunks instead of native Anthropic thinking blocks, which can leak internal reasoning into visible output if left enabled implicitly.
|
||||
</Warning>
|
||||
|
||||
<Note>
|
||||
API-key setups use the `minimax` provider id. Model refs follow the form `minimax/MiniMax-M2.7`.
|
||||
</Note>
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Configure via `openclaw configure`
|
||||
|
||||
Use the interactive config wizard to set MiniMax without editing JSON:
|
||||
|
||||
<Steps>
|
||||
<Step title="Launch the wizard">
|
||||
```bash
|
||||
openclaw configure
|
||||
```
|
||||
</Step>
|
||||
<Step title="Select Model/auth">
|
||||
Choose **Model/auth** from the menu.
|
||||
</Step>
|
||||
<Step title="Choose a MiniMax auth option">
|
||||
Pick one of the available MiniMax options:
|
||||
|
||||
| Auth choice | Description |
|
||||
| --- | --- |
|
||||
| `minimax-global-oauth` | International OAuth (Coding Plan) |
|
||||
| `minimax-cn-oauth` | China OAuth (Coding Plan) |
|
||||
| `minimax-global-api` | International API key |
|
||||
| `minimax-cn-api` | China API key |
|
||||
|
||||
</Step>
|
||||
<Step title="Pick your default model">
|
||||
Select your default model when prompted.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Capabilities
|
||||
|
||||
### Image generation
|
||||
|
||||
The MiniMax plugin registers the `image-01` model for the `image_generate` tool. It supports:
|
||||
|
||||
- **Text-to-image generation** with aspect ratio control.
|
||||
- **Image-to-image editing** (subject reference) with aspect ratio control.
|
||||
- Up to **9 output images** per request.
|
||||
- Up to **1 reference image** per edit request.
|
||||
- Supported aspect ratios: `1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `21:9`.
|
||||
- **Text-to-image generation** with aspect ratio control
|
||||
- **Image-to-image editing** (subject reference) with aspect ratio control
|
||||
- Up to **9 output images** per request
|
||||
- Up to **1 reference image** per edit request
|
||||
- Supported aspect ratios: `1:1`, `16:9`, `4:3`, `3:2`, `2:3`, `3:4`, `9:16`, `21:9`
|
||||
|
||||
To use MiniMax for image generation, set it as the image generation provider:
|
||||
|
||||
@@ -64,10 +245,11 @@ The built-in bundled MiniMax text catalog itself stays text-only metadata until
|
||||
that explicit provider config exists. Image understanding is exposed separately
|
||||
through the plugin-owned `MiniMax-VL-01` media provider.
|
||||
|
||||
See [Image Generation](/tools/image-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Image Generation](/tools/image-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Music generation
|
||||
### Music generation
|
||||
|
||||
The bundled `minimax` plugin also registers music generation through the shared
|
||||
`music_generate` tool.
|
||||
@@ -92,10 +274,11 @@ To use MiniMax as the default music provider:
|
||||
}
|
||||
```
|
||||
|
||||
See [Music Generation](/tools/music-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Music Generation](/tools/music-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Video generation
|
||||
### Video generation
|
||||
|
||||
The bundled `minimax` plugin also registers video generation through the shared
|
||||
`video_generate` tool.
|
||||
@@ -118,21 +301,24 @@ To use MiniMax as the default video provider:
|
||||
}
|
||||
```
|
||||
|
||||
See [Video Generation](/tools/video-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<Note>
|
||||
See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
## Image understanding
|
||||
### Image understanding
|
||||
|
||||
The MiniMax plugin registers image understanding separately from the text
|
||||
catalog:
|
||||
|
||||
- `minimax`: default image model `MiniMax-VL-01`
|
||||
- `minimax-portal`: default image model `MiniMax-VL-01`
|
||||
| Provider ID | Default image model |
|
||||
| ---------------- | ------------------- |
|
||||
| `minimax` | `MiniMax-VL-01` |
|
||||
| `minimax-portal` | `MiniMax-VL-01` |
|
||||
|
||||
That is why automatic media routing can use MiniMax image understanding even
|
||||
when the bundled text-provider catalog still shows text-only M2.7 chat refs.
|
||||
|
||||
## Web search
|
||||
### Web search
|
||||
|
||||
The MiniMax plugin also registers `web_search` through the MiniMax Coding Plan
|
||||
search API.
|
||||
@@ -146,136 +332,66 @@ search API.
|
||||
- Search stays on provider id `minimax`; OAuth CN/global setup can still steer region indirectly through `models.providers.minimax-portal.baseUrl`
|
||||
|
||||
Config lives under `plugins.entries.minimax.config.webSearch.*`.
|
||||
See [MiniMax Search](/tools/minimax-search).
|
||||
|
||||
## Choose a setup
|
||||
<Note>
|
||||
See [MiniMax Search](/tools/minimax-search) for full web search configuration and usage.
|
||||
</Note>
|
||||
|
||||
### MiniMax OAuth (Coding Plan) - recommended
|
||||
## Advanced configuration
|
||||
|
||||
**Best for:** quick setup with MiniMax Coding Plan via OAuth, no API key required.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Configuration options">
|
||||
| Option | Description |
|
||||
| --- | --- |
|
||||
| `models.providers.minimax.baseUrl` | Prefer `https://api.minimax.io/anthropic` (Anthropic-compatible); `https://api.minimax.io/v1` is optional for OpenAI-compatible payloads |
|
||||
| `models.providers.minimax.api` | Prefer `anthropic-messages`; `openai-completions` is optional for OpenAI-compatible payloads |
|
||||
| `models.providers.minimax.apiKey` | MiniMax API key (`MINIMAX_API_KEY`) |
|
||||
| `models.providers.minimax.models` | Define `id`, `name`, `reasoning`, `contextWindow`, `maxTokens`, `cost` |
|
||||
| `agents.defaults.models` | Alias models you want in the allowlist |
|
||||
| `models.mode` | Keep `merge` if you want to add MiniMax alongside built-ins |
|
||||
</Accordion>
|
||||
|
||||
Authenticate with the explicit regional OAuth choice:
|
||||
<Accordion title="Thinking defaults">
|
||||
On `api: "anthropic-messages"`, OpenClaw injects `thinking: { type: "disabled" }` unless thinking is already explicitly set in params/config.
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-global-oauth
|
||||
# or
|
||||
openclaw onboard --auth-choice minimax-cn-oauth
|
||||
```
|
||||
This prevents MiniMax's streaming endpoint from emitting `reasoning_content` in OpenAI-style delta chunks, which would leak internal reasoning into visible output.
|
||||
|
||||
Choice mapping:
|
||||
</Accordion>
|
||||
|
||||
- `minimax-global-oauth`: International users (`api.minimax.io`)
|
||||
- `minimax-cn-oauth`: Users in China (`api.minimaxi.com`)
|
||||
<Accordion title="Fast mode">
|
||||
`/fast on` or `params.fastMode: true` rewrites `MiniMax-M2.7` to `MiniMax-M2.7-highspeed` on the Anthropic-compatible stream path.
|
||||
</Accordion>
|
||||
|
||||
See the MiniMax plugin package README in the OpenClaw repo for details.
|
||||
<Accordion title="Fallback example">
|
||||
**Best for:** keep your strongest latest-generation model as primary, fail over to MiniMax M2.7. Example below uses Opus as a concrete primary; swap to your preferred latest-gen primary model.
|
||||
|
||||
### MiniMax M2.7 (API key)
|
||||
|
||||
**Best for:** hosted MiniMax with Anthropic-compatible API.
|
||||
|
||||
Configure via CLI:
|
||||
|
||||
- Interactive onboarding:
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice minimax-global-api
|
||||
# or
|
||||
openclaw onboard --auth-choice minimax-cn-api
|
||||
```
|
||||
|
||||
- `minimax-global-api`: International users (`api.minimax.io`)
|
||||
- `minimax-cn-api`: Users in China (`api.minimaxi.com`)
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { MINIMAX_API_KEY: "sk-..." },
|
||||
agents: { defaults: { model: { primary: "minimax/MiniMax-M2.7" } } },
|
||||
models: {
|
||||
mode: "merge",
|
||||
providers: {
|
||||
minimax: {
|
||||
baseUrl: "https://api.minimax.io/anthropic",
|
||||
apiKey: "${MINIMAX_API_KEY}",
|
||||
api: "anthropic-messages",
|
||||
models: [
|
||||
{
|
||||
id: "MiniMax-M2.7",
|
||||
name: "MiniMax M2.7",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0.3, output: 1.2, cacheRead: 0.06, cacheWrite: 0.375 },
|
||||
contextWindow: 204800,
|
||||
maxTokens: 131072,
|
||||
```json5
|
||||
{
|
||||
env: { MINIMAX_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"anthropic/claude-opus-4-6": { alias: "primary" },
|
||||
"minimax/MiniMax-M2.7": { alias: "minimax" },
|
||||
},
|
||||
{
|
||||
id: "MiniMax-M2.7-highspeed",
|
||||
name: "MiniMax M2.7 Highspeed",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0.6, output: 2.4, cacheRead: 0.06, cacheWrite: 0.375 },
|
||||
contextWindow: 204800,
|
||||
maxTokens: 131072,
|
||||
model: {
|
||||
primary: "anthropic/claude-opus-4-6",
|
||||
fallbacks: ["minimax/MiniMax-M2.7"],
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
On the Anthropic-compatible streaming path, OpenClaw now disables MiniMax
|
||||
thinking by default unless you explicitly set `thinking` yourself. MiniMax's
|
||||
streaming endpoint emits `reasoning_content` in OpenAI-style delta chunks
|
||||
instead of native Anthropic thinking blocks, which can leak internal reasoning
|
||||
into visible output if left enabled implicitly.
|
||||
</Accordion>
|
||||
|
||||
### MiniMax M2.7 as fallback (example)
|
||||
|
||||
**Best for:** keep your strongest latest-generation model as primary, fail over to MiniMax M2.7.
|
||||
Example below uses Opus as a concrete primary; swap to your preferred latest-gen primary model.
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { MINIMAX_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"anthropic/claude-opus-4-6": { alias: "primary" },
|
||||
"minimax/MiniMax-M2.7": { alias: "minimax" },
|
||||
},
|
||||
model: {
|
||||
primary: "anthropic/claude-opus-4-6",
|
||||
fallbacks: ["minimax/MiniMax-M2.7"],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
## Configure via `openclaw configure`
|
||||
|
||||
Use the interactive config wizard to set MiniMax without editing JSON:
|
||||
|
||||
1. Run `openclaw configure`.
|
||||
2. Select **Model/auth**.
|
||||
3. Choose a **MiniMax** auth option.
|
||||
4. Pick your default model when prompted.
|
||||
|
||||
Current MiniMax auth choices in the wizard/CLI:
|
||||
|
||||
- `minimax-global-oauth`
|
||||
- `minimax-cn-oauth`
|
||||
- `minimax-global-api`
|
||||
- `minimax-cn-api`
|
||||
|
||||
## Configuration options
|
||||
|
||||
- `models.providers.minimax.baseUrl`: prefer `https://api.minimax.io/anthropic` (Anthropic-compatible); `https://api.minimax.io/v1` is optional for OpenAI-compatible payloads.
|
||||
- `models.providers.minimax.api`: prefer `anthropic-messages`; `openai-completions` is optional for OpenAI-compatible payloads.
|
||||
- `models.providers.minimax.apiKey`: MiniMax API key (`MINIMAX_API_KEY`).
|
||||
- `models.providers.minimax.models`: define `id`, `name`, `reasoning`, `contextWindow`, `maxTokens`, `cost`.
|
||||
- `agents.defaults.models`: alias models you want in the allowlist.
|
||||
- `models.mode`: keep `merge` if you want to add MiniMax alongside built-ins.
|
||||
<Accordion title="Coding Plan usage details">
|
||||
- Coding Plan usage API: `https://api.minimaxi.com/v1/api/openplatform/coding_plan/remains` (requires a coding plan key).
|
||||
- OpenClaw normalizes MiniMax coding-plan usage to the same `% left` display used by other providers. MiniMax's raw `usage_percent` / `usagePercent` fields are remaining quota, not consumed quota, so OpenClaw inverts them. Count-based fields win when present.
|
||||
- When the API returns `model_remains`, OpenClaw prefers the chat-model entry, derives the window label from `start_time` / `end_time` when needed, and includes the selected model name in the plan label so coding-plan windows are easier to distinguish.
|
||||
- Usage snapshots treat `minimax`, `minimax-cn`, and `minimax-portal` as the same MiniMax quota surface, and prefer stored MiniMax OAuth before falling back to Coding Plan key env vars.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Notes
|
||||
|
||||
@@ -284,56 +400,67 @@ Current MiniMax auth choices in the wizard/CLI:
|
||||
- OAuth setup: `minimax-portal/<model>`
|
||||
- Default chat model: `MiniMax-M2.7`
|
||||
- Alternate chat model: `MiniMax-M2.7-highspeed`
|
||||
- On `api: "anthropic-messages"`, OpenClaw injects
|
||||
`thinking: { type: "disabled" }` unless thinking is already explicitly set in
|
||||
params/config.
|
||||
- `/fast on` or `params.fastMode: true` rewrites `MiniMax-M2.7` to
|
||||
`MiniMax-M2.7-highspeed` on the Anthropic-compatible stream path.
|
||||
- Onboarding and direct API-key setup write explicit model definitions with
|
||||
`input: ["text", "image"]` for both M2.7 variants
|
||||
- The bundled provider catalog currently exposes the chat refs as text-only
|
||||
metadata until explicit MiniMax provider config exists
|
||||
- Coding Plan usage API: `https://api.minimaxi.com/v1/api/openplatform/coding_plan/remains` (requires a coding plan key).
|
||||
- OpenClaw normalizes MiniMax coding-plan usage to the same `% left` display
|
||||
used by other providers. MiniMax's raw `usage_percent` / `usagePercent`
|
||||
fields are remaining quota, not consumed quota, so OpenClaw inverts them.
|
||||
Count-based fields win when present. When the API returns `model_remains`,
|
||||
OpenClaw prefers the chat-model entry, derives the window label from
|
||||
`start_time` / `end_time` when needed, and includes the selected model name
|
||||
in the plan label so coding-plan windows are easier to distinguish.
|
||||
- Usage snapshots treat `minimax`, `minimax-cn`, and `minimax-portal` as the
|
||||
same MiniMax quota surface, and prefer stored MiniMax OAuth before falling
|
||||
back to Coding Plan key env vars.
|
||||
- Update pricing values in `models.json` if you need exact cost tracking.
|
||||
- Referral link for MiniMax Coding Plan (10% off): [https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link)
|
||||
- See [/concepts/model-providers](/concepts/model-providers) for provider rules.
|
||||
- Use `openclaw models list` to confirm the current provider id, then switch with
|
||||
`openclaw models set minimax/MiniMax-M2.7` or
|
||||
`openclaw models set minimax-portal/MiniMax-M2.7`.
|
||||
- Onboarding and direct API-key setup write explicit model definitions with `input: ["text", "image"]` for both M2.7 variants
|
||||
- The bundled provider catalog currently exposes the chat refs as text-only metadata until explicit MiniMax provider config exists
|
||||
- Update pricing values in `models.json` if you need exact cost tracking
|
||||
- Use `openclaw models list` to confirm the current provider id, then switch with `openclaw models set minimax/MiniMax-M2.7` or `openclaw models set minimax-portal/MiniMax-M2.7`
|
||||
|
||||
<Tip>
|
||||
Referral link for MiniMax Coding Plan (10% off): [MiniMax Coding Plan](https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link)
|
||||
</Tip>
|
||||
|
||||
<Note>
|
||||
See [Model providers](/concepts/model-providers) for provider rules.
|
||||
</Note>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### "Unknown model: minimax/MiniMax-M2.7"
|
||||
<AccordionGroup>
|
||||
<Accordion title='"Unknown model: minimax/MiniMax-M2.7"'>
|
||||
This usually means the **MiniMax provider is not configured** (no matching provider entry and no MiniMax auth profile/env key found). A fix for this detection is in **2026.1.12**. Fix by:
|
||||
|
||||
This usually means the **MiniMax provider isn’t configured** (no matching
|
||||
provider entry and no MiniMax auth profile/env key found). A fix for this
|
||||
detection is in **2026.1.12**. Fix by:
|
||||
- Upgrading to **2026.1.12** (or run from source `main`), then restarting the gateway.
|
||||
- Running `openclaw configure` and selecting a **MiniMax** auth option, or
|
||||
- Adding the matching `models.providers.minimax` or `models.providers.minimax-portal` block manually, or
|
||||
- Setting `MINIMAX_API_KEY`, `MINIMAX_OAUTH_TOKEN`, or a MiniMax auth profile so the matching provider can be injected.
|
||||
|
||||
- Upgrading to **2026.1.12** (or run from source `main`), then restarting the gateway.
|
||||
- Running `openclaw configure` and selecting a **MiniMax** auth option, or
|
||||
- Adding the matching `models.providers.minimax` or
|
||||
`models.providers.minimax-portal` block manually, or
|
||||
- Setting `MINIMAX_API_KEY`, `MINIMAX_OAUTH_TOKEN`, or a MiniMax auth profile
|
||||
so the matching provider can be injected.
|
||||
Make sure the model id is **case-sensitive**:
|
||||
|
||||
Make sure the model id is **case‑sensitive**:
|
||||
- API-key path: `minimax/MiniMax-M2.7` or `minimax/MiniMax-M2.7-highspeed`
|
||||
- OAuth path: `minimax-portal/MiniMax-M2.7` or `minimax-portal/MiniMax-M2.7-highspeed`
|
||||
|
||||
- API-key path: `minimax/MiniMax-M2.7` or `minimax/MiniMax-M2.7-highspeed`
|
||||
- OAuth path: `minimax-portal/MiniMax-M2.7` or
|
||||
`minimax-portal/MiniMax-M2.7-highspeed`
|
||||
Then recheck with:
|
||||
|
||||
Then recheck with:
|
||||
```bash
|
||||
openclaw models list
|
||||
```
|
||||
|
||||
```bash
|
||||
openclaw models list
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
<Note>
|
||||
More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
|
||||
</Note>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Image generation" href="/tools/image-generation" icon="image">
|
||||
Shared image tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Music generation" href="/tools/music-generation" icon="music">
|
||||
Shared music tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Video generation" href="/tools/video-generation" icon="video">
|
||||
Shared video tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="MiniMax Search" href="/tools/minimax-search" icon="magnifying-glass">
|
||||
Web search configuration via MiniMax Coding Plan.
|
||||
</Card>
|
||||
<Card title="Troubleshooting" href="/help/troubleshooting" icon="wrench">
|
||||
General troubleshooting and FAQ.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -14,122 +14,154 @@ Ollama is a local LLM runtime that makes it easy to run open-source models on yo
|
||||
**Remote Ollama users**: Do not use the `/v1` OpenAI-compatible URL (`http://host:11434/v1`) with OpenClaw. This breaks tool calling and models may output raw tool JSON as plain text. Use the native Ollama API URL instead: `baseUrl: "http://host:11434"` (no `/v1`).
|
||||
</Warning>
|
||||
|
||||
## Quick start
|
||||
## Getting started
|
||||
|
||||
### Onboarding (recommended)
|
||||
Choose your preferred setup method and mode.
|
||||
|
||||
The fastest way to set up Ollama is through onboarding:
|
||||
<Tabs>
|
||||
<Tab title="Onboarding (recommended)">
|
||||
**Best for:** fastest path to a working Ollama setup with automatic model discovery.
|
||||
|
||||
```bash
|
||||
openclaw onboard
|
||||
```
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard
|
||||
```
|
||||
|
||||
Select **Ollama** from the provider list. Onboarding will:
|
||||
Select **Ollama** from the provider list.
|
||||
</Step>
|
||||
<Step title="Choose your mode">
|
||||
- **Cloud + Local** — cloud-hosted models and local models together
|
||||
- **Local** — local models only
|
||||
|
||||
1. Ask for the Ollama base URL where your instance can be reached (default `http://127.0.0.1:11434`).
|
||||
2. Let you choose **Cloud + Local** (cloud models and local models) or **Local** (local models only).
|
||||
3. Open a browser sign-in flow if you choose **Cloud + Local** and are not signed in to ollama.com.
|
||||
4. Discover available models and suggest defaults.
|
||||
5. Auto-pull the selected model if it is not available locally.
|
||||
If you choose **Cloud + Local** and are not signed in to ollama.com, onboarding opens a browser sign-in flow.
|
||||
</Step>
|
||||
<Step title="Select a model">
|
||||
Onboarding discovers available models and suggests defaults. It auto-pulls the selected model if it is not available locally.
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider ollama
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
Non-interactive mode is also supported:
|
||||
### Non-interactive mode
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice ollama \
|
||||
--accept-risk
|
||||
```
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice ollama \
|
||||
--accept-risk
|
||||
```
|
||||
|
||||
Optionally specify a custom base URL or model:
|
||||
Optionally specify a custom base URL or model:
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice ollama \
|
||||
--custom-base-url "http://ollama-host:11434" \
|
||||
--custom-model-id "qwen3.5:27b" \
|
||||
--accept-risk
|
||||
```
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice ollama \
|
||||
--custom-base-url "http://ollama-host:11434" \
|
||||
--custom-model-id "qwen3.5:27b" \
|
||||
--accept-risk
|
||||
```
|
||||
|
||||
### Manual setup
|
||||
</Tab>
|
||||
|
||||
1. Install Ollama: [https://ollama.com/download](https://ollama.com/download)
|
||||
<Tab title="Manual setup">
|
||||
**Best for:** full control over installation, model pulls, and config.
|
||||
|
||||
2. Pull a local model if you want local inference:
|
||||
<Steps>
|
||||
<Step title="Install Ollama">
|
||||
Download from [ollama.com/download](https://ollama.com/download).
|
||||
</Step>
|
||||
<Step title="Pull a local model">
|
||||
```bash
|
||||
ollama pull gemma4
|
||||
# or
|
||||
ollama pull gpt-oss:20b
|
||||
# or
|
||||
ollama pull llama3.3
|
||||
```
|
||||
</Step>
|
||||
<Step title="Sign in for cloud models (optional)">
|
||||
If you want cloud models too:
|
||||
|
||||
```bash
|
||||
ollama pull gemma4
|
||||
# or
|
||||
ollama pull gpt-oss:20b
|
||||
# or
|
||||
ollama pull llama3.3
|
||||
```
|
||||
```bash
|
||||
ollama signin
|
||||
```
|
||||
</Step>
|
||||
<Step title="Enable Ollama for OpenClaw">
|
||||
Set any value for the API key (Ollama does not require a real key):
|
||||
|
||||
3. If you want cloud models too, sign in:
|
||||
```bash
|
||||
# Set environment variable
|
||||
export OLLAMA_API_KEY="ollama-local"
|
||||
|
||||
```bash
|
||||
ollama signin
|
||||
```
|
||||
# Or configure in your config file
|
||||
openclaw config set models.providers.ollama.apiKey "ollama-local"
|
||||
```
|
||||
</Step>
|
||||
<Step title="Inspect and set your model">
|
||||
```bash
|
||||
openclaw models list
|
||||
openclaw models set ollama/gemma4
|
||||
```
|
||||
|
||||
4. Run onboarding and choose `Ollama`:
|
||||
Or set the default in config:
|
||||
|
||||
```bash
|
||||
openclaw onboard
|
||||
```
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "ollama/gemma4" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
- `Local`: local models only
|
||||
- `Cloud + Local`: local models plus cloud models
|
||||
- Cloud models such as `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, and `glm-5.1:cloud` do **not** require a local `ollama pull`
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
OpenClaw currently suggests:
|
||||
## Cloud models
|
||||
|
||||
- local default: `gemma4`
|
||||
- cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud`
|
||||
<Tabs>
|
||||
<Tab title="Cloud + Local">
|
||||
Cloud models let you run cloud-hosted models alongside your local models. Examples include `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, and `glm-5.1:cloud` -- these do **not** require a local `ollama pull`.
|
||||
|
||||
5. If you prefer manual setup, enable Ollama for OpenClaw directly (any value works; Ollama doesn't require a real key):
|
||||
Select **Cloud + Local** mode during setup. The wizard checks whether you are signed in and opens a browser sign-in flow when needed. If authentication cannot be verified, the wizard falls back to local model defaults.
|
||||
|
||||
```bash
|
||||
# Set environment variable
|
||||
export OLLAMA_API_KEY="ollama-local"
|
||||
You can also sign in directly at [ollama.com/signin](https://ollama.com/signin).
|
||||
|
||||
# Or configure in your config file
|
||||
openclaw config set models.providers.ollama.apiKey "ollama-local"
|
||||
```
|
||||
OpenClaw currently suggests these cloud defaults: `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud`.
|
||||
|
||||
6. Inspect or switch models:
|
||||
</Tab>
|
||||
|
||||
```bash
|
||||
openclaw models list
|
||||
openclaw models set ollama/gemma4
|
||||
```
|
||||
<Tab title="Local only">
|
||||
In local-only mode, OpenClaw discovers models from the local Ollama instance. No cloud sign-in is needed.
|
||||
|
||||
7. Or set the default in config:
|
||||
OpenClaw currently suggests `gemma4` as the local default.
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "ollama/gemma4" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Model discovery (implicit provider)
|
||||
|
||||
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
|
||||
When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, OpenClaw discovers models from the local Ollama instance at `http://127.0.0.1:11434`.
|
||||
|
||||
- Queries `/api/tags`
|
||||
- Uses best-effort `/api/show` lookups to read `contextWindow` and detect capabilities (including vision) when available
|
||||
- Models with a `vision` capability reported by `/api/show` are marked as image-capable (`input: ["text", "image"]`), so OpenClaw auto-injects images into the prompt for those models
|
||||
- Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`)
|
||||
- Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw
|
||||
- Sets all costs to `0`
|
||||
| Behavior | Detail |
|
||||
| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| Catalog query | Queries `/api/tags` |
|
||||
| Capability detection | Uses best-effort `/api/show` lookups to read `contextWindow` and detect capabilities (including vision) |
|
||||
| Vision models | Models with a `vision` capability reported by `/api/show` are marked as image-capable (`input: ["text", "image"]`), so OpenClaw auto-injects images into the prompt |
|
||||
| Reasoning detection | Marks `reasoning` with a model-name heuristic (`r1`, `reasoning`, `think`) |
|
||||
| Token limits | Sets `maxTokens` to the default Ollama max-token cap used by OpenClaw |
|
||||
| Costs | Sets all costs to `0` |
|
||||
|
||||
This avoids manual model entries while keeping the catalog aligned with the local Ollama instance.
|
||||
|
||||
To see what models are available:
|
||||
|
||||
```bash
|
||||
# See what models are available
|
||||
ollama list
|
||||
openclaw models list
|
||||
```
|
||||
@@ -142,74 +174,79 @@ ollama pull mistral
|
||||
|
||||
The new model will be automatically discovered and available to use.
|
||||
|
||||
If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually (see below).
|
||||
<Note>
|
||||
If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually. See the explicit config section below.
|
||||
</Note>
|
||||
|
||||
## Configuration
|
||||
|
||||
### Basic setup (implicit discovery)
|
||||
<Tabs>
|
||||
<Tab title="Basic (implicit discovery)">
|
||||
The simplest way to enable Ollama is via environment variable:
|
||||
|
||||
The simplest way to enable Ollama is via environment variable:
|
||||
```bash
|
||||
export OLLAMA_API_KEY="ollama-local"
|
||||
```
|
||||
|
||||
```bash
|
||||
export OLLAMA_API_KEY="ollama-local"
|
||||
```
|
||||
<Tip>
|
||||
If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks.
|
||||
</Tip>
|
||||
|
||||
### Explicit setup (manual models)
|
||||
</Tab>
|
||||
|
||||
Use explicit config when:
|
||||
<Tab title="Explicit (manual models)">
|
||||
Use explicit config when Ollama runs on another host/port, you want to force specific context windows or model lists, or you want fully manual model definitions.
|
||||
|
||||
- Ollama runs on another host/port.
|
||||
- You want to force specific context windows or model lists.
|
||||
- You want fully manual model definitions.
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
models: [
|
||||
{
|
||||
id: "gpt-oss:20b",
|
||||
name: "GPT-OSS 20B",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 8192,
|
||||
maxTokens: 8192 * 10
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434",
|
||||
apiKey: "ollama-local",
|
||||
api: "ollama",
|
||||
models: [
|
||||
{
|
||||
id: "gpt-oss:20b",
|
||||
name: "GPT-OSS 20B",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 8192,
|
||||
maxTokens: 8192 * 10
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and OpenClaw will fill it for availability checks.
|
||||
</Tab>
|
||||
|
||||
### Custom base URL (explicit config)
|
||||
<Tab title="Custom base URL">
|
||||
If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):
|
||||
|
||||
If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
apiKey: "ollama-local",
|
||||
baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL
|
||||
api: "ollama", // Set explicitly to guarantee native tool-calling behavior
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
apiKey: "ollama-local",
|
||||
baseUrl: "http://ollama-host:11434", // No /v1 - use native Ollama API URL
|
||||
api: "ollama", // Set explicitly to guarantee native tool-calling behavior
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
<Warning>
|
||||
Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix.
|
||||
</Warning>
|
||||
<Warning>
|
||||
Do not add `/v1` to the URL. The `/v1` path uses OpenAI-compatible mode, where tool calling is not reliable. Use the base Ollama URL without a path suffix.
|
||||
</Warning>
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
### Model selection
|
||||
|
||||
@@ -228,26 +265,17 @@ Once configured, all your Ollama models are available:
|
||||
}
|
||||
```
|
||||
|
||||
## Cloud models
|
||||
|
||||
Cloud models let you run cloud-hosted models (for example `kimi-k2.5:cloud`, `minimax-m2.7:cloud`, `glm-5.1:cloud`) alongside your local models.
|
||||
|
||||
To use cloud models, select **Cloud + Local** mode during setup. The wizard checks whether you are signed in and opens a browser sign-in flow when needed. If authentication cannot be verified, the wizard falls back to local model defaults.
|
||||
|
||||
You can also sign in directly at [ollama.com/signin](https://ollama.com/signin).
|
||||
|
||||
## Ollama Web Search
|
||||
|
||||
OpenClaw also supports **Ollama Web Search** as a bundled `web_search`
|
||||
provider.
|
||||
OpenClaw supports **Ollama Web Search** as a bundled `web_search` provider.
|
||||
|
||||
- It uses your configured Ollama host (`models.providers.ollama.baseUrl` when
|
||||
set, otherwise `http://127.0.0.1:11434`).
|
||||
- It is key-free.
|
||||
- It requires Ollama to be running and signed in with `ollama signin`.
|
||||
| Property | Detail |
|
||||
| ----------- | ----------------------------------------------------------------------------------------------------------------- |
|
||||
| Host | Uses your configured Ollama host (`models.providers.ollama.baseUrl` when set, otherwise `http://127.0.0.1:11434`) |
|
||||
| Auth | Key-free |
|
||||
| Requirement | Ollama must be running and signed in with `ollama signin` |
|
||||
|
||||
Choose **Ollama Web Search** during `openclaw onboard` or
|
||||
`openclaw configure --section web`, or set:
|
||||
Choose **Ollama Web Search** during `openclaw onboard` or `openclaw configure --section web`, or set:
|
||||
|
||||
```json5
|
||||
{
|
||||
@@ -261,120 +289,169 @@ Choose **Ollama Web Search** during `openclaw onboard` or
|
||||
}
|
||||
```
|
||||
|
||||
<Note>
|
||||
For the full setup and behavior details, see [Ollama Web Search](/tools/ollama-search).
|
||||
</Note>
|
||||
|
||||
## Advanced
|
||||
## Advanced configuration
|
||||
|
||||
### Reasoning models
|
||||
<AccordionGroup>
|
||||
<Accordion title="Legacy OpenAI-compatible mode">
|
||||
<Warning>
|
||||
**Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior.
|
||||
</Warning>
|
||||
|
||||
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default:
|
||||
If you need to use the OpenAI-compatible endpoint instead (for example, behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly:
|
||||
|
||||
```bash
|
||||
ollama pull deepseek-r1:32b
|
||||
```
|
||||
|
||||
### Model Costs
|
||||
|
||||
Ollama is free and runs locally, so all model costs are set to $0.
|
||||
|
||||
### Streaming Configuration
|
||||
|
||||
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.
|
||||
|
||||
#### Legacy OpenAI-Compatible Mode
|
||||
|
||||
<Warning>
|
||||
**Tool calling is not reliable in OpenAI-compatible mode.** Use this mode only if you need OpenAI format for a proxy and do not depend on native tool calling behavior.
|
||||
</Warning>
|
||||
|
||||
If you need to use the OpenAI-compatible endpoint instead (e.g., behind a proxy that only supports OpenAI format), set `api: "openai-completions"` explicitly:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434/v1",
|
||||
api: "openai-completions",
|
||||
injectNumCtxForOpenAICompat: true, // default: true
|
||||
apiKey: "ollama-local",
|
||||
models: [...]
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434/v1",
|
||||
api: "openai-completions",
|
||||
injectNumCtxForOpenAICompat: true, // default: true
|
||||
apiKey: "ollama-local",
|
||||
models: [...]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
This mode may not support streaming + tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config.
|
||||
This mode may not support streaming and tool calling simultaneously. You may need to disable streaming with `params: { streaming: false }` in model config.
|
||||
|
||||
When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior:
|
||||
When `api: "openai-completions"` is used with Ollama, OpenClaw injects `options.num_ctx` by default so Ollama does not silently fall back to a 4096 context window. If your proxy/upstream rejects unknown `options` fields, disable this behavior:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434/v1",
|
||||
api: "openai-completions",
|
||||
injectNumCtxForOpenAICompat: false,
|
||||
apiKey: "ollama-local",
|
||||
models: [...]
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
baseUrl: "http://ollama-host:11434/v1",
|
||||
api: "openai-completions",
|
||||
injectNumCtxForOpenAICompat: false,
|
||||
apiKey: "ollama-local",
|
||||
models: [...]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
```
|
||||
|
||||
### Context windows
|
||||
</Accordion>
|
||||
|
||||
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw. You can override `contextWindow` and `maxTokens` in explicit provider config.
|
||||
<Accordion title="Context windows">
|
||||
For auto-discovered models, OpenClaw uses the context window reported by Ollama when available, otherwise it falls back to the default Ollama context window used by OpenClaw.
|
||||
|
||||
You can override `contextWindow` and `maxTokens` in explicit provider config:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
ollama: {
|
||||
models: [
|
||||
{
|
||||
id: "llama3.3",
|
||||
contextWindow: 131072,
|
||||
maxTokens: 65536,
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Reasoning models">
|
||||
OpenClaw treats models with names such as `deepseek-r1`, `reasoning`, or `think` as reasoning-capable by default.
|
||||
|
||||
```bash
|
||||
ollama pull deepseek-r1:32b
|
||||
```
|
||||
|
||||
No additional configuration is needed -- OpenClaw marks them automatically.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Model costs">
|
||||
Ollama is free and runs locally, so all model costs are set to $0. This applies to both auto-discovered and manually defined models.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Streaming configuration">
|
||||
OpenClaw's Ollama integration uses the **native Ollama API** (`/api/chat`) by default, which fully supports streaming and tool calling simultaneously. No special configuration is needed.
|
||||
|
||||
<Tip>
|
||||
If you need to use the OpenAI-compatible endpoint, see the "Legacy OpenAI-compatible mode" section above. Streaming and tool calling may not work simultaneously in that mode.
|
||||
</Tip>
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Ollama not detected
|
||||
<AccordionGroup>
|
||||
<Accordion title="Ollama not detected">
|
||||
Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry:
|
||||
|
||||
Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry:
|
||||
```bash
|
||||
ollama serve
|
||||
```
|
||||
|
||||
```bash
|
||||
ollama serve
|
||||
```
|
||||
Verify that the API is accessible:
|
||||
|
||||
And that the API is accessible:
|
||||
```bash
|
||||
curl http://localhost:11434/api/tags
|
||||
```
|
||||
|
||||
```bash
|
||||
curl http://localhost:11434/api/tags
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
### No models available
|
||||
<Accordion title="No models available">
|
||||
If your model is not listed, either pull the model locally or define it explicitly in `models.providers.ollama`.
|
||||
|
||||
If your model is not listed, either:
|
||||
```bash
|
||||
ollama list # See what's installed
|
||||
ollama pull gemma4
|
||||
ollama pull gpt-oss:20b
|
||||
ollama pull llama3.3 # Or another model
|
||||
```
|
||||
|
||||
- Pull the model locally, or
|
||||
- Define the model explicitly in `models.providers.ollama`.
|
||||
</Accordion>
|
||||
|
||||
To add models:
|
||||
<Accordion title="Connection refused">
|
||||
Check that Ollama is running on the correct port:
|
||||
|
||||
```bash
|
||||
ollama list # See what's installed
|
||||
ollama pull gemma4
|
||||
ollama pull gpt-oss:20b
|
||||
ollama pull llama3.3 # Or another model
|
||||
```
|
||||
```bash
|
||||
# Check if Ollama is running
|
||||
ps aux | grep ollama
|
||||
|
||||
### Connection refused
|
||||
# Or restart Ollama
|
||||
ollama serve
|
||||
```
|
||||
|
||||
Check that Ollama is running on the correct port:
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
```bash
|
||||
# Check if Ollama is running
|
||||
ps aux | grep ollama
|
||||
<Note>
|
||||
More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
|
||||
</Note>
|
||||
|
||||
# Or restart Ollama
|
||||
ollama serve
|
||||
```
|
||||
## Related
|
||||
|
||||
## See Also
|
||||
|
||||
- [Model Providers](/concepts/model-providers) - Overview of all providers
|
||||
- [Model Selection](/concepts/models) - How to choose models
|
||||
- [Configuration](/gateway/configuration) - Full config reference
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model providers" href="/concepts/model-providers" icon="layers">
|
||||
Overview of all providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Model selection" href="/concepts/models" icon="brain">
|
||||
How to choose and configure models.
|
||||
</Card>
|
||||
<Card title="Ollama Web Search" href="/tools/ollama-search" icon="magnifying-glass">
|
||||
Full setup and behavior details for Ollama-powered web search.
|
||||
</Card>
|
||||
<Card title="Configuration" href="/gateway/configuration" icon="gear">
|
||||
Full config reference.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -6,11 +6,9 @@ read_when:
|
||||
title: "Venice AI"
|
||||
---
|
||||
|
||||
# Venice AI (Venice highlight)
|
||||
# Venice AI
|
||||
|
||||
**Venice** is our highlight Venice setup for privacy-first inference with optional anonymized access to proprietary models.
|
||||
|
||||
Venice AI provides privacy-focused AI inference with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default—no training on your data, no logging.
|
||||
Venice AI provides **privacy-focused AI inference** with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default — no training on your data, no logging.
|
||||
|
||||
## Why Venice in OpenClaw
|
||||
|
||||
@@ -19,7 +17,7 @@ Venice AI provides privacy-focused AI inference with support for uncensored mode
|
||||
- **Anonymized access** to proprietary models (Opus/GPT/Gemini) when quality matters.
|
||||
- OpenAI-compatible `/v1` endpoints.
|
||||
|
||||
## Privacy Modes
|
||||
## Privacy modes
|
||||
|
||||
Venice offers two privacy levels — understanding this is key to choosing your model:
|
||||
|
||||
@@ -28,61 +26,67 @@ Venice offers two privacy levels — understanding this is key to choosing your
|
||||
| **Private** | Fully private. Prompts/responses are **never stored or logged**. Ephemeral. | Llama, Qwen, DeepSeek, Kimi, MiniMax, Venice Uncensored, etc. |
|
||||
| **Anonymized** | Proxied through Venice with metadata stripped. The underlying provider (OpenAI, Anthropic, Google, xAI) sees anonymized requests. | Claude, GPT, Gemini, Grok |
|
||||
|
||||
<Warning>
|
||||
Anonymized models are **not** fully private. Venice strips metadata before forwarding, but the underlying provider (OpenAI, Anthropic, Google, xAI) still processes the request. Choose **Private** models when full privacy is required.
|
||||
</Warning>
|
||||
|
||||
## Features
|
||||
|
||||
- **Privacy-focused**: Choose between "private" (fully private) and "anonymized" (proxied) modes
|
||||
- **Uncensored models**: Access to models without content restrictions
|
||||
- **Major model access**: Use Claude, GPT, Gemini, and Grok via Venice's anonymized proxy
|
||||
- **OpenAI-compatible API**: Standard `/v1` endpoints for easy integration
|
||||
- **Streaming**: ✅ Supported on all models
|
||||
- **Function calling**: ✅ Supported on select models (check model capabilities)
|
||||
- **Vision**: ✅ Supported on models with vision capability
|
||||
- **Streaming**: Supported on all models
|
||||
- **Function calling**: Supported on select models (check model capabilities)
|
||||
- **Vision**: Supported on models with vision capability
|
||||
- **No hard rate limits**: Fair-use throttling may apply for extreme usage
|
||||
|
||||
## Setup
|
||||
## Getting started
|
||||
|
||||
### 1. Get API Key
|
||||
<Steps>
|
||||
<Step title="Get your API key">
|
||||
1. Sign up at [venice.ai](https://venice.ai)
|
||||
2. Go to **Settings > API Keys > Create new key**
|
||||
3. Copy your API key (format: `vapi_xxxxxxxxxxxx`)
|
||||
</Step>
|
||||
<Step title="Configure OpenClaw">
|
||||
Choose your preferred setup method:
|
||||
|
||||
1. Sign up at [venice.ai](https://venice.ai)
|
||||
2. Go to **Settings → API Keys → Create new key**
|
||||
3. Copy your API key (format: `vapi_xxxxxxxxxxxx`)
|
||||
<Tabs>
|
||||
<Tab title="Interactive (recommended)">
|
||||
```bash
|
||||
openclaw onboard --auth-choice venice-api-key
|
||||
```
|
||||
|
||||
### 2. Configure OpenClaw
|
||||
This will:
|
||||
1. Prompt for your API key (or use existing `VENICE_API_KEY`)
|
||||
2. Show all available Venice models
|
||||
3. Let you pick your default model
|
||||
4. Configure the provider automatically
|
||||
</Tab>
|
||||
<Tab title="Environment variable">
|
||||
```bash
|
||||
export VENICE_API_KEY="vapi_xxxxxxxxxxxx"
|
||||
```
|
||||
</Tab>
|
||||
<Tab title="Non-interactive">
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice venice-api-key \
|
||||
--venice-api-key "vapi_xxxxxxxxxxxx"
|
||||
```
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
**Option A: Environment Variable**
|
||||
</Step>
|
||||
<Step title="Verify setup">
|
||||
```bash
|
||||
openclaw agent --model venice/kimi-k2-5 --message "Hello, are you working?"
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
```bash
|
||||
export VENICE_API_KEY="vapi_xxxxxxxxxxxx"
|
||||
```
|
||||
|
||||
**Option B: Interactive Setup (Recommended)**
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice venice-api-key
|
||||
```
|
||||
|
||||
This will:
|
||||
|
||||
1. Prompt for your API key (or use existing `VENICE_API_KEY`)
|
||||
2. Show all available Venice models
|
||||
3. Let you pick your default model
|
||||
4. Configure the provider automatically
|
||||
|
||||
**Option C: Non-interactive**
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
--auth-choice venice-api-key \
|
||||
--venice-api-key "vapi_xxxxxxxxxxxx"
|
||||
```
|
||||
|
||||
### 3. Verify Setup
|
||||
|
||||
```bash
|
||||
openclaw agent --model venice/kimi-k2-5 --message "Hello, are you working?"
|
||||
```
|
||||
|
||||
## Model Selection
|
||||
## Model selection
|
||||
|
||||
After setup, OpenClaw shows all available Venice models. Pick based on your needs:
|
||||
|
||||
@@ -104,13 +108,10 @@ List all available models:
|
||||
openclaw models list | grep venice
|
||||
```
|
||||
|
||||
## Configure via `openclaw configure`
|
||||
You can also run `openclaw configure`, select **Model/auth**, and choose **Venice AI**.
|
||||
|
||||
1. Run `openclaw configure`
|
||||
2. Select **Model/auth**
|
||||
3. Choose **Venice AI**
|
||||
|
||||
## Which Model Should I Use?
|
||||
<Tip>
|
||||
Use the table below to pick the right model for your use case.
|
||||
|
||||
| Use Case | Recommended Model | Why |
|
||||
| -------------------------- | -------------------------------- | -------------------------------------------- |
|
||||
@@ -122,73 +123,77 @@ openclaw models list | grep venice
|
||||
| **Complex private tasks** | `deepseek-v3.2` | Strong reasoning, but no Venice tool support |
|
||||
| **Uncensored** | `venice-uncensored` | No content restrictions |
|
||||
|
||||
## Available Models (41 Total)
|
||||
</Tip>
|
||||
|
||||
### Private Models (26) - Fully Private, No Logging
|
||||
## Available models (41 total)
|
||||
|
||||
| Model ID | Name | Context | Features |
|
||||
| -------------------------------------- | ----------------------------------- | ------- | -------------------------- |
|
||||
| `kimi-k2-5` | Kimi K2.5 | 256k | Default, reasoning, vision |
|
||||
| `kimi-k2-thinking` | Kimi K2 Thinking | 256k | Reasoning |
|
||||
| `llama-3.3-70b` | Llama 3.3 70B | 128k | General |
|
||||
| `llama-3.2-3b` | Llama 3.2 3B | 128k | General |
|
||||
| `hermes-3-llama-3.1-405b` | Hermes 3 Llama 3.1 405B | 128k | General, tools disabled |
|
||||
| `qwen3-235b-a22b-thinking-2507` | Qwen3 235B Thinking | 128k | Reasoning |
|
||||
| `qwen3-235b-a22b-instruct-2507` | Qwen3 235B Instruct | 128k | General |
|
||||
| `qwen3-coder-480b-a35b-instruct` | Qwen3 Coder 480B | 256k | Coding |
|
||||
| `qwen3-coder-480b-a35b-instruct-turbo` | Qwen3 Coder 480B Turbo | 256k | Coding |
|
||||
| `qwen3-5-35b-a3b` | Qwen3.5 35B A3B | 256k | Reasoning, vision |
|
||||
| `qwen3-next-80b` | Qwen3 Next 80B | 256k | General |
|
||||
| `qwen3-vl-235b-a22b` | Qwen3 VL 235B (Vision) | 256k | Vision |
|
||||
| `qwen3-4b` | Venice Small (Qwen3 4B) | 32k | Fast, reasoning |
|
||||
| `deepseek-v3.2` | DeepSeek V3.2 | 160k | Reasoning, tools disabled |
|
||||
| `venice-uncensored` | Venice Uncensored (Dolphin-Mistral) | 32k | Uncensored, tools disabled |
|
||||
| `mistral-31-24b` | Venice Medium (Mistral) | 128k | Vision |
|
||||
| `google-gemma-3-27b-it` | Google Gemma 3 27B Instruct | 198k | Vision |
|
||||
| `openai-gpt-oss-120b` | OpenAI GPT OSS 120B | 128k | General |
|
||||
| `nvidia-nemotron-3-nano-30b-a3b` | NVIDIA Nemotron 3 Nano 30B | 128k | General |
|
||||
| `olafangensan-glm-4.7-flash-heretic` | GLM 4.7 Flash Heretic | 128k | Reasoning |
|
||||
| `zai-org-glm-4.6` | GLM 4.6 | 198k | General |
|
||||
| `zai-org-glm-4.7` | GLM 4.7 | 198k | Reasoning |
|
||||
| `zai-org-glm-4.7-flash` | GLM 4.7 Flash | 128k | Reasoning |
|
||||
| `zai-org-glm-5` | GLM 5 | 198k | Reasoning |
|
||||
| `minimax-m21` | MiniMax M2.1 | 198k | Reasoning |
|
||||
| `minimax-m25` | MiniMax M2.5 | 198k | Reasoning |
|
||||
<AccordionGroup>
|
||||
<Accordion title="Private models (26) — fully private, no logging">
|
||||
| Model ID | Name | Context | Features |
|
||||
| -------------------------------------- | ----------------------------------- | ------- | -------------------------- |
|
||||
| `kimi-k2-5` | Kimi K2.5 | 256k | Default, reasoning, vision |
|
||||
| `kimi-k2-thinking` | Kimi K2 Thinking | 256k | Reasoning |
|
||||
| `llama-3.3-70b` | Llama 3.3 70B | 128k | General |
|
||||
| `llama-3.2-3b` | Llama 3.2 3B | 128k | General |
|
||||
| `hermes-3-llama-3.1-405b` | Hermes 3 Llama 3.1 405B | 128k | General, tools disabled |
|
||||
| `qwen3-235b-a22b-thinking-2507` | Qwen3 235B Thinking | 128k | Reasoning |
|
||||
| `qwen3-235b-a22b-instruct-2507` | Qwen3 235B Instruct | 128k | General |
|
||||
| `qwen3-coder-480b-a35b-instruct` | Qwen3 Coder 480B | 256k | Coding |
|
||||
| `qwen3-coder-480b-a35b-instruct-turbo` | Qwen3 Coder 480B Turbo | 256k | Coding |
|
||||
| `qwen3-5-35b-a3b` | Qwen3.5 35B A3B | 256k | Reasoning, vision |
|
||||
| `qwen3-next-80b` | Qwen3 Next 80B | 256k | General |
|
||||
| `qwen3-vl-235b-a22b` | Qwen3 VL 235B (Vision) | 256k | Vision |
|
||||
| `qwen3-4b` | Venice Small (Qwen3 4B) | 32k | Fast, reasoning |
|
||||
| `deepseek-v3.2` | DeepSeek V3.2 | 160k | Reasoning, tools disabled |
|
||||
| `venice-uncensored` | Venice Uncensored (Dolphin-Mistral) | 32k | Uncensored, tools disabled |
|
||||
| `mistral-31-24b` | Venice Medium (Mistral) | 128k | Vision |
|
||||
| `google-gemma-3-27b-it` | Google Gemma 3 27B Instruct | 198k | Vision |
|
||||
| `openai-gpt-oss-120b` | OpenAI GPT OSS 120B | 128k | General |
|
||||
| `nvidia-nemotron-3-nano-30b-a3b` | NVIDIA Nemotron 3 Nano 30B | 128k | General |
|
||||
| `olafangensan-glm-4.7-flash-heretic` | GLM 4.7 Flash Heretic | 128k | Reasoning |
|
||||
| `zai-org-glm-4.6` | GLM 4.6 | 198k | General |
|
||||
| `zai-org-glm-4.7` | GLM 4.7 | 198k | Reasoning |
|
||||
| `zai-org-glm-4.7-flash` | GLM 4.7 Flash | 128k | Reasoning |
|
||||
| `zai-org-glm-5` | GLM 5 | 198k | Reasoning |
|
||||
| `minimax-m21` | MiniMax M2.1 | 198k | Reasoning |
|
||||
| `minimax-m25` | MiniMax M2.5 | 198k | Reasoning |
|
||||
</Accordion>
|
||||
|
||||
### Anonymized Models (15) - Via Venice Proxy
|
||||
<Accordion title="Anonymized models (15) — via Venice proxy">
|
||||
| Model ID | Name | Context | Features |
|
||||
| ------------------------------- | ------------------------------ | ------- | ------------------------- |
|
||||
| `claude-opus-4-6` | Claude Opus 4.6 (via Venice) | 1M | Reasoning, vision |
|
||||
| `claude-opus-4-5` | Claude Opus 4.5 (via Venice) | 198k | Reasoning, vision |
|
||||
| `claude-sonnet-4-6` | Claude Sonnet 4.6 (via Venice) | 1M | Reasoning, vision |
|
||||
| `claude-sonnet-4-5` | Claude Sonnet 4.5 (via Venice) | 198k | Reasoning, vision |
|
||||
| `openai-gpt-54` | GPT-5.4 (via Venice) | 1M | Reasoning, vision |
|
||||
| `openai-gpt-53-codex` | GPT-5.3 Codex (via Venice) | 400k | Reasoning, vision, coding |
|
||||
| `openai-gpt-52` | GPT-5.2 (via Venice) | 256k | Reasoning |
|
||||
| `openai-gpt-52-codex` | GPT-5.2 Codex (via Venice) | 256k | Reasoning, vision, coding |
|
||||
| `openai-gpt-4o-2024-11-20` | GPT-4o (via Venice) | 128k | Vision |
|
||||
| `openai-gpt-4o-mini-2024-07-18` | GPT-4o Mini (via Venice) | 128k | Vision |
|
||||
| `gemini-3-1-pro-preview` | Gemini 3.1 Pro (via Venice) | 1M | Reasoning, vision |
|
||||
| `gemini-3-pro-preview` | Gemini 3 Pro (via Venice) | 198k | Reasoning, vision |
|
||||
| `gemini-3-flash-preview` | Gemini 3 Flash (via Venice) | 256k | Reasoning, vision |
|
||||
| `grok-41-fast` | Grok 4.1 Fast (via Venice) | 1M | Reasoning, vision |
|
||||
| `grok-code-fast-1` | Grok Code Fast 1 (via Venice) | 256k | Reasoning, coding |
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
| Model ID | Name | Context | Features |
|
||||
| ------------------------------- | ------------------------------ | ------- | ------------------------- |
|
||||
| `claude-opus-4-6` | Claude Opus 4.6 (via Venice) | 1M | Reasoning, vision |
|
||||
| `claude-opus-4-5` | Claude Opus 4.5 (via Venice) | 198k | Reasoning, vision |
|
||||
| `claude-sonnet-4-6` | Claude Sonnet 4.6 (via Venice) | 1M | Reasoning, vision |
|
||||
| `claude-sonnet-4-5` | Claude Sonnet 4.5 (via Venice) | 198k | Reasoning, vision |
|
||||
| `openai-gpt-54` | GPT-5.4 (via Venice) | 1M | Reasoning, vision |
|
||||
| `openai-gpt-53-codex` | GPT-5.3 Codex (via Venice) | 400k | Reasoning, vision, coding |
|
||||
| `openai-gpt-52` | GPT-5.2 (via Venice) | 256k | Reasoning |
|
||||
| `openai-gpt-52-codex` | GPT-5.2 Codex (via Venice) | 256k | Reasoning, vision, coding |
|
||||
| `openai-gpt-4o-2024-11-20` | GPT-4o (via Venice) | 128k | Vision |
|
||||
| `openai-gpt-4o-mini-2024-07-18` | GPT-4o Mini (via Venice) | 128k | Vision |
|
||||
| `gemini-3-1-pro-preview` | Gemini 3.1 Pro (via Venice) | 1M | Reasoning, vision |
|
||||
| `gemini-3-pro-preview` | Gemini 3 Pro (via Venice) | 198k | Reasoning, vision |
|
||||
| `gemini-3-flash-preview` | Gemini 3 Flash (via Venice) | 256k | Reasoning, vision |
|
||||
| `grok-41-fast` | Grok 4.1 Fast (via Venice) | 1M | Reasoning, vision |
|
||||
| `grok-code-fast-1` | Grok Code Fast 1 (via Venice) | 256k | Reasoning, coding |
|
||||
|
||||
## Model Discovery
|
||||
## Model discovery
|
||||
|
||||
OpenClaw automatically discovers models from the Venice API when `VENICE_API_KEY` is set. If the API is unreachable, it falls back to a static catalog.
|
||||
|
||||
The `/models` endpoint is public (no auth needed for listing), but inference requires a valid API key.
|
||||
|
||||
## Streaming & Tool Support
|
||||
## Streaming and tool support
|
||||
|
||||
| Feature | Support |
|
||||
| -------------------- | ------------------------------------------------------- |
|
||||
| **Streaming** | ✅ All models |
|
||||
| **Function calling** | ✅ Most models (check `supportsFunctionCalling` in API) |
|
||||
| **Vision/Images** | ✅ Models marked with "Vision" feature |
|
||||
| **JSON mode** | ✅ Supported via `response_format` |
|
||||
| Feature | Support |
|
||||
| -------------------- | ---------------------------------------------------- |
|
||||
| **Streaming** | All models |
|
||||
| **Function calling** | Most models (check `supportsFunctionCalling` in API) |
|
||||
| **Vision/Images** | Models marked with "Vision" feature |
|
||||
| **JSON mode** | Supported via `response_format` |
|
||||
|
||||
## Pricing
|
||||
|
||||
@@ -197,7 +202,7 @@ Venice uses a credit-based system. Check [venice.ai/pricing](https://venice.ai/p
|
||||
- **Private models**: Generally lower cost
|
||||
- **Anonymized models**: Similar to direct API pricing + small Venice fee
|
||||
|
||||
## Comparison: Venice vs Direct API
|
||||
### Venice (anonymized) vs direct API
|
||||
|
||||
| Aspect | Venice (Anonymized) | Direct API |
|
||||
| ------------ | ----------------------------- | ------------------- |
|
||||
@@ -206,7 +211,7 @@ Venice uses a credit-based system. Check [venice.ai/pricing](https://venice.ai/p
|
||||
| **Features** | Most features supported | Full features |
|
||||
| **Billing** | Venice credits | Provider billing |
|
||||
|
||||
## Usage Examples
|
||||
## Usage examples
|
||||
|
||||
```bash
|
||||
# Use the default private model
|
||||
@@ -227,56 +232,77 @@ openclaw agent --model venice/qwen3-coder-480b-a35b-instruct --message "Refactor
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### API key not recognized
|
||||
<AccordionGroup>
|
||||
<Accordion title="API key not recognized">
|
||||
```bash
|
||||
echo $VENICE_API_KEY
|
||||
openclaw models list | grep venice
|
||||
```
|
||||
|
||||
```bash
|
||||
echo $VENICE_API_KEY
|
||||
openclaw models list | grep venice
|
||||
```
|
||||
Ensure the key starts with `vapi_`.
|
||||
|
||||
Ensure the key starts with `vapi_`.
|
||||
</Accordion>
|
||||
|
||||
### Model not available
|
||||
<Accordion title="Model not available">
|
||||
The Venice model catalog updates dynamically. Run `openclaw models list` to see currently available models. Some models may be temporarily offline.
|
||||
</Accordion>
|
||||
|
||||
The Venice model catalog updates dynamically. Run `openclaw models list` to see currently available models. Some models may be temporarily offline.
|
||||
<Accordion title="Connection issues">
|
||||
Venice API is at `https://api.venice.ai/api/v1`. Ensure your network allows HTTPS connections.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
### Connection issues
|
||||
<Note>
|
||||
More help: [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
|
||||
</Note>
|
||||
|
||||
Venice API is at `https://api.venice.ai/api/v1`. Ensure your network allows HTTPS connections.
|
||||
## Advanced configuration
|
||||
|
||||
## Config file example
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { VENICE_API_KEY: "vapi_..." },
|
||||
agents: { defaults: { model: { primary: "venice/kimi-k2-5" } } },
|
||||
models: {
|
||||
mode: "merge",
|
||||
providers: {
|
||||
venice: {
|
||||
baseUrl: "https://api.venice.ai/api/v1",
|
||||
apiKey: "${VENICE_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "kimi-k2-5",
|
||||
name: "Kimi K2.5",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 256000,
|
||||
maxTokens: 65536,
|
||||
<AccordionGroup>
|
||||
<Accordion title="Config file example">
|
||||
```json5
|
||||
{
|
||||
env: { VENICE_API_KEY: "vapi_..." },
|
||||
agents: { defaults: { model: { primary: "venice/kimi-k2-5" } } },
|
||||
models: {
|
||||
mode: "merge",
|
||||
providers: {
|
||||
venice: {
|
||||
baseUrl: "https://api.venice.ai/api/v1",
|
||||
apiKey: "${VENICE_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
{
|
||||
id: "kimi-k2-5",
|
||||
name: "Kimi K2.5",
|
||||
reasoning: true,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 256000,
|
||||
maxTokens: 65536,
|
||||
},
|
||||
],
|
||||
},
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Links
|
||||
## Related
|
||||
|
||||
- [Venice AI](https://venice.ai)
|
||||
- [API Documentation](https://docs.venice.ai)
|
||||
- [Pricing](https://venice.ai/pricing)
|
||||
- [Status](https://status.venice.ai)
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Venice AI" href="https://venice.ai" icon="globe">
|
||||
Venice AI homepage and account signup.
|
||||
</Card>
|
||||
<Card title="API documentation" href="https://docs.venice.ai" icon="book">
|
||||
Venice API reference and developer docs.
|
||||
</Card>
|
||||
<Card title="Pricing" href="https://venice.ai/pricing" icon="credit-card">
|
||||
Current Venice credit rates and plans.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
Reference in New Issue
Block a user