docs(providers): improve moonshot, qwen, comfy, huggingface, inferrs with Mintlify components

This commit is contained in:
Vincent Koc
2026-04-12 11:10:32 +01:00
parent faae37d38c
commit 2b68af784f
5 changed files with 1029 additions and 589 deletions

View File

@@ -9,13 +9,15 @@ read_when:
# ComfyUI
OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs.
OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs. The plugin is entirely workflow-driven, so OpenClaw does not try to map generic `size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls onto your graph.
- Provider: `comfy`
- Models: `comfy/workflow`
- Shared surfaces: `image_generate`, `video_generate`, `music_generate`
- Auth: none for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud
- API: ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*`
| Property | Detail |
| --------------- | -------------------------------------------------------------------------------- |
| Provider | `comfy` |
| Models | `comfy/workflow` |
| Shared surfaces | `image_generate`, `video_generate`, `music_generate` |
| Auth | None for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud |
| API | ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*` |
## What it supports
@@ -26,14 +28,140 @@ OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs.
- Music or audio generation through the shared `music_generate` tool
- Output download from a configured node or all matching output nodes
The bundled plugin is workflow-driven, so OpenClaw does not try to map generic
`size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls
onto your graph.
## Getting started
## Config layout
Choose between running ComfyUI on your own machine or using Comfy Cloud.
Comfy supports shared top-level connection settings plus per-capability workflow
sections:
<Tabs>
<Tab title="Local">
**Best for:** running your own ComfyUI instance on your machine or LAN.
<Steps>
<Step title="Start ComfyUI locally">
Make sure your local ComfyUI instance is running (defaults to `http://127.0.0.1:8188`).
</Step>
<Step title="Prepare your workflow JSON">
Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node you want OpenClaw to read from.
</Step>
<Step title="Configure the provider">
Set `mode: "local"` and point at your workflow file. Here is a minimal image example:
```json5
{
models: {
providers: {
comfy: {
mode: "local",
baseUrl: "http://127.0.0.1:8188",
image: {
workflowPath: "./workflows/flux-api.json",
promptNodeId: "6",
outputNodeId: "9",
},
},
},
},
}
```
</Step>
<Step title="Set the default model">
Point OpenClaw at the `comfy/workflow` model for the capability you configured:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "comfy/workflow",
},
},
},
}
```
</Step>
<Step title="Verify">
```bash
openclaw models list --provider comfy
```
</Step>
</Steps>
</Tab>
<Tab title="Comfy Cloud">
**Best for:** running workflows on Comfy Cloud without managing local GPU resources.
<Steps>
<Step title="Get an API key">
Sign up at [comfy.org](https://comfy.org) and generate an API key from your account dashboard.
</Step>
<Step title="Set the API key">
Provide your key through one of these methods:
```bash
# Environment variable (preferred)
export COMFY_API_KEY="your-key"
# Alternative environment variable
export COMFY_CLOUD_API_KEY="your-key"
# Or inline in config
openclaw config set models.providers.comfy.apiKey "your-key"
```
</Step>
<Step title="Prepare your workflow JSON">
Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node.
</Step>
<Step title="Configure the provider">
Set `mode: "cloud"` and point at your workflow file:
```json5
{
models: {
providers: {
comfy: {
mode: "cloud",
image: {
workflowPath: "./workflows/flux-api.json",
promptNodeId: "6",
outputNodeId: "9",
},
},
},
},
}
```
<Tip>
Cloud mode defaults `baseUrl` to `https://cloud.comfy.org`. You only need to set `baseUrl` if you use a custom cloud endpoint.
</Tip>
</Step>
<Step title="Set the default model">
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "comfy/workflow",
},
},
},
}
```
</Step>
<Step title="Verify">
```bash
openclaw models list --provider comfy
```
</Step>
</Steps>
</Tab>
</Tabs>
## Configuration
Comfy supports shared top-level connection settings plus per-capability workflow sections (`image`, `video`, `music`):
```json5
{
@@ -63,139 +191,164 @@ sections:
}
```
Shared keys:
### Shared keys
- `mode`: `local` or `cloud`
- `baseUrl`: defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud
- `apiKey`: optional inline key alternative to env vars
- `allowPrivateNetwork`: allow a private/LAN `baseUrl` in cloud mode
| Key | Type | Description |
| --------------------- | ---------------------- | ------------------------------------------------------------------------------------- |
| `mode` | `"local"` or `"cloud"` | Connection mode. |
| `baseUrl` | string | Defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud. |
| `apiKey` | string | Optional inline key, alternative to `COMFY_API_KEY` / `COMFY_CLOUD_API_KEY` env vars. |
| `allowPrivateNetwork` | boolean | Allow a private/LAN `baseUrl` in cloud mode. |
Per-capability keys under `image`, `video`, or `music`:
### Per-capability keys
- `workflow` or `workflowPath`: required
- `promptNodeId`: required
- `promptInputName`: defaults to `text`
- `outputNodeId`: optional
- `pollIntervalMs`: optional
- `timeoutMs`: optional
These keys apply inside the `image`, `video`, or `music` sections:
Image and video sections also support:
| Key | Required | Default | Description |
| ---------------------------- | -------- | -------- | ---------------------------------------------------------------------------- |
| `workflow` or `workflowPath` | Yes | -- | Path to the ComfyUI workflow JSON file. |
| `promptNodeId` | Yes | -- | Node ID that receives the text prompt. |
| `promptInputName` | No | `"text"` | Input name on the prompt node. |
| `outputNodeId` | No | -- | Node ID to read output from. If omitted, all matching output nodes are used. |
| `pollIntervalMs` | No | -- | Polling interval in milliseconds for job completion. |
| `timeoutMs` | No | -- | Timeout in milliseconds for the workflow run. |
- `inputImageNodeId`: required when you pass a reference image
- `inputImageInputName`: defaults to `image`
The `image` and `video` sections also support:
## Backward compatibility
| Key | Required | Default | Description |
| --------------------- | ------------------------------------ | --------- | --------------------------------------------------- |
| `inputImageNodeId` | Yes (when passing a reference image) | -- | Node ID that receives the uploaded reference image. |
| `inputImageInputName` | No | `"image"` | Input name on the image node. |
Existing top-level image config still works:
## Workflow details
```json5
{
models: {
providers: {
comfy: {
workflowPath: "./workflows/flux-api.json",
promptNodeId: "6",
outputNodeId: "9",
},
},
},
}
```
<AccordionGroup>
<Accordion title="Image workflows">
Set the default image model to `comfy/workflow`:
OpenClaw treats that legacy shape as the image workflow config.
## Image workflows
Set the default image model:
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "comfy/workflow",
},
},
},
}
```
Reference-image editing example:
```json5
{
models: {
providers: {
comfy: {
image: {
workflowPath: "./workflows/edit-api.json",
promptNodeId: "6",
inputImageNodeId: "7",
inputImageInputName: "image",
outputNodeId: "9",
```json5
{
agents: {
defaults: {
imageGenerationModel: {
primary: "comfy/workflow",
},
},
},
},
},
}
```
}
```
## Video workflows
**Reference-image editing example:**
Set the default video model:
To enable image editing with an uploaded reference image, add `inputImageNodeId` to your image config:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "comfy/workflow",
```json5
{
models: {
providers: {
comfy: {
image: {
workflowPath: "./workflows/edit-api.json",
promptNodeId: "6",
inputImageNodeId: "7",
inputImageInputName: "image",
outputNodeId: "9",
},
},
},
},
},
},
}
```
}
```
Comfy video workflows currently support text-to-video and image-to-video through
the configured graph. OpenClaw does not pass input videos into Comfy workflows.
</Accordion>
## Music workflows
<Accordion title="Video workflows">
Set the default video model to `comfy/workflow`:
The bundled plugin registers a music-generation provider for workflow-defined
audio or music outputs, surfaced through the shared `music_generate` tool:
```json5
{
agents: {
defaults: {
videoGenerationModel: {
primary: "comfy/workflow",
},
},
},
}
```
```text
/tool music_generate prompt="Warm ambient synth loop with soft tape texture"
```
Comfy video workflows support text-to-video and image-to-video through the configured graph.
Use the `music` config section to point at your audio workflow JSON and output
node.
<Note>
OpenClaw does not pass input videos into Comfy workflows. Only text prompts and single reference images are supported as inputs.
</Note>
## Comfy Cloud
</Accordion>
Use `mode: "cloud"` plus one of:
<Accordion title="Music workflows">
The bundled plugin registers a music-generation provider for workflow-defined audio or music outputs, surfaced through the shared `music_generate` tool:
- `COMFY_API_KEY`
- `COMFY_CLOUD_API_KEY`
- `models.providers.comfy.apiKey`
```text
/tool music_generate prompt="Warm ambient synth loop with soft tape texture"
```
Cloud mode still uses the same `image`, `video`, and `music` workflow sections.
Use the `music` config section to point at your audio workflow JSON and output node.
## Live tests
</Accordion>
Opt-in live coverage exists for the bundled plugin:
<Accordion title="Backward compatibility">
Existing top-level image config (without the nested `image` section) still works:
```bash
OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts
```
```json5
{
models: {
providers: {
comfy: {
workflowPath: "./workflows/flux-api.json",
promptNodeId: "6",
outputNodeId: "9",
},
},
},
}
```
The live test skips individual image, video, or music cases unless the matching
Comfy workflow section is configured.
OpenClaw treats that legacy shape as the image workflow config. You do not need to migrate immediately, but the nested `image` / `video` / `music` sections are recommended for new setups.
<Tip>
If you only use image generation, the legacy flat config and the new nested `image` section are functionally equivalent.
</Tip>
</Accordion>
<Accordion title="Live tests">
Opt-in live coverage exists for the bundled plugin:
```bash
OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts
```
The live test skips individual image, video, or music cases unless the matching Comfy workflow section is configured.
</Accordion>
</AccordionGroup>
## Related
- [Image Generation](/tools/image-generation)
- [Video Generation](/tools/video-generation)
- [Music Generation](/tools/music-generation)
- [Provider Directory](/providers/index)
- [Configuration Reference](/gateway/configuration-reference#agent-defaults)
<CardGroup cols={2}>
<Card title="Image Generation" href="/tools/image-generation" icon="image">
Image generation tool configuration and usage.
</Card>
<Card title="Video Generation" href="/tools/video-generation" icon="video">
Video generation tool configuration and usage.
</Card>
<Card title="Music Generation" href="/tools/music-generation" icon="music">
Music and audio generation tool setup.
</Card>
<Card title="Provider Directory" href="/providers/index" icon="layers">
Overview of all providers and model refs.
</Card>
<Card title="Configuration Reference" href="/gateway/configuration-reference#agent-defaults" icon="gear">
Full config reference including agent defaults.
</Card>
</CardGroup>

View File

@@ -15,29 +15,49 @@ title: "Hugging Face (Inference)"
- API: OpenAI-compatible (`https://router.huggingface.co/v1`)
- Billing: Single HF token; [pricing](https://huggingface.co/docs/inference-providers/pricing) follows provider rates with a free tier.
## Quick start
## Getting started
1. Create a fine-grained token at [Hugging Face → Settings → Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) with the **Make calls to Inference Providers** permission.
2. Run onboarding and choose **Hugging Face** in the provider dropdown, then enter your API key when prompted:
<Steps>
<Step title="Create a fine-grained token">
Go to [Hugging Face Settings Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) and create a new fine-grained token.
```bash
openclaw onboard --auth-choice huggingface-api-key
```
<Warning>
The token must have the **Make calls to Inference Providers** permission enabled or API requests will be rejected.
</Warning>
3. In the **Default Hugging Face model** dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model.
4. You can also set or change the default model later in config:
</Step>
<Step title="Run onboarding">
Choose **Hugging Face** in the provider dropdown, then enter your API key when prompted:
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },
},
},
}
```
```bash
openclaw onboard --auth-choice huggingface-api-key
```
## Non-interactive example
</Step>
<Step title="Select a default model">
In the **Default Hugging Face model** dropdown, pick the model you want. The list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown. Your choice is saved as the default model.
You can also set or change the default model later in config:
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },
},
},
}
```
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider huggingface
```
</Step>
</Steps>
### Non-interactive setup
```bash
openclaw onboard --non-interactive \
@@ -48,56 +68,10 @@ openclaw onboard --non-interactive \
This will set `huggingface/deepseek-ai/DeepSeek-R1` as the default model.
## Environment note
If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`
is available to that process (for example, in `~/.openclaw/.env` or via
`env.shellEnv`).
## Model discovery and onboarding dropdown
OpenClaw discovers models by calling the **Inference endpoint directly**:
```bash
GET https://router.huggingface.co/v1/models
```
(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.
When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
## Model names and editable options
- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` → “DeepSeek R1”).
- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI:
```json5
{
agents: {
defaults: {
models: {
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },
"huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },
},
},
},
}
```
- **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants:
- **`:fastest`** — highest throughput.
- **`:cheapest`** — lowest cost per output token.
You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).
- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.
## Model IDs and configuration examples
## Model IDs
Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list below is from **GET** `https://router.huggingface.co/v1/models`; your catalog may include more.
**Example IDs (from the inference endpoint):**
| Model | Ref (prefix with `huggingface/`) |
| ---------------------- | ----------------------------------- |
| DeepSeek R1 | `deepseek-ai/DeepSeek-R1` |
@@ -111,83 +85,153 @@ Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list be
| GLM 4.7 | `zai-org/GLM-4.7` |
| Kimi K2.5 | `moonshotai/Kimi-K2.5` |
You can append `:fastest` or `:cheapest` to the model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list.
<Tip>
You can append `:fastest` or `:cheapest` to any model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list.
</Tip>
### Complete configuration examples
## Advanced details
**Primary DeepSeek R1 with Qwen fallback:**
<AccordionGroup>
<Accordion title="Model discovery and onboarding dropdown">
OpenClaw discovers models by calling the **Inference endpoint directly**:
```json5
{
agents: {
defaults: {
model: {
primary: "huggingface/deepseek-ai/DeepSeek-R1",
fallbacks: ["huggingface/Qwen/Qwen3-8B"],
```bash
GET https://router.huggingface.co/v1/models
```
(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.
When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
</Accordion>
<Accordion title="Model names, aliases, and policy suffixes">
- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` becomes "DeepSeek R1").
- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI:
```json5
{
agents: {
defaults: {
models: {
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },
"huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },
},
},
},
models: {
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
}
```
- **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants:
- **`:fastest`** — highest throughput.
- **`:cheapest`** — lowest cost per output token.
You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).
- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.
</Accordion>
<Accordion title="Environment and daemon setup">
If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` is available to that process (for example, in `~/.openclaw/.env` or via `env.shellEnv`).
<Note>
OpenClaw accepts both `HUGGINGFACE_HUB_TOKEN` and `HF_TOKEN` as env var aliases. Either one works; if both are set, `HUGGINGFACE_HUB_TOKEN` takes precedence.
</Note>
</Accordion>
<Accordion title="Config: DeepSeek R1 with Qwen fallback">
```json5
{
agents: {
defaults: {
model: {
primary: "huggingface/deepseek-ai/DeepSeek-R1",
fallbacks: ["huggingface/Qwen/Qwen3-8B"],
},
models: {
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
},
},
},
},
},
}
```
}
```
</Accordion>
**Qwen as default, with :cheapest and :fastest variants:**
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/Qwen/Qwen3-8B" },
models: {
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
"huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },
"huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },
<Accordion title="Config: Qwen with cheapest and fastest variants">
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/Qwen/Qwen3-8B" },
models: {
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
"huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },
"huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },
},
},
},
},
},
}
```
}
```
</Accordion>
**DeepSeek + Llama + GPT-OSS with aliases:**
```json5
{
agents: {
defaults: {
model: {
primary: "huggingface/deepseek-ai/DeepSeek-V3.2",
fallbacks: [
"huggingface/meta-llama/Llama-3.3-70B-Instruct",
"huggingface/openai/gpt-oss-120b",
],
<Accordion title="Config: DeepSeek + Llama + GPT-OSS with aliases">
```json5
{
agents: {
defaults: {
model: {
primary: "huggingface/deepseek-ai/DeepSeek-V3.2",
fallbacks: [
"huggingface/meta-llama/Llama-3.3-70B-Instruct",
"huggingface/openai/gpt-oss-120b",
],
},
models: {
"huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },
"huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },
"huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },
},
},
},
models: {
"huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },
"huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },
"huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },
},
},
},
}
```
}
```
</Accordion>
**Multiple Qwen and DeepSeek models with policy suffixes:**
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },
models: {
"huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },
"huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },
"huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },
"huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },
<Accordion title="Config: Multiple Qwen and DeepSeek with policy suffixes">
```json5
{
agents: {
defaults: {
model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },
models: {
"huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },
"huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },
"huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },
"huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },
},
},
},
},
},
}
```
}
```
</Accordion>
</AccordionGroup>
## Related
<CardGroup cols={2}>
<Card title="Model providers" href="/concepts/model-providers" icon="layers">
Overview of all providers, model refs, and failover behavior.
</Card>
<Card title="Model selection" href="/concepts/models" icon="brain">
How to choose and configure models.
</Card>
<Card title="Inference Providers docs" href="https://huggingface.co/docs/inference-providers" icon="book">
Official Hugging Face Inference Providers documentation.
</Card>
<Card title="Configuration" href="/gateway/configuration" icon="gear">
Full config reference.
</Card>
</CardGroup>

View File

@@ -16,27 +16,27 @@ OpenAI-compatible `/v1` API. OpenClaw works with `inferrs` through the generic
`inferrs` is currently best treated as a custom self-hosted OpenAI-compatible
backend, not a dedicated OpenClaw provider plugin.
## Quick start
## Getting started
1. Start `inferrs` with a model.
Example:
```bash
inferrs serve google/gemma-4-E2B-it \
--host 127.0.0.1 \
--port 8080 \
--device metal
```
2. Verify the server is reachable.
```bash
curl http://127.0.0.1:8080/health
curl http://127.0.0.1:8080/v1/models
```
3. Add an explicit OpenClaw provider entry and point your default model at it.
<Steps>
<Step title="Start inferrs with a model">
```bash
inferrs serve google/gemma-4-E2B-it \
--host 127.0.0.1 \
--port 8080 \
--device metal
```
</Step>
<Step title="Verify the server is reachable">
```bash
curl http://127.0.0.1:8080/health
curl http://127.0.0.1:8080/v1/models
```
</Step>
<Step title="Add an OpenClaw provider entry">
Add an explicit provider entry and point your default model at it. See the full config example below.
</Step>
</Steps>
## Full config example
@@ -81,93 +81,130 @@ This example uses Gemma 4 on a local `inferrs` server.
}
```
## Why `requiresStringContent` matters
## Advanced
Some `inferrs` Chat Completions routes accept only string
`messages[].content`, not structured content-part arrays.
<AccordionGroup>
<Accordion title="Why requiresStringContent matters">
Some `inferrs` Chat Completions routes accept only string
`messages[].content`, not structured content-part arrays.
If OpenClaw runs fail with an error like:
<Warning>
If OpenClaw runs fail with an error like:
```text
messages[1].content: invalid type: sequence, expected a string
```
```text
messages[1].content: invalid type: sequence, expected a string
```
set:
set `compat.requiresStringContent: true` in your model entry.
</Warning>
```json5
compat: {
requiresStringContent: true
}
```
```json5
compat: {
requiresStringContent: true
}
```
OpenClaw will flatten pure text content parts into plain strings before sending
the request.
OpenClaw will flatten pure text content parts into plain strings before sending
the request.
## Gemma and tool-schema caveat
</Accordion>
Some current `inferrs` + Gemma combinations accept small direct
`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime
turns.
<Accordion title="Gemma and tool-schema caveat">
Some current `inferrs` + Gemma combinations accept small direct
`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime
turns.
If that happens, try this first:
If that happens, try this first:
```json5
compat: {
requiresStringContent: true,
supportsTools: false
}
```
```json5
compat: {
requiresStringContent: true,
supportsTools: false
}
```
That disables OpenClaw's tool schema surface for the model and can reduce prompt
pressure on stricter local backends.
That disables OpenClaw's tool schema surface for the model and can reduce prompt
pressure on stricter local backends.
If tiny direct requests still work but normal OpenClaw agent turns continue to
crash inside `inferrs`, the remaining issue is usually upstream model/server
behavior rather than OpenClaw's transport layer.
If tiny direct requests still work but normal OpenClaw agent turns continue to
crash inside `inferrs`, the remaining issue is usually upstream model/server
behavior rather than OpenClaw's transport layer.
## Manual smoke test
</Accordion>
Once configured, test both layers:
<Accordion title="Manual smoke test">
Once configured, test both layers:
```bash
curl http://127.0.0.1:8080/v1/chat/completions \
-H 'content-type: application/json' \
-d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
```bash
curl http://127.0.0.1:8080/v1/chat/completions \
-H 'content-type: application/json' \
-d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
```
openclaw infer model run \
--model inferrs/google/gemma-4-E2B-it \
--prompt "What is 2 + 2? Reply with one short sentence." \
--json
```
```bash
openclaw infer model run \
--model inferrs/google/gemma-4-E2B-it \
--prompt "What is 2 + 2? Reply with one short sentence." \
--json
```
If the first command works but the second fails, use the troubleshooting notes
below.
If the first command works but the second fails, check the troubleshooting section below.
</Accordion>
<Accordion title="Proxy-style behavior">
`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a
native OpenAI endpoint.
- Native OpenAI-only request shaping does not apply here
- No `service_tier`, no Responses `store`, no prompt-cache hints, and no
OpenAI reasoning-compat payload shaping
- Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`)
are not injected on custom `inferrs` base URLs
</Accordion>
</AccordionGroup>
## Troubleshooting
- `curl /v1/models` fails: `inferrs` is not running, not reachable, or not
bound to the expected host/port.
- `messages[].content ... expected a string`: set
`compat.requiresStringContent: true`.
- Direct tiny `/v1/chat/completions` calls pass, but `openclaw infer model run`
fails: try `compat.supportsTools: false`.
- OpenClaw no longer gets schema errors, but `inferrs` still crashes on larger
agent turns: treat it as an upstream `inferrs` or model limitation and reduce
prompt pressure or switch local backend/model.
<AccordionGroup>
<Accordion title="curl /v1/models fails">
`inferrs` is not running, not reachable, or not bound to the expected
host/port. Make sure the server is started and listening on the address you
configured.
</Accordion>
## Proxy-style behavior
<Accordion title="messages[].content expected a string">
Set `compat.requiresStringContent: true` in the model entry. See the
`requiresStringContent` section above for details.
</Accordion>
`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a
native OpenAI endpoint.
<Accordion title="Direct /v1/chat/completions calls pass but openclaw infer model run fails">
Try setting `compat.supportsTools: false` to disable the tool schema surface.
See the Gemma tool-schema caveat above.
</Accordion>
- native OpenAI-only request shaping does not apply here
- no `service_tier`, no Responses `store`, no prompt-cache hints, and no
OpenAI reasoning-compat payload shaping
- hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`)
are not injected on custom `inferrs` base URLs
<Accordion title="inferrs still crashes on larger agent turns">
If OpenClaw no longer gets schema errors but `inferrs` still crashes on larger
agent turns, treat it as an upstream `inferrs` or model limitation. Reduce
prompt pressure or switch to a different local backend or model.
</Accordion>
</AccordionGroup>
<Tip>
For general help, see [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
</Tip>
## See also
- [Local models](/gateway/local-models)
- [Gateway troubleshooting](/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail)
- [Model providers](/concepts/model-providers)
<CardGroup cols={2}>
<Card title="Local models" href="/gateway/local-models" icon="server">
Running OpenClaw against local model servers.
</Card>
<Card title="Gateway troubleshooting" href="/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail" icon="wrench">
Debugging local OpenAI-compatible backends that pass probes but fail agent runs.
</Card>
<Card title="Model providers" href="/concepts/model-providers" icon="layers">
Overview of all providers, model refs, and failover behavior.
</Card>
</CardGroup>

View File

@@ -13,138 +13,215 @@ Moonshot provides the Kimi API with OpenAI-compatible endpoints. Configure the
provider and set the default model to `moonshot/kimi-k2.5`, or use
Kimi Coding with `kimi/kimi-code`.
Current Kimi K2 model IDs:
<Warning>
Moonshot and Kimi Coding are **separate providers**. Keys are not interchangeable, endpoints differ, and model refs differ (`moonshot/...` vs `kimi/...`).
</Warning>
## Built-in model catalog
[//]: # "moonshot-kimi-k2-ids:start"
- `kimi-k2.5`
- `kimi-k2-thinking`
- `kimi-k2-thinking-turbo`
- `kimi-k2-turbo`
| Model ref | Name | Reasoning | Input | Context | Max output |
| --------------------------------- | ---------------------- | --------- | ----------- | ------- | ---------- |
| `moonshot/kimi-k2.5` | Kimi K2.5 | No | text, image | 262,144 | 262,144 |
| `moonshot/kimi-k2-thinking` | Kimi K2 Thinking | Yes | text | 262,144 | 262,144 |
| `moonshot/kimi-k2-thinking-turbo` | Kimi K2 Thinking Turbo | Yes | text | 262,144 | 262,144 |
| `moonshot/kimi-k2-turbo` | Kimi K2 Turbo | No | text | 256,000 | 16,384 |
[//]: # "moonshot-kimi-k2-ids:end"
```bash
openclaw onboard --auth-choice moonshot-api-key
# or
openclaw onboard --auth-choice moonshot-api-key-cn
```
## Getting started
Kimi Coding:
Choose your provider and follow the setup steps.
```bash
openclaw onboard --auth-choice kimi-code-api-key
```
<Tabs>
<Tab title="Moonshot API">
**Best for:** Kimi K2 models via the Moonshot Open Platform.
Note: Moonshot and Kimi Coding are separate providers. Keys are not interchangeable, endpoints differ, and model refs differ (Moonshot uses `moonshot/...`, Kimi Coding uses `kimi/...`).
<Steps>
<Step title="Choose your endpoint region">
| Auth choice | Endpoint | Region |
| ---------------------- | ------------------------------ | ------------- |
| `moonshot-api-key` | `https://api.moonshot.ai/v1` | International |
| `moonshot-api-key-cn` | `https://api.moonshot.cn/v1` | China |
</Step>
<Step title="Run onboarding">
```bash
openclaw onboard --auth-choice moonshot-api-key
```
Kimi web search uses the Moonshot plugin too:
Or for the China endpoint:
```bash
openclaw configure --section web
```
```bash
openclaw onboard --auth-choice moonshot-api-key-cn
```
</Step>
<Step title="Set a default model">
```json5
{
agents: {
defaults: {
model: { primary: "moonshot/kimi-k2.5" },
},
},
}
```
</Step>
<Step title="Verify models are available">
```bash
openclaw models list --provider moonshot
```
</Step>
</Steps>
Choose **Kimi** in the web-search section to store
`plugins.entries.moonshot.config.webSearch.*`.
### Config example
## Config snippet (Moonshot API)
```json5
{
env: { MOONSHOT_API_KEY: "sk-..." },
agents: {
defaults: {
model: { primary: "moonshot/kimi-k2.5" },
```json5
{
env: { MOONSHOT_API_KEY: "sk-..." },
agents: {
defaults: {
model: { primary: "moonshot/kimi-k2.5" },
models: {
// moonshot-kimi-k2-aliases:start
"moonshot/kimi-k2.5": { alias: "Kimi K2.5" },
"moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" },
"moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" },
"moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" },
// moonshot-kimi-k2-aliases:end
},
},
},
models: {
// moonshot-kimi-k2-aliases:start
"moonshot/kimi-k2.5": { alias: "Kimi K2.5" },
"moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" },
"moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" },
"moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" },
// moonshot-kimi-k2-aliases:end
mode: "merge",
providers: {
moonshot: {
baseUrl: "https://api.moonshot.ai/v1",
apiKey: "${MOONSHOT_API_KEY}",
api: "openai-completions",
models: [
// moonshot-kimi-k2-models:start
{
id: "kimi-k2.5",
name: "Kimi K2.5",
reasoning: false,
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-thinking",
name: "Kimi K2 Thinking",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-thinking-turbo",
name: "Kimi K2 Thinking Turbo",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-turbo",
name: "Kimi K2 Turbo",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 256000,
maxTokens: 16384,
},
// moonshot-kimi-k2-models:end
],
},
},
},
},
},
models: {
mode: "merge",
providers: {
moonshot: {
baseUrl: "https://api.moonshot.ai/v1",
apiKey: "${MOONSHOT_API_KEY}",
api: "openai-completions",
models: [
// moonshot-kimi-k2-models:start
{
id: "kimi-k2.5",
name: "Kimi K2.5",
reasoning: false,
input: ["text", "image"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-thinking",
name: "Kimi K2 Thinking",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-thinking-turbo",
name: "Kimi K2 Thinking Turbo",
reasoning: true,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 262144,
maxTokens: 262144,
},
{
id: "kimi-k2-turbo",
name: "Kimi K2 Turbo",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 256000,
maxTokens: 16384,
},
// moonshot-kimi-k2-models:end
],
},
},
},
}
```
}
```
## Kimi Coding
</Tab>
```json5
{
env: { KIMI_API_KEY: "sk-..." },
agents: {
defaults: {
model: { primary: "kimi/kimi-code" },
models: {
"kimi/kimi-code": { alias: "Kimi" },
<Tab title="Kimi Coding">
**Best for:** code-focused tasks via the Kimi Coding endpoint.
<Note>
Kimi Coding uses a different API key and provider prefix (`kimi/...`) than Moonshot (`moonshot/...`). Legacy model ref `kimi/k2p5` remains accepted as a compatibility id.
</Note>
<Steps>
<Step title="Run onboarding">
```bash
openclaw onboard --auth-choice kimi-code-api-key
```
</Step>
<Step title="Set a default model">
```json5
{
agents: {
defaults: {
model: { primary: "kimi/kimi-code" },
},
},
}
```
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider kimi
```
</Step>
</Steps>
### Config example
```json5
{
env: { KIMI_API_KEY: "sk-..." },
agents: {
defaults: {
model: { primary: "kimi/kimi-code" },
models: {
"kimi/kimi-code": { alias: "Kimi" },
},
},
},
},
},
}
```
}
```
</Tab>
</Tabs>
## Kimi web search
OpenClaw also ships **Kimi** as a `web_search` provider, backed by Moonshot web
search.
Interactive setup can prompt for:
<Steps>
<Step title="Run interactive web search setup">
```bash
openclaw configure --section web
```
- the Moonshot API region:
- `https://api.moonshot.ai/v1`
- `https://api.moonshot.cn/v1`
- the default Kimi web-search model (defaults to `kimi-k2.5`)
Choose **Kimi** in the web-search section to store
`plugins.entries.moonshot.config.webSearch.*`.
</Step>
<Step title="Configure the web search region and model">
Interactive setup prompts for:
| Setting | Options |
| ------------------- | -------------------------------------------------------------------- |
| API region | `https://api.moonshot.ai/v1` (international) or `https://api.moonshot.cn/v1` (China) |
| Web search model | Defaults to `kimi-k2.5` |
</Step>
</Steps>
Config lives under `plugins.entries.moonshot.config.webSearch`:
@@ -173,52 +250,82 @@ Config lives under `plugins.entries.moonshot.config.webSearch`:
}
```
## Notes
## Advanced
- Moonshot model refs use `moonshot/<modelId>`. Kimi Coding model refs use `kimi/<modelId>`.
- Current Kimi Coding default model ref is `kimi/kimi-code`. Legacy `kimi/k2p5` remains accepted as a compatibility model id.
- Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`.
- Native Moonshot endpoints (`https://api.moonshot.ai/v1` and
`https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the
shared `openai-completions` transport. OpenClaw now keys that off endpoint
capabilities, so compatible custom provider ids targeting the same native
Moonshot hosts inherit the same streaming-usage behavior.
- Override pricing and context metadata in `models.providers` if needed.
- If Moonshot publishes different context limits for a model, adjust
`contextWindow` accordingly.
- Use `https://api.moonshot.ai/v1` for the international endpoint, and `https://api.moonshot.cn/v1` for the China endpoint.
- Onboarding choices:
- `moonshot-api-key` for `https://api.moonshot.ai/v1`
- `moonshot-api-key-cn` for `https://api.moonshot.cn/v1`
<AccordionGroup>
<Accordion title="Native thinking mode">
Moonshot Kimi supports binary native thinking:
## Native thinking mode (Moonshot)
- `thinking: { type: "enabled" }`
- `thinking: { type: "disabled" }`
Moonshot Kimi supports binary native thinking:
Configure it per model via `agents.defaults.models.<provider/model>.params`:
- `thinking: { type: "enabled" }`
- `thinking: { type: "disabled" }`
Configure it per model via `agents.defaults.models.<provider/model>.params`:
```json5
{
agents: {
defaults: {
models: {
"moonshot/kimi-k2.5": {
params: {
thinking: { type: "disabled" },
```json5
{
agents: {
defaults: {
models: {
"moonshot/kimi-k2.5": {
params: {
thinking: { type: "disabled" },
},
},
},
},
},
},
},
}
```
}
```
OpenClaw also maps runtime `/think` levels for Moonshot:
OpenClaw also maps runtime `/think` levels for Moonshot:
- `/think off` -> `thinking.type=disabled`
- any non-off thinking level -> `thinking.type=enabled`
| `/think` level | Moonshot behavior |
| -------------------- | -------------------------- |
| `/think off` | `thinking.type=disabled` |
| Any non-off level | `thinking.type=enabled` |
When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility.
<Warning>
When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility.
</Warning>
</Accordion>
<Accordion title="Streaming usage compatibility">
Native Moonshot endpoints (`https://api.moonshot.ai/v1` and
`https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the
shared `openai-completions` transport. OpenClaw keys that off endpoint
capabilities, so compatible custom provider ids targeting the same native
Moonshot hosts inherit the same streaming-usage behavior.
</Accordion>
<Accordion title="Endpoint and model ref reference">
| Provider | Model ref prefix | Endpoint | Auth env var |
| ---------- | ---------------- | ----------------------------- | ------------------- |
| Moonshot | `moonshot/` | `https://api.moonshot.ai/v1` | `MOONSHOT_API_KEY` |
| Moonshot CN| `moonshot/` | `https://api.moonshot.cn/v1` | `MOONSHOT_API_KEY` |
| Kimi Coding| `kimi/` | Kimi Coding endpoint | `KIMI_API_KEY` |
| Web search | N/A | Same as Moonshot API region | `KIMI_API_KEY` or `MOONSHOT_API_KEY` |
- Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`.
- Override pricing and context metadata in `models.providers` if needed.
- If Moonshot publishes different context limits for a model, adjust `contextWindow` accordingly.
</Accordion>
</AccordionGroup>
## Related
<CardGroup cols={2}>
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
Choosing providers, model refs, and failover behavior.
</Card>
<Card title="Web search" href="/tools/web-search" icon="magnifying-glass">
Configuring web search providers including Kimi.
</Card>
<Card title="Configuration reference" href="/gateway/configuration-reference" icon="gear">
Full config schema for providers, models, and plugins.
</Card>
<Card title="Moonshot Open Platform" href="https://platform.moonshot.ai" icon="globe">
Moonshot API key management and documentation.
</Card>
</CardGroup>

View File

@@ -17,8 +17,6 @@ background.
</Warning>
## Recommended: Qwen Cloud
OpenClaw now treats Qwen as a first-class bundled provider with canonical id
`qwen`. The bundled provider targets the Qwen Cloud / Alibaba DashScope and
Coding Plan endpoints and keeps legacy `modelstudio` ids working as a
@@ -29,38 +27,108 @@ compatibility alias.
- Also accepted for compatibility: `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`
- API style: OpenAI-compatible
<Tip>
If you want `qwen3.6-plus`, prefer the **Standard (pay-as-you-go)** endpoint.
Coding Plan support can lag behind the public catalog.
</Tip>
```bash
# Global Coding Plan endpoint
openclaw onboard --auth-choice qwen-api-key
## Getting started
# China Coding Plan endpoint
openclaw onboard --auth-choice qwen-api-key-cn
Choose your plan type and follow the setup steps.
# Global Standard (pay-as-you-go) endpoint
openclaw onboard --auth-choice qwen-standard-api-key
<Tabs>
<Tab title="Coding Plan (subscription)">
**Best for:** subscription-based access through the Qwen Coding Plan.
# China Standard (pay-as-you-go) endpoint
openclaw onboard --auth-choice qwen-standard-api-key-cn
```
<Steps>
<Step title="Get your API key">
Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys).
</Step>
<Step title="Run onboarding">
For the **Global** endpoint:
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
work as compatibility aliases, but new setup flows should prefer the canonical
`qwen-*` auth-choice ids and `qwen/...` model refs.
```bash
openclaw onboard --auth-choice qwen-api-key
```
After onboarding, set a default model:
For the **China** endpoint:
```json5
{
agents: {
defaults: {
model: { primary: "qwen/qwen3.5-plus" },
},
},
}
```
```bash
openclaw onboard --auth-choice qwen-api-key-cn
```
</Step>
<Step title="Set a default model">
```json5
{
agents: {
defaults: {
model: { primary: "qwen/qwen3.5-plus" },
},
},
}
```
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider qwen
```
</Step>
</Steps>
<Note>
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
work as compatibility aliases, but new setup flows should prefer the canonical
`qwen-*` auth-choice ids and `qwen/...` model refs.
</Note>
</Tab>
<Tab title="Standard (pay-as-you-go)">
**Best for:** pay-as-you-go access through the Standard Model Studio endpoint, including models like `qwen3.6-plus` that may not be available on the Coding Plan.
<Steps>
<Step title="Get your API key">
Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys).
</Step>
<Step title="Run onboarding">
For the **Global** endpoint:
```bash
openclaw onboard --auth-choice qwen-standard-api-key
```
For the **China** endpoint:
```bash
openclaw onboard --auth-choice qwen-standard-api-key-cn
```
</Step>
<Step title="Set a default model">
```json5
{
agents: {
defaults: {
model: { primary: "qwen/qwen3.5-plus" },
},
},
}
```
</Step>
<Step title="Verify the model is available">
```bash
openclaw models list --provider qwen
```
</Step>
</Steps>
<Note>
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
work as compatibility aliases, but new setup flows should prefer the canonical
`qwen-*` auth-choice ids and `qwen/...` model refs.
</Note>
</Tab>
</Tabs>
## Plan types and endpoints
@@ -75,16 +143,10 @@ The provider auto-selects the endpoint based on your auth choice. Canonical
choices use the `qwen-*` family; `modelstudio-*` remains compatibility-only.
You can override with a custom `baseUrl` in config.
Native Model Studio endpoints advertise streaming usage compatibility on the
shared `openai-completions` transport. OpenClaw keys that off endpoint
capabilities now, so DashScope-compatible custom provider ids targeting the
same native hosts inherit the same streaming-usage behavior instead of
requiring the built-in `qwen` provider id specifically.
## Get your API key
- **Manage keys**: [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys)
- **Docs**: [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction)
<Tip>
**Manage keys:** [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys) |
**Docs:** [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction)
</Tip>
## Built-in catalog
@@ -104,71 +166,20 @@ the Standard endpoint.
| `qwen/glm-4.7` | text | 202,752 | GLM |
| `qwen/kimi-k2.5` | text, image | 262,144 | Moonshot AI via Alibaba |
<Note>
Availability can still vary by endpoint and billing plan even when a model is
present in the bundled catalog.
Native-streaming usage compatibility applies to both the Coding Plan hosts and
the Standard DashScope-compatible hosts:
- `https://coding.dashscope.aliyuncs.com/v1`
- `https://coding-intl.dashscope.aliyuncs.com/v1`
- `https://dashscope.aliyuncs.com/compatible-mode/v1`
- `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
## Qwen 3.6 Plus availability
`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio
endpoints:
- China: `dashscope.aliyuncs.com/compatible-mode/v1`
- Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1`
If the Coding Plan endpoints return an "unsupported model" error for
`qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan
endpoint/key pair.
## Capability plan
The `qwen` extension is being positioned as the vendor home for the full Qwen
Cloud surface, not just coding/text models.
- Text/chat models: bundled now
- Tool calling, structured output, thinking: inherited from the OpenAI-compatible transport
- Image generation: planned at the provider-plugin layer
- Image/video understanding: bundled now on the Standard endpoint
- Speech/audio: planned at the provider-plugin layer
- Memory embeddings/reranking: planned through the embedding adapter surface
- Video generation: bundled now through the shared video-generation capability
</Note>
## Multimodal add-ons
The `qwen` extension now also exposes:
The `qwen` extension also exposes multimodal capabilities on the **Standard**
DashScope endpoints (not the Coding Plan endpoints):
- Video understanding via `qwen-vl-max-latest`
- Wan video generation via:
- `wan2.6-t2v` (default)
- `wan2.6-i2v`
- `wan2.6-r2v`
- `wan2.6-r2v-flash`
- `wan2.7-r2v`
- **Video understanding** via `qwen-vl-max-latest`
- **Wan video generation** via `wan2.6-t2v` (default), `wan2.6-i2v`, `wan2.6-r2v`, `wan2.6-r2v-flash`, `wan2.7-r2v`
These multimodal surfaces use the **Standard** DashScope endpoints, not the
Coding Plan endpoints.
- Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
- China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1`
For video generation, OpenClaw maps the configured Qwen region to the matching
DashScope AIGC host before submitting the job:
- Global/Intl: `https://dashscope-intl.aliyuncs.com`
- China: `https://dashscope.aliyuncs.com`
That means a normal `models.providers.qwen.baseUrl` pointing at either the
Coding Plan or Standard Qwen hosts still keeps video generation on the correct
regional DashScope video endpoint.
For video generation, set a default model explicitly:
To use Qwen as the default video provider:
```json5
{
@@ -180,22 +191,110 @@ For video generation, set a default model explicitly:
}
```
Current bundled Qwen video-generation limits:
<Note>
See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior.
</Note>
- Up to **1** output video per request
- Up to **1** input image
- Up to **4** input videos
- Up to **10 seconds** duration
- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark`
- Reference image/video mode currently requires **remote http(s) URLs**. Local
file paths are rejected up front because the DashScope video endpoint does not
accept uploaded local buffers for those references.
## Advanced
See [Video Generation](/tools/video-generation) for the shared tool
parameters, provider selection, and failover behavior.
<AccordionGroup>
<Accordion title="Qwen 3.6 Plus availability">
`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio
endpoints:
## Environment note
- China: `dashscope.aliyuncs.com/compatible-mode/v1`
- Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1`
If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is
available to that process (for example, in `~/.openclaw/.env` or via
`env.shellEnv`).
If the Coding Plan endpoints return an "unsupported model" error for
`qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan
endpoint/key pair.
</Accordion>
<Accordion title="Capability plan">
The `qwen` extension is being positioned as the vendor home for the full Qwen
Cloud surface, not just coding/text models.
- **Text/chat models:** bundled now
- **Tool calling, structured output, thinking:** inherited from the OpenAI-compatible transport
- **Image generation:** planned at the provider-plugin layer
- **Image/video understanding:** bundled now on the Standard endpoint
- **Speech/audio:** planned at the provider-plugin layer
- **Memory embeddings/reranking:** planned through the embedding adapter surface
- **Video generation:** bundled now through the shared video-generation capability
</Accordion>
<Accordion title="Video generation details">
For video generation, OpenClaw maps the configured Qwen region to the matching
DashScope AIGC host before submitting the job:
- Global/Intl: `https://dashscope-intl.aliyuncs.com`
- China: `https://dashscope.aliyuncs.com`
That means a normal `models.providers.qwen.baseUrl` pointing at either the
Coding Plan or Standard Qwen hosts still keeps video generation on the correct
regional DashScope video endpoint.
Current bundled Qwen video-generation limits:
- Up to **1** output video per request
- Up to **1** input image
- Up to **4** input videos
- Up to **10 seconds** duration
- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark`
- Reference image/video mode currently requires **remote http(s) URLs**. Local
file paths are rejected up front because the DashScope video endpoint does not
accept uploaded local buffers for those references.
</Accordion>
<Accordion title="Streaming usage compatibility">
Native Model Studio endpoints advertise streaming usage compatibility on the
shared `openai-completions` transport. OpenClaw keys that off endpoint
capabilities now, so DashScope-compatible custom provider ids targeting the
same native hosts inherit the same streaming-usage behavior instead of
requiring the built-in `qwen` provider id specifically.
Native-streaming usage compatibility applies to both the Coding Plan hosts and
the Standard DashScope-compatible hosts:
- `https://coding.dashscope.aliyuncs.com/v1`
- `https://coding-intl.dashscope.aliyuncs.com/v1`
- `https://dashscope.aliyuncs.com/compatible-mode/v1`
- `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
</Accordion>
<Accordion title="Multimodal endpoint regions">
Multimodal surfaces (video understanding and Wan video generation) use the
**Standard** DashScope endpoints, not the Coding Plan endpoints:
- Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
- China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1`
</Accordion>
<Accordion title="Environment and daemon setup">
If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is
available to that process (for example, in `~/.openclaw/.env` or via
`env.shellEnv`).
</Accordion>
</AccordionGroup>
## Related
<CardGroup cols={2}>
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
Choosing providers, model refs, and failover behavior.
</Card>
<Card title="Video generation" href="/tools/video-generation" icon="video">
Shared video tool parameters and provider selection.
</Card>
<Card title="Alibaba (ModelStudio)" href="/providers/alibaba" icon="cloud">
Legacy ModelStudio provider and migration notes.
</Card>
<Card title="Troubleshooting" href="/help/troubleshooting" icon="wrench">
General troubleshooting and FAQ.
</Card>
</CardGroup>