mirror of
https://github.com/moltbot/moltbot.git
synced 2026-04-18 12:14:32 +00:00
docs(providers): improve moonshot, qwen, comfy, huggingface, inferrs with Mintlify components
This commit is contained in:
@@ -9,13 +9,15 @@ read_when:
|
||||
|
||||
# ComfyUI
|
||||
|
||||
OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs.
|
||||
OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs. The plugin is entirely workflow-driven, so OpenClaw does not try to map generic `size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls onto your graph.
|
||||
|
||||
- Provider: `comfy`
|
||||
- Models: `comfy/workflow`
|
||||
- Shared surfaces: `image_generate`, `video_generate`, `music_generate`
|
||||
- Auth: none for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud
|
||||
- API: ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*`
|
||||
| Property | Detail |
|
||||
| --------------- | -------------------------------------------------------------------------------- |
|
||||
| Provider | `comfy` |
|
||||
| Models | `comfy/workflow` |
|
||||
| Shared surfaces | `image_generate`, `video_generate`, `music_generate` |
|
||||
| Auth | None for local ComfyUI; `COMFY_API_KEY` or `COMFY_CLOUD_API_KEY` for Comfy Cloud |
|
||||
| API | ComfyUI `/prompt` / `/history` / `/view` and Comfy Cloud `/api/*` |
|
||||
|
||||
## What it supports
|
||||
|
||||
@@ -26,14 +28,140 @@ OpenClaw ships a bundled `comfy` plugin for workflow-driven ComfyUI runs.
|
||||
- Music or audio generation through the shared `music_generate` tool
|
||||
- Output download from a configured node or all matching output nodes
|
||||
|
||||
The bundled plugin is workflow-driven, so OpenClaw does not try to map generic
|
||||
`size`, `aspectRatio`, `resolution`, `durationSeconds`, or TTS-style controls
|
||||
onto your graph.
|
||||
## Getting started
|
||||
|
||||
## Config layout
|
||||
Choose between running ComfyUI on your own machine or using Comfy Cloud.
|
||||
|
||||
Comfy supports shared top-level connection settings plus per-capability workflow
|
||||
sections:
|
||||
<Tabs>
|
||||
<Tab title="Local">
|
||||
**Best for:** running your own ComfyUI instance on your machine or LAN.
|
||||
|
||||
<Steps>
|
||||
<Step title="Start ComfyUI locally">
|
||||
Make sure your local ComfyUI instance is running (defaults to `http://127.0.0.1:8188`).
|
||||
</Step>
|
||||
<Step title="Prepare your workflow JSON">
|
||||
Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node you want OpenClaw to read from.
|
||||
</Step>
|
||||
<Step title="Configure the provider">
|
||||
Set `mode: "local"` and point at your workflow file. Here is a minimal image example:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
mode: "local",
|
||||
baseUrl: "http://127.0.0.1:8188",
|
||||
image: {
|
||||
workflowPath: "./workflows/flux-api.json",
|
||||
promptNodeId: "6",
|
||||
outputNodeId: "9",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set the default model">
|
||||
Point OpenClaw at the `comfy/workflow` model for the capability you configured:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify">
|
||||
```bash
|
||||
openclaw models list --provider comfy
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
</Tab>
|
||||
|
||||
<Tab title="Comfy Cloud">
|
||||
**Best for:** running workflows on Comfy Cloud without managing local GPU resources.
|
||||
|
||||
<Steps>
|
||||
<Step title="Get an API key">
|
||||
Sign up at [comfy.org](https://comfy.org) and generate an API key from your account dashboard.
|
||||
</Step>
|
||||
<Step title="Set the API key">
|
||||
Provide your key through one of these methods:
|
||||
|
||||
```bash
|
||||
# Environment variable (preferred)
|
||||
export COMFY_API_KEY="your-key"
|
||||
|
||||
# Alternative environment variable
|
||||
export COMFY_CLOUD_API_KEY="your-key"
|
||||
|
||||
# Or inline in config
|
||||
openclaw config set models.providers.comfy.apiKey "your-key"
|
||||
```
|
||||
</Step>
|
||||
<Step title="Prepare your workflow JSON">
|
||||
Export or create a ComfyUI workflow JSON file. Note the node IDs for the prompt input node and the output node.
|
||||
</Step>
|
||||
<Step title="Configure the provider">
|
||||
Set `mode: "cloud"` and point at your workflow file:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
mode: "cloud",
|
||||
image: {
|
||||
workflowPath: "./workflows/flux-api.json",
|
||||
promptNodeId: "6",
|
||||
outputNodeId: "9",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
<Tip>
|
||||
Cloud mode defaults `baseUrl` to `https://cloud.comfy.org`. You only need to set `baseUrl` if you use a custom cloud endpoint.
|
||||
</Tip>
|
||||
</Step>
|
||||
<Step title="Set the default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify">
|
||||
```bash
|
||||
openclaw models list --provider comfy
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Configuration
|
||||
|
||||
Comfy supports shared top-level connection settings plus per-capability workflow sections (`image`, `video`, `music`):
|
||||
|
||||
```json5
|
||||
{
|
||||
@@ -63,139 +191,164 @@ sections:
|
||||
}
|
||||
```
|
||||
|
||||
Shared keys:
|
||||
### Shared keys
|
||||
|
||||
- `mode`: `local` or `cloud`
|
||||
- `baseUrl`: defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud
|
||||
- `apiKey`: optional inline key alternative to env vars
|
||||
- `allowPrivateNetwork`: allow a private/LAN `baseUrl` in cloud mode
|
||||
| Key | Type | Description |
|
||||
| --------------------- | ---------------------- | ------------------------------------------------------------------------------------- |
|
||||
| `mode` | `"local"` or `"cloud"` | Connection mode. |
|
||||
| `baseUrl` | string | Defaults to `http://127.0.0.1:8188` for local or `https://cloud.comfy.org` for cloud. |
|
||||
| `apiKey` | string | Optional inline key, alternative to `COMFY_API_KEY` / `COMFY_CLOUD_API_KEY` env vars. |
|
||||
| `allowPrivateNetwork` | boolean | Allow a private/LAN `baseUrl` in cloud mode. |
|
||||
|
||||
Per-capability keys under `image`, `video`, or `music`:
|
||||
### Per-capability keys
|
||||
|
||||
- `workflow` or `workflowPath`: required
|
||||
- `promptNodeId`: required
|
||||
- `promptInputName`: defaults to `text`
|
||||
- `outputNodeId`: optional
|
||||
- `pollIntervalMs`: optional
|
||||
- `timeoutMs`: optional
|
||||
These keys apply inside the `image`, `video`, or `music` sections:
|
||||
|
||||
Image and video sections also support:
|
||||
| Key | Required | Default | Description |
|
||||
| ---------------------------- | -------- | -------- | ---------------------------------------------------------------------------- |
|
||||
| `workflow` or `workflowPath` | Yes | -- | Path to the ComfyUI workflow JSON file. |
|
||||
| `promptNodeId` | Yes | -- | Node ID that receives the text prompt. |
|
||||
| `promptInputName` | No | `"text"` | Input name on the prompt node. |
|
||||
| `outputNodeId` | No | -- | Node ID to read output from. If omitted, all matching output nodes are used. |
|
||||
| `pollIntervalMs` | No | -- | Polling interval in milliseconds for job completion. |
|
||||
| `timeoutMs` | No | -- | Timeout in milliseconds for the workflow run. |
|
||||
|
||||
- `inputImageNodeId`: required when you pass a reference image
|
||||
- `inputImageInputName`: defaults to `image`
|
||||
The `image` and `video` sections also support:
|
||||
|
||||
## Backward compatibility
|
||||
| Key | Required | Default | Description |
|
||||
| --------------------- | ------------------------------------ | --------- | --------------------------------------------------- |
|
||||
| `inputImageNodeId` | Yes (when passing a reference image) | -- | Node ID that receives the uploaded reference image. |
|
||||
| `inputImageInputName` | No | `"image"` | Input name on the image node. |
|
||||
|
||||
Existing top-level image config still works:
|
||||
## Workflow details
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
workflowPath: "./workflows/flux-api.json",
|
||||
promptNodeId: "6",
|
||||
outputNodeId: "9",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
<AccordionGroup>
|
||||
<Accordion title="Image workflows">
|
||||
Set the default image model to `comfy/workflow`:
|
||||
|
||||
OpenClaw treats that legacy shape as the image workflow config.
|
||||
|
||||
## Image workflows
|
||||
|
||||
Set the default image model:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
Reference-image editing example:
|
||||
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
image: {
|
||||
workflowPath: "./workflows/edit-api.json",
|
||||
promptNodeId: "6",
|
||||
inputImageNodeId: "7",
|
||||
inputImageInputName: "image",
|
||||
outputNodeId: "9",
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
imageGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
## Video workflows
|
||||
**Reference-image editing example:**
|
||||
|
||||
Set the default video model:
|
||||
To enable image editing with an uploaded reference image, add `inputImageNodeId` to your image config:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
videoGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
image: {
|
||||
workflowPath: "./workflows/edit-api.json",
|
||||
promptNodeId: "6",
|
||||
inputImageNodeId: "7",
|
||||
inputImageInputName: "image",
|
||||
outputNodeId: "9",
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
Comfy video workflows currently support text-to-video and image-to-video through
|
||||
the configured graph. OpenClaw does not pass input videos into Comfy workflows.
|
||||
</Accordion>
|
||||
|
||||
## Music workflows
|
||||
<Accordion title="Video workflows">
|
||||
Set the default video model to `comfy/workflow`:
|
||||
|
||||
The bundled plugin registers a music-generation provider for workflow-defined
|
||||
audio or music outputs, surfaced through the shared `music_generate` tool:
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
videoGenerationModel: {
|
||||
primary: "comfy/workflow",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
```text
|
||||
/tool music_generate prompt="Warm ambient synth loop with soft tape texture"
|
||||
```
|
||||
Comfy video workflows support text-to-video and image-to-video through the configured graph.
|
||||
|
||||
Use the `music` config section to point at your audio workflow JSON and output
|
||||
node.
|
||||
<Note>
|
||||
OpenClaw does not pass input videos into Comfy workflows. Only text prompts and single reference images are supported as inputs.
|
||||
</Note>
|
||||
|
||||
## Comfy Cloud
|
||||
</Accordion>
|
||||
|
||||
Use `mode: "cloud"` plus one of:
|
||||
<Accordion title="Music workflows">
|
||||
The bundled plugin registers a music-generation provider for workflow-defined audio or music outputs, surfaced through the shared `music_generate` tool:
|
||||
|
||||
- `COMFY_API_KEY`
|
||||
- `COMFY_CLOUD_API_KEY`
|
||||
- `models.providers.comfy.apiKey`
|
||||
```text
|
||||
/tool music_generate prompt="Warm ambient synth loop with soft tape texture"
|
||||
```
|
||||
|
||||
Cloud mode still uses the same `image`, `video`, and `music` workflow sections.
|
||||
Use the `music` config section to point at your audio workflow JSON and output node.
|
||||
|
||||
## Live tests
|
||||
</Accordion>
|
||||
|
||||
Opt-in live coverage exists for the bundled plugin:
|
||||
<Accordion title="Backward compatibility">
|
||||
Existing top-level image config (without the nested `image` section) still works:
|
||||
|
||||
```bash
|
||||
OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts
|
||||
```
|
||||
```json5
|
||||
{
|
||||
models: {
|
||||
providers: {
|
||||
comfy: {
|
||||
workflowPath: "./workflows/flux-api.json",
|
||||
promptNodeId: "6",
|
||||
outputNodeId: "9",
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
The live test skips individual image, video, or music cases unless the matching
|
||||
Comfy workflow section is configured.
|
||||
OpenClaw treats that legacy shape as the image workflow config. You do not need to migrate immediately, but the nested `image` / `video` / `music` sections are recommended for new setups.
|
||||
|
||||
<Tip>
|
||||
If you only use image generation, the legacy flat config and the new nested `image` section are functionally equivalent.
|
||||
</Tip>
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Live tests">
|
||||
Opt-in live coverage exists for the bundled plugin:
|
||||
|
||||
```bash
|
||||
OPENCLAW_LIVE_TEST=1 COMFY_LIVE_TEST=1 pnpm test:live -- extensions/comfy/comfy.live.test.ts
|
||||
```
|
||||
|
||||
The live test skips individual image, video, or music cases unless the matching Comfy workflow section is configured.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
- [Image Generation](/tools/image-generation)
|
||||
- [Video Generation](/tools/video-generation)
|
||||
- [Music Generation](/tools/music-generation)
|
||||
- [Provider Directory](/providers/index)
|
||||
- [Configuration Reference](/gateway/configuration-reference#agent-defaults)
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Image Generation" href="/tools/image-generation" icon="image">
|
||||
Image generation tool configuration and usage.
|
||||
</Card>
|
||||
<Card title="Video Generation" href="/tools/video-generation" icon="video">
|
||||
Video generation tool configuration and usage.
|
||||
</Card>
|
||||
<Card title="Music Generation" href="/tools/music-generation" icon="music">
|
||||
Music and audio generation tool setup.
|
||||
</Card>
|
||||
<Card title="Provider Directory" href="/providers/index" icon="layers">
|
||||
Overview of all providers and model refs.
|
||||
</Card>
|
||||
<Card title="Configuration Reference" href="/gateway/configuration-reference#agent-defaults" icon="gear">
|
||||
Full config reference including agent defaults.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -15,29 +15,49 @@ title: "Hugging Face (Inference)"
|
||||
- API: OpenAI-compatible (`https://router.huggingface.co/v1`)
|
||||
- Billing: Single HF token; [pricing](https://huggingface.co/docs/inference-providers/pricing) follows provider rates with a free tier.
|
||||
|
||||
## Quick start
|
||||
## Getting started
|
||||
|
||||
1. Create a fine-grained token at [Hugging Face → Settings → Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) with the **Make calls to Inference Providers** permission.
|
||||
2. Run onboarding and choose **Hugging Face** in the provider dropdown, then enter your API key when prompted:
|
||||
<Steps>
|
||||
<Step title="Create a fine-grained token">
|
||||
Go to [Hugging Face Settings Tokens](https://huggingface.co/settings/tokens/new?ownUserPermissions=inference.serverless.write&tokenType=fineGrained) and create a new fine-grained token.
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice huggingface-api-key
|
||||
```
|
||||
<Warning>
|
||||
The token must have the **Make calls to Inference Providers** permission enabled or API requests will be rejected.
|
||||
</Warning>
|
||||
|
||||
3. In the **Default Hugging Face model** dropdown, pick the model you want (the list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown). Your choice is saved as the default model.
|
||||
4. You can also set or change the default model later in config:
|
||||
</Step>
|
||||
<Step title="Run onboarding">
|
||||
Choose **Hugging Face** in the provider dropdown, then enter your API key when prompted:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
```bash
|
||||
openclaw onboard --auth-choice huggingface-api-key
|
||||
```
|
||||
|
||||
## Non-interactive example
|
||||
</Step>
|
||||
<Step title="Select a default model">
|
||||
In the **Default Hugging Face model** dropdown, pick the model you want. The list is loaded from the Inference API when you have a valid token; otherwise a built-in list is shown. Your choice is saved as the default model.
|
||||
|
||||
You can also set or change the default model later in config:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/deepseek-ai/DeepSeek-R1" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider huggingface
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### Non-interactive setup
|
||||
|
||||
```bash
|
||||
openclaw onboard --non-interactive \
|
||||
@@ -48,56 +68,10 @@ openclaw onboard --non-interactive \
|
||||
|
||||
This will set `huggingface/deepseek-ai/DeepSeek-R1` as the default model.
|
||||
|
||||
## Environment note
|
||||
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN`
|
||||
is available to that process (for example, in `~/.openclaw/.env` or via
|
||||
`env.shellEnv`).
|
||||
|
||||
## Model discovery and onboarding dropdown
|
||||
|
||||
OpenClaw discovers models by calling the **Inference endpoint directly**:
|
||||
|
||||
```bash
|
||||
GET https://router.huggingface.co/v1/models
|
||||
```
|
||||
|
||||
(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.
|
||||
|
||||
When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
|
||||
|
||||
## Model names and editable options
|
||||
|
||||
- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` → “DeepSeek R1”).
|
||||
- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },
|
||||
"huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
|
||||
- **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants:
|
||||
- **`:fastest`** — highest throughput.
|
||||
- **`:cheapest`** — lowest cost per output token.
|
||||
|
||||
You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).
|
||||
|
||||
- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.
|
||||
|
||||
## Model IDs and configuration examples
|
||||
## Model IDs
|
||||
|
||||
Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list below is from **GET** `https://router.huggingface.co/v1/models`; your catalog may include more.
|
||||
|
||||
**Example IDs (from the inference endpoint):**
|
||||
|
||||
| Model | Ref (prefix with `huggingface/`) |
|
||||
| ---------------------- | ----------------------------------- |
|
||||
| DeepSeek R1 | `deepseek-ai/DeepSeek-R1` |
|
||||
@@ -111,83 +85,153 @@ Model refs use the form `huggingface/<org>/<model>` (Hub-style IDs). The list be
|
||||
| GLM 4.7 | `zai-org/GLM-4.7` |
|
||||
| Kimi K2.5 | `moonshotai/Kimi-K2.5` |
|
||||
|
||||
You can append `:fastest` or `:cheapest` to the model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list.
|
||||
<Tip>
|
||||
You can append `:fastest` or `:cheapest` to any model id. Set your default order in [Inference Provider settings](https://hf.co/settings/inference-providers); see [Inference Providers](https://huggingface.co/docs/inference-providers) and **GET** `https://router.huggingface.co/v1/models` for the full list.
|
||||
</Tip>
|
||||
|
||||
### Complete configuration examples
|
||||
## Advanced details
|
||||
|
||||
**Primary DeepSeek R1 with Qwen fallback:**
|
||||
<AccordionGroup>
|
||||
<Accordion title="Model discovery and onboarding dropdown">
|
||||
OpenClaw discovers models by calling the **Inference endpoint directly**:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "huggingface/deepseek-ai/DeepSeek-R1",
|
||||
fallbacks: ["huggingface/Qwen/Qwen3-8B"],
|
||||
```bash
|
||||
GET https://router.huggingface.co/v1/models
|
||||
```
|
||||
|
||||
(Optional: send `Authorization: Bearer $HUGGINGFACE_HUB_TOKEN` or `$HF_TOKEN` for the full list; some endpoints return a subset without auth.) The response is OpenAI-style `{ "object": "list", "data": [ { "id": "Qwen/Qwen3-8B", "owned_by": "Qwen", ... }, ... ] }`.
|
||||
|
||||
When you configure a Hugging Face API key (via onboarding, `HUGGINGFACE_HUB_TOKEN`, or `HF_TOKEN`), OpenClaw uses this GET to discover available chat-completion models. During **interactive setup**, after you enter your token you see a **Default Hugging Face model** dropdown populated from that list (or the built-in catalog if the request fails). At runtime (e.g. Gateway startup), when a key is present, OpenClaw again calls **GET** `https://router.huggingface.co/v1/models` to refresh the catalog. The list is merged with a built-in catalog (for metadata like context window and cost). If the request fails or no key is set, only the built-in catalog is used.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Model names, aliases, and policy suffixes">
|
||||
- **Name from API:** The model display name is **hydrated from GET /v1/models** when the API returns `name`, `title`, or `display_name`; otherwise it is derived from the model id (e.g. `deepseek-ai/DeepSeek-R1` becomes "DeepSeek R1").
|
||||
- **Override display name:** You can set a custom label per model in config so it appears the way you want in the CLI and UI:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1 (fast)" },
|
||||
"huggingface/deepseek-ai/DeepSeek-R1:cheapest": { alias: "DeepSeek R1 (cheap)" },
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },
|
||||
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
|
||||
}
|
||||
```
|
||||
|
||||
- **Policy suffixes:** OpenClaw's bundled Hugging Face docs and helpers currently treat these two suffixes as the built-in policy variants:
|
||||
- **`:fastest`** — highest throughput.
|
||||
- **`:cheapest`** — lowest cost per output token.
|
||||
|
||||
You can add these as separate entries in `models.providers.huggingface.models` or set `model.primary` with the suffix. You can also set your default provider order in [Inference Provider settings](https://hf.co/settings/inference-providers) (no suffix = use that order).
|
||||
|
||||
- **Config merge:** Existing entries in `models.providers.huggingface.models` (e.g. in `models.json`) are kept when config is merged. So any custom `name`, `alias`, or model options you set there are preserved.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Environment and daemon setup">
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `HUGGINGFACE_HUB_TOKEN` or `HF_TOKEN` is available to that process (for example, in `~/.openclaw/.env` or via `env.shellEnv`).
|
||||
|
||||
<Note>
|
||||
OpenClaw accepts both `HUGGINGFACE_HUB_TOKEN` and `HF_TOKEN` as env var aliases. Either one works; if both are set, `HUGGINGFACE_HUB_TOKEN` takes precedence.
|
||||
</Note>
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Config: DeepSeek R1 with Qwen fallback">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "huggingface/deepseek-ai/DeepSeek-R1",
|
||||
fallbacks: ["huggingface/Qwen/Qwen3-8B"],
|
||||
},
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-R1": { alias: "DeepSeek R1" },
|
||||
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
**Qwen as default, with :cheapest and :fastest variants:**
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/Qwen/Qwen3-8B" },
|
||||
models: {
|
||||
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
|
||||
"huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },
|
||||
"huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },
|
||||
<Accordion title="Config: Qwen with cheapest and fastest variants">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/Qwen/Qwen3-8B" },
|
||||
models: {
|
||||
"huggingface/Qwen/Qwen3-8B": { alias: "Qwen3 8B" },
|
||||
"huggingface/Qwen/Qwen3-8B:cheapest": { alias: "Qwen3 8B (cheapest)" },
|
||||
"huggingface/Qwen/Qwen3-8B:fastest": { alias: "Qwen3 8B (fastest)" },
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
**DeepSeek + Llama + GPT-OSS with aliases:**
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "huggingface/deepseek-ai/DeepSeek-V3.2",
|
||||
fallbacks: [
|
||||
"huggingface/meta-llama/Llama-3.3-70B-Instruct",
|
||||
"huggingface/openai/gpt-oss-120b",
|
||||
],
|
||||
<Accordion title="Config: DeepSeek + Llama + GPT-OSS with aliases">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: {
|
||||
primary: "huggingface/deepseek-ai/DeepSeek-V3.2",
|
||||
fallbacks: [
|
||||
"huggingface/meta-llama/Llama-3.3-70B-Instruct",
|
||||
"huggingface/openai/gpt-oss-120b",
|
||||
],
|
||||
},
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },
|
||||
"huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },
|
||||
"huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
"huggingface/deepseek-ai/DeepSeek-V3.2": { alias: "DeepSeek V3.2" },
|
||||
"huggingface/meta-llama/Llama-3.3-70B-Instruct": { alias: "Llama 3.3 70B" },
|
||||
"huggingface/openai/gpt-oss-120b": { alias: "GPT-OSS 120B" },
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
|
||||
**Multiple Qwen and DeepSeek models with policy suffixes:**
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },
|
||||
models: {
|
||||
"huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },
|
||||
"huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },
|
||||
"huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },
|
||||
"huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },
|
||||
<Accordion title="Config: Multiple Qwen and DeepSeek with policy suffixes">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest" },
|
||||
models: {
|
||||
"huggingface/Qwen/Qwen2.5-7B-Instruct": { alias: "Qwen2.5 7B" },
|
||||
"huggingface/Qwen/Qwen2.5-7B-Instruct:cheapest": { alias: "Qwen2.5 7B (cheap)" },
|
||||
"huggingface/deepseek-ai/DeepSeek-R1:fastest": { alias: "DeepSeek R1 (fast)" },
|
||||
"huggingface/meta-llama/Llama-3.1-8B-Instruct": { alias: "Llama 3.1 8B" },
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model providers" href="/concepts/model-providers" icon="layers">
|
||||
Overview of all providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Model selection" href="/concepts/models" icon="brain">
|
||||
How to choose and configure models.
|
||||
</Card>
|
||||
<Card title="Inference Providers docs" href="https://huggingface.co/docs/inference-providers" icon="book">
|
||||
Official Hugging Face Inference Providers documentation.
|
||||
</Card>
|
||||
<Card title="Configuration" href="/gateway/configuration" icon="gear">
|
||||
Full config reference.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -16,27 +16,27 @@ OpenAI-compatible `/v1` API. OpenClaw works with `inferrs` through the generic
|
||||
`inferrs` is currently best treated as a custom self-hosted OpenAI-compatible
|
||||
backend, not a dedicated OpenClaw provider plugin.
|
||||
|
||||
## Quick start
|
||||
## Getting started
|
||||
|
||||
1. Start `inferrs` with a model.
|
||||
|
||||
Example:
|
||||
|
||||
```bash
|
||||
inferrs serve google/gemma-4-E2B-it \
|
||||
--host 127.0.0.1 \
|
||||
--port 8080 \
|
||||
--device metal
|
||||
```
|
||||
|
||||
2. Verify the server is reachable.
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8080/health
|
||||
curl http://127.0.0.1:8080/v1/models
|
||||
```
|
||||
|
||||
3. Add an explicit OpenClaw provider entry and point your default model at it.
|
||||
<Steps>
|
||||
<Step title="Start inferrs with a model">
|
||||
```bash
|
||||
inferrs serve google/gemma-4-E2B-it \
|
||||
--host 127.0.0.1 \
|
||||
--port 8080 \
|
||||
--device metal
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the server is reachable">
|
||||
```bash
|
||||
curl http://127.0.0.1:8080/health
|
||||
curl http://127.0.0.1:8080/v1/models
|
||||
```
|
||||
</Step>
|
||||
<Step title="Add an OpenClaw provider entry">
|
||||
Add an explicit provider entry and point your default model at it. See the full config example below.
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
## Full config example
|
||||
|
||||
@@ -81,93 +81,130 @@ This example uses Gemma 4 on a local `inferrs` server.
|
||||
}
|
||||
```
|
||||
|
||||
## Why `requiresStringContent` matters
|
||||
## Advanced
|
||||
|
||||
Some `inferrs` Chat Completions routes accept only string
|
||||
`messages[].content`, not structured content-part arrays.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Why requiresStringContent matters">
|
||||
Some `inferrs` Chat Completions routes accept only string
|
||||
`messages[].content`, not structured content-part arrays.
|
||||
|
||||
If OpenClaw runs fail with an error like:
|
||||
<Warning>
|
||||
If OpenClaw runs fail with an error like:
|
||||
|
||||
```text
|
||||
messages[1].content: invalid type: sequence, expected a string
|
||||
```
|
||||
```text
|
||||
messages[1].content: invalid type: sequence, expected a string
|
||||
```
|
||||
|
||||
set:
|
||||
set `compat.requiresStringContent: true` in your model entry.
|
||||
</Warning>
|
||||
|
||||
```json5
|
||||
compat: {
|
||||
requiresStringContent: true
|
||||
}
|
||||
```
|
||||
```json5
|
||||
compat: {
|
||||
requiresStringContent: true
|
||||
}
|
||||
```
|
||||
|
||||
OpenClaw will flatten pure text content parts into plain strings before sending
|
||||
the request.
|
||||
OpenClaw will flatten pure text content parts into plain strings before sending
|
||||
the request.
|
||||
|
||||
## Gemma and tool-schema caveat
|
||||
</Accordion>
|
||||
|
||||
Some current `inferrs` + Gemma combinations accept small direct
|
||||
`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime
|
||||
turns.
|
||||
<Accordion title="Gemma and tool-schema caveat">
|
||||
Some current `inferrs` + Gemma combinations accept small direct
|
||||
`/v1/chat/completions` requests but still fail on full OpenClaw agent-runtime
|
||||
turns.
|
||||
|
||||
If that happens, try this first:
|
||||
If that happens, try this first:
|
||||
|
||||
```json5
|
||||
compat: {
|
||||
requiresStringContent: true,
|
||||
supportsTools: false
|
||||
}
|
||||
```
|
||||
```json5
|
||||
compat: {
|
||||
requiresStringContent: true,
|
||||
supportsTools: false
|
||||
}
|
||||
```
|
||||
|
||||
That disables OpenClaw's tool schema surface for the model and can reduce prompt
|
||||
pressure on stricter local backends.
|
||||
That disables OpenClaw's tool schema surface for the model and can reduce prompt
|
||||
pressure on stricter local backends.
|
||||
|
||||
If tiny direct requests still work but normal OpenClaw agent turns continue to
|
||||
crash inside `inferrs`, the remaining issue is usually upstream model/server
|
||||
behavior rather than OpenClaw's transport layer.
|
||||
If tiny direct requests still work but normal OpenClaw agent turns continue to
|
||||
crash inside `inferrs`, the remaining issue is usually upstream model/server
|
||||
behavior rather than OpenClaw's transport layer.
|
||||
|
||||
## Manual smoke test
|
||||
</Accordion>
|
||||
|
||||
Once configured, test both layers:
|
||||
<Accordion title="Manual smoke test">
|
||||
Once configured, test both layers:
|
||||
|
||||
```bash
|
||||
curl http://127.0.0.1:8080/v1/chat/completions \
|
||||
-H 'content-type: application/json' \
|
||||
-d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
|
||||
```bash
|
||||
curl http://127.0.0.1:8080/v1/chat/completions \
|
||||
-H 'content-type: application/json' \
|
||||
-d '{"model":"google/gemma-4-E2B-it","messages":[{"role":"user","content":"What is 2 + 2?"}],"stream":false}'
|
||||
```
|
||||
|
||||
openclaw infer model run \
|
||||
--model inferrs/google/gemma-4-E2B-it \
|
||||
--prompt "What is 2 + 2? Reply with one short sentence." \
|
||||
--json
|
||||
```
|
||||
```bash
|
||||
openclaw infer model run \
|
||||
--model inferrs/google/gemma-4-E2B-it \
|
||||
--prompt "What is 2 + 2? Reply with one short sentence." \
|
||||
--json
|
||||
```
|
||||
|
||||
If the first command works but the second fails, use the troubleshooting notes
|
||||
below.
|
||||
If the first command works but the second fails, check the troubleshooting section below.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Proxy-style behavior">
|
||||
`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a
|
||||
native OpenAI endpoint.
|
||||
|
||||
- Native OpenAI-only request shaping does not apply here
|
||||
- No `service_tier`, no Responses `store`, no prompt-cache hints, and no
|
||||
OpenAI reasoning-compat payload shaping
|
||||
- Hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`)
|
||||
are not injected on custom `inferrs` base URLs
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
- `curl /v1/models` fails: `inferrs` is not running, not reachable, or not
|
||||
bound to the expected host/port.
|
||||
- `messages[].content ... expected a string`: set
|
||||
`compat.requiresStringContent: true`.
|
||||
- Direct tiny `/v1/chat/completions` calls pass, but `openclaw infer model run`
|
||||
fails: try `compat.supportsTools: false`.
|
||||
- OpenClaw no longer gets schema errors, but `inferrs` still crashes on larger
|
||||
agent turns: treat it as an upstream `inferrs` or model limitation and reduce
|
||||
prompt pressure or switch local backend/model.
|
||||
<AccordionGroup>
|
||||
<Accordion title="curl /v1/models fails">
|
||||
`inferrs` is not running, not reachable, or not bound to the expected
|
||||
host/port. Make sure the server is started and listening on the address you
|
||||
configured.
|
||||
</Accordion>
|
||||
|
||||
## Proxy-style behavior
|
||||
<Accordion title="messages[].content expected a string">
|
||||
Set `compat.requiresStringContent: true` in the model entry. See the
|
||||
`requiresStringContent` section above for details.
|
||||
</Accordion>
|
||||
|
||||
`inferrs` is treated as a proxy-style OpenAI-compatible `/v1` backend, not a
|
||||
native OpenAI endpoint.
|
||||
<Accordion title="Direct /v1/chat/completions calls pass but openclaw infer model run fails">
|
||||
Try setting `compat.supportsTools: false` to disable the tool schema surface.
|
||||
See the Gemma tool-schema caveat above.
|
||||
</Accordion>
|
||||
|
||||
- native OpenAI-only request shaping does not apply here
|
||||
- no `service_tier`, no Responses `store`, no prompt-cache hints, and no
|
||||
OpenAI reasoning-compat payload shaping
|
||||
- hidden OpenClaw attribution headers (`originator`, `version`, `User-Agent`)
|
||||
are not injected on custom `inferrs` base URLs
|
||||
<Accordion title="inferrs still crashes on larger agent turns">
|
||||
If OpenClaw no longer gets schema errors but `inferrs` still crashes on larger
|
||||
agent turns, treat it as an upstream `inferrs` or model limitation. Reduce
|
||||
prompt pressure or switch to a different local backend or model.
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
<Tip>
|
||||
For general help, see [Troubleshooting](/help/troubleshooting) and [FAQ](/help/faq).
|
||||
</Tip>
|
||||
|
||||
## See also
|
||||
|
||||
- [Local models](/gateway/local-models)
|
||||
- [Gateway troubleshooting](/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail)
|
||||
- [Model providers](/concepts/model-providers)
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Local models" href="/gateway/local-models" icon="server">
|
||||
Running OpenClaw against local model servers.
|
||||
</Card>
|
||||
<Card title="Gateway troubleshooting" href="/gateway/troubleshooting#local-openai-compatible-backend-passes-direct-probes-but-agent-runs-fail" icon="wrench">
|
||||
Debugging local OpenAI-compatible backends that pass probes but fail agent runs.
|
||||
</Card>
|
||||
<Card title="Model providers" href="/concepts/model-providers" icon="layers">
|
||||
Overview of all providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -13,138 +13,215 @@ Moonshot provides the Kimi API with OpenAI-compatible endpoints. Configure the
|
||||
provider and set the default model to `moonshot/kimi-k2.5`, or use
|
||||
Kimi Coding with `kimi/kimi-code`.
|
||||
|
||||
Current Kimi K2 model IDs:
|
||||
<Warning>
|
||||
Moonshot and Kimi Coding are **separate providers**. Keys are not interchangeable, endpoints differ, and model refs differ (`moonshot/...` vs `kimi/...`).
|
||||
</Warning>
|
||||
|
||||
## Built-in model catalog
|
||||
|
||||
[//]: # "moonshot-kimi-k2-ids:start"
|
||||
|
||||
- `kimi-k2.5`
|
||||
- `kimi-k2-thinking`
|
||||
- `kimi-k2-thinking-turbo`
|
||||
- `kimi-k2-turbo`
|
||||
| Model ref | Name | Reasoning | Input | Context | Max output |
|
||||
| --------------------------------- | ---------------------- | --------- | ----------- | ------- | ---------- |
|
||||
| `moonshot/kimi-k2.5` | Kimi K2.5 | No | text, image | 262,144 | 262,144 |
|
||||
| `moonshot/kimi-k2-thinking` | Kimi K2 Thinking | Yes | text | 262,144 | 262,144 |
|
||||
| `moonshot/kimi-k2-thinking-turbo` | Kimi K2 Thinking Turbo | Yes | text | 262,144 | 262,144 |
|
||||
| `moonshot/kimi-k2-turbo` | Kimi K2 Turbo | No | text | 256,000 | 16,384 |
|
||||
|
||||
[//]: # "moonshot-kimi-k2-ids:end"
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice moonshot-api-key
|
||||
# or
|
||||
openclaw onboard --auth-choice moonshot-api-key-cn
|
||||
```
|
||||
## Getting started
|
||||
|
||||
Kimi Coding:
|
||||
Choose your provider and follow the setup steps.
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice kimi-code-api-key
|
||||
```
|
||||
<Tabs>
|
||||
<Tab title="Moonshot API">
|
||||
**Best for:** Kimi K2 models via the Moonshot Open Platform.
|
||||
|
||||
Note: Moonshot and Kimi Coding are separate providers. Keys are not interchangeable, endpoints differ, and model refs differ (Moonshot uses `moonshot/...`, Kimi Coding uses `kimi/...`).
|
||||
<Steps>
|
||||
<Step title="Choose your endpoint region">
|
||||
| Auth choice | Endpoint | Region |
|
||||
| ---------------------- | ------------------------------ | ------------- |
|
||||
| `moonshot-api-key` | `https://api.moonshot.ai/v1` | International |
|
||||
| `moonshot-api-key-cn` | `https://api.moonshot.cn/v1` | China |
|
||||
</Step>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice moonshot-api-key
|
||||
```
|
||||
|
||||
Kimi web search uses the Moonshot plugin too:
|
||||
Or for the China endpoint:
|
||||
|
||||
```bash
|
||||
openclaw configure --section web
|
||||
```
|
||||
```bash
|
||||
openclaw onboard --auth-choice moonshot-api-key-cn
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set a default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "moonshot/kimi-k2.5" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify models are available">
|
||||
```bash
|
||||
openclaw models list --provider moonshot
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
Choose **Kimi** in the web-search section to store
|
||||
`plugins.entries.moonshot.config.webSearch.*`.
|
||||
### Config example
|
||||
|
||||
## Config snippet (Moonshot API)
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { MOONSHOT_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "moonshot/kimi-k2.5" },
|
||||
```json5
|
||||
{
|
||||
env: { MOONSHOT_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "moonshot/kimi-k2.5" },
|
||||
models: {
|
||||
// moonshot-kimi-k2-aliases:start
|
||||
"moonshot/kimi-k2.5": { alias: "Kimi K2.5" },
|
||||
"moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" },
|
||||
"moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" },
|
||||
"moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" },
|
||||
// moonshot-kimi-k2-aliases:end
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
// moonshot-kimi-k2-aliases:start
|
||||
"moonshot/kimi-k2.5": { alias: "Kimi K2.5" },
|
||||
"moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" },
|
||||
"moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" },
|
||||
"moonshot/kimi-k2-turbo": { alias: "Kimi K2 Turbo" },
|
||||
// moonshot-kimi-k2-aliases:end
|
||||
mode: "merge",
|
||||
providers: {
|
||||
moonshot: {
|
||||
baseUrl: "https://api.moonshot.ai/v1",
|
||||
apiKey: "${MOONSHOT_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
// moonshot-kimi-k2-models:start
|
||||
{
|
||||
id: "kimi-k2.5",
|
||||
name: "Kimi K2.5",
|
||||
reasoning: false,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-thinking",
|
||||
name: "Kimi K2 Thinking",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-thinking-turbo",
|
||||
name: "Kimi K2 Thinking Turbo",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-turbo",
|
||||
name: "Kimi K2 Turbo",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 256000,
|
||||
maxTokens: 16384,
|
||||
},
|
||||
// moonshot-kimi-k2-models:end
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
models: {
|
||||
mode: "merge",
|
||||
providers: {
|
||||
moonshot: {
|
||||
baseUrl: "https://api.moonshot.ai/v1",
|
||||
apiKey: "${MOONSHOT_API_KEY}",
|
||||
api: "openai-completions",
|
||||
models: [
|
||||
// moonshot-kimi-k2-models:start
|
||||
{
|
||||
id: "kimi-k2.5",
|
||||
name: "Kimi K2.5",
|
||||
reasoning: false,
|
||||
input: ["text", "image"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-thinking",
|
||||
name: "Kimi K2 Thinking",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-thinking-turbo",
|
||||
name: "Kimi K2 Thinking Turbo",
|
||||
reasoning: true,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 262144,
|
||||
maxTokens: 262144,
|
||||
},
|
||||
{
|
||||
id: "kimi-k2-turbo",
|
||||
name: "Kimi K2 Turbo",
|
||||
reasoning: false,
|
||||
input: ["text"],
|
||||
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
|
||||
contextWindow: 256000,
|
||||
maxTokens: 16384,
|
||||
},
|
||||
// moonshot-kimi-k2-models:end
|
||||
],
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
## Kimi Coding
|
||||
</Tab>
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { KIMI_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "kimi/kimi-code" },
|
||||
models: {
|
||||
"kimi/kimi-code": { alias: "Kimi" },
|
||||
<Tab title="Kimi Coding">
|
||||
**Best for:** code-focused tasks via the Kimi Coding endpoint.
|
||||
|
||||
<Note>
|
||||
Kimi Coding uses a different API key and provider prefix (`kimi/...`) than Moonshot (`moonshot/...`). Legacy model ref `kimi/k2p5` remains accepted as a compatibility id.
|
||||
</Note>
|
||||
|
||||
<Steps>
|
||||
<Step title="Run onboarding">
|
||||
```bash
|
||||
openclaw onboard --auth-choice kimi-code-api-key
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set a default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "kimi/kimi-code" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider kimi
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
### Config example
|
||||
|
||||
```json5
|
||||
{
|
||||
env: { KIMI_API_KEY: "sk-..." },
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "kimi/kimi-code" },
|
||||
models: {
|
||||
"kimi/kimi-code": { alias: "Kimi" },
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Kimi web search
|
||||
|
||||
OpenClaw also ships **Kimi** as a `web_search` provider, backed by Moonshot web
|
||||
search.
|
||||
|
||||
Interactive setup can prompt for:
|
||||
<Steps>
|
||||
<Step title="Run interactive web search setup">
|
||||
```bash
|
||||
openclaw configure --section web
|
||||
```
|
||||
|
||||
- the Moonshot API region:
|
||||
- `https://api.moonshot.ai/v1`
|
||||
- `https://api.moonshot.cn/v1`
|
||||
- the default Kimi web-search model (defaults to `kimi-k2.5`)
|
||||
Choose **Kimi** in the web-search section to store
|
||||
`plugins.entries.moonshot.config.webSearch.*`.
|
||||
|
||||
</Step>
|
||||
<Step title="Configure the web search region and model">
|
||||
Interactive setup prompts for:
|
||||
|
||||
| Setting | Options |
|
||||
| ------------------- | -------------------------------------------------------------------- |
|
||||
| API region | `https://api.moonshot.ai/v1` (international) or `https://api.moonshot.cn/v1` (China) |
|
||||
| Web search model | Defaults to `kimi-k2.5` |
|
||||
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
Config lives under `plugins.entries.moonshot.config.webSearch`:
|
||||
|
||||
@@ -173,52 +250,82 @@ Config lives under `plugins.entries.moonshot.config.webSearch`:
|
||||
}
|
||||
```
|
||||
|
||||
## Notes
|
||||
## Advanced
|
||||
|
||||
- Moonshot model refs use `moonshot/<modelId>`. Kimi Coding model refs use `kimi/<modelId>`.
|
||||
- Current Kimi Coding default model ref is `kimi/kimi-code`. Legacy `kimi/k2p5` remains accepted as a compatibility model id.
|
||||
- Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`.
|
||||
- Native Moonshot endpoints (`https://api.moonshot.ai/v1` and
|
||||
`https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the
|
||||
shared `openai-completions` transport. OpenClaw now keys that off endpoint
|
||||
capabilities, so compatible custom provider ids targeting the same native
|
||||
Moonshot hosts inherit the same streaming-usage behavior.
|
||||
- Override pricing and context metadata in `models.providers` if needed.
|
||||
- If Moonshot publishes different context limits for a model, adjust
|
||||
`contextWindow` accordingly.
|
||||
- Use `https://api.moonshot.ai/v1` for the international endpoint, and `https://api.moonshot.cn/v1` for the China endpoint.
|
||||
- Onboarding choices:
|
||||
- `moonshot-api-key` for `https://api.moonshot.ai/v1`
|
||||
- `moonshot-api-key-cn` for `https://api.moonshot.cn/v1`
|
||||
<AccordionGroup>
|
||||
<Accordion title="Native thinking mode">
|
||||
Moonshot Kimi supports binary native thinking:
|
||||
|
||||
## Native thinking mode (Moonshot)
|
||||
- `thinking: { type: "enabled" }`
|
||||
- `thinking: { type: "disabled" }`
|
||||
|
||||
Moonshot Kimi supports binary native thinking:
|
||||
Configure it per model via `agents.defaults.models.<provider/model>.params`:
|
||||
|
||||
- `thinking: { type: "enabled" }`
|
||||
- `thinking: { type: "disabled" }`
|
||||
|
||||
Configure it per model via `agents.defaults.models.<provider/model>.params`:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"moonshot/kimi-k2.5": {
|
||||
params: {
|
||||
thinking: { type: "disabled" },
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
models: {
|
||||
"moonshot/kimi-k2.5": {
|
||||
params: {
|
||||
thinking: { type: "disabled" },
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
}
|
||||
```
|
||||
|
||||
OpenClaw also maps runtime `/think` levels for Moonshot:
|
||||
OpenClaw also maps runtime `/think` levels for Moonshot:
|
||||
|
||||
- `/think off` -> `thinking.type=disabled`
|
||||
- any non-off thinking level -> `thinking.type=enabled`
|
||||
| `/think` level | Moonshot behavior |
|
||||
| -------------------- | -------------------------- |
|
||||
| `/think off` | `thinking.type=disabled` |
|
||||
| Any non-off level | `thinking.type=enabled` |
|
||||
|
||||
When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility.
|
||||
<Warning>
|
||||
When Moonshot thinking is enabled, `tool_choice` must be `auto` or `none`. OpenClaw normalizes incompatible `tool_choice` values to `auto` for compatibility.
|
||||
</Warning>
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Streaming usage compatibility">
|
||||
Native Moonshot endpoints (`https://api.moonshot.ai/v1` and
|
||||
`https://api.moonshot.cn/v1`) advertise streaming usage compatibility on the
|
||||
shared `openai-completions` transport. OpenClaw keys that off endpoint
|
||||
capabilities, so compatible custom provider ids targeting the same native
|
||||
Moonshot hosts inherit the same streaming-usage behavior.
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Endpoint and model ref reference">
|
||||
| Provider | Model ref prefix | Endpoint | Auth env var |
|
||||
| ---------- | ---------------- | ----------------------------- | ------------------- |
|
||||
| Moonshot | `moonshot/` | `https://api.moonshot.ai/v1` | `MOONSHOT_API_KEY` |
|
||||
| Moonshot CN| `moonshot/` | `https://api.moonshot.cn/v1` | `MOONSHOT_API_KEY` |
|
||||
| Kimi Coding| `kimi/` | Kimi Coding endpoint | `KIMI_API_KEY` |
|
||||
| Web search | N/A | Same as Moonshot API region | `KIMI_API_KEY` or `MOONSHOT_API_KEY` |
|
||||
|
||||
- Kimi web search uses `KIMI_API_KEY` or `MOONSHOT_API_KEY`, and defaults to `https://api.moonshot.ai/v1` with model `kimi-k2.5`.
|
||||
- Override pricing and context metadata in `models.providers` if needed.
|
||||
- If Moonshot publishes different context limits for a model, adjust `contextWindow` accordingly.
|
||||
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Web search" href="/tools/web-search" icon="magnifying-glass">
|
||||
Configuring web search providers including Kimi.
|
||||
</Card>
|
||||
<Card title="Configuration reference" href="/gateway/configuration-reference" icon="gear">
|
||||
Full config schema for providers, models, and plugins.
|
||||
</Card>
|
||||
<Card title="Moonshot Open Platform" href="https://platform.moonshot.ai" icon="globe">
|
||||
Moonshot API key management and documentation.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
@@ -17,8 +17,6 @@ background.
|
||||
|
||||
</Warning>
|
||||
|
||||
## Recommended: Qwen Cloud
|
||||
|
||||
OpenClaw now treats Qwen as a first-class bundled provider with canonical id
|
||||
`qwen`. The bundled provider targets the Qwen Cloud / Alibaba DashScope and
|
||||
Coding Plan endpoints and keeps legacy `modelstudio` ids working as a
|
||||
@@ -29,38 +27,108 @@ compatibility alias.
|
||||
- Also accepted for compatibility: `MODELSTUDIO_API_KEY`, `DASHSCOPE_API_KEY`
|
||||
- API style: OpenAI-compatible
|
||||
|
||||
<Tip>
|
||||
If you want `qwen3.6-plus`, prefer the **Standard (pay-as-you-go)** endpoint.
|
||||
Coding Plan support can lag behind the public catalog.
|
||||
</Tip>
|
||||
|
||||
```bash
|
||||
# Global Coding Plan endpoint
|
||||
openclaw onboard --auth-choice qwen-api-key
|
||||
## Getting started
|
||||
|
||||
# China Coding Plan endpoint
|
||||
openclaw onboard --auth-choice qwen-api-key-cn
|
||||
Choose your plan type and follow the setup steps.
|
||||
|
||||
# Global Standard (pay-as-you-go) endpoint
|
||||
openclaw onboard --auth-choice qwen-standard-api-key
|
||||
<Tabs>
|
||||
<Tab title="Coding Plan (subscription)">
|
||||
**Best for:** subscription-based access through the Qwen Coding Plan.
|
||||
|
||||
# China Standard (pay-as-you-go) endpoint
|
||||
openclaw onboard --auth-choice qwen-standard-api-key-cn
|
||||
```
|
||||
<Steps>
|
||||
<Step title="Get your API key">
|
||||
Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys).
|
||||
</Step>
|
||||
<Step title="Run onboarding">
|
||||
For the **Global** endpoint:
|
||||
|
||||
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
|
||||
work as compatibility aliases, but new setup flows should prefer the canonical
|
||||
`qwen-*` auth-choice ids and `qwen/...` model refs.
|
||||
```bash
|
||||
openclaw onboard --auth-choice qwen-api-key
|
||||
```
|
||||
|
||||
After onboarding, set a default model:
|
||||
For the **China** endpoint:
|
||||
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "qwen/qwen3.5-plus" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
```bash
|
||||
openclaw onboard --auth-choice qwen-api-key-cn
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set a default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "qwen/qwen3.5-plus" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider qwen
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Note>
|
||||
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
|
||||
work as compatibility aliases, but new setup flows should prefer the canonical
|
||||
`qwen-*` auth-choice ids and `qwen/...` model refs.
|
||||
</Note>
|
||||
|
||||
</Tab>
|
||||
|
||||
<Tab title="Standard (pay-as-you-go)">
|
||||
**Best for:** pay-as-you-go access through the Standard Model Studio endpoint, including models like `qwen3.6-plus` that may not be available on the Coding Plan.
|
||||
|
||||
<Steps>
|
||||
<Step title="Get your API key">
|
||||
Create or copy an API key from [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys).
|
||||
</Step>
|
||||
<Step title="Run onboarding">
|
||||
For the **Global** endpoint:
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice qwen-standard-api-key
|
||||
```
|
||||
|
||||
For the **China** endpoint:
|
||||
|
||||
```bash
|
||||
openclaw onboard --auth-choice qwen-standard-api-key-cn
|
||||
```
|
||||
</Step>
|
||||
<Step title="Set a default model">
|
||||
```json5
|
||||
{
|
||||
agents: {
|
||||
defaults: {
|
||||
model: { primary: "qwen/qwen3.5-plus" },
|
||||
},
|
||||
},
|
||||
}
|
||||
```
|
||||
</Step>
|
||||
<Step title="Verify the model is available">
|
||||
```bash
|
||||
openclaw models list --provider qwen
|
||||
```
|
||||
</Step>
|
||||
</Steps>
|
||||
|
||||
<Note>
|
||||
Legacy `modelstudio-*` auth-choice ids and `modelstudio/...` model refs still
|
||||
work as compatibility aliases, but new setup flows should prefer the canonical
|
||||
`qwen-*` auth-choice ids and `qwen/...` model refs.
|
||||
</Note>
|
||||
|
||||
</Tab>
|
||||
</Tabs>
|
||||
|
||||
## Plan types and endpoints
|
||||
|
||||
@@ -75,16 +143,10 @@ The provider auto-selects the endpoint based on your auth choice. Canonical
|
||||
choices use the `qwen-*` family; `modelstudio-*` remains compatibility-only.
|
||||
You can override with a custom `baseUrl` in config.
|
||||
|
||||
Native Model Studio endpoints advertise streaming usage compatibility on the
|
||||
shared `openai-completions` transport. OpenClaw keys that off endpoint
|
||||
capabilities now, so DashScope-compatible custom provider ids targeting the
|
||||
same native hosts inherit the same streaming-usage behavior instead of
|
||||
requiring the built-in `qwen` provider id specifically.
|
||||
|
||||
## Get your API key
|
||||
|
||||
- **Manage keys**: [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys)
|
||||
- **Docs**: [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction)
|
||||
<Tip>
|
||||
**Manage keys:** [home.qwencloud.com/api-keys](https://home.qwencloud.com/api-keys) |
|
||||
**Docs:** [docs.qwencloud.com](https://docs.qwencloud.com/developer-guides/getting-started/introduction)
|
||||
</Tip>
|
||||
|
||||
## Built-in catalog
|
||||
|
||||
@@ -104,71 +166,20 @@ the Standard endpoint.
|
||||
| `qwen/glm-4.7` | text | 202,752 | GLM |
|
||||
| `qwen/kimi-k2.5` | text, image | 262,144 | Moonshot AI via Alibaba |
|
||||
|
||||
<Note>
|
||||
Availability can still vary by endpoint and billing plan even when a model is
|
||||
present in the bundled catalog.
|
||||
|
||||
Native-streaming usage compatibility applies to both the Coding Plan hosts and
|
||||
the Standard DashScope-compatible hosts:
|
||||
|
||||
- `https://coding.dashscope.aliyuncs.com/v1`
|
||||
- `https://coding-intl.dashscope.aliyuncs.com/v1`
|
||||
- `https://dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
- `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
## Qwen 3.6 Plus availability
|
||||
|
||||
`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio
|
||||
endpoints:
|
||||
|
||||
- China: `dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
- Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
If the Coding Plan endpoints return an "unsupported model" error for
|
||||
`qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan
|
||||
endpoint/key pair.
|
||||
|
||||
## Capability plan
|
||||
|
||||
The `qwen` extension is being positioned as the vendor home for the full Qwen
|
||||
Cloud surface, not just coding/text models.
|
||||
|
||||
- Text/chat models: bundled now
|
||||
- Tool calling, structured output, thinking: inherited from the OpenAI-compatible transport
|
||||
- Image generation: planned at the provider-plugin layer
|
||||
- Image/video understanding: bundled now on the Standard endpoint
|
||||
- Speech/audio: planned at the provider-plugin layer
|
||||
- Memory embeddings/reranking: planned through the embedding adapter surface
|
||||
- Video generation: bundled now through the shared video-generation capability
|
||||
</Note>
|
||||
|
||||
## Multimodal add-ons
|
||||
|
||||
The `qwen` extension now also exposes:
|
||||
The `qwen` extension also exposes multimodal capabilities on the **Standard**
|
||||
DashScope endpoints (not the Coding Plan endpoints):
|
||||
|
||||
- Video understanding via `qwen-vl-max-latest`
|
||||
- Wan video generation via:
|
||||
- `wan2.6-t2v` (default)
|
||||
- `wan2.6-i2v`
|
||||
- `wan2.6-r2v`
|
||||
- `wan2.6-r2v-flash`
|
||||
- `wan2.7-r2v`
|
||||
- **Video understanding** via `qwen-vl-max-latest`
|
||||
- **Wan video generation** via `wan2.6-t2v` (default), `wan2.6-i2v`, `wan2.6-r2v`, `wan2.6-r2v-flash`, `wan2.7-r2v`
|
||||
|
||||
These multimodal surfaces use the **Standard** DashScope endpoints, not the
|
||||
Coding Plan endpoints.
|
||||
|
||||
- Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
- China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
For video generation, OpenClaw maps the configured Qwen region to the matching
|
||||
DashScope AIGC host before submitting the job:
|
||||
|
||||
- Global/Intl: `https://dashscope-intl.aliyuncs.com`
|
||||
- China: `https://dashscope.aliyuncs.com`
|
||||
|
||||
That means a normal `models.providers.qwen.baseUrl` pointing at either the
|
||||
Coding Plan or Standard Qwen hosts still keeps video generation on the correct
|
||||
regional DashScope video endpoint.
|
||||
|
||||
For video generation, set a default model explicitly:
|
||||
To use Qwen as the default video provider:
|
||||
|
||||
```json5
|
||||
{
|
||||
@@ -180,22 +191,110 @@ For video generation, set a default model explicitly:
|
||||
}
|
||||
```
|
||||
|
||||
Current bundled Qwen video-generation limits:
|
||||
<Note>
|
||||
See [Video Generation](/tools/video-generation) for shared tool parameters, provider selection, and failover behavior.
|
||||
</Note>
|
||||
|
||||
- Up to **1** output video per request
|
||||
- Up to **1** input image
|
||||
- Up to **4** input videos
|
||||
- Up to **10 seconds** duration
|
||||
- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark`
|
||||
- Reference image/video mode currently requires **remote http(s) URLs**. Local
|
||||
file paths are rejected up front because the DashScope video endpoint does not
|
||||
accept uploaded local buffers for those references.
|
||||
## Advanced
|
||||
|
||||
See [Video Generation](/tools/video-generation) for the shared tool
|
||||
parameters, provider selection, and failover behavior.
|
||||
<AccordionGroup>
|
||||
<Accordion title="Qwen 3.6 Plus availability">
|
||||
`qwen3.6-plus` is available on the Standard (pay-as-you-go) Model Studio
|
||||
endpoints:
|
||||
|
||||
## Environment note
|
||||
- China: `dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
- Global: `dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is
|
||||
available to that process (for example, in `~/.openclaw/.env` or via
|
||||
`env.shellEnv`).
|
||||
If the Coding Plan endpoints return an "unsupported model" error for
|
||||
`qwen3.6-plus`, switch to Standard (pay-as-you-go) instead of the Coding Plan
|
||||
endpoint/key pair.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Capability plan">
|
||||
The `qwen` extension is being positioned as the vendor home for the full Qwen
|
||||
Cloud surface, not just coding/text models.
|
||||
|
||||
- **Text/chat models:** bundled now
|
||||
- **Tool calling, structured output, thinking:** inherited from the OpenAI-compatible transport
|
||||
- **Image generation:** planned at the provider-plugin layer
|
||||
- **Image/video understanding:** bundled now on the Standard endpoint
|
||||
- **Speech/audio:** planned at the provider-plugin layer
|
||||
- **Memory embeddings/reranking:** planned through the embedding adapter surface
|
||||
- **Video generation:** bundled now through the shared video-generation capability
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Video generation details">
|
||||
For video generation, OpenClaw maps the configured Qwen region to the matching
|
||||
DashScope AIGC host before submitting the job:
|
||||
|
||||
- Global/Intl: `https://dashscope-intl.aliyuncs.com`
|
||||
- China: `https://dashscope.aliyuncs.com`
|
||||
|
||||
That means a normal `models.providers.qwen.baseUrl` pointing at either the
|
||||
Coding Plan or Standard Qwen hosts still keeps video generation on the correct
|
||||
regional DashScope video endpoint.
|
||||
|
||||
Current bundled Qwen video-generation limits:
|
||||
|
||||
- Up to **1** output video per request
|
||||
- Up to **1** input image
|
||||
- Up to **4** input videos
|
||||
- Up to **10 seconds** duration
|
||||
- Supports `size`, `aspectRatio`, `resolution`, `audio`, and `watermark`
|
||||
- Reference image/video mode currently requires **remote http(s) URLs**. Local
|
||||
file paths are rejected up front because the DashScope video endpoint does not
|
||||
accept uploaded local buffers for those references.
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Streaming usage compatibility">
|
||||
Native Model Studio endpoints advertise streaming usage compatibility on the
|
||||
shared `openai-completions` transport. OpenClaw keys that off endpoint
|
||||
capabilities now, so DashScope-compatible custom provider ids targeting the
|
||||
same native hosts inherit the same streaming-usage behavior instead of
|
||||
requiring the built-in `qwen` provider id specifically.
|
||||
|
||||
Native-streaming usage compatibility applies to both the Coding Plan hosts and
|
||||
the Standard DashScope-compatible hosts:
|
||||
|
||||
- `https://coding.dashscope.aliyuncs.com/v1`
|
||||
- `https://coding-intl.dashscope.aliyuncs.com/v1`
|
||||
- `https://dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
- `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Multimodal endpoint regions">
|
||||
Multimodal surfaces (video understanding and Wan video generation) use the
|
||||
**Standard** DashScope endpoints, not the Coding Plan endpoints:
|
||||
|
||||
- Global/Intl Standard base URL: `https://dashscope-intl.aliyuncs.com/compatible-mode/v1`
|
||||
- China Standard base URL: `https://dashscope.aliyuncs.com/compatible-mode/v1`
|
||||
|
||||
</Accordion>
|
||||
|
||||
<Accordion title="Environment and daemon setup">
|
||||
If the Gateway runs as a daemon (launchd/systemd), make sure `QWEN_API_KEY` is
|
||||
available to that process (for example, in `~/.openclaw/.env` or via
|
||||
`env.shellEnv`).
|
||||
</Accordion>
|
||||
</AccordionGroup>
|
||||
|
||||
## Related
|
||||
|
||||
<CardGroup cols={2}>
|
||||
<Card title="Model selection" href="/concepts/model-providers" icon="layers">
|
||||
Choosing providers, model refs, and failover behavior.
|
||||
</Card>
|
||||
<Card title="Video generation" href="/tools/video-generation" icon="video">
|
||||
Shared video tool parameters and provider selection.
|
||||
</Card>
|
||||
<Card title="Alibaba (ModelStudio)" href="/providers/alibaba" icon="cloud">
|
||||
Legacy ModelStudio provider and migration notes.
|
||||
</Card>
|
||||
<Card title="Troubleshooting" href="/help/troubleshooting" icon="wrench">
|
||||
General troubleshooting and FAQ.
|
||||
</Card>
|
||||
</CardGroup>
|
||||
|
||||
Reference in New Issue
Block a user