mirror of
https://github.com/router-for-me/CLIProxyAPIPlus.git
synced 2026-03-09 15:25:17 +00:00
Merge pull request #20 from Ravens2121/master
feat(kiro): 支持思考模型 (Thinking Mode) 并通过多配额故障转移增强稳定性
This commit is contained in:
49
PR_DOCUMENTATION.md
Normal file
49
PR_DOCUMENTATION.md
Normal file
@@ -0,0 +1,49 @@
|
||||
# PR Title / 拉取请求标题
|
||||
|
||||
`feat(kiro): Add Thinking Mode support & enhance reliability with multi-quota failover`
|
||||
`feat(kiro): 支持思考模型 (Thinking Mode) 并通过多配额故障转移增强稳定性`
|
||||
|
||||
---
|
||||
|
||||
# PR Description / 拉取请求描述
|
||||
|
||||
## 📝 Summary / 摘要
|
||||
|
||||
This PR introduces significant upgrades to the Kiro (AWS CodeWhisperer/Amazon Q) module. It adds native support for **Thinking/Reasoning models** (similar to OpenAI o1/Claude 3.7), implements a robust **Multi-Endpoint Failover** system to handle rate limits (429), and optimizes configuration flexibility.
|
||||
|
||||
本次 PR 对 Kiro (AWS CodeWhisperer/Amazon Q) 模块进行了重大升级。它增加了对 **思考/推理模型 (Thinking/Reasoning models)** 的原生支持(类似 OpenAI o1/Claude 3.7),实现了一套健壮的 **多端点故障转移 (Multi-Endpoint Failover)** 系统以应对速率限制 (429),并优化了配置灵活性。
|
||||
|
||||
## ✨ Key Changes / 主要变更
|
||||
|
||||
### 1. 🧠 Thinking Mode Support / 思考模式支持
|
||||
- **OpenAI Compatibility**: Automatically maps OpenAI's `reasoning_effort` parameter (low/medium/high) to Claude's `budget_tokens` (4k/16k/32k).
|
||||
- **OpenAI 兼容性**:自动将 OpenAI 的 `reasoning_effort` 参数(low/medium/high)映射为 Claude 的 `budget_tokens`(4k/16k/32k)。
|
||||
- **Stream Parsing**: Implemented advanced stream parsing logic to detect and extract content within `<thinking>...</thinking>` tags, even across chunk boundaries.
|
||||
- **流式解析**:实现了高级流式解析逻辑,能够检测并提取 `<thinking>...</thinking>` 标签内的内容,即使标签跨越了数据块边界。
|
||||
- **Protocol Translation**: Converts Kiro's internal thinking content into OpenAI-compatible `reasoning_content` fields (for non-stream) or `thinking_delta` events (for stream).
|
||||
- **协议转换**:将 Kiro 内部的思考内容转换为兼容 OpenAI 的 `reasoning_content` 字段(非流式)或 `thinking_delta` 事件(流式)。
|
||||
|
||||
### 2. 🛡️ Robustness & Failover / 稳健性与故障转移
|
||||
- **Dual Quota System**: Explicitly defined `kiroEndpointConfig` to distinguish between **IDE (CodeWhisperer)** and **CLI (Amazon Q)** quotas.
|
||||
- **双配额系统**:显式定义了 `kiroEndpointConfig` 结构,明确区分 **IDE (CodeWhisperer)** 和 **CLI (Amazon Q)** 的配额来源。
|
||||
- **Auto Failover**: Implemented automatic failover logic. If one endpoint returns `429 Too Many Requests`, the request seamlessly retries on the next available endpoint/quota.
|
||||
- **自动故障转移**:实现了自动故障转移逻辑。如果一个端点返回 `429 Too Many Requests`,请求将无缝在下一个可用端点/配额上重试。
|
||||
- **Strict Protocol Compliance**: Enforced strict matching of `Origin` and `X-Amz-Target` headers for each endpoint to prevent `403 Forbidden` errors due to protocol mismatches.
|
||||
- **严格协议合规**:强制每个端点严格匹配 `Origin` 和 `X-Amz-Target` 头信息,防止因协议不匹配导致的 `403 Forbidden` 错误。
|
||||
|
||||
### 3. ⚙️ Configuration & Models / 配置与模型
|
||||
- **New Config Options**: Added `KiroPreferredEndpoint` (global) and `PreferredEndpoint` (per-key) settings to allow users to prioritize specific quotas (e.g., "ide" or "cli").
|
||||
- **新配置项**:添加了 `KiroPreferredEndpoint`(全局)和 `PreferredEndpoint`(单 Key)设置,允许用户优先选择特定的配额(如 "ide" 或 "cli")。
|
||||
- **Model Registry**: Normalized model IDs (replaced dots with hyphens) and added `-agentic` variants optimized for large code generation tasks.
|
||||
- **模型注册表**:规范化了模型 ID(将点号替换为连字符),并添加了针对大型代码生成任务优化的 `-agentic` 变体。
|
||||
|
||||
### 4. 🔧 Fixes / 修复
|
||||
- **AMP Proxy**: Downgraded client-side context cancellation logs from `Error` to `Debug` to reduce log noise.
|
||||
- **AMP 代理**:将客户端上下文取消的日志级别从 `Error` 降级为 `Debug`,减少日志噪音。
|
||||
|
||||
## ⚠️ Impact / 影响
|
||||
|
||||
- **Authentication**: **No changes** to the login/OAuth process. Existing tokens work as is.
|
||||
- **认证**:登录/OAuth 流程 **无变更**。现有 Token 可直接使用。
|
||||
- **Compatibility**: Fully backward compatible. The new failover logic is transparent to the user.
|
||||
- **兼容性**:完全向后兼容。新的故障转移逻辑对用户是透明的。
|
||||
@@ -3,6 +3,8 @@ package amp
|
||||
import (
|
||||
"bytes"
|
||||
"compress/gzip"
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"net/http"
|
||||
@@ -148,7 +150,13 @@ func createReverseProxy(upstreamURL string, secretSource SecretSource) (*httputi
|
||||
|
||||
// Error handler for proxy failures
|
||||
proxy.ErrorHandler = func(rw http.ResponseWriter, req *http.Request, err error) {
|
||||
log.Errorf("amp upstream proxy error for %s %s: %v", req.Method, req.URL.Path, err)
|
||||
// Check if this is a client-side cancellation (normal behavior)
|
||||
// Don't log as error for context canceled - it's usually client closing connection
|
||||
if errors.Is(err, context.Canceled) {
|
||||
log.Debugf("amp upstream proxy: client canceled request for %s %s", req.Method, req.URL.Path)
|
||||
} else {
|
||||
log.Errorf("amp upstream proxy error for %s %s: %v", req.Method, req.URL.Path, err)
|
||||
}
|
||||
rw.Header().Set("Content-Type", "application/json")
|
||||
rw.WriteHeader(http.StatusBadGateway)
|
||||
_, _ = rw.Write([]byte(`{"error":"amp_upstream_proxy_error","message":"Failed to reach Amp upstream"}`))
|
||||
|
||||
@@ -349,6 +349,12 @@ func (s *Server) setupRoutes() {
|
||||
},
|
||||
})
|
||||
})
|
||||
|
||||
// Event logging endpoint - handles Claude Code telemetry requests
|
||||
// Returns 200 OK to prevent 404 errors in logs
|
||||
s.engine.POST("/api/event_logging/batch", func(c *gin.Context) {
|
||||
c.JSON(http.StatusOK, gin.H{"status": "ok"})
|
||||
})
|
||||
s.engine.POST("/v1internal:method", geminiCLIHandlers.CLIHandler)
|
||||
|
||||
// OAuth callback endpoints (reuse main server port)
|
||||
|
||||
@@ -64,6 +64,10 @@ type Config struct {
|
||||
// KiroKey defines a list of Kiro (AWS CodeWhisperer) configurations.
|
||||
KiroKey []KiroKey `yaml:"kiro" json:"kiro"`
|
||||
|
||||
// KiroPreferredEndpoint sets the global default preferred endpoint for all Kiro providers.
|
||||
// Values: "ide" (default, CodeWhisperer) or "cli" (Amazon Q).
|
||||
KiroPreferredEndpoint string `yaml:"kiro-preferred-endpoint" json:"kiro-preferred-endpoint"`
|
||||
|
||||
// Codex defines a list of Codex API key configurations as specified in the YAML configuration file.
|
||||
CodexKey []CodexKey `yaml:"codex-api-key" json:"codex-api-key"`
|
||||
|
||||
@@ -278,6 +282,10 @@ type KiroKey struct {
|
||||
// AgentTaskType sets the Kiro API task type. Known values: "vibe", "dev", "chat".
|
||||
// Leave empty to let API use defaults. Different values may inject different system prompts.
|
||||
AgentTaskType string `yaml:"agent-task-type,omitempty" json:"agent-task-type,omitempty"`
|
||||
|
||||
// PreferredEndpoint sets the preferred Kiro API endpoint/quota.
|
||||
// Values: "codewhisperer" (default, IDE quota) or "amazonq" (CLI quota).
|
||||
PreferredEndpoint string `yaml:"preferred-endpoint,omitempty" json:"preferred-endpoint,omitempty"`
|
||||
}
|
||||
|
||||
// OpenAICompatibility represents the configuration for OpenAI API compatibility
|
||||
@@ -504,6 +512,7 @@ func (cfg *Config) SanitizeKiroKeys() {
|
||||
entry.ProfileArn = strings.TrimSpace(entry.ProfileArn)
|
||||
entry.Region = strings.TrimSpace(entry.Region)
|
||||
entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
|
||||
entry.PreferredEndpoint = strings.TrimSpace(entry.PreferredEndpoint)
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@@ -884,8 +884,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
|
||||
// GetKiroModels returns the Kiro (AWS CodeWhisperer) model definitions
|
||||
func GetKiroModels() []*ModelInfo {
|
||||
return []*ModelInfo{
|
||||
// --- Base Models ---
|
||||
{
|
||||
ID: "kiro-claude-opus-4.5",
|
||||
ID: "kiro-claude-opus-4-5",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
@@ -896,7 +897,7 @@ func GetKiroModels() []*ModelInfo {
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
{
|
||||
ID: "kiro-claude-sonnet-4.5",
|
||||
ID: "kiro-claude-sonnet-4-5",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
@@ -918,7 +919,7 @@ func GetKiroModels() []*ModelInfo {
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
{
|
||||
ID: "kiro-claude-haiku-4.5",
|
||||
ID: "kiro-claude-haiku-4-5",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
@@ -928,21 +929,9 @@ func GetKiroModels() []*ModelInfo {
|
||||
ContextLength: 200000,
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
// --- Chat Variant (No tool calling, for pure conversation) ---
|
||||
{
|
||||
ID: "kiro-claude-opus-4.5-chat",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
Type: "kiro",
|
||||
DisplayName: "Kiro Claude Opus 4.5 (Chat)",
|
||||
Description: "Claude Opus 4.5 for chat only (no tool calling)",
|
||||
ContextLength: 200000,
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
// --- Agentic Variants (Optimized for coding agents with chunked writes) ---
|
||||
{
|
||||
ID: "kiro-claude-opus-4.5-agentic",
|
||||
ID: "kiro-claude-opus-4-5-agentic",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
@@ -953,7 +942,7 @@ func GetKiroModels() []*ModelInfo {
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
{
|
||||
ID: "kiro-claude-sonnet-4.5-agentic",
|
||||
ID: "kiro-claude-sonnet-4-5-agentic",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
@@ -963,6 +952,28 @@ func GetKiroModels() []*ModelInfo {
|
||||
ContextLength: 200000,
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
{
|
||||
ID: "kiro-claude-sonnet-4-agentic",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
Type: "kiro",
|
||||
DisplayName: "Kiro Claude Sonnet 4 (Agentic)",
|
||||
Description: "Claude Sonnet 4 optimized for coding agents (chunked writes)",
|
||||
ContextLength: 200000,
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
{
|
||||
ID: "kiro-claude-haiku-4-5-agentic",
|
||||
Object: "model",
|
||||
Created: 1732752000,
|
||||
OwnedBy: "aws",
|
||||
Type: "kiro",
|
||||
DisplayName: "Kiro Claude Haiku 4.5 (Agentic)",
|
||||
Description: "Claude Haiku 4.5 optimized for coding agents (chunked writes)",
|
||||
ContextLength: 200000,
|
||||
MaxCompletionTokens: 64000,
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -10,9 +10,18 @@ import (
|
||||
"github.com/tidwall/sjson"
|
||||
)
|
||||
|
||||
// reasoningEffortToBudget maps OpenAI reasoning_effort values to Claude thinking budget_tokens.
|
||||
// OpenAI uses "low", "medium", "high" while Claude uses numeric budget_tokens.
|
||||
var reasoningEffortToBudget = map[string]int{
|
||||
"low": 4000,
|
||||
"medium": 16000,
|
||||
"high": 32000,
|
||||
}
|
||||
|
||||
// ConvertOpenAIRequestToKiro transforms an OpenAI Chat Completions API request into Kiro (Claude) format.
|
||||
// Kiro uses Claude-compatible format internally, so we primarily pass through to Claude format.
|
||||
// Supports tool calling: OpenAI tools -> Claude tools, tool_calls -> tool_use, tool messages -> tool_result.
|
||||
// Supports reasoning/thinking: OpenAI reasoning_effort -> Claude thinking parameter.
|
||||
func ConvertOpenAIRequestToKiro(modelName string, inputRawJSON []byte, stream bool) []byte {
|
||||
rawJSON := bytes.Clone(inputRawJSON)
|
||||
root := gjson.ParseBytes(rawJSON)
|
||||
@@ -38,6 +47,26 @@ func ConvertOpenAIRequestToKiro(modelName string, inputRawJSON []byte, stream bo
|
||||
out, _ = sjson.Set(out, "top_p", v.Float())
|
||||
}
|
||||
|
||||
// Handle OpenAI reasoning_effort parameter -> Claude thinking parameter
|
||||
// OpenAI format: {"reasoning_effort": "low"|"medium"|"high"}
|
||||
// Claude format: {"thinking": {"type": "enabled", "budget_tokens": N}}
|
||||
if v := root.Get("reasoning_effort"); v.Exists() {
|
||||
effort := v.String()
|
||||
if budget, ok := reasoningEffortToBudget[effort]; ok {
|
||||
thinking := map[string]interface{}{
|
||||
"type": "enabled",
|
||||
"budget_tokens": budget,
|
||||
}
|
||||
out, _ = sjson.Set(out, "thinking", thinking)
|
||||
}
|
||||
}
|
||||
|
||||
// Also support direct thinking parameter passthrough (for Claude API compatibility)
|
||||
// Claude format: {"thinking": {"type": "enabled", "budget_tokens": N}}
|
||||
if v := root.Get("thinking"); v.Exists() && v.IsObject() {
|
||||
out, _ = sjson.Set(out, "thinking", v.Value())
|
||||
}
|
||||
|
||||
// Convert OpenAI tools to Claude tools format
|
||||
if tools := root.Get("tools"); tools.Exists() && tools.IsArray() {
|
||||
claudeTools := make([]interface{}, 0)
|
||||
|
||||
@@ -134,6 +134,28 @@ func convertClaudeEventToOpenAI(jsonStr string, model string) []string {
|
||||
result, _ := json.Marshal(response)
|
||||
results = append(results, string(result))
|
||||
}
|
||||
} else if deltaType == "thinking_delta" {
|
||||
// Thinking/reasoning content delta - convert to OpenAI reasoning_content format
|
||||
thinkingDelta := root.Get("delta.thinking").String()
|
||||
if thinkingDelta != "" {
|
||||
response := map[string]interface{}{
|
||||
"id": "chatcmpl-" + uuid.New().String()[:24],
|
||||
"object": "chat.completion.chunk",
|
||||
"created": time.Now().Unix(),
|
||||
"model": model,
|
||||
"choices": []map[string]interface{}{
|
||||
{
|
||||
"index": 0,
|
||||
"delta": map[string]interface{}{
|
||||
"reasoning_content": thinkingDelta,
|
||||
},
|
||||
"finish_reason": nil,
|
||||
},
|
||||
},
|
||||
}
|
||||
result, _ := json.Marshal(response)
|
||||
results = append(results, string(result))
|
||||
}
|
||||
} else if deltaType == "input_json_delta" {
|
||||
// Tool input delta (streaming arguments)
|
||||
partialJSON := root.Get("delta.partial_json").String()
|
||||
@@ -298,6 +320,7 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
|
||||
root := gjson.ParseBytes(rawResponse)
|
||||
|
||||
var content string
|
||||
var reasoningContent string
|
||||
var toolCalls []map[string]interface{}
|
||||
|
||||
contentArray := root.Get("content")
|
||||
@@ -306,6 +329,9 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
|
||||
itemType := item.Get("type").String()
|
||||
if itemType == "text" {
|
||||
content += item.Get("text").String()
|
||||
} else if itemType == "thinking" {
|
||||
// Extract thinking/reasoning content
|
||||
reasoningContent += item.Get("thinking").String()
|
||||
} else if itemType == "tool_use" {
|
||||
// Convert Claude tool_use to OpenAI tool_calls format
|
||||
inputJSON := item.Get("input").String()
|
||||
@@ -339,6 +365,11 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
|
||||
"content": content,
|
||||
}
|
||||
|
||||
// Add reasoning_content if present (OpenAI reasoning format)
|
||||
if reasoningContent != "" {
|
||||
message["reasoning_content"] = reasoningContent
|
||||
}
|
||||
|
||||
// Add tool_calls if present
|
||||
if len(toolCalls) > 0 {
|
||||
message["tool_calls"] = toolCalls
|
||||
|
||||
@@ -1317,6 +1317,12 @@ func (w *Watcher) SnapshotCoreAuths() []*coreauth.Auth {
|
||||
if kk.AgentTaskType != "" {
|
||||
attrs["agent_task_type"] = kk.AgentTaskType
|
||||
}
|
||||
if kk.PreferredEndpoint != "" {
|
||||
attrs["preferred_endpoint"] = kk.PreferredEndpoint
|
||||
} else if cfg.KiroPreferredEndpoint != "" {
|
||||
// Apply global default if not overridden by specific key
|
||||
attrs["preferred_endpoint"] = cfg.KiroPreferredEndpoint
|
||||
}
|
||||
if refreshToken != "" {
|
||||
attrs["refresh_token"] = refreshToken
|
||||
}
|
||||
@@ -1532,6 +1538,17 @@ func (w *Watcher) SnapshotCoreAuths() []*coreauth.Auth {
|
||||
a.NextRefreshAfter = expiresAt.Add(-30 * time.Minute)
|
||||
}
|
||||
}
|
||||
|
||||
// Apply global preferred endpoint setting if not present in metadata
|
||||
if cfg.KiroPreferredEndpoint != "" {
|
||||
// Check if already set in metadata (which takes precedence in executor)
|
||||
if _, hasMeta := metadata["preferred_endpoint"]; !hasMeta {
|
||||
if a.Attributes == nil {
|
||||
a.Attributes = make(map[string]string)
|
||||
}
|
||||
a.Attributes["preferred_endpoint"] = cfg.KiroPreferredEndpoint
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
applyAuthExcludedModelsMeta(a, cfg, nil, "oauth")
|
||||
|
||||
Reference in New Issue
Block a user