Merge pull request #20 from Ravens2121/master

feat(kiro): 支持思考模型 (Thinking Mode) 并通过多配额故障转移增强稳定性
This commit is contained in:
Luis Pater
2025-12-12 16:09:28 +08:00
committed by GitHub
9 changed files with 998 additions and 148 deletions

49
PR_DOCUMENTATION.md Normal file
View File

@@ -0,0 +1,49 @@
# PR Title / 拉取请求标题
`feat(kiro): Add Thinking Mode support & enhance reliability with multi-quota failover`
`feat(kiro): 支持思考模型 (Thinking Mode) 并通过多配额故障转移增强稳定性`
---
# PR Description / 拉取请求描述
## 📝 Summary / 摘要
This PR introduces significant upgrades to the Kiro (AWS CodeWhisperer/Amazon Q) module. It adds native support for **Thinking/Reasoning models** (similar to OpenAI o1/Claude 3.7), implements a robust **Multi-Endpoint Failover** system to handle rate limits (429), and optimizes configuration flexibility.
本次 PR 对 Kiro (AWS CodeWhisperer/Amazon Q) 模块进行了重大升级。它增加了对 **思考/推理模型 (Thinking/Reasoning models)** 的原生支持(类似 OpenAI o1/Claude 3.7),实现了一套健壮的 **多端点故障转移 (Multi-Endpoint Failover)** 系统以应对速率限制 (429),并优化了配置灵活性。
## ✨ Key Changes / 主要变更
### 1. 🧠 Thinking Mode Support / 思考模式支持
- **OpenAI Compatibility**: Automatically maps OpenAI's `reasoning_effort` parameter (low/medium/high) to Claude's `budget_tokens` (4k/16k/32k).
- **OpenAI 兼容性**:自动将 OpenAI 的 `reasoning_effort` 参数low/medium/high映射为 Claude 的 `budget_tokens`4k/16k/32k
- **Stream Parsing**: Implemented advanced stream parsing logic to detect and extract content within `<thinking>...</thinking>` tags, even across chunk boundaries.
- **流式解析**:实现了高级流式解析逻辑,能够检测并提取 `<thinking>...</thinking>` 标签内的内容,即使标签跨越了数据块边界。
- **Protocol Translation**: Converts Kiro's internal thinking content into OpenAI-compatible `reasoning_content` fields (for non-stream) or `thinking_delta` events (for stream).
- **协议转换**:将 Kiro 内部的思考内容转换为兼容 OpenAI 的 `reasoning_content` 字段(非流式)或 `thinking_delta` 事件(流式)。
### 2. 🛡️ Robustness & Failover / 稳健性与故障转移
- **Dual Quota System**: Explicitly defined `kiroEndpointConfig` to distinguish between **IDE (CodeWhisperer)** and **CLI (Amazon Q)** quotas.
- **双配额系统**:显式定义了 `kiroEndpointConfig` 结构,明确区分 **IDE (CodeWhisperer)****CLI (Amazon Q)** 的配额来源。
- **Auto Failover**: Implemented automatic failover logic. If one endpoint returns `429 Too Many Requests`, the request seamlessly retries on the next available endpoint/quota.
- **自动故障转移**:实现了自动故障转移逻辑。如果一个端点返回 `429 Too Many Requests`,请求将无缝在下一个可用端点/配额上重试。
- **Strict Protocol Compliance**: Enforced strict matching of `Origin` and `X-Amz-Target` headers for each endpoint to prevent `403 Forbidden` errors due to protocol mismatches.
- **严格协议合规**:强制每个端点严格匹配 `Origin``X-Amz-Target` 头信息,防止因协议不匹配导致的 `403 Forbidden` 错误。
### 3. ⚙️ Configuration & Models / 配置与模型
- **New Config Options**: Added `KiroPreferredEndpoint` (global) and `PreferredEndpoint` (per-key) settings to allow users to prioritize specific quotas (e.g., "ide" or "cli").
- **新配置项**:添加了 `KiroPreferredEndpoint`(全局)和 `PreferredEndpoint`(单 Key设置允许用户优先选择特定的配额如 "ide" 或 "cli")。
- **Model Registry**: Normalized model IDs (replaced dots with hyphens) and added `-agentic` variants optimized for large code generation tasks.
- **模型注册表**:规范化了模型 ID将点号替换为连字符并添加了针对大型代码生成任务优化的 `-agentic` 变体。
### 4. 🔧 Fixes / 修复
- **AMP Proxy**: Downgraded client-side context cancellation logs from `Error` to `Debug` to reduce log noise.
- **AMP 代理**:将客户端上下文取消的日志级别从 `Error` 降级为 `Debug`,减少日志噪音。
## ⚠️ Impact / 影响
- **Authentication**: **No changes** to the login/OAuth process. Existing tokens work as is.
- **认证**:登录/OAuth 流程 **无变更**。现有 Token 可直接使用。
- **Compatibility**: Fully backward compatible. The new failover logic is transparent to the user.
- **兼容性**:完全向后兼容。新的故障转移逻辑对用户是透明的。

View File

@@ -3,6 +3,8 @@ package amp
import (
"bytes"
"compress/gzip"
"context"
"errors"
"fmt"
"io"
"net/http"
@@ -148,7 +150,13 @@ func createReverseProxy(upstreamURL string, secretSource SecretSource) (*httputi
// Error handler for proxy failures
proxy.ErrorHandler = func(rw http.ResponseWriter, req *http.Request, err error) {
log.Errorf("amp upstream proxy error for %s %s: %v", req.Method, req.URL.Path, err)
// Check if this is a client-side cancellation (normal behavior)
// Don't log as error for context canceled - it's usually client closing connection
if errors.Is(err, context.Canceled) {
log.Debugf("amp upstream proxy: client canceled request for %s %s", req.Method, req.URL.Path)
} else {
log.Errorf("amp upstream proxy error for %s %s: %v", req.Method, req.URL.Path, err)
}
rw.Header().Set("Content-Type", "application/json")
rw.WriteHeader(http.StatusBadGateway)
_, _ = rw.Write([]byte(`{"error":"amp_upstream_proxy_error","message":"Failed to reach Amp upstream"}`))

View File

@@ -349,6 +349,12 @@ func (s *Server) setupRoutes() {
},
})
})
// Event logging endpoint - handles Claude Code telemetry requests
// Returns 200 OK to prevent 404 errors in logs
s.engine.POST("/api/event_logging/batch", func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"status": "ok"})
})
s.engine.POST("/v1internal:method", geminiCLIHandlers.CLIHandler)
// OAuth callback endpoints (reuse main server port)

View File

@@ -64,6 +64,10 @@ type Config struct {
// KiroKey defines a list of Kiro (AWS CodeWhisperer) configurations.
KiroKey []KiroKey `yaml:"kiro" json:"kiro"`
// KiroPreferredEndpoint sets the global default preferred endpoint for all Kiro providers.
// Values: "ide" (default, CodeWhisperer) or "cli" (Amazon Q).
KiroPreferredEndpoint string `yaml:"kiro-preferred-endpoint" json:"kiro-preferred-endpoint"`
// Codex defines a list of Codex API key configurations as specified in the YAML configuration file.
CodexKey []CodexKey `yaml:"codex-api-key" json:"codex-api-key"`
@@ -278,6 +282,10 @@ type KiroKey struct {
// AgentTaskType sets the Kiro API task type. Known values: "vibe", "dev", "chat".
// Leave empty to let API use defaults. Different values may inject different system prompts.
AgentTaskType string `yaml:"agent-task-type,omitempty" json:"agent-task-type,omitempty"`
// PreferredEndpoint sets the preferred Kiro API endpoint/quota.
// Values: "codewhisperer" (default, IDE quota) or "amazonq" (CLI quota).
PreferredEndpoint string `yaml:"preferred-endpoint,omitempty" json:"preferred-endpoint,omitempty"`
}
// OpenAICompatibility represents the configuration for OpenAI API compatibility
@@ -504,6 +512,7 @@ func (cfg *Config) SanitizeKiroKeys() {
entry.ProfileArn = strings.TrimSpace(entry.ProfileArn)
entry.Region = strings.TrimSpace(entry.Region)
entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
entry.PreferredEndpoint = strings.TrimSpace(entry.PreferredEndpoint)
}
}

View File

@@ -884,8 +884,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
// GetKiroModels returns the Kiro (AWS CodeWhisperer) model definitions
func GetKiroModels() []*ModelInfo {
return []*ModelInfo{
// --- Base Models ---
{
ID: "kiro-claude-opus-4.5",
ID: "kiro-claude-opus-4-5",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
@@ -896,7 +897,7 @@ func GetKiroModels() []*ModelInfo {
MaxCompletionTokens: 64000,
},
{
ID: "kiro-claude-sonnet-4.5",
ID: "kiro-claude-sonnet-4-5",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
@@ -918,7 +919,7 @@ func GetKiroModels() []*ModelInfo {
MaxCompletionTokens: 64000,
},
{
ID: "kiro-claude-haiku-4.5",
ID: "kiro-claude-haiku-4-5",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
@@ -928,21 +929,9 @@ func GetKiroModels() []*ModelInfo {
ContextLength: 200000,
MaxCompletionTokens: 64000,
},
// --- Chat Variant (No tool calling, for pure conversation) ---
{
ID: "kiro-claude-opus-4.5-chat",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
Type: "kiro",
DisplayName: "Kiro Claude Opus 4.5 (Chat)",
Description: "Claude Opus 4.5 for chat only (no tool calling)",
ContextLength: 200000,
MaxCompletionTokens: 64000,
},
// --- Agentic Variants (Optimized for coding agents with chunked writes) ---
{
ID: "kiro-claude-opus-4.5-agentic",
ID: "kiro-claude-opus-4-5-agentic",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
@@ -953,7 +942,7 @@ func GetKiroModels() []*ModelInfo {
MaxCompletionTokens: 64000,
},
{
ID: "kiro-claude-sonnet-4.5-agentic",
ID: "kiro-claude-sonnet-4-5-agentic",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
@@ -963,6 +952,28 @@ func GetKiroModels() []*ModelInfo {
ContextLength: 200000,
MaxCompletionTokens: 64000,
},
{
ID: "kiro-claude-sonnet-4-agentic",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
Type: "kiro",
DisplayName: "Kiro Claude Sonnet 4 (Agentic)",
Description: "Claude Sonnet 4 optimized for coding agents (chunked writes)",
ContextLength: 200000,
MaxCompletionTokens: 64000,
},
{
ID: "kiro-claude-haiku-4-5-agentic",
Object: "model",
Created: 1732752000,
OwnedBy: "aws",
Type: "kiro",
DisplayName: "Kiro Claude Haiku 4.5 (Agentic)",
Description: "Claude Haiku 4.5 optimized for coding agents (chunked writes)",
ContextLength: 200000,
MaxCompletionTokens: 64000,
},
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -10,9 +10,18 @@ import (
"github.com/tidwall/sjson"
)
// reasoningEffortToBudget maps OpenAI reasoning_effort values to Claude thinking budget_tokens.
// OpenAI uses "low", "medium", "high" while Claude uses numeric budget_tokens.
var reasoningEffortToBudget = map[string]int{
"low": 4000,
"medium": 16000,
"high": 32000,
}
// ConvertOpenAIRequestToKiro transforms an OpenAI Chat Completions API request into Kiro (Claude) format.
// Kiro uses Claude-compatible format internally, so we primarily pass through to Claude format.
// Supports tool calling: OpenAI tools -> Claude tools, tool_calls -> tool_use, tool messages -> tool_result.
// Supports reasoning/thinking: OpenAI reasoning_effort -> Claude thinking parameter.
func ConvertOpenAIRequestToKiro(modelName string, inputRawJSON []byte, stream bool) []byte {
rawJSON := bytes.Clone(inputRawJSON)
root := gjson.ParseBytes(rawJSON)
@@ -38,6 +47,26 @@ func ConvertOpenAIRequestToKiro(modelName string, inputRawJSON []byte, stream bo
out, _ = sjson.Set(out, "top_p", v.Float())
}
// Handle OpenAI reasoning_effort parameter -> Claude thinking parameter
// OpenAI format: {"reasoning_effort": "low"|"medium"|"high"}
// Claude format: {"thinking": {"type": "enabled", "budget_tokens": N}}
if v := root.Get("reasoning_effort"); v.Exists() {
effort := v.String()
if budget, ok := reasoningEffortToBudget[effort]; ok {
thinking := map[string]interface{}{
"type": "enabled",
"budget_tokens": budget,
}
out, _ = sjson.Set(out, "thinking", thinking)
}
}
// Also support direct thinking parameter passthrough (for Claude API compatibility)
// Claude format: {"thinking": {"type": "enabled", "budget_tokens": N}}
if v := root.Get("thinking"); v.Exists() && v.IsObject() {
out, _ = sjson.Set(out, "thinking", v.Value())
}
// Convert OpenAI tools to Claude tools format
if tools := root.Get("tools"); tools.Exists() && tools.IsArray() {
claudeTools := make([]interface{}, 0)

View File

@@ -134,6 +134,28 @@ func convertClaudeEventToOpenAI(jsonStr string, model string) []string {
result, _ := json.Marshal(response)
results = append(results, string(result))
}
} else if deltaType == "thinking_delta" {
// Thinking/reasoning content delta - convert to OpenAI reasoning_content format
thinkingDelta := root.Get("delta.thinking").String()
if thinkingDelta != "" {
response := map[string]interface{}{
"id": "chatcmpl-" + uuid.New().String()[:24],
"object": "chat.completion.chunk",
"created": time.Now().Unix(),
"model": model,
"choices": []map[string]interface{}{
{
"index": 0,
"delta": map[string]interface{}{
"reasoning_content": thinkingDelta,
},
"finish_reason": nil,
},
},
}
result, _ := json.Marshal(response)
results = append(results, string(result))
}
} else if deltaType == "input_json_delta" {
// Tool input delta (streaming arguments)
partialJSON := root.Get("delta.partial_json").String()
@@ -298,6 +320,7 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
root := gjson.ParseBytes(rawResponse)
var content string
var reasoningContent string
var toolCalls []map[string]interface{}
contentArray := root.Get("content")
@@ -306,6 +329,9 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
itemType := item.Get("type").String()
if itemType == "text" {
content += item.Get("text").String()
} else if itemType == "thinking" {
// Extract thinking/reasoning content
reasoningContent += item.Get("thinking").String()
} else if itemType == "tool_use" {
// Convert Claude tool_use to OpenAI tool_calls format
inputJSON := item.Get("input").String()
@@ -339,6 +365,11 @@ func ConvertKiroResponseToOpenAINonStream(ctx context.Context, model string, ori
"content": content,
}
// Add reasoning_content if present (OpenAI reasoning format)
if reasoningContent != "" {
message["reasoning_content"] = reasoningContent
}
// Add tool_calls if present
if len(toolCalls) > 0 {
message["tool_calls"] = toolCalls

View File

@@ -1317,6 +1317,12 @@ func (w *Watcher) SnapshotCoreAuths() []*coreauth.Auth {
if kk.AgentTaskType != "" {
attrs["agent_task_type"] = kk.AgentTaskType
}
if kk.PreferredEndpoint != "" {
attrs["preferred_endpoint"] = kk.PreferredEndpoint
} else if cfg.KiroPreferredEndpoint != "" {
// Apply global default if not overridden by specific key
attrs["preferred_endpoint"] = cfg.KiroPreferredEndpoint
}
if refreshToken != "" {
attrs["refresh_token"] = refreshToken
}
@@ -1532,6 +1538,17 @@ func (w *Watcher) SnapshotCoreAuths() []*coreauth.Auth {
a.NextRefreshAfter = expiresAt.Add(-30 * time.Minute)
}
}
// Apply global preferred endpoint setting if not present in metadata
if cfg.KiroPreferredEndpoint != "" {
// Check if already set in metadata (which takes precedence in executor)
if _, hasMeta := metadata["preferred_endpoint"]; !hasMeta {
if a.Attributes == nil {
a.Attributes = make(map[string]string)
}
a.Attributes["preferred_endpoint"] = cfg.KiroPreferredEndpoint
}
}
}
applyAuthExcludedModelsMeta(a, cfg, nil, "oauth")