CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-03-22 00:50:26 +00:00

Author	SHA1	Message	Date
Luis Pater	843316ea7a	Merge branch 'router-for-me:main' into main	2025-12-19 22:24:26 +08:00
hkfires	2039062845	fix(gemini): add optional skip for gemini3 thinking conversion	2025-12-19 22:07:43 +08:00
Luis Pater	0f646800f6	Merge branch 'router-for-me:main' into main	2025-12-18 08:36:59 +08:00
Luis Pater	13eb5268de	Merge pull request #582 from ben-vargas/fix-gemini-3-thinking-level feat: use thinkingLevel for Gemini 3 models per Google documentation	2025-12-18 07:19:37 +08:00
Ben Vargas	598f0af19b	fix: apply thinkingLevel from model suffix metadata for Gemini 3 The previous commit added thinkingLevel support but didn't apply it when the reasoning effort came from model name suffix (e.g., model(minimal)). This was because ResolveThinkingConfigFromMetadata returns nil for level-based models, bypassing the metadata application. Changes: - Add ApplyGemini3ThinkingLevelFromMetadata for standard Gemini API - Add ApplyGemini3ThinkingLevelFromMetadataCLI for CLI API format - Update gemini_cli_executor to apply Gemini 3 thinkingLevel from metadata - Update antigravity_executor to apply Gemini 3 thinkingLevel from metadata - Update aistudio_executor to apply Gemini 3 thinkingLevel from metadata - Add comprehensive test coverage for Gemini 3 thinkingLevel functions	2025-12-17 16:08:38 -07:00
Ben Vargas	a33f5d31fc	feat: use thinkingLevel for Gemini 3 models per Google documentation Per Google's official documentation, Gemini 3 models should use thinkingLevel (string) instead of thinkingBudget (number) for optimal performance. From Google's Gemini Thinking docs: > Use the thinkingLevel parameter with Gemini 3 models. While > thinkingBudget is accepted for backwards compatibility, using > it with Gemini 3 Pro may result in suboptimal performance. Changes: - Add model family detection functions (IsGemini3Model, IsGemini25Model, IsGemini3ProModel, IsGemini3FlashModel) - Add ApplyGeminiThinkingLevel and ApplyGeminiCLIThinkingLevel functions for applying thinkingLevel config - Add ValidateGemini3ThinkingLevel for model-specific level validation - Add ThinkingBudgetToGemini3Level for backward compatibility conversion - Update NormalizeGeminiThinkingBudget to convert budget to level for Gemini 3 models - Update ApplyDefaultThinkingIfNeeded to not set a default level for Gemini 3 (lets API use its dynamic default "high") - Update ConvertThinkingLevelToBudget to preserve thinkingLevel for Gemini 3 models - Add Levels field to all Gemini 3 model definitions: - Gemini 3 Pro: ["low", "high"] - Gemini 3 Flash: ["minimal", "low", "medium", "high"] Backward compatibility: - Gemini 2.5 models continue to use thinkingBudget as before - If thinkingBudget is provided for Gemini 3, it's converted to the appropriate thinkingLevel - Existing configurations continue to work	2025-12-17 15:28:20 -07:00
Ravens2121	54acd69e9d	Merge branch 'router-for-me:main' into master	2025-12-18 04:39:17 +08:00
Ravens2121	d687ee2777	feat(kiro): implement official reasoningContentEvent and improve metadat	2025-12-18 04:38:22 +08:00
Luis Pater	f7b17ee6ec	Merge pull request #36 from rezhajulio/feat/gpt-5.2 Add GPT-5.2 model support for GitHub Copilot	2025-12-18 03:16:25 +08:00
Luis Pater	408614c74c	Merge branch 'router-for-me:main' into main	2025-12-18 03:13:48 +08:00
Luis Pater	68a27772b3	feat(antigravity): enable token counting via API with resilient routing Introduces the capability to count tokens for Antigravity-backed requests. This implementation leverages the `countTokens` endpoint of the Antigravity API, replacing the prior unsupported stub. Key aspects of this update include: - API Integration: Direct integration with the Antigravity `countTokens` API, including necessary request payload translation and authentication. - Resilient Infrastructure: A fallback mechanism has been established, allowing the system to attempt connections across multiple Antigravity base URLs to ensure request success even in the event of temporary service interruptions. - Model Aliasing: Added mappings for `gemini-3-flash` and `gemini-3-flash-preview` to ensure compatibility with the latest model variants. - Robust Error Handling: Comprehensive error handling and logging are in place to manage failures during API interactions.	2025-12-18 03:12:46 +08:00
Luis Pater	10e0ea1309	Merge main into pr-39	2025-12-18 00:36:51 +08:00
Luis Pater	5fda6f8ef3	feat(antigravity): implement non-streaming execution for Claude model requests	2025-12-17 23:17:11 +08:00
Luis Pater	09923f654c	feat(antigravity): add streaming support for Claude model requests	2025-12-17 22:16:57 +08:00
이대희	1b8e538a77	feature: Improves Gemini JSON schema compatibility Enhances compatibility with the Gemini API by implementing a schema cleaning process. This includes: - Centralizing schema cleaning logic for Gemini in a dedicated utility function. - Converting unsupported schema keywords to hints within the description field. - Flattening complex schema structures like `anyOf`, `oneOf`, and type arrays to simplify the schema. - Handling streaming responses with empty tool names, which can occur in subsequent chunks after the initial tool use.	2025-12-17 17:10:53 +09:00
Rezha Julio	92c62bb2fb	Add GPT-5.2 model support for GitHub Copilot	2025-12-17 02:15:10 +07:00
Luis Pater	1efade8bdb	Merge branch 'main' into plus	2025-12-17 02:50:14 +08:00
hkfires	28a428ae2f	fix(thinking): align budget effort mapping across translators Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.	2025-12-16 18:34:43 +08:00
hkfires	b326ec3641	feat(iflow): add thinking support for iFlow models	2025-12-16 18:34:43 +08:00
Ravens2121	f3d1cc8dc1	chore: change debug logs from INFO to DEBUG level	2025-12-16 05:32:03 +08:00
Ravens2121	0a3a95521c	feat: enhance thinking mode support for Kiro translator Changes:	2025-12-16 05:01:40 +08:00
Luis Pater	407020de0c	Merge branch 'router-for-me:main' into main	2025-12-15 10:36:39 +08:00
hkfires	367a05bdf6	refactor(thinking): export thinking helpers Expose thinking/effort normalization helpers from the executor package so conversion tests use production code and stay aligned with runtime validation behavior.	2025-12-15 09:16:15 +08:00
hkfires	712ce9f781	fix(thinking): drop unsupported none effort When budget 0 maps to "none" for models that use thinking levels but don't support that effort level, strip thinking fields instead of setting an invalid reasoning_effort value. Tests now expect removal for this edge case.	2025-12-15 09:16:14 +08:00
hkfires	27a5ad8ec2	Fixed: #534 fix(aistudio): correct JSON string boundary detection for backslash sequences	2025-12-15 09:00:14 +08:00
Ravens2121	de0ea3ac49	fix(kiro): Always parse thinking tags from Kiro API responses Amp-Thread-ID: https://ampcode.com/threads/T-019b1c00-17b4-713d-a8cc-813b71181934 Co-authored-by: Amp <amp@ampcode.com>	2025-12-14 16:46:17 +08:00
Ravens2121	12116b018d	Merge branch 'router-for-me:main' into master	2025-12-14 16:42:30 +08:00
Ravens2121	c3ed3b40ea	feat(kiro): Add token usage cross-validation and simplify thinking mode handling	2025-12-14 16:40:33 +08:00
Luis Pater	b80c2aabb0	Merge branch 'router-for-me:main' into main	2025-12-14 16:19:29 +08:00
Luis Pater	14ce6aebd1	Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing fix(gemini-cli): enhance 429 retry delay parsing	2025-12-14 14:04:14 +08:00
Ravens2121	9c04c18c04	feat(kiro): enhance request translation and fix streaming issues English: - Fix <thinking> tag parsing: only parse at response start, avoid misinterpreting discussion text - Add token counting support using tiktoken for local estimation - Support top_p parameter in inference config - Handle max_tokens=-1 as maximum (32000 tokens) - Add tool_choice and response_format parameter handling via system prompt hints - Support multiple thinking mode detection formats (Claude API, OpenAI reasoning_effort, AMP/Cursor) - Shorten MCP tool names exceeding 64 characters - Fix duplicate [DONE] marker in OpenAI SSE streaming - Enhance token usage statistics with multiple event format support - Add code fence markers to constants 中文: - 修复 <thinking> 标签解析：仅在响应开头解析，避免误解析讨论文本中的标签 - 使用 tiktoken 实现本地 token 计数功能 - 支持 top_p 推理配置参数 - 处理 max_tokens=-1 转换为最大值（32000 tokens） - 通过系统提示词注入实现 tool_choice 和 response_format 参数支持 - 支持多种思考模式检测格式（Claude API、OpenAI reasoning_effort、AMP/Cursor） - 截断超过64字符的 MCP 工具名称 - 修复 OpenAI SSE 流中重复的 [DONE] 标记 - 增强 token 使用量统计，支持多种事件格式 - 添加代码围栏标记常量	2025-12-14 11:57:16 +08:00
Ravens2121	81ae09d0ec	Merge branch 'kiro-refactor-backup'	2025-12-14 07:03:24 +08:00
Ravens2121	01cf221167	feat(kiro): 代码优化重构 + OpenAI翻译器实现	2025-12-14 06:58:50 +08:00
Luis Pater	79033aee34	Merge branch 'main' into plus	2025-12-14 00:07:46 +08:00
Ravens2121	1ea0cff3a4	fix: add missing import declarations for net and time packages	2025-12-13 12:57:47 +08:00
Ravens2121	75793a18f0	feat(kiro): Add Kiro OAuth login entry and auth file filter in Web UI 为Kiro供应商添加WEB UI OAuth登录入口和认证文件过滤器 ## Changes / 更改内容 ### Frontend / 前端 (management.html) - Add Kiro OAuth card UI with support for AWS Builder ID, Google, and GitHub login methods - 添加Kiro OAuth卡片UI，支持AWS Builder ID、Google和GitHub三种登录方式 - Add i18n translations for Kiro OAuth (Chinese and English) - 添加Kiro OAuth的中英文国际化翻译 - Add Kiro filter button in auth files management page - 在认证文件管理页面添加Kiro过滤按钮 - Implement JavaScript methods: startKiroOAuth(), openKiroLink(), copyKiroLink(), copyKiroDeviceCode(), startKiroOAuthPolling(), resetKiroOAuthUI() - 实现JavaScript方法：startKiroOAuth()、openKiroLink()、copyKiroLink()、copyKiroDeviceCode()、startKiroOAuthPolling()、resetKiroOAuthUI() ### Backend / 后端 - Add /kiro-auth-url endpoint for Kiro OAuth authentication (auth_files.go) - 添加/kiro-auth-url端点用于Kiro OAuth认证 (auth_files.go) - Fix GetAuthStatus() to correctly parse device_code and auth_url status - 修复GetAuthStatus()以正确解析device_code和auth_url状态 - Change status delimiter from ':' to '\|' to avoid URL parsing issues - 将状态分隔符从':'改为'\|'以避免URL解析问题 - Export CreateToken method in social_auth.go - 在social_auth.go中导出CreateToken方法 - Register Kiro OAuth routes in server.go - 在server.go中注册Kiro OAuth路由 ## Files Modified / 修改的文件 - management.html - internal/api/handlers/management/auth_files.go - internal/api/server.go - internal/auth/kiro/social_auth.go	2025-12-13 11:39:22 +08:00
Ravens2121	58866b21cb	feat: optimize connection pooling and improve Kiro executor reliability ## 中文说明 ### 连接池优化 - 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数 - MaxIdleConnsPerHost 从默认的 2 增加到 20，支持更多并发用户 - MaxConnsPerHost 设为 0（无限制），避免连接瓶颈 - 添加 IdleConnTimeout (90s) 和其他超时配置 ### Kiro 执行器增强 - 添加 Event Stream 消息解析的边界保护，防止越界访问 - 实现实时使用量估算（每 5000 字符或 15 秒发送 ping 事件） - 正确从上游事件中提取并传递 stop_reason - 改进输入 token 计算，优先使用 Claude 格式解析 - 添加 max_tokens 截断警告日志 ### Token 计算改进 - 添加 tokenizer 缓存（sync.Map）避免重复创建 - 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子 - 新增 countClaudeChatTokens 函数支持 Claude API 格式 - 支持图像 token 估算（基于尺寸计算） ### 认证刷新优化 - RefreshLead 从 30 分钟改为 5 分钟，与 Antigravity 保持一致 - 修复 NextRefreshAfter 设置，防止频繁刷新检查 - refreshFailureBackoff 从 5 分钟改为 1 分钟，加快失败恢复 --- ## English Description ### Connection Pool Optimization - Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy - Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users - Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks - Add IdleConnTimeout (90s) and other timeout configurations ### Kiro Executor Enhancements - Add boundary protection for Event Stream message parsing to prevent out-of-bounds access - Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds) - Correctly extract and pass stop_reason from upstream events - Improve input token calculation, prioritize Claude format parsing - Add max_tokens truncation warning logs ### Token Calculation Improvements - Add tokenizer cache (sync.Map) to avoid repeated creation - Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models - Add countClaudeChatTokens function to support Claude API format - Support image token estimation (calculated based on dimensions) ### Authentication Refresh Optimization - Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity - Fix NextRefreshAfter setting to prevent frequent refresh checks - Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery	2025-12-13 10:21:40 +08:00
Luis Pater	660aabc437	fix(executor): add `allowCompat` support for reasoning effort normalization Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.	2025-12-13 04:06:02 +08:00
Ravens2121	db80b20bc2	feat(kiro): enhance thinking support and fix truncation issues - Thinking Support: - Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants. - Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true). - Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code. - Kiro Executor Fixes: - Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative. - Removed aggressive duplicate content filtering logic that caused truncation/data loss. - Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks. - Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end. - Performance: - Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.	2025-12-13 03:57:13 +08:00
Luis Pater	f3f0f1717d	Merge branch 'dev' into think	2025-12-12 22:16:44 +08:00
Luis Pater	05b499fb83	Merge branch 'router-for-me:main' into main	2025-12-12 22:09:08 +08:00
Luis Pater	7621ec609e	Merge pull request #501 from huynguyen03dev/fix/openai-compat-model-alias-resolution fix(openai-compat): prevent model alias from being overwritten	2025-12-12 21:58:15 +08:00
Luis Pater	9f511f0024	fix(executor): improve model compatibility handling for OpenAI-compatibility Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.	2025-12-12 21:57:25 +08:00
hkfires	374faa2640	fix(thinking): map budgets to effort levels Ensure thinking settings translate correctly across providers: - Only apply reasoning_effort to level-based models and derive it from numeric budget suffixes when present - Strip effort string fields for budget-based models and skip Claude/Gemini budget resolution for level-based or unsupported models - Default Gemini include_thoughts when a nonzero budget override is set - Add cross-protocol conversion and budget range tests	2025-12-12 21:33:20 +08:00
Luis Pater	7f4f6bc9ca	Merge branch 'main' into plus	2025-12-12 18:41:39 +08:00
huynguyen03.dev	15c3cc3a50	fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6), the alias resolution was correctly applied but then immediately overwritten by ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs. This fix skips the ResolveOriginalModel override when a model alias has already been resolved, ensuring the correct model name is sent to the upstream provider. Co-authored-by: Amp <amp@ampcode.com>	2025-12-12 17:20:24 +07:00
hkfires	d131435e25	fix(codex): raise default reasoning effort to medium	2025-12-12 18:18:48 +08:00
Ravens2121	fdeb84db2b	Merge branch 'router-for-me:main' into master	2025-12-12 13:44:07 +08:00
Ravens2121	84920cb670	feat(kiro): add multi-endpoint fallback & thinking mode support	2025-12-12 13:43:36 +08:00
Ravens2121	204bba9dea	refactor(kiro): update Kiro executor to use CodeWhisperer endpoint and improve tool calling support	2025-12-12 09:27:30 +08:00

1 2 3 4 5 ...

264 Commits