CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-03-09 15:25:17 +00:00

Author	SHA1	Message	Date
Joao	7fd98f3556	feat: add IDC auth support with Kiro IDE headers	2025-12-23 08:18:10 +00:00
Luis Pater	b1aecc2bf1	Merge branch 'router-for-me:main' into main	2025-12-23 02:49:37 +08:00
Luis Pater	83b90e106f	refactor(antigravity): add sandbox URL constant and update base URLs routine	2025-12-23 02:47:56 +08:00
Luis Pater	e755e567ea	Merge branch 'router-for-me:main' into main	2025-12-21 19:54:13 +08:00
Luis Pater	63908869f6	Merge pull request #611 from soilSpoon/feature/antigravity feat(antigravity): Improve Claude model compatibility	2025-12-21 16:27:29 +08:00
이대희	4070c9de81	Remove interleaved-thinking header from requests Removes the addition of the "anthropic-beta: interleaved-thinking-2025-05-14" header for Claude thinking models when building HTTP requests. This prevents sending an experimental/feature flag header that is no longer required and avoids potential compatibility or routing issues with downstream services. Keeps request headers simpler and more standard.	2025-12-21 15:29:36 +09:00
이대희	1e9e4a86a2	Improve thinking/tool signature handling for Claude and Gemini requests Prefer cached signatures and avoid injecting dummy thinking blocks; instead remove unsigned thinking blocks and add a skip sentinel for tool calls without a valid signature. Generate stable session IDs from the first user message, apply schema cleaning only for Claude models, and reorder thinking parts so thinking appears first. For Gemini, remove thinking blocks and attach a skip sentinel to function calls. Simplify response handling by passing raw function args through (remove special Bash conversion). Update and add tests to reflect the new behavior. These changes prevent rejected dummy signatures, improve compatibility with Antigravity’s signature validation, provide more stable session IDs for conversation grouping, and make request/response translation more robust.	2025-12-21 15:15:50 +09:00
Luis Pater	5418bbc338	Merge branch 'router-for-me:main' into main	2025-12-20 23:40:09 +08:00
Ben Vargas	1231dc9cda	feat(antigravity): add payload config support to Antigravity executor Add applyPayloadConfig calls to all Antigravity executor paths (Execute, executeClaudeNonStream, ExecuteStream) to enable config.yaml payload overrides for Antigravity/Gemini-Claude models. This allows users to configure thinking budget and other parameters via payload.override in config.yaml for models like gemini-claude-opus-4-5*.	2025-12-19 22:30:44 -07:00
Luis Pater	843316ea7a	Merge branch 'router-for-me:main' into main	2025-12-19 22:24:26 +08:00
hkfires	2039062845	fix(gemini): add optional skip for gemini3 thinking conversion	2025-12-19 22:07:43 +08:00
이대희	b6ba15fcbd	fix(runtime/executor): Antigravity executor schema handling and Claude-specific headers	2025-12-19 10:28:23 +09:00
Luis Pater	0f646800f6	Merge branch 'router-for-me:main' into main	2025-12-18 08:36:59 +08:00
Luis Pater	13eb5268de	Merge pull request #582 from ben-vargas/fix-gemini-3-thinking-level feat: use thinkingLevel for Gemini 3 models per Google documentation	2025-12-18 07:19:37 +08:00
Ben Vargas	598f0af19b	fix: apply thinkingLevel from model suffix metadata for Gemini 3 The previous commit added thinkingLevel support but didn't apply it when the reasoning effort came from model name suffix (e.g., model(minimal)). This was because ResolveThinkingConfigFromMetadata returns nil for level-based models, bypassing the metadata application. Changes: - Add ApplyGemini3ThinkingLevelFromMetadata for standard Gemini API - Add ApplyGemini3ThinkingLevelFromMetadataCLI for CLI API format - Update gemini_cli_executor to apply Gemini 3 thinkingLevel from metadata - Update antigravity_executor to apply Gemini 3 thinkingLevel from metadata - Update aistudio_executor to apply Gemini 3 thinkingLevel from metadata - Add comprehensive test coverage for Gemini 3 thinkingLevel functions	2025-12-17 16:08:38 -07:00
Ben Vargas	a33f5d31fc	feat: use thinkingLevel for Gemini 3 models per Google documentation Per Google's official documentation, Gemini 3 models should use thinkingLevel (string) instead of thinkingBudget (number) for optimal performance. From Google's Gemini Thinking docs: > Use the thinkingLevel parameter with Gemini 3 models. While > thinkingBudget is accepted for backwards compatibility, using > it with Gemini 3 Pro may result in suboptimal performance. Changes: - Add model family detection functions (IsGemini3Model, IsGemini25Model, IsGemini3ProModel, IsGemini3FlashModel) - Add ApplyGeminiThinkingLevel and ApplyGeminiCLIThinkingLevel functions for applying thinkingLevel config - Add ValidateGemini3ThinkingLevel for model-specific level validation - Add ThinkingBudgetToGemini3Level for backward compatibility conversion - Update NormalizeGeminiThinkingBudget to convert budget to level for Gemini 3 models - Update ApplyDefaultThinkingIfNeeded to not set a default level for Gemini 3 (lets API use its dynamic default "high") - Update ConvertThinkingLevelToBudget to preserve thinkingLevel for Gemini 3 models - Add Levels field to all Gemini 3 model definitions: - Gemini 3 Pro: ["low", "high"] - Gemini 3 Flash: ["minimal", "low", "medium", "high"] Backward compatibility: - Gemini 2.5 models continue to use thinkingBudget as before - If thinkingBudget is provided for Gemini 3, it's converted to the appropriate thinkingLevel - Existing configurations continue to work	2025-12-17 15:28:20 -07:00
Ravens2121	54acd69e9d	Merge branch 'router-for-me:main' into master	2025-12-18 04:39:17 +08:00
Ravens2121	d687ee2777	feat(kiro): implement official reasoningContentEvent and improve metadat	2025-12-18 04:38:22 +08:00
Luis Pater	f7b17ee6ec	Merge pull request #36 from rezhajulio/feat/gpt-5.2 Add GPT-5.2 model support for GitHub Copilot	2025-12-18 03:16:25 +08:00
Luis Pater	408614c74c	Merge branch 'router-for-me:main' into main	2025-12-18 03:13:48 +08:00
Luis Pater	68a27772b3	feat(antigravity): enable token counting via API with resilient routing Introduces the capability to count tokens for Antigravity-backed requests. This implementation leverages the `countTokens` endpoint of the Antigravity API, replacing the prior unsupported stub. Key aspects of this update include: - API Integration: Direct integration with the Antigravity `countTokens` API, including necessary request payload translation and authentication. - Resilient Infrastructure: A fallback mechanism has been established, allowing the system to attempt connections across multiple Antigravity base URLs to ensure request success even in the event of temporary service interruptions. - Model Aliasing: Added mappings for `gemini-3-flash` and `gemini-3-flash-preview` to ensure compatibility with the latest model variants. - Robust Error Handling: Comprehensive error handling and logging are in place to manage failures during API interactions.	2025-12-18 03:12:46 +08:00
Luis Pater	10e0ea1309	Merge main into pr-39	2025-12-18 00:36:51 +08:00
Luis Pater	5fda6f8ef3	feat(antigravity): implement non-streaming execution for Claude model requests	2025-12-17 23:17:11 +08:00
Luis Pater	09923f654c	feat(antigravity): add streaming support for Claude model requests	2025-12-17 22:16:57 +08:00
이대희	1b8e538a77	feature: Improves Gemini JSON schema compatibility Enhances compatibility with the Gemini API by implementing a schema cleaning process. This includes: - Centralizing schema cleaning logic for Gemini in a dedicated utility function. - Converting unsupported schema keywords to hints within the description field. - Flattening complex schema structures like `anyOf`, `oneOf`, and type arrays to simplify the schema. - Handling streaming responses with empty tool names, which can occur in subsequent chunks after the initial tool use.	2025-12-17 17:10:53 +09:00
Rezha Julio	92c62bb2fb	Add GPT-5.2 model support for GitHub Copilot	2025-12-17 02:15:10 +07:00
Luis Pater	1efade8bdb	Merge branch 'main' into plus	2025-12-17 02:50:14 +08:00
hkfires	28a428ae2f	fix(thinking): align budget effort mapping across translators Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.	2025-12-16 18:34:43 +08:00
hkfires	b326ec3641	feat(iflow): add thinking support for iFlow models	2025-12-16 18:34:43 +08:00
Ravens2121	f3d1cc8dc1	chore: change debug logs from INFO to DEBUG level	2025-12-16 05:32:03 +08:00
Ravens2121	0a3a95521c	feat: enhance thinking mode support for Kiro translator Changes:	2025-12-16 05:01:40 +08:00
Luis Pater	407020de0c	Merge branch 'router-for-me:main' into main	2025-12-15 10:36:39 +08:00
hkfires	367a05bdf6	refactor(thinking): export thinking helpers Expose thinking/effort normalization helpers from the executor package so conversion tests use production code and stay aligned with runtime validation behavior.	2025-12-15 09:16:15 +08:00
hkfires	712ce9f781	fix(thinking): drop unsupported none effort When budget 0 maps to "none" for models that use thinking levels but don't support that effort level, strip thinking fields instead of setting an invalid reasoning_effort value. Tests now expect removal for this edge case.	2025-12-15 09:16:14 +08:00
hkfires	27a5ad8ec2	Fixed: #534 fix(aistudio): correct JSON string boundary detection for backslash sequences	2025-12-15 09:00:14 +08:00
Ravens2121	de0ea3ac49	fix(kiro): Always parse thinking tags from Kiro API responses Amp-Thread-ID: https://ampcode.com/threads/T-019b1c00-17b4-713d-a8cc-813b71181934 Co-authored-by: Amp <amp@ampcode.com>	2025-12-14 16:46:17 +08:00
Ravens2121	12116b018d	Merge branch 'router-for-me:main' into master	2025-12-14 16:42:30 +08:00
Ravens2121	c3ed3b40ea	feat(kiro): Add token usage cross-validation and simplify thinking mode handling	2025-12-14 16:40:33 +08:00
Luis Pater	b80c2aabb0	Merge branch 'router-for-me:main' into main	2025-12-14 16:19:29 +08:00
Luis Pater	14ce6aebd1	Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing fix(gemini-cli): enhance 429 retry delay parsing	2025-12-14 14:04:14 +08:00
Ravens2121	9c04c18c04	feat(kiro): enhance request translation and fix streaming issues English: - Fix <thinking> tag parsing: only parse at response start, avoid misinterpreting discussion text - Add token counting support using tiktoken for local estimation - Support top_p parameter in inference config - Handle max_tokens=-1 as maximum (32000 tokens) - Add tool_choice and response_format parameter handling via system prompt hints - Support multiple thinking mode detection formats (Claude API, OpenAI reasoning_effort, AMP/Cursor) - Shorten MCP tool names exceeding 64 characters - Fix duplicate [DONE] marker in OpenAI SSE streaming - Enhance token usage statistics with multiple event format support - Add code fence markers to constants 中文: - 修复 <thinking> 标签解析：仅在响应开头解析，避免误解析讨论文本中的标签 - 使用 tiktoken 实现本地 token 计数功能 - 支持 top_p 推理配置参数 - 处理 max_tokens=-1 转换为最大值（32000 tokens） - 通过系统提示词注入实现 tool_choice 和 response_format 参数支持 - 支持多种思考模式检测格式（Claude API、OpenAI reasoning_effort、AMP/Cursor） - 截断超过64字符的 MCP 工具名称 - 修复 OpenAI SSE 流中重复的 [DONE] 标记 - 增强 token 使用量统计，支持多种事件格式 - 添加代码围栏标记常量	2025-12-14 11:57:16 +08:00
Ravens2121	81ae09d0ec	Merge branch 'kiro-refactor-backup'	2025-12-14 07:03:24 +08:00
Ravens2121	01cf221167	feat(kiro): 代码优化重构 + OpenAI翻译器实现	2025-12-14 06:58:50 +08:00
Luis Pater	79033aee34	Merge branch 'main' into plus	2025-12-14 00:07:46 +08:00
Ravens2121	1ea0cff3a4	fix: add missing import declarations for net and time packages	2025-12-13 12:57:47 +08:00
Ravens2121	75793a18f0	feat(kiro): Add Kiro OAuth login entry and auth file filter in Web UI 为Kiro供应商添加WEB UI OAuth登录入口和认证文件过滤器 ## Changes / 更改内容 ### Frontend / 前端 (management.html) - Add Kiro OAuth card UI with support for AWS Builder ID, Google, and GitHub login methods - 添加Kiro OAuth卡片UI，支持AWS Builder ID、Google和GitHub三种登录方式 - Add i18n translations for Kiro OAuth (Chinese and English) - 添加Kiro OAuth的中英文国际化翻译 - Add Kiro filter button in auth files management page - 在认证文件管理页面添加Kiro过滤按钮 - Implement JavaScript methods: startKiroOAuth(), openKiroLink(), copyKiroLink(), copyKiroDeviceCode(), startKiroOAuthPolling(), resetKiroOAuthUI() - 实现JavaScript方法：startKiroOAuth()、openKiroLink()、copyKiroLink()、copyKiroDeviceCode()、startKiroOAuthPolling()、resetKiroOAuthUI() ### Backend / 后端 - Add /kiro-auth-url endpoint for Kiro OAuth authentication (auth_files.go) - 添加/kiro-auth-url端点用于Kiro OAuth认证 (auth_files.go) - Fix GetAuthStatus() to correctly parse device_code and auth_url status - 修复GetAuthStatus()以正确解析device_code和auth_url状态 - Change status delimiter from ':' to '\|' to avoid URL parsing issues - 将状态分隔符从':'改为'\|'以避免URL解析问题 - Export CreateToken method in social_auth.go - 在social_auth.go中导出CreateToken方法 - Register Kiro OAuth routes in server.go - 在server.go中注册Kiro OAuth路由 ## Files Modified / 修改的文件 - management.html - internal/api/handlers/management/auth_files.go - internal/api/server.go - internal/auth/kiro/social_auth.go	2025-12-13 11:39:22 +08:00
Ravens2121	58866b21cb	feat: optimize connection pooling and improve Kiro executor reliability ## 中文说明 ### 连接池优化 - 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数 - MaxIdleConnsPerHost 从默认的 2 增加到 20，支持更多并发用户 - MaxConnsPerHost 设为 0（无限制），避免连接瓶颈 - 添加 IdleConnTimeout (90s) 和其他超时配置 ### Kiro 执行器增强 - 添加 Event Stream 消息解析的边界保护，防止越界访问 - 实现实时使用量估算（每 5000 字符或 15 秒发送 ping 事件） - 正确从上游事件中提取并传递 stop_reason - 改进输入 token 计算，优先使用 Claude 格式解析 - 添加 max_tokens 截断警告日志 ### Token 计算改进 - 添加 tokenizer 缓存（sync.Map）避免重复创建 - 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子 - 新增 countClaudeChatTokens 函数支持 Claude API 格式 - 支持图像 token 估算（基于尺寸计算） ### 认证刷新优化 - RefreshLead 从 30 分钟改为 5 分钟，与 Antigravity 保持一致 - 修复 NextRefreshAfter 设置，防止频繁刷新检查 - refreshFailureBackoff 从 5 分钟改为 1 分钟，加快失败恢复 --- ## English Description ### Connection Pool Optimization - Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy - Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users - Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks - Add IdleConnTimeout (90s) and other timeout configurations ### Kiro Executor Enhancements - Add boundary protection for Event Stream message parsing to prevent out-of-bounds access - Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds) - Correctly extract and pass stop_reason from upstream events - Improve input token calculation, prioritize Claude format parsing - Add max_tokens truncation warning logs ### Token Calculation Improvements - Add tokenizer cache (sync.Map) to avoid repeated creation - Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models - Add countClaudeChatTokens function to support Claude API format - Support image token estimation (calculated based on dimensions) ### Authentication Refresh Optimization - Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity - Fix NextRefreshAfter setting to prevent frequent refresh checks - Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery	2025-12-13 10:21:40 +08:00
Luis Pater	660aabc437	fix(executor): add `allowCompat` support for reasoning effort normalization Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.	2025-12-13 04:06:02 +08:00
Ravens2121	db80b20bc2	feat(kiro): enhance thinking support and fix truncation issues - Thinking Support: - Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants. - Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true). - Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code. - Kiro Executor Fixes: - Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative. - Removed aggressive duplicate content filtering logic that caused truncation/data loss. - Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks. - Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end. - Performance: - Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.	2025-12-13 03:57:13 +08:00
Luis Pater	f3f0f1717d	Merge branch 'dev' into think	2025-12-12 22:16:44 +08:00

1 2 3 4 5 ...

274 Commits