Luis Pater
407020de0c
Merge branch 'router-for-me:main' into main
v6.6.14-0
2025-12-15 10:36:39 +08:00
Luis Pater
8e4fbcaa7d
Merge pull request #533 from router-for-me/think
...
refactor(thinking): centralize reasoning effort mapping and normalize budget values
2025-12-15 10:34:41 +08:00
hkfires
09c339953d
fix(openai): forward reasoning.effort value
...
Drop the hardcoded effort mapping in request conversion so
unknown values are preserved instead of being coerced to `auto
2025-12-15 09:16:15 +08:00
hkfires
367a05bdf6
refactor(thinking): export thinking helpers
...
Expose thinking/effort normalization helpers from the executor package
so conversion tests use production code and stay aligned with runtime
validation behavior.
2025-12-15 09:16:15 +08:00
hkfires
d20b71deb9
fix(thinking): normalize effort mapping
...
Route OpenAI reasoning effort through ThinkingEffortToBudget for Claude
translators, preserve "minimal" when translating OpenAI Responses, and
treat blank/unknown efforts as no-ops for Gemini thinking configs.
Also map budget -1 to "auto" and expand cross-protocol thinking tests.
2025-12-15 09:16:15 +08:00
hkfires
712ce9f781
fix(thinking): drop unsupported none effort
...
When budget 0 maps to "none" for models that use thinking levels
but don't support that effort level, strip thinking fields instead
of setting an invalid reasoning_effort value.
Tests now expect removal for this edge case.
2025-12-15 09:16:14 +08:00
hkfires
a4a3274a55
test(thinking): expand conversion edge case coverage
2025-12-15 09:16:14 +08:00
hkfires
716aa71f6e
fix(thinking): centralize reasoning_effort mapping
...
Move OpenAI `reasoning_effort` -> Gemini `thinkingConfig` budget logic into
shared helpers used by Gemini, Gemini CLI, and antigravity translators.
Normalize Claude thinking handling by preferring positive budgets, applying
budget token normalization, and gating by model support.
Always convert Gemini `thinkingBudget` back to OpenAI `reasoning_effort` to
support allowCompat models, and update tests for normalization behavior.
2025-12-15 09:16:14 +08:00
hkfires
e8976f9898
fix(thinking): map budgets to effort for level models
2025-12-15 09:16:14 +08:00
hkfires
8496cc2444
test(thinking): cover openai-compat reasoning passthrough
2025-12-15 09:16:14 +08:00
hkfires
5ef2d59e05
fix(thinking): gate reasoning effort by model support
...
Only map OpenAI reasoning effort to Claude thinking for models that support
thinking and use budget tokens (not level-based thinking).
Also add "xhigh" effort mapping and adjust minimal/low budgets, with new
raw-payload conversion tests across protocols and models.
2025-12-15 09:16:14 +08:00
Chén Mù
07bb89ae80
Merge pull request #542 from router-for-me/aistudio
2025-12-15 09:13:25 +08:00
hkfires
27a5ad8ec2
Fixed : #534
...
fix(aistudio): correct JSON string boundary detection for backslash sequences
2025-12-15 09:00:14 +08:00
Luis Pater
707b07c5f5
Merge pull request #537 from sukakcoding/fix/function-response-fallback
...
fix: handle malformed json in function response parsing
2025-12-15 03:31:09 +08:00
sukakcoding
4a764afd76
refactor: extract parseFunctionResponse helper to reduce duplication
2025-12-15 01:05:36 +08:00
sukakcoding
ecf49d574b
fix: handle malformed json in function response parsing
2025-12-15 00:59:46 +08:00
Luis Pater
188de4ff2a
Merge branch 'router-for-me:main' into main
v6.6.12-0
2025-12-15 00:31:29 +08:00
Luis Pater
5a75ef8ffd
Merge pull request #536 from AoaoMH/feature/auth-model-check
...
feat: using Client Model Infos;
2025-12-15 00:29:33 +08:00
Test
07279f8746
feat: using Client Model Infos;
2025-12-15 00:13:05 +08:00
Luis Pater
71f788b13a
fix(registry): remove unused ThinkingSupport from DeepSeek-R1 model
2025-12-14 21:30:17 +08:00
Luis Pater
59c62dc580
fix(registry): correct DeepSeek-V3.2 experimental model ID
2025-12-14 21:27:43 +08:00
Luis Pater
8fb1f114bc
Merge pull request #25 from Ravens2121/master
...
fix(kiro): Always parse thinking tags from Kiro API responses
v6.6.11-1
2025-12-14 20:29:14 +08:00
Luis Pater
6a4cff6699
Merge branch 'router-for-me:main' into main
v6.6.11-0
2025-12-14 17:28:28 +08:00
Luis Pater
d5310a3300
Merge pull request #531 from AoaoMH/feature/auth-model-check
...
feat: add API endpoint to query models for auth credentials
2025-12-14 16:46:43 +08:00
Ravens2121
de0ea3ac49
fix(kiro): Always parse thinking tags from Kiro API responses
...
Amp-Thread-ID: https://ampcode.com/threads/T-019b1c00-17b4-713d-a8cc-813b71181934
Co-authored-by: Amp <amp@ampcode.com >
2025-12-14 16:46:17 +08:00
Ravens2121
12116b018d
Merge branch 'router-for-me:main' into master
2025-12-14 16:42:30 +08:00
Ravens2121
c3ed3b40ea
feat(kiro): Add token usage cross-validation and simplify thinking mode handling
2025-12-14 16:40:33 +08:00
Luis Pater
b80c2aabb0
Merge branch 'router-for-me:main' into main
v6.6.10-0
2025-12-14 16:19:29 +08:00
Luis Pater
f0a3eb574e
fix(registry): update DeepSeek model definitions with new IDs and descriptions
2025-12-14 16:17:11 +08:00
Test
bb15855443
feat: add API endpoint to query models for auth credentials
2025-12-14 15:16:26 +08:00
Luis Pater
14ce6aebd1
Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing
...
fix(gemini-cli): enhance 429 retry delay parsing
2025-12-14 14:04:14 +08:00
Luis Pater
2fe83723f2
Merge pull request #515 from teeverc/fix/response-rewriter-streaming-flush
...
fix(amp): flush response buffer after each streaming chunk write
2025-12-14 13:26:05 +08:00
Luis Pater
e73b9e10a6
Merge pull request #24 from Ravens2121/master
...
feat(kiro): Major Refactoring + OpenAI Translator Implementation + Streaming Fixes
v6.6.9-1
2025-12-14 12:51:28 +08:00
Ravens2121
9c04c18c04
feat(kiro): enhance request translation and fix streaming issues
...
English:
- Fix <thinking> tag parsing: only parse at response start, avoid misinterpreting discussion text
- Add token counting support using tiktoken for local estimation
- Support top_p parameter in inference config
- Handle max_tokens=-1 as maximum (32000 tokens)
- Add tool_choice and response_format parameter handling via system prompt hints
- Support multiple thinking mode detection formats (Claude API, OpenAI reasoning_effort, AMP/Cursor)
- Shorten MCP tool names exceeding 64 characters
- Fix duplicate [DONE] marker in OpenAI SSE streaming
- Enhance token usage statistics with multiple event format support
- Add code fence markers to constants
中文:
- 修复 <thinking> 标签解析:仅在响应开头解析,避免误解析讨论文本中的标签
- 使用 tiktoken 实现本地 token 计数功能
- 支持 top_p 推理配置参数
- 处理 max_tokens=-1 转换为最大值(32000 tokens)
- 通过系统提示词注入实现 tool_choice 和 response_format 参数支持
- 支持多种思考模式检测格式(Claude API、OpenAI reasoning_effort、AMP/Cursor)
- 截断超过64字符的 MCP 工具名称
- 修复 OpenAI SSE 流中重复的 [DONE] 标记
- 增强 token 使用量统计,支持多种事件格式
- 添加代码围栏标记常量
2025-12-14 11:57:16 +08:00
Ravens2121
81ae09d0ec
Merge branch 'kiro-refactor-backup'
2025-12-14 07:03:24 +08:00
Ravens2121
01cf221167
feat(kiro): 代码优化重构 + OpenAI翻译器实现
2025-12-14 06:58:50 +08:00
teeverc
cd8c86c6fb
refactor: only flush stream response on successful write
2025-12-13 13:32:54 -08:00
teeverc
52d5fd1a67
fix: streaming for amp cli
2025-12-13 13:17:53 -08:00
Luis Pater
7ecc7aabda
Merge pull request #23 from router-for-me/plus
...
v6.6.9
v6.6.9-0
2025-12-14 00:07:57 +08:00
Luis Pater
79033aee34
Merge branch 'main' into plus
2025-12-14 00:07:46 +08:00
Luis Pater
b6ad243e9e
Merge pull request #498 from teeverc/fix/claude-streaming-flush
...
fix(claude): flush Claude SSE chunks immediately
2025-12-13 23:58:34 +08:00
Luis Pater
92ca5078c1
docs(readme): update contributors for Kiro integration (AWS CodeWhisperer)
v6.6.8-1
2025-12-13 13:40:39 +08:00
Luis Pater
aca8523060
Merge pull request #22 from Ravens2121/master
...
feat(kiro): enhance thinking support and fix truncation issues
2025-12-13 13:37:47 +08:00
Ravens2121
1ea0cff3a4
fix: add missing import declarations for net and time packages
2025-12-13 12:57:47 +08:00
Ravens2121
75793a18f0
feat(kiro): Add Kiro OAuth login entry and auth file filter in Web UI
...
为Kiro供应商添加WEB UI OAuth登录入口和认证文件过滤器
## Changes / 更改内容
### Frontend / 前端 (management.html)
- Add Kiro OAuth card UI with support for AWS Builder ID, Google, and GitHub login methods
- 添加Kiro OAuth卡片UI,支持AWS Builder ID、Google和GitHub三种登录方式
- Add i18n translations for Kiro OAuth (Chinese and English)
- 添加Kiro OAuth的中英文国际化翻译
- Add Kiro filter button in auth files management page
- 在认证文件管理页面添加Kiro过滤按钮
- Implement JavaScript methods: startKiroOAuth(), openKiroLink(), copyKiroLink(), copyKiroDeviceCode(), startKiroOAuthPolling(), resetKiroOAuthUI()
- 实现JavaScript方法:startKiroOAuth()、openKiroLink()、copyKiroLink()、copyKiroDeviceCode()、startKiroOAuthPolling()、resetKiroOAuthUI()
### Backend / 后端
- Add /kiro-auth-url endpoint for Kiro OAuth authentication (auth_files.go)
- 添加/kiro-auth-url端点用于Kiro OAuth认证 (auth_files.go)
- Fix GetAuthStatus() to correctly parse device_code and auth_url status
- 修复GetAuthStatus()以正确解析device_code和auth_url状态
- Change status delimiter from ':' to '|' to avoid URL parsing issues
- 将状态分隔符从':'改为'|'以避免URL解析问题
- Export CreateToken method in social_auth.go
- 在social_auth.go中导出CreateToken方法
- Register Kiro OAuth routes in server.go
- 在server.go中注册Kiro OAuth路由
## Files Modified / 修改的文件
- management.html
- internal/api/handlers/management/auth_files.go
- internal/api/server.go
- internal/auth/kiro/social_auth.go
2025-12-13 11:39:22 +08:00
Ravens2121
58866b21cb
feat: optimize connection pooling and improve Kiro executor reliability
...
## 中文说明
### 连接池优化
- 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数
- MaxIdleConnsPerHost 从默认的 2 增加到 20,支持更多并发用户
- MaxConnsPerHost 设为 0(无限制),避免连接瓶颈
- 添加 IdleConnTimeout (90s) 和其他超时配置
### Kiro 执行器增强
- 添加 Event Stream 消息解析的边界保护,防止越界访问
- 实现实时使用量估算(每 5000 字符或 15 秒发送 ping 事件)
- 正确从上游事件中提取并传递 stop_reason
- 改进输入 token 计算,优先使用 Claude 格式解析
- 添加 max_tokens 截断警告日志
### Token 计算改进
- 添加 tokenizer 缓存(sync.Map)避免重复创建
- 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子
- 新增 countClaudeChatTokens 函数支持 Claude API 格式
- 支持图像 token 估算(基于尺寸计算)
### 认证刷新优化
- RefreshLead 从 30 分钟改为 5 分钟,与 Antigravity 保持一致
- 修复 NextRefreshAfter 设置,防止频繁刷新检查
- refreshFailureBackoff 从 5 分钟改为 1 分钟,加快失败恢复
---
## English Description
### Connection Pool Optimization
- Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy
- Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users
- Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks
- Add IdleConnTimeout (90s) and other timeout configurations
### Kiro Executor Enhancements
- Add boundary protection for Event Stream message parsing to prevent out-of-bounds access
- Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds)
- Correctly extract and pass stop_reason from upstream events
- Improve input token calculation, prioritize Claude format parsing
- Add max_tokens truncation warning logs
### Token Calculation Improvements
- Add tokenizer cache (sync.Map) to avoid repeated creation
- Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models
- Add countClaudeChatTokens function to support Claude API format
- Support image token estimation (calculated based on dimensions)
### Authentication Refresh Optimization
- Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity
- Fix NextRefreshAfter setting to prevent frequent refresh checks
- Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery
2025-12-13 10:21:40 +08:00
Luis Pater
660aabc437
fix(executor): add allowCompat support for reasoning effort normalization
...
Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.
2025-12-13 04:06:02 +08:00
Ravens2121
db80b20bc2
feat(kiro): enhance thinking support and fix truncation issues
...
- **Thinking Support**:
- Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
- Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
- Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.
- **Kiro Executor Fixes**:
- Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
- Removed aggressive duplicate content filtering logic that caused truncation/data loss.
- Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
- Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.
- **Performance**:
- Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
2025-12-13 03:57:13 +08:00
Luis Pater
566120e8d5
Merge pull request #505 from router-for-me/think
...
fix(thinking): map budgets to effort levels
2025-12-12 22:17:11 +08:00
Luis Pater
f3f0f1717d
Merge branch 'dev' into think
2025-12-12 22:16:44 +08:00