Luis Pater
6a4cff6699
Merge branch 'router-for-me:main' into main
2025-12-14 17:28:28 +08:00
Luis Pater
d5310a3300
Merge pull request #531 from AoaoMH/feature/auth-model-check
...
feat: add API endpoint to query models for auth credentials
2025-12-14 16:46:43 +08:00
Luis Pater
b80c2aabb0
Merge branch 'router-for-me:main' into main
2025-12-14 16:19:29 +08:00
Luis Pater
f0a3eb574e
fix(registry): update DeepSeek model definitions with new IDs and descriptions
2025-12-14 16:17:11 +08:00
Test
bb15855443
feat: add API endpoint to query models for auth credentials
2025-12-14 15:16:26 +08:00
Luis Pater
14ce6aebd1
Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing
...
fix(gemini-cli): enhance 429 retry delay parsing
2025-12-14 14:04:14 +08:00
Luis Pater
2fe83723f2
Merge pull request #515 from teeverc/fix/response-rewriter-streaming-flush
...
fix(amp): flush response buffer after each streaming chunk write
2025-12-14 13:26:05 +08:00
Ravens2121
9c04c18c04
feat(kiro): enhance request translation and fix streaming issues
...
English:
- Fix <thinking> tag parsing: only parse at response start, avoid misinterpreting discussion text
- Add token counting support using tiktoken for local estimation
- Support top_p parameter in inference config
- Handle max_tokens=-1 as maximum (32000 tokens)
- Add tool_choice and response_format parameter handling via system prompt hints
- Support multiple thinking mode detection formats (Claude API, OpenAI reasoning_effort, AMP/Cursor)
- Shorten MCP tool names exceeding 64 characters
- Fix duplicate [DONE] marker in OpenAI SSE streaming
- Enhance token usage statistics with multiple event format support
- Add code fence markers to constants
中文:
- 修复 <thinking> 标签解析:仅在响应开头解析,避免误解析讨论文本中的标签
- 使用 tiktoken 实现本地 token 计数功能
- 支持 top_p 推理配置参数
- 处理 max_tokens=-1 转换为最大值(32000 tokens)
- 通过系统提示词注入实现 tool_choice 和 response_format 参数支持
- 支持多种思考模式检测格式(Claude API、OpenAI reasoning_effort、AMP/Cursor)
- 截断超过64字符的 MCP 工具名称
- 修复 OpenAI SSE 流中重复的 [DONE] 标记
- 增强 token 使用量统计,支持多种事件格式
- 添加代码围栏标记常量
2025-12-14 11:57:16 +08:00
Ravens2121
81ae09d0ec
Merge branch 'kiro-refactor-backup'
2025-12-14 07:03:24 +08:00
Ravens2121
01cf221167
feat(kiro): 代码优化重构 + OpenAI翻译器实现
2025-12-14 06:58:50 +08:00
teeverc
cd8c86c6fb
refactor: only flush stream response on successful write
2025-12-13 13:32:54 -08:00
teeverc
52d5fd1a67
fix: streaming for amp cli
2025-12-13 13:17:53 -08:00
Luis Pater
79033aee34
Merge branch 'main' into plus
2025-12-14 00:07:46 +08:00
Ravens2121
1ea0cff3a4
fix: add missing import declarations for net and time packages
2025-12-13 12:57:47 +08:00
Ravens2121
75793a18f0
feat(kiro): Add Kiro OAuth login entry and auth file filter in Web UI
...
为Kiro供应商添加WEB UI OAuth登录入口和认证文件过滤器
## Changes / 更改内容
### Frontend / 前端 (management.html)
- Add Kiro OAuth card UI with support for AWS Builder ID, Google, and GitHub login methods
- 添加Kiro OAuth卡片UI,支持AWS Builder ID、Google和GitHub三种登录方式
- Add i18n translations for Kiro OAuth (Chinese and English)
- 添加Kiro OAuth的中英文国际化翻译
- Add Kiro filter button in auth files management page
- 在认证文件管理页面添加Kiro过滤按钮
- Implement JavaScript methods: startKiroOAuth(), openKiroLink(), copyKiroLink(), copyKiroDeviceCode(), startKiroOAuthPolling(), resetKiroOAuthUI()
- 实现JavaScript方法:startKiroOAuth()、openKiroLink()、copyKiroLink()、copyKiroDeviceCode()、startKiroOAuthPolling()、resetKiroOAuthUI()
### Backend / 后端
- Add /kiro-auth-url endpoint for Kiro OAuth authentication (auth_files.go)
- 添加/kiro-auth-url端点用于Kiro OAuth认证 (auth_files.go)
- Fix GetAuthStatus() to correctly parse device_code and auth_url status
- 修复GetAuthStatus()以正确解析device_code和auth_url状态
- Change status delimiter from ':' to '|' to avoid URL parsing issues
- 将状态分隔符从':'改为'|'以避免URL解析问题
- Export CreateToken method in social_auth.go
- 在social_auth.go中导出CreateToken方法
- Register Kiro OAuth routes in server.go
- 在server.go中注册Kiro OAuth路由
## Files Modified / 修改的文件
- management.html
- internal/api/handlers/management/auth_files.go
- internal/api/server.go
- internal/auth/kiro/social_auth.go
2025-12-13 11:39:22 +08:00
Ravens2121
58866b21cb
feat: optimize connection pooling and improve Kiro executor reliability
...
## 中文说明
### 连接池优化
- 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数
- MaxIdleConnsPerHost 从默认的 2 增加到 20,支持更多并发用户
- MaxConnsPerHost 设为 0(无限制),避免连接瓶颈
- 添加 IdleConnTimeout (90s) 和其他超时配置
### Kiro 执行器增强
- 添加 Event Stream 消息解析的边界保护,防止越界访问
- 实现实时使用量估算(每 5000 字符或 15 秒发送 ping 事件)
- 正确从上游事件中提取并传递 stop_reason
- 改进输入 token 计算,优先使用 Claude 格式解析
- 添加 max_tokens 截断警告日志
### Token 计算改进
- 添加 tokenizer 缓存(sync.Map)避免重复创建
- 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子
- 新增 countClaudeChatTokens 函数支持 Claude API 格式
- 支持图像 token 估算(基于尺寸计算)
### 认证刷新优化
- RefreshLead 从 30 分钟改为 5 分钟,与 Antigravity 保持一致
- 修复 NextRefreshAfter 设置,防止频繁刷新检查
- refreshFailureBackoff 从 5 分钟改为 1 分钟,加快失败恢复
---
## English Description
### Connection Pool Optimization
- Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy
- Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users
- Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks
- Add IdleConnTimeout (90s) and other timeout configurations
### Kiro Executor Enhancements
- Add boundary protection for Event Stream message parsing to prevent out-of-bounds access
- Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds)
- Correctly extract and pass stop_reason from upstream events
- Improve input token calculation, prioritize Claude format parsing
- Add max_tokens truncation warning logs
### Token Calculation Improvements
- Add tokenizer cache (sync.Map) to avoid repeated creation
- Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models
- Add countClaudeChatTokens function to support Claude API format
- Support image token estimation (calculated based on dimensions)
### Authentication Refresh Optimization
- Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity
- Fix NextRefreshAfter setting to prevent frequent refresh checks
- Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery
2025-12-13 10:21:40 +08:00
Luis Pater
660aabc437
fix(executor): add allowCompat support for reasoning effort normalization
...
Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.
2025-12-13 04:06:02 +08:00
Ravens2121
db80b20bc2
feat(kiro): enhance thinking support and fix truncation issues
...
- **Thinking Support**:
- Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
- Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
- Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.
- **Kiro Executor Fixes**:
- Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
- Removed aggressive duplicate content filtering logic that caused truncation/data loss.
- Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
- Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.
- **Performance**:
- Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
2025-12-13 03:57:13 +08:00
Luis Pater
f3f0f1717d
Merge branch 'dev' into think
2025-12-12 22:16:44 +08:00
Luis Pater
05b499fb83
Merge branch 'router-for-me:main' into main
2025-12-12 22:09:08 +08:00
Luis Pater
7621ec609e
Merge pull request #501 from huynguyen03dev/fix/openai-compat-model-alias-resolution
...
fix(openai-compat): prevent model alias from being overwritten
2025-12-12 21:58:15 +08:00
Luis Pater
9f511f0024
fix(executor): improve model compatibility handling for OpenAI-compatibility
...
Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.
2025-12-12 21:57:25 +08:00
hkfires
374faa2640
fix(thinking): map budgets to effort levels
...
Ensure thinking settings translate correctly across providers:
- Only apply reasoning_effort to level-based models and derive it from numeric
budget suffixes when present
- Strip effort string fields for budget-based models and skip Claude/Gemini
budget resolution for level-based or unsupported models
- Default Gemini include_thoughts when a nonzero budget override is set
- Add cross-protocol conversion and budget range tests
2025-12-12 21:33:20 +08:00
Luis Pater
ba6aa5fbbe
Merge branch 'router-for-me:main' into main
2025-12-12 20:09:31 +08:00
Luis Pater
1c52a89535
Merge pull request #502 from router-for-me/iflow
...
fix(auth): prevent duplicate iflow BXAuth tokens
2025-12-12 20:03:37 +08:00
hkfires
e7cedbee6e
fix(auth): prevent duplicate iflow BXAuth tokens
2025-12-12 19:57:19 +08:00
Luis Pater
7f4f6bc9ca
Merge branch 'main' into plus
2025-12-12 18:41:39 +08:00
Luis Pater
b8194e717c
Merge pull request #500 from router-for-me/think
...
fix(codex): raise default reasoning effort to medium
2025-12-12 18:35:26 +08:00
huynguyen03.dev
15c3cc3a50
fix(openai-compat): prevent model alias from being overwritten by ResolveOriginalModel
...
When using OpenAI-compatible providers with model aliases (e.g., glm-4.6-zai -> glm-4.6),
the alias resolution was correctly applied but then immediately overwritten by
ResolveOriginalModel, causing 'Unknown Model' errors from upstream APIs.
This fix skips the ResolveOriginalModel override when a model alias has already
been resolved, ensuring the correct model name is sent to the upstream provider.
Co-authored-by: Amp <amp@ampcode.com >
2025-12-12 17:20:24 +07:00
hkfires
d131435e25
fix(codex): raise default reasoning effort to medium
2025-12-12 18:18:48 +08:00
Luis Pater
6e43669498
Fixed : #440
...
feat(watcher): normalize auth file paths and implement debounce for remove events
2025-12-12 16:50:56 +08:00
Ravens2121
fdeb84db2b
Merge branch 'router-for-me:main' into master
2025-12-12 13:44:07 +08:00
Ravens2121
84920cb670
feat(kiro): add multi-endpoint fallback & thinking mode support
2025-12-12 13:43:36 +08:00
Ravens2121
204bba9dea
refactor(kiro): update Kiro executor to use CodeWhisperer endpoint and improve tool calling support
2025-12-12 09:27:30 +08:00
Luis Pater
35fdd7bc05
Merge branch 'router-for-me:main' into main
2025-12-12 08:54:36 +08:00
Ben Vargas
b09e2115d1
fix(models): add "none" reasoning effort level to gpt-5.2
...
Per OpenAI API documentation, gpt-5.2 supports reasoning_effort values
of "none", "low", "medium", "high", and "xhigh". The "none" level was
missing from the model definition.
Reference: https://platform.openai.com/docs/api-reference/chat/create#chat_create-reasoning_effort
2025-12-11 15:26:23 -07:00
Luis Pater
6a94afab6c
Merge branch 'router-for-me:main' into main
2025-12-12 04:08:54 +08:00
Luis Pater
a68c97a40f
Fixed : #492
2025-12-12 04:08:11 +08:00
Luis Pater
218dc17713
Merge branch 'router-for-me:main' into main
2025-12-12 03:03:36 +08:00
Luis Pater
cd2da152d4
feat(models): add GPT 5.2 model definition and prompts
2025-12-12 03:02:27 +08:00
Luis Pater
28469576bf
Merge branch 'router-for-me:main' into main
2025-12-12 02:41:05 +08:00
Ravens2121
40e7f066e4
feat(kiro): enhance Kiro executor with retry, deduplication and event filtering
2025-12-12 01:59:06 +08:00
Luis Pater
ef0edbfe69
refactor(claude): replace strings.Builder with simpler output string concatenation
2025-12-11 22:34:06 +08:00
hkfires
3c315551b0
refactor(executor): relocate gemini token counters
2025-12-11 21:56:44 +08:00
hkfires
27c9c5c4da
refactor(executor): clarify executor comments and oauth names
2025-12-11 21:56:44 +08:00
hkfires
fc9f6c974a
refactor(executor): clarify providers and streams
...
Add package and constructor documentation for AI Studio, Antigravity,
Gemini CLI, Gemini API, and Vertex executors to describe their roles and
inputs.
Introduce a shared stream scanner buffer constant in the Gemini API
executor and reuse it in Gemini CLI and Vertex streaming code so stream
handling uses a consistent configuration.
Update Refresh implementations for AI Studio, Gemini CLI, Gemini API
(API key), and Vertex executors to short‑circuit and simply return the
incoming auth object, while keeping Antigravity token renewal as the
only executor that performs OAuth refresh.
Remove OAuth2-based token refresh logic and related dependencies from
the Gemini API executor, since it now operates strictly with API key
credentials.
2025-12-11 21:56:43 +08:00
Luis Pater
4ce7c61a17
Merge branch 'main' into plus
2025-12-11 21:33:49 +08:00
Luis Pater
a74ee3f319
Merge pull request #481 from sususu98/fix/increase-buffer-size
...
fix: increase buffer size for stream scanners to 50MB across multiple executors
2025-12-11 21:20:54 +08:00
hkfires
e79f65fd8e
refactor(thinking): use parentheses for metadata suffix
2025-12-11 18:39:07 +08:00
hkfires
facfe7c518
refactor(thinking): use bracket tags for thinking meta
...
Align thinking suffix handling on a single bracket-style marker.
NormalizeThinkingModel strips a terminal `[value]` segment from
model identifiers and turns it into either a thinking budget (for
numeric values) or a reasoning effort hint (for strings). Emission
of `ThinkingIncludeThoughtsMetadataKey` is removed.
Executor helpers and the example config are updated so their
comments reference the new `[value]` suffix format instead of the
legacy dash variants.
BREAKING CHANGE: dash-based thinking suffixes (`-thinking`,
`-thinking-N`, `-reasoning`, `-nothinking`) are no longer parsed
for thinking metadata; only `[value]` annotations are recognized.
2025-12-11 18:17:28 +08:00