Commit Graph

805 Commits

Author SHA1 Message Date
Luis Pater
bbb21d7c2b Merge branch 'main' into plus 2025-12-15 16:36:11 +08:00
hkfires
8f1dd69e72 feat(amp): require API key authentication for management routes
All Amp management endpoints (e.g., /api/user, /threads) are now protected by the standard API key authentication middleware. This ensures that all management operations require a valid API key, significantly improving security.

As a result of this change:
- The `restrict-management-to-localhost` setting now defaults to `false`. API key authentication provides a stronger and more flexible security control than IP-based restrictions, improving usability in containerized environments.
- The reverse proxy logic now strips the client's `Authorization` header after authenticating the initial request. It then injects the configured `upstream-api-key` for the request to the upstream Amp service.

BREAKING CHANGE: Amp management endpoints now require a valid API key for authentication. Requests without a valid API key in the `Authorization` header will be rejected with a 401 Unauthorized error.
2025-12-15 13:24:53 +08:00
Luis Pater
407020de0c Merge branch 'router-for-me:main' into main 2025-12-15 10:36:39 +08:00
hkfires
09c339953d fix(openai): forward reasoning.effort value
Drop the hardcoded effort mapping in request conversion so
unknown values are preserved instead of being coerced to `auto
2025-12-15 09:16:15 +08:00
hkfires
367a05bdf6 refactor(thinking): export thinking helpers
Expose thinking/effort normalization helpers from the executor package
so conversion tests use production code and stay aligned with runtime
validation behavior.
2025-12-15 09:16:15 +08:00
hkfires
d20b71deb9 fix(thinking): normalize effort mapping
Route OpenAI reasoning effort through ThinkingEffortToBudget for Claude
translators, preserve "minimal" when translating OpenAI Responses, and
treat blank/unknown efforts as no-ops for Gemini thinking configs.

Also map budget -1 to "auto" and expand cross-protocol thinking tests.
2025-12-15 09:16:15 +08:00
hkfires
712ce9f781 fix(thinking): drop unsupported none effort
When budget 0 maps to "none" for models that use thinking levels
but don't support that effort level, strip thinking fields instead
of setting an invalid reasoning_effort value.
Tests now expect removal for this edge case.
2025-12-15 09:16:14 +08:00
hkfires
716aa71f6e fix(thinking): centralize reasoning_effort mapping
Move OpenAI `reasoning_effort` -> Gemini `thinkingConfig` budget logic into
shared helpers used by Gemini, Gemini CLI, and antigravity translators.

Normalize Claude thinking handling by preferring positive budgets, applying
budget token normalization, and gating by model support.

Always convert Gemini `thinkingBudget` back to OpenAI `reasoning_effort` to
support allowCompat models, and update tests for normalization behavior.
2025-12-15 09:16:14 +08:00
hkfires
e8976f9898 fix(thinking): map budgets to effort for level models 2025-12-15 09:16:14 +08:00
hkfires
5ef2d59e05 fix(thinking): gate reasoning effort by model support
Only map OpenAI reasoning effort to Claude thinking for models that support
thinking and use budget tokens (not level-based thinking).

Also add "xhigh" effort mapping and adjust minimal/low budgets, with new
raw-payload conversion tests across protocols and models.
2025-12-15 09:16:14 +08:00
hkfires
27a5ad8ec2 Fixed: #534
fix(aistudio): correct JSON string boundary detection for backslash sequences
2025-12-15 09:00:14 +08:00
sukakcoding
4a764afd76 refactor: extract parseFunctionResponse helper to reduce duplication 2025-12-15 01:05:36 +08:00
sukakcoding
ecf49d574b fix: handle malformed json in function response parsing 2025-12-15 00:59:46 +08:00
Luis Pater
188de4ff2a Merge branch 'router-for-me:main' into main 2025-12-15 00:31:29 +08:00
Luis Pater
5a75ef8ffd Merge pull request #536 from AoaoMH/feature/auth-model-check
feat: using Client Model Infos;
2025-12-15 00:29:33 +08:00
Test
07279f8746 feat: using Client Model Infos; 2025-12-15 00:13:05 +08:00
Luis Pater
71f788b13a fix(registry): remove unused ThinkingSupport from DeepSeek-R1 model 2025-12-14 21:30:17 +08:00
Luis Pater
59c62dc580 fix(registry): correct DeepSeek-V3.2 experimental model ID 2025-12-14 21:27:43 +08:00
Luis Pater
8fb1f114bc Merge pull request #25 from Ravens2121/master
fix(kiro): Always parse thinking tags from Kiro API responses
2025-12-14 20:29:14 +08:00
Luis Pater
6a4cff6699 Merge branch 'router-for-me:main' into main 2025-12-14 17:28:28 +08:00
Luis Pater
d5310a3300 Merge pull request #531 from AoaoMH/feature/auth-model-check
feat: add API endpoint to query models for auth credentials
2025-12-14 16:46:43 +08:00
Ravens2121
de0ea3ac49 fix(kiro): Always parse thinking tags from Kiro API responses
Amp-Thread-ID: https://ampcode.com/threads/T-019b1c00-17b4-713d-a8cc-813b71181934
Co-authored-by: Amp <amp@ampcode.com>
2025-12-14 16:46:17 +08:00
Ravens2121
12116b018d Merge branch 'router-for-me:main' into master 2025-12-14 16:42:30 +08:00
Ravens2121
c3ed3b40ea feat(kiro): Add token usage cross-validation and simplify thinking mode handling 2025-12-14 16:40:33 +08:00
Luis Pater
b80c2aabb0 Merge branch 'router-for-me:main' into main 2025-12-14 16:19:29 +08:00
Luis Pater
f0a3eb574e fix(registry): update DeepSeek model definitions with new IDs and descriptions 2025-12-14 16:17:11 +08:00
Test
bb15855443 feat: add API endpoint to query models for auth credentials 2025-12-14 15:16:26 +08:00
Luis Pater
14ce6aebd1 Merge pull request #449 from sususu98/fix/gemini-cli-429-retry-delay-parsing
fix(gemini-cli): enhance 429 retry delay parsing
2025-12-14 14:04:14 +08:00
Luis Pater
2fe83723f2 Merge pull request #515 from teeverc/fix/response-rewriter-streaming-flush
fix(amp): flush response buffer after each streaming chunk write
2025-12-14 13:26:05 +08:00
Ravens2121
9c04c18c04 feat(kiro): enhance request translation and fix streaming issues
English:
- Fix <thinking> tag parsing: only parse at response start, avoid misinterpreting discussion text
- Add token counting support using tiktoken for local estimation
- Support top_p parameter in inference config
- Handle max_tokens=-1 as maximum (32000 tokens)
- Add tool_choice and response_format parameter handling via system prompt hints
- Support multiple thinking mode detection formats (Claude API, OpenAI reasoning_effort, AMP/Cursor)
- Shorten MCP tool names exceeding 64 characters
- Fix duplicate [DONE] marker in OpenAI SSE streaming
- Enhance token usage statistics with multiple event format support
- Add code fence markers to constants

中文:
- 修复 <thinking> 标签解析:仅在响应开头解析,避免误解析讨论文本中的标签
- 使用 tiktoken 实现本地 token 计数功能
- 支持 top_p 推理配置参数
- 处理 max_tokens=-1 转换为最大值(32000 tokens)
- 通过系统提示词注入实现 tool_choice 和 response_format 参数支持
- 支持多种思考模式检测格式(Claude API、OpenAI reasoning_effort、AMP/Cursor)
- 截断超过64字符的 MCP 工具名称
- 修复 OpenAI SSE 流中重复的 [DONE] 标记
- 增强 token 使用量统计,支持多种事件格式
- 添加代码围栏标记常量
2025-12-14 11:57:16 +08:00
Ravens2121
81ae09d0ec Merge branch 'kiro-refactor-backup' 2025-12-14 07:03:24 +08:00
Ravens2121
01cf221167 feat(kiro): 代码优化重构 + OpenAI翻译器实现 2025-12-14 06:58:50 +08:00
teeverc
cd8c86c6fb refactor: only flush stream response on successful write 2025-12-13 13:32:54 -08:00
teeverc
52d5fd1a67 fix: streaming for amp cli 2025-12-13 13:17:53 -08:00
Luis Pater
79033aee34 Merge branch 'main' into plus 2025-12-14 00:07:46 +08:00
Ravens2121
1ea0cff3a4 fix: add missing import declarations for net and time packages 2025-12-13 12:57:47 +08:00
Ravens2121
75793a18f0 feat(kiro): Add Kiro OAuth login entry and auth file filter in Web UI
为Kiro供应商添加WEB UI OAuth登录入口和认证文件过滤器

## Changes / 更改内容

### Frontend / 前端 (management.html)
- Add Kiro OAuth card UI with support for AWS Builder ID, Google, and GitHub login methods
- 添加Kiro OAuth卡片UI,支持AWS Builder ID、Google和GitHub三种登录方式
- Add i18n translations for Kiro OAuth (Chinese and English)
- 添加Kiro OAuth的中英文国际化翻译
- Add Kiro filter button in auth files management page
- 在认证文件管理页面添加Kiro过滤按钮
- Implement JavaScript methods: startKiroOAuth(), openKiroLink(), copyKiroLink(), copyKiroDeviceCode(), startKiroOAuthPolling(), resetKiroOAuthUI()
- 实现JavaScript方法:startKiroOAuth()、openKiroLink()、copyKiroLink()、copyKiroDeviceCode()、startKiroOAuthPolling()、resetKiroOAuthUI()

### Backend / 后端
- Add /kiro-auth-url endpoint for Kiro OAuth authentication (auth_files.go)
- 添加/kiro-auth-url端点用于Kiro OAuth认证 (auth_files.go)
- Fix GetAuthStatus() to correctly parse device_code and auth_url status
- 修复GetAuthStatus()以正确解析device_code和auth_url状态
- Change status delimiter from ':' to '|' to avoid URL parsing issues
- 将状态分隔符从':'改为'|'以避免URL解析问题
- Export CreateToken method in social_auth.go
- 在social_auth.go中导出CreateToken方法
- Register Kiro OAuth routes in server.go
- 在server.go中注册Kiro OAuth路由

## Files Modified / 修改的文件
- management.html
- internal/api/handlers/management/auth_files.go
- internal/api/server.go
- internal/auth/kiro/social_auth.go
2025-12-13 11:39:22 +08:00
Ravens2121
58866b21cb feat: optimize connection pooling and improve Kiro executor reliability
## 中文说明

### 连接池优化
- 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数
- MaxIdleConnsPerHost 从默认的 2 增加到 20,支持更多并发用户
- MaxConnsPerHost 设为 0(无限制),避免连接瓶颈
- 添加 IdleConnTimeout (90s) 和其他超时配置

### Kiro 执行器增强
- 添加 Event Stream 消息解析的边界保护,防止越界访问
- 实现实时使用量估算(每 5000 字符或 15 秒发送 ping 事件)
- 正确从上游事件中提取并传递 stop_reason
- 改进输入 token 计算,优先使用 Claude 格式解析
- 添加 max_tokens 截断警告日志

### Token 计算改进
- 添加 tokenizer 缓存(sync.Map)避免重复创建
- 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子
- 新增 countClaudeChatTokens 函数支持 Claude API 格式
- 支持图像 token 估算(基于尺寸计算)

### 认证刷新优化
- RefreshLead 从 30 分钟改为 5 分钟,与 Antigravity 保持一致
- 修复 NextRefreshAfter 设置,防止频繁刷新检查
- refreshFailureBackoff 从 5 分钟改为 1 分钟,加快失败恢复

---

## English Description

### Connection Pool Optimization
- Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy
- Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users
- Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks
- Add IdleConnTimeout (90s) and other timeout configurations

### Kiro Executor Enhancements
- Add boundary protection for Event Stream message parsing to prevent out-of-bounds access
- Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds)
- Correctly extract and pass stop_reason from upstream events
- Improve input token calculation, prioritize Claude format parsing
- Add max_tokens truncation warning logs

### Token Calculation Improvements
- Add tokenizer cache (sync.Map) to avoid repeated creation
- Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models
- Add countClaudeChatTokens function to support Claude API format
- Support image token estimation (calculated based on dimensions)

### Authentication Refresh Optimization
- Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity
- Fix NextRefreshAfter setting to prevent frequent refresh checks
- Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery
2025-12-13 10:21:40 +08:00
Luis Pater
660aabc437 fix(executor): add allowCompat support for reasoning effort normalization
Introduced `allowCompat` parameter to improve compatibility handling for reasoning effort in payloads across OpenAI and similar models.
2025-12-13 04:06:02 +08:00
Ravens2121
db80b20bc2 feat(kiro): enhance thinking support and fix truncation issues
- **Thinking Support**:
    - Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
    - Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
    - Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.

- **Kiro Executor Fixes**:
    - Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
    - Removed aggressive duplicate content filtering logic that caused truncation/data loss.
    - Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
    - Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.

- **Performance**:
    - Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
2025-12-13 03:57:13 +08:00
Luis Pater
f3f0f1717d Merge branch 'dev' into think 2025-12-12 22:16:44 +08:00
Luis Pater
05b499fb83 Merge branch 'router-for-me:main' into main 2025-12-12 22:09:08 +08:00
Luis Pater
7621ec609e Merge pull request #501 from huynguyen03dev/fix/openai-compat-model-alias-resolution
fix(openai-compat): prevent model alias from being overwritten
2025-12-12 21:58:15 +08:00
Luis Pater
9f511f0024 fix(executor): improve model compatibility handling for OpenAI-compatibility
Enhances payload handling by introducing OpenAI-compatibility checks and refining how reasoning metadata is resolved, ensuring broader model support.
2025-12-12 21:57:25 +08:00
hkfires
374faa2640 fix(thinking): map budgets to effort levels
Ensure thinking settings translate correctly across providers:
- Only apply reasoning_effort to level-based models and derive it from numeric
  budget suffixes when present
- Strip effort string fields for budget-based models and skip Claude/Gemini
  budget resolution for level-based or unsupported models
- Default Gemini include_thoughts when a nonzero budget override is set
- Add cross-protocol conversion and budget range tests
2025-12-12 21:33:20 +08:00
Luis Pater
ba6aa5fbbe Merge branch 'router-for-me:main' into main 2025-12-12 20:09:31 +08:00
Luis Pater
1c52a89535 Merge pull request #502 from router-for-me/iflow
fix(auth): prevent duplicate iflow BXAuth tokens
2025-12-12 20:03:37 +08:00
hkfires
e7cedbee6e fix(auth): prevent duplicate iflow BXAuth tokens 2025-12-12 19:57:19 +08:00
Luis Pater
7f4f6bc9ca Merge branch 'main' into plus 2025-12-12 18:41:39 +08:00
Luis Pater
b8194e717c Merge pull request #500 from router-for-me/think
fix(codex): raise default reasoning effort to medium
2025-12-12 18:35:26 +08:00