Commit Graph

154 Commits

Author SHA1 Message Date
Luis Pater
1efade8bdb Merge branch 'main' into plus 2025-12-17 02:50:14 +08:00
Luis Pater
670685139a fix(api): update route patterns to support wildcards for Gemini actions
Normalize action handling by accommodating wildcard patterns in route definitions for Gemini endpoints. Adjust `request.Action` parsing logic to correctly process routes with prefixed actions.
2025-12-17 01:17:02 +08:00
Luis Pater
52b6306388 feat(config): add support for model prefixes and prefix normalization
Refactor model management to include an optional `prefix` field for model credentials, enabling better namespace handling. Update affected configuration files, APIs, and handlers to support prefix normalization and routing. Remove unused OpenAI compatibility provider logic to simplify processing.
2025-12-17 01:07:26 +08:00
Luis Pater
59ac1a3f60 Merge branch 'main' into plus 2025-12-15 23:53:23 +08:00
Luis Pater
bbb21d7c2b Merge branch 'main' into plus 2025-12-15 16:36:11 +08:00
hkfires
3bc489254b fix(api): prevent double logging for error responses
The WriteErrorResponse function now caches the error response body in the gin context.

The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.
2025-12-15 16:36:01 +08:00
hkfires
4c07ea41c3 feat(api): return structured JSON error responses
The API error handling is updated to return a structured JSON payload
instead of a plain text message. This provides more context and allows
clients to programmatically handle different error types.

The new error response has the following structure:
{
  "error": {
    "message": "...",
    "type": "..."
  }
}

The `type` field is determined by the HTTP status code, such as
`authentication_error`, `rate_limit_error`, or `server_error`.

If the underlying error message from an upstream service is already a
valid JSON string, it will be preserved and returned directly.

BREAKING CHANGE: API error responses are now in a structured JSON
format instead of plain text. Clients expecting plain text error
messages will need to be updated to parse the new JSON body.
2025-12-15 16:19:52 +08:00
hkfires
f26da24a2f feat(auth): add proxy information to debug logs 2025-12-15 13:14:55 +08:00
Luis Pater
79033aee34 Merge branch 'main' into plus 2025-12-14 00:07:46 +08:00
Luis Pater
b6ad243e9e Merge pull request #498 from teeverc/fix/claude-streaming-flush
fix(claude): flush Claude SSE chunks immediately
2025-12-13 23:58:34 +08:00
Ravens2121
58866b21cb feat: optimize connection pooling and improve Kiro executor reliability
## 中文说明

### 连接池优化
- 为 AMP 代理、SOCKS5 代理和 HTTP 代理配置优化的连接池参数
- MaxIdleConnsPerHost 从默认的 2 增加到 20,支持更多并发用户
- MaxConnsPerHost 设为 0(无限制),避免连接瓶颈
- 添加 IdleConnTimeout (90s) 和其他超时配置

### Kiro 执行器增强
- 添加 Event Stream 消息解析的边界保护,防止越界访问
- 实现实时使用量估算(每 5000 字符或 15 秒发送 ping 事件)
- 正确从上游事件中提取并传递 stop_reason
- 改进输入 token 计算,优先使用 Claude 格式解析
- 添加 max_tokens 截断警告日志

### Token 计算改进
- 添加 tokenizer 缓存(sync.Map)避免重复创建
- 为 Claude/Kiro/AmazonQ 模型添加 1.1 调整因子
- 新增 countClaudeChatTokens 函数支持 Claude API 格式
- 支持图像 token 估算(基于尺寸计算)

### 认证刷新优化
- RefreshLead 从 30 分钟改为 5 分钟,与 Antigravity 保持一致
- 修复 NextRefreshAfter 设置,防止频繁刷新检查
- refreshFailureBackoff 从 5 分钟改为 1 分钟,加快失败恢复

---

## English Description

### Connection Pool Optimization
- Configure optimized connection pool parameters for AMP proxy, SOCKS5 proxy, and HTTP proxy
- Increase MaxIdleConnsPerHost from default 2 to 20 to support more concurrent users
- Set MaxConnsPerHost to 0 (unlimited) to avoid connection bottlenecks
- Add IdleConnTimeout (90s) and other timeout configurations

### Kiro Executor Enhancements
- Add boundary protection for Event Stream message parsing to prevent out-of-bounds access
- Implement real-time usage estimation (send ping events every 5000 chars or 15 seconds)
- Correctly extract and pass stop_reason from upstream events
- Improve input token calculation, prioritize Claude format parsing
- Add max_tokens truncation warning logs

### Token Calculation Improvements
- Add tokenizer cache (sync.Map) to avoid repeated creation
- Add 1.1 adjustment factor for Claude/Kiro/AmazonQ models
- Add countClaudeChatTokens function to support Claude API format
- Support image token estimation (calculated based on dimensions)

### Authentication Refresh Optimization
- Change RefreshLead from 30 minutes to 5 minutes, consistent with Antigravity
- Fix NextRefreshAfter setting to prevent frequent refresh checks
- Change refreshFailureBackoff from 5 minutes to 1 minute for faster failure recovery
2025-12-13 10:21:40 +08:00
Ravens2121
db80b20bc2 feat(kiro): enhance thinking support and fix truncation issues
- **Thinking Support**:
    - Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
    - Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
    - Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.

- **Kiro Executor Fixes**:
    - Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
    - Removed aggressive duplicate content filtering logic that caused truncation/data loss.
    - Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
    - Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.

- **Performance**:
    - Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
2025-12-13 03:57:13 +08:00
Luis Pater
ba6aa5fbbe Merge branch 'router-for-me:main' into main 2025-12-12 20:09:31 +08:00
hkfires
e7cedbee6e fix(auth): prevent duplicate iflow BXAuth tokens 2025-12-12 19:57:19 +08:00
teeverc
5ab3032335 Update sdk/api/handlers/claude/code_handlers.go
thank you gemini

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-12 00:26:01 -08:00
teeverc
1215c635a0 fix: flush Claude SSE chunks immediately to match OpenAI behavior
- Write each SSE chunk directly to c.Writer and flush immediately
- Remove buffered writer and ticker-based flushing that caused delayed output
- Add 500ms timeout case for consistency with OpenAI/Gemini handlers
- Clean up unused bufio import

This fixes the 'not streaming' issue where small responses were held
in the buffer until timeout/threshold was reached.

Amp-Thread-ID: https://ampcode.com/threads/T-019b1186-164e-740c-96ab-856f64ee6bee
Co-authored-by: Amp <amp@ampcode.com>
2025-12-12 00:14:19 -08:00
Luis Pater
35fdd7bc05 Merge branch 'router-for-me:main' into main 2025-12-12 08:54:36 +08:00
Luis Pater
6e2306a5f2 refactor(handlers): improve request logging and payload handling 2025-12-12 08:52:52 +08:00
Luis Pater
4ce7c61a17 Merge branch 'main' into plus 2025-12-11 21:33:49 +08:00
hkfires
88bdd25f06 fix(amp): set status on claude stream errors 2025-12-11 20:12:06 +08:00
Luis Pater
4360ed8a7b Merge branch 'router-for-me:main' into main 2025-12-11 03:17:55 +08:00
Luis Pater
423ce97665 feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
2025-12-11 03:10:50 +08:00
Luis Pater
1fd1ccca17 Merge branch 'router-for-me:main' into main 2025-12-09 21:13:08 +08:00
hkfires
347769b3e3 fix(openai-compat): use model id for auth model display 2025-12-09 18:09:14 +08:00
hkfires
da23ddb061 fix(gemini): normalize model listing output 2025-12-09 17:34:15 +08:00
Luis Pater
9f41894573 Merge branch 'main' into v6.5.57 2025-12-08 23:33:39 +08:00
vuonglv(Andy)
5c3a013cd1 feat(config): add configurable host binding for server (#454)
* feat(config): add configurable host binding for server
2025-12-08 23:16:39 +08:00
Luis Pater
f77c22e6ff Merge branch 'main' into feature/kiro-integration 2025-12-06 11:52:59 +08:00
Mansi
02d8a1cfec feat(kiro): add AWS Builder ID authentication support
- Add --kiro-aws-login flag for AWS Builder ID device code flow
- Add DoKiroAWSLogin function for AWS SSO OIDC authentication
- Complete Kiro integration with AWS, Google OAuth, and social auth
- Add kiro executor, translator, and SDK components
- Update browser support for Kiro authentication flows
2025-12-05 22:46:24 +03:00
Luis Pater
92f033dec0 Merge branch 'router-for-me:main' into main 2025-12-06 01:33:34 +08:00
Luis Pater
0ebabf5152 feat(antigravity): add FetchAntigravityProjectID function and integrate project ID retrieval 2025-12-06 01:32:12 +08:00
Luis Pater
f241124599 Merge branch 'router-for-me:main' into main 2025-12-06 00:43:02 +08:00
Luis Pater
c44c46dd80 Fixed: #421
feat(antigravity): implement project ID retrieval and integration in payload processing
2025-12-06 00:40:55 +08:00
Luis Pater
43cac7b5f6 Merge branch 'main' into v6.5.32 2025-12-02 11:46:05 +08:00
Luis Pater
0fd2abbc3b **refactor(cliproxy, config): remove vertex-compat flow, streamline Vertex API key handling**
- Removed `vertex-compat` executor and related configuration.
- Consolidated Vertex compatibility checks into `vertex` handling with `apikey`-based model resolution.
- Streamlined model generation logic for Vertex API key entries.
2025-12-02 09:18:24 +08:00
Aero
0ebb654019 feat: Add support for VertexAI compatible service (#375)
feat: consolidate Vertex AI compatibility with API key support in Gemini
2025-12-02 08:14:22 +08:00
Luis Pater
1a9f939eac Merge branch 'plus-dev' into feature/github-copilot-auth 2025-11-30 17:08:00 +08:00
Luis Pater
a748e93fd9 **fix(executor, auth): ensure index assignment consistency for auth objects**
- Updated `usage_helpers.go` to call `EnsureIndex()` for proper index assignment in reporter initialization.
- Adjusted `auth/manager.go` to assign auth indices inside a locked section when they are unassigned, ensuring thread safety and consistency.
2025-11-30 16:56:29 +08:00
hkfires
022aa81be1 feat(cliproxy): support wildcard exclusions for models 2025-11-30 08:02:00 +08:00
hkfires
c43f0ea7b1 refactor(config): rename model blacklist fields to excluded models 2025-11-29 21:23:47 +08:00
hkfires
6a191358af fix(auth): fix runtime auth reload on oauth blacklist change 2025-11-29 20:30:11 +08:00
Ernesto Martínez
7515090cb6 refactor(executor): improve concurrency and code quality in GitHub Copilot executor
- Replace concurrent-unsafe metadata caching with thread-safe sync.RWMutex-protected map
- Extract magic numbers and hardcoded header values to named constants
- Replace verbose status code checks with isHTTPSuccess() helper
- Simplify normalizeModel() to no-op with explanatory comment (models already canonical)
- Remove redundant metadata manipulation in token caching
- Improve code clarity and performance with proper cache management
2025-11-28 08:33:51 +01:00
hkfires
5983e3ec87 feat(auth): add oauth provider model blacklist 2025-11-28 10:37:10 +08:00
Ernesto Martínez
3a9ac7ef33 feat(auth): add GitHub Copilot authentication and API integration
Add complete GitHub Copilot support including:
- Device flow OAuth authentication via GitHub's official client ID
- Token management with automatic caching (25 min TTL)
- OpenAI-compatible API executor for api.githubcopilot.com
- 16 model definitions (GPT-5 variants, Claude variants, Gemini, Grok, Raptor)
- CLI login command via -github-copilot-login flag
- SDK authenticator and refresh registry integration

Enables users to authenticate with their GitHub Copilot subscription and
use it as a backend provider alongside existing providers.
2025-11-27 20:14:30 +01:00
hkfires
f8cebb9343 feat(config): add per-key model blacklist for providers 2025-11-27 21:57:07 +08:00
hkfires
6c17dbc4da style(amp): tidy whitespace in proxy module and tests 2025-11-26 18:57:26 +08:00
Luis Pater
a4a26d978e Fixed: #339
**feat(handlers, executor): add Gemini 3 Pro Preview support and refine Claude system instructions**

- Added support for the new "Gemini 3 Pro Preview" action in Gemini handlers, including detailed metadata and configuration.
- Removed redundant `cache_control` field from Claude system instructions for cleaner payload structure.
2025-11-26 11:42:57 +08:00
Luis Pater
506f1117dd **fix(handlers): refactor API response capture to append data safely**
- Introduced `appendAPIResponse` helper to preserve and append data to existing API responses.
- Ensured newline inclusion when appending, if necessary.
- Improved `nil` and data type checks for response handling.
- Updated middleware to skip request logging for `GET` requests.
2025-11-25 11:37:02 +08:00
Luis Pater
bb9955e461 **fix(auth): resolve index reassignment issue during auth management**
- Fixed improper handling of `indexAssigned` and `Index` during auth reassignment.
- Ensured `EnsureIndex` is invoked after validating existing auth entries.
2025-11-24 10:10:09 +08:00
Luis Pater
7063a176f4 #293
**feat(retry): add configurable retry logic with cooldown support**

- Introduced `max-retry-interval` configuration for cooldown durations between retries.
- Added `SetRetryConfig` in `Manager` to handle retry attempts and cooldown intervals.
- Enhanced provider execution logic to include retry attempts, cooldown management, and dynamic wait periods.
- Updated API endpoints and YAML configuration to support `max-retry-interval`.
2025-11-24 09:55:15 +08:00