Commit Graph

339 Commits

Author SHA1 Message Date
Luis Pater
514b9bf9fc Merge origin/main into pr-92 2026-01-10 01:12:22 +08:00
Luis Pater
95f87d5669 Merge pull request #947 from pykancha/fix-memory-leak
Resolve memory leaks causing OOM in k8s deployment
2026-01-10 00:40:47 +08:00
Luis Pater
c83365a349 Merge pull request #938 from router-for-me/log
refactor(logging): clean up oauth logs and debugs
2026-01-10 00:02:45 +08:00
Luis Pater
6b3604cf2b Merge pull request #943 from ben-vargas/fix-tool-mappings
Fix Claude OAuth tool name mapping (proxy_)
2026-01-09 23:52:29 +08:00
Luis Pater
af6bdca14f Fixed: #942
fix(executor): ignore non-SSE lines in OpenAI-compatible streams
2026-01-09 23:41:50 +08:00
Luis Pater
1906ebcfce Merge branch 'main' into plus 2026-01-09 21:52:24 +08:00
Ben Vargas
e785bfcd12 Use unprefixed Claude request for translation
Keep the upstream payload prefixed for OAuth while passing the unprefixed request body into response translators. This avoids proxy_ leaking into OpenAI Responses echoed tool metadata while preserving the Claude OAuth workaround.
2026-01-09 00:54:35 -07:00
hemanta212
47dacce6ea fix(server): resolve memory leaks causing OOM in k8s deployment
- usage/logger_plugin: cap modelStats.Details at 1000 entries per model
- cache/signature_cache: add background cleanup for expired sessions (10 min)
- management/handler: add background cleanup for stale IP rate-limit entries (1 hr)
- executor/cache_helpers: add mutex protection and TTL cleanup for codexCacheMap (15 min)
- executor/codex_executor: use thread-safe cache accessors

Add reproduction tests demonstrating leak behavior before/after fixes.

Amp-Thread-ID: https://ampcode.com/threads/T-019ba0fc-1d7b-7338-8e1d-ca0520412777
Co-authored-by: Amp <amp@ampcode.com>
2026-01-09 13:33:46 +05:45
Ben Vargas
dcac3407ab Fix Claude OAuth tool name mapping
Prefix tool names with proxy_ for Claude OAuth requests and strip the prefix from streaming and non-streaming responses to restore client-facing names.

Updates the Claude executor to:
- add prefixing for tools, tool_choice, and tool_use messages when using OAuth tokens
- strip the prefix from tool_use events in SSE and non-streaming payloads
- add focused unit tests for prefix/strip helpers
2026-01-09 00:10:38 -07:00
hkfires
ee62ef4745 refactor(logging): clean up oauth logs and debugs 2026-01-09 11:20:55 +08:00
Luis Pater
ef6bafbf7e fix(executor): handle context cancellation and deadline errors explicitly 2026-01-09 10:48:29 +08:00
Luis Pater
49b9709ce5 Merge pull request #787 from sususu98/fix/antigravity-429-retry-delay-parsing
fix(antigravity): parse retry-after delay from 429 response body
2026-01-09 04:45:25 +08:00
Luis Pater
d3533f81fc Merge branch 'router-for-me:main' into main 2026-01-08 21:06:24 +08:00
Luis Pater
59a448b645 feat(executor): centralize systemInstruction handling for Claude and Gemini-3-Pro models 2026-01-08 21:05:33 +08:00
Luis Pater
3de7a7f0cd Merge branch 'router-for-me:main' into main 2026-01-08 20:32:08 +08:00
hkfires
b6a0f7a07f fix(executor): update gemini model identifier to gemini-3-pro-preview
Update the model name check in `buildRequest` to target "gemini-3-pro-preview" instead of "gemini-3-pro" when applying specific system instruction handling.
2026-01-08 19:14:52 +08:00
Luis Pater
b2566368f8 Merge branch 'router-for-me:main' into main 2026-01-08 12:45:39 +08:00
Luis Pater
1b2f907671 feat(executor): update system instruction handling for Claude and Gemini-3-Pro models 2026-01-08 12:42:26 +08:00
Luis Pater
bda04eed8a feat(executor): add model-specific support for "gemini-3-pro" in execution and payload handling 2026-01-08 12:27:03 +08:00
Luis Pater
e0735977b5 Merge branch 'router-for-me:main' into main 2026-01-08 11:17:28 +08:00
Luis Pater
67985d8226 feat(executor): enhance Antigravity payload with user role and dynamic system instructions 2026-01-08 10:55:25 +08:00
Luis Pater
1fb4f2b12e Merge branch 'router-for-me:main' into main 2026-01-07 18:18:15 +08:00
Luis Pater
f4ba1ab910 fix(executor): remove unused tokenRefreshTimeout constant and pass zero timeout to HTTP client 2026-01-07 18:16:49 +08:00
Luis Pater
771fec9447 Merge branch 'main' into plus 2026-01-04 01:38:40 +08:00
Luis Pater
7a77b23f2d feat(executor): add token refresh timeout and improve context handling during refresh
Introduced `tokenRefreshTimeout` constant for token refresh operations and enhanced context propagation for `refreshToken` by embedding roundtrip information if available. Adjusted `refreshAuth` to ensure default context initialization and handle cancellation errors appropriately.
2026-01-04 00:26:08 +08:00
Luis Pater
b1f1cee1e5 feat(executor): refine payload handling by integrating original request context
Updated `applyPayloadConfig` to `applyPayloadConfigWithRoot` across payload translation logic, enabling validation against the original request payload when available. Added support for improved model normalization and translation consistency.
2026-01-02 03:28:37 +08:00
Luis Pater
a1ecc9ab00 Merge branch 'router-for-me:main' into main 2026-01-02 00:04:50 +08:00
Luis Pater
2a663d5cba feat(executor): enhance payload translation with original request context
Refactored `applyPayloadConfig` to `applyPayloadConfigWithRoot`, adding support for default rule validation against the original payload when available. Updated all executors to use `applyPayloadConfigWithRoot` and incorporate an optional original request payload for translations.
2026-01-02 00:03:26 +08:00
Luis Pater
ba486ca6b7 Merge branch 'router-for-me:main' into main 2026-01-01 21:03:16 +08:00
hkfires
3902fd7501 fix(iflow): remove thinking field from request body in thinking config handler 2026-01-01 19:40:28 +08:00
hkfires
4fc3d5e935 refactor(iflow): simplify thinking config handling for GLM and MiniMax models 2026-01-01 19:31:08 +08:00
hkfires
8bf3305b2b fix(thinking): fallback to upstream model for thinking support when alias not in registry 2025-12-31 18:07:13 +08:00
hkfires
89db4e9481 fix(thinking): use model alias for thinking config resolution in mapped models 2025-12-31 17:09:22 +08:00
Luis Pater
e85c9d9322 Merge branch 'main' into plus 2025-12-30 23:33:35 +08:00
hkfires
26efbed05c refactor(executor): remove redundant upstream model parameter from translateRequest 2025-12-30 20:20:42 +08:00
hkfires
96340bf136 refactor(executor): resolve upstream model at conductor level before execution 2025-12-30 19:31:54 +08:00
hkfires
b055e00c1a fix(executor): use upstream model for thinking config and payload translation 2025-12-30 17:49:44 +08:00
sususu
414db44c00 fix(antigravity): parse retry-after delay from 429 response body
When receiving HTTP 429 (Too Many Requests) responses, parse the retry
delay from the response body using parseRetryDelay and populate the
statusErr.retryAfter field. This allows upstream callers to respect
the server's requested retry timing.

Applied to all error paths in Execute, executeClaudeNonStream,
ExecuteStream, CountTokens, and refreshToken functions.
2025-12-30 16:07:32 +08:00
hkfires
08ab6a7d77 feat(gemini): add per-key model alias support for Gemini provider 2025-12-30 13:27:57 +08:00
Luis Pater
5c95129884 Merge branch 'router-for-me:main' into main 2025-12-30 11:48:29 +08:00
Luis Pater
50e6d845f4 feat(cliproxy): introduce global model name mappings for improved aliasing and routing 2025-12-30 08:13:06 +08:00
Luis Pater
e3d8d726e6 Merge branch 'router-for-me:main' into main 2025-12-28 15:09:33 +08:00
Luis Pater
457924828a Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict
Fix: disable thinking when tool_choice forces tool use
2025-12-28 14:04:30 +08:00
Ben Vargas
aca2ef6359 Fix: disable thinking when tool_choice forces tool use
Anthropic API does not allow extended thinking when tool_choice is set
to "any" or a specific tool. This was causing 400 errors when using
features like Amp's /handoff command which forces tool_choice.

Added disableThinkingIfToolChoiceForced() that removes thinking config
when incompatible tool_choice is detected, applied to both streaming
and non-streaming paths.

Fixes router-for-me/CLIProxyAPI#630
2025-12-27 16:31:37 -07:00
Luis Pater
0f51e73baa Merge branch 'router-for-me:main' into main 2025-12-28 03:07:58 +08:00
Luis Pater
3a436e116a feat(cliproxy): implement model aliasing and hashing for Codex configurations, enhance request routing logic, and normalize Codex model entries 2025-12-28 03:06:51 +08:00
Luis Pater
d06e2dc83c Merge branch 'router-for-me:main' into main 2025-12-28 02:10:16 +08:00
leaph
6403ff4ec4 feat(iflow): add model-specific thinking configs for GLM-4.7 and MiniMax-M2.1
- GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false}
- MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation
- Added preserveReasoningContentInMessages() to support re-injection of reasoning
  content in assistant message history for multi-turn conversations
- Added ThinkingSupport to MiniMax-M2.1 model definition
2025-12-27 18:39:15 +01:00
Luis Pater
d35152bbef Merge branch 'router-for-me:main' into main 2025-12-27 22:03:50 +08:00
Luis Pater
c281f4cbaf Fixed: #747
fix(translators): rename and integrate `usageMetadata` as `cpaUsageMetadata` in Claude processing logic
2025-12-27 22:02:11 +08:00