Commit Graph

1800 Commits

Author SHA1 Message Date
Luis Pater
1cbc4834e1 Merge pull request #1771 from edlsh/fix/claude-cache-control-1769
Fix Claude OAuth cache_control regressions and gzip error decoding
2026-03-01 20:17:22 +08:00
hkfires
a8a5d03c33 chore: ignore .idea directory in git and docker builds 2026-03-01 12:42:59 +08:00
edlsh
76aa917882 Optimize cache-control JSON mutations in Claude executor 2026-02-28 22:47:04 -05:00
edlsh
6ac9b31e4e Handle compressed error decode failures safely 2026-02-28 22:43:46 -05:00
edlsh
0ad3e8457f Clarify cloaking system block cache-control comments 2026-02-28 22:34:14 -05:00
edlsh
444a47ae63 Fix Claude cache-control guardrails and gzip error decoding 2026-02-28 22:32:33 -05:00
Luis Pater
725f4fdff4 Merge pull request #1768 from router-for-me/claude
fix(translator): handle Claude thinking type "auto" like adaptive
2026-03-01 11:03:13 +08:00
Luis Pater
c23e46f45d Merge pull request #1767 from router-for-me/antigravity
fix(antigravity): update model configurations and add new models for Antigravity
2026-03-01 11:02:20 +08:00
hkfires
b148820c35 fix(translator): handle Claude thinking type "auto" like adaptive 2026-03-01 10:30:19 +08:00
hkfires
134f41496d fix(antigravity): update model configurations and add new models for Antigravity 2026-03-01 10:05:29 +08:00
Luis Pater
1ae994b4aa fix(antigravity): adjust thinkingBudget default to 64000 and update model definitions for Claude 2026-03-01 09:39:39 +08:00
Luis Pater
cc1d8f6629 Fixed: #1747
feat(auth): add configurable max-retry-credentials for finer control over cross-credential retries
2026-03-01 02:42:36 +08:00
Luis Pater
5446cd2b02 Merge pull request #1761 from margbug01/fix/thinking-chain-display
fix: support thinking.type=auto from Amp client and decouple thinking translation from unsigned history
2026-03-01 02:30:42 +08:00
margbug01
8de0885b7d fix: support thinking.type="auto" from Amp client for Antigravity Claude models
## Problem

When using Antigravity Claude models through CLIProxyAPI, the thinking
chain (reasoning content) does not display in the Amp client.

## Root Cause

The Amp client sends `thinking: {"type": "auto"}` in its requests,
but `ConvertClaudeRequestToAntigravity` only handled `"enabled"` and
`"adaptive"` types in its switch statement. The `"auto"` type was
silently ignored, resulting in no `thinkingConfig` being set in the
translated Gemini request. Without `thinkingConfig`, the Antigravity
API returns responses without any thinking content.

Additionally, the Antigravity API for Claude models does not support
`thinkingBudget: -1` (auto mode sentinel). It requires a concrete
positive budget value. The fix uses 128000 as the budget for "auto"
mode, which `ApplyThinking` will then normalize to stay within the
model's actual limits (e.g., capped to `maxOutputTokens - 1`).

## Changes

### internal/translator/antigravity/claude/antigravity_claude_request.go

1. **Add "auto" case** to the thinking type switch statement.
   Sets `thinkingBudget: 128000` and `includeThoughts: true`.
   The budget is subsequently normalized by `ApplyThinking` based
   on model-specific limits.

2. **Add "auto" to hasThinking check** so that interleaved thinking
   hints are injected for tool-use scenarios when Amp sends
   `thinking.type="auto"`.

### internal/registry/model_definitions_static_data.go

3. **Add Thinking configuration** for `claude-sonnet-4-6`,
   `claude-sonnet-4-5`, and `claude-opus-4-6` in
   `GetAntigravityModelConfig()` -- these were previously missing,
   causing `ApplyThinking` to skip thinking config entirely.

## Testing

- Deployed to Railway test instance (cpa-thinking-test)
- Verified via debug logging that:
  - Amp sends `thinking: {"type": "auto"}`
  - CPA now translates this to `thinkingConfig: {thinkingBudget: 128000, includeThoughts: true}`
  - `ApplyThinking` normalizes the budget to model-specific limits
  - Antigravity API receives the correct thinkingConfig

Amp-Thread-ID: https://ampcode.com/threads/T-019ca511-710d-776d-a07c-4b750f871a93
Co-authored-by: Amp <amp@ampcode.com>
2026-03-01 02:18:43 +08:00
Luis Pater
a6ce5f36e6 Fixed: #1758
fix(codex): filter billing headers from system result text and update template logic
2026-03-01 01:45:35 +08:00
Luis Pater
e73cf42e28 Merge pull request #1750 from tpm2dot0/fix/claude-code-request-fingerprint-alignment
fix(cloak): align outgoing requests with real Claude Code 2.1.63
2026-03-01 01:27:28 +08:00
exe.dev user
b45343e812 fix(cloak): align outgoing requests with real Claude Code 2.1.63 fingerprint
Captured and compared outgoing requests from CLIProxyAPI against real
Claude Code 2.1.63 and fixed all detectable differences:

Headers:
- Update anthropic-beta to match 2.1.63: replace fine-grained-tool-streaming
  and prompt-caching-2024-07-31 with context-management-2025-06-27 and
  prompt-caching-scope-2026-01-05
- Remove X-Stainless-Helper-Method header (real Claude Code does not send it)
- Update default User-Agent from "claude-cli/2.1.44 (external, sdk-cli)" to
  "claude-cli/2.1.63 (external, cli)"
- Force Claude Code User-Agent for non-Claude clients to avoid leaking
  real client identity (e.g. curl, OpenAI SDKs) during cloaking

Body:
- Inject x-anthropic-billing-header as system[0] (matches real format)
- Change system prompt identifier from "You are Claude Code..." to
  "You are a Claude agent, built on Anthropic's Claude Agent SDK."
- Add cache_control with ttl:"1h" to match real request format
- Fix user_id format: user_[64hex]_account_[uuid]_session_[uuid]
  (was missing account UUID)
- Disable tool name prefix (set claudeToolPrefix to empty string)

TLS:
- Switch utls fingerprint from HelloFirefox_Auto to HelloChrome_Auto
  (closer to Node.js/OpenSSL used by real Claude Code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 09:19:06 +00:00
Luis Pater
8599b1560e Fixed: #1716
feat(kimi): add support for explicit disabled thinking and reasoning effort handling
2026-02-28 05:29:07 +08:00
Luis Pater
8bde8c37c0 Fixed: #1711
fix(server): use resolved log directory for request logger initialization and test fallback logic
2026-02-28 05:21:01 +08:00
Luis Pater
27c68f5bb2 fix(auth): replace MarkResult with hook OnResult for result handling 2026-02-27 20:47:46 +08:00
Luis Pater
41b1cf2273 Merge pull request #1734 from huangusaki/main
feat(registry): add gemini-3.1-flash-image support
2026-02-27 16:12:05 +08:00
huang_usaki
3b4f9f43db feat(registry): add gemini-3.1-flash-image support 2026-02-27 10:20:46 +08:00
Luis Pater
0da34d3c2d Merge pull request #1668 from lyd123qw2008/fix/codex-usage-limit-retry-after
fix(codex): honor usage_limit_reached resets_at for retry_after
2026-02-27 06:01:44 +08:00
Luis Pater
74bf7eda8f Merge pull request #1686 from lyd123qw2008/fix/auth-refresh-concurrency-limit
fix(auth): limit auto-refresh concurrency to prevent refresh storms
2026-02-27 05:59:20 +08:00
Luis Pater
8c6c90da74 fix(registry): clean up outdated model definitions in static data 2026-02-26 23:12:40 +08:00
Luis Pater
24bcfd9c03 Merge pull request #1699 from 123hi123/fix/antigravity-primary-model-fallback
fix(antigravity): keep primary model list and backfill empty auths
2026-02-26 04:28:29 +08:00
Luis Pater
816fb4c5da Merge pull request #1682 from sususu98/fix/tool-result-image-parts
fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType
2026-02-25 23:14:35 +08:00
Luis Pater
d24ea4ce2a Merge pull request #1664 from ciberponk/pr/responses-compaction-compat
feat: add codex responses compatibility for compaction payloads
2026-02-25 01:21:59 +08:00
Luis Pater
2c30c981ae Merge pull request #1687 from lyd123qw2008/fix/codex-refresh-token-reused-no-retry
fix(codex): stop retrying refresh_token_reused errors
2026-02-25 01:19:30 +08:00
Luis Pater
aa1da8a858 Merge pull request #1685 from lyd123qw2008/fix/auth-auto-refresh-interval
fix(auth): respect configured auto-refresh interval
2026-02-25 01:13:47 +08:00
Luis Pater
f1e9a787d7 Merge pull request #1676 from piexian/feat/qwen-quota-handling-clean
feat(qwen): add rate limiting and quota error handling
2026-02-25 01:07:55 +08:00
Luis Pater
c66cb0afd2 Merge pull request #1683 from dusty-du/codex/device-login-flow
Add additive Codex device-code login flow
2026-02-25 00:50:48 +08:00
Luis Pater
fb48eee973 Merge pull request #1680 from canxin121/fix/responses-stream-error-chunks
fix(responses): emit schema-valid SSE chunks
2026-02-25 00:49:06 +08:00
Luis Pater
bb44e5ec44 Merge pull request #1701 from router-for-me/openai
Revert "Merge pull request #1627 from thebtf/fix/reasoning-effort-clamping"
2026-02-25 00:46:13 +08:00
comalot
514ae341c8 fix(antigravity): deep copy cached model metadata 2026-02-24 20:14:01 +08:00
hkfires
0659ffab75 Revert "Merge pull request #1627 from thebtf/fix/reasoning-effort-clamping" 2026-02-24 19:47:53 +08:00
comalot
8ce07f38dd fix(antigravity): keep primary model list and backfill empty auths 2026-02-24 16:16:44 +08:00
Luis Pater
7cb398d167 Merge pull request #1663 from rensumo/main
feat: implement credential-based round-robin for gemini-cli
2026-02-24 06:02:50 +08:00
Luis Pater
c3e12c5e58 Merge pull request #1654 from alexey-yanchenko/feature/pass-file-inputs
Pass file input from /chat/completions and /responses to codex and claude
2026-02-24 05:53:11 +08:00
Luis Pater
1825fc7503 Merge pull request #1643 from alexey-yanchenko/fix/gemini-prompt-tokens
Fix usage convertation from gemini response to openai format
2026-02-24 05:46:13 +08:00
Luis Pater
48732ba05e Merge pull request #1527 from HEUDavid/feat/auth-hook
feat(auth): add post-auth hook mechanism
2026-02-24 05:33:13 +08:00
canxin121
acf483c9e6 fix(responses): reject invalid SSE data JSON
Guard the openai-response streaming path against truncated/invalid SSE data payloads by validating data: JSON before forwarding; surface a 502 terminal error instead of letting clients crash with JSON parse errors.
2026-02-24 01:42:54 +08:00
lyd123qw2008
3b3e0d1141 test(codex): log non-retryable refresh error and cover single-attempt behavior 2026-02-23 22:41:33 +08:00
lyd123qw2008
7acd428507 fix(codex): stop retrying refresh_token_reused errors 2026-02-23 22:31:30 +08:00
lyd123qw2008
0aaf177640 fix(auth): limit auto-refresh concurrency to prevent refresh storms 2026-02-23 22:28:41 +08:00
lyd123qw2008
450d1227bd fix(auth): respect configured auto-refresh interval 2026-02-23 22:07:50 +08:00
test
492b9c46f0 Add additive Codex device-code login flow 2026-02-23 06:30:04 -05:00
sususu98
4e26182d14 fix(antigravity): place tool_result images in functionResponse.parts and unify mimeType
Move base64 image data from Claude tool_result into functionResponse.parts
as inlineData instead of outer sibling parts, preventing context bloat.
Unify all inlineData field naming to camelCase mimeType across Claude,
OpenAI, and Gemini translators. Add comprehensive edge case tests and
Gemini-side regression test for functionResponse.parts preservation.
2026-02-23 13:38:21 +08:00
canxin121
eb7571936c revert: translator changes (path guard)
CI blocks PRs that modify internal/translator. Revert translator edits and keep only the /v1/responses streaming error-chunk fix; file an issue for translator conformance work.
2026-02-23 13:30:43 +08:00
canxin121
5382764d8a fix(responses): include model and usage in translated streams
Ensure response.created and response.completed chunks produced by the OpenAI/Gemini/Claude translators always include required fields (response.model and response.usage) so clients validating Responses SSE do not fail schema validation.
2026-02-23 13:22:06 +08:00