Compare commits

...

299 Commits

Author SHA1 Message Date
Luis Pater
0c48ef58e0 chore: remove GitLab Duo codex parity plan document
- Deleted `gitlab-duo-codex-parity-plan.md` as it is no longer required.
2026-04-17 01:44:13 +08:00
Luis Pater
15b0d8d039 Merge branch 'pr-529'
# Conflicts:
#	README.md
#	README_CN.md
#	README_JA.md
#	cmd/server/main.go
#	config.example.yaml
#	internal/api/handlers/management/auth_files.go
#	internal/api/server.go
#	internal/auth/iflow/iflow_auth.go
#	internal/cmd/auth_manager.go
#	internal/config/config.go
#	sdk/cliproxy/auth/oauth_model_alias.go
#	sdk/cliproxy/auth/types.go
#	test/thinking_conversion_test.go
2026-04-17 01:42:35 +08:00
Luis Pater
5dcca69e8c feat(models): add Claude Opus 4.7 model entry to registry JSON 2026-04-17 01:08:19 +08:00
Luis Pater
f5dc6483d5 chore: remove iFlow-related modules and dependencies
- Deleted `iflow` provider implementation, including thinking configuration (`apply.go`) and authentication modules.
- Removed iFlow-specific tests, executors, and helpers across SDK and internal components.
- Updated all references to exclude iFlow functionality.
2026-04-17 01:07:12 +08:00
Luis Pater
d949921143 feat(auth): add proxy URL override support to auth constructors and executors
- Introduced `WithProxyURL` variants for `CodexAuth`, `ClaudeAuth`, `IFlowAuth`, and `DeviceFlowClient`.
- Updated executors to use proxy-aware constructors for improved configurability.
- Added unit tests to validate proxy override precedence and functionality.

Closes: #2823
2026-04-16 22:11:39 +08:00
Luis Pater
7b03f04670 fix(handlers): include execution session metadata and skip idempotency key when absent
- Refactored `requestExecutionMetadata` to handle empty `Idempotency-Key` gracefully.
- Added test to validate metadata inclusion of execution session without idempotency key.
2026-04-16 21:44:32 +08:00
hkfires
1267fddf61 fix(docker-build): improve argument handling and error messaging for usage option 2026-04-16 09:19:03 +08:00
Luis Pater
85c7d43bea Merge pull request #515 from JokerRun/fix/claude-tool-streaming-arguments
fix(copilot): route Claude models through native messages
2026-04-16 03:20:21 +08:00
Luis Pater
44c74d6ea2 Merge PR #525 (v6.9.27) 2026-04-16 03:16:28 +08:00
Luis Pater
ba454dbfbf Merge pull request #2817 from sususu98/fix/antigravity-strip-billing-header
fix(antigravity): strip billing header from system instruction before upstream call
2026-04-16 02:46:05 +08:00
Luis Pater
d1508ca030 Merge pull request #2816 from sususu98/feat/session-affinity
feat(session-affinity): add session-sticky routing for multi-account load balancing
2026-04-16 02:45:31 +08:00
sususu98
d4a6a5ae15 fix(antigravity): strip billing header from system instruction before upstream call
The x-anthropic-billing-header block in the Claude system array is
client-internal metadata and should not be forwarded to the Gemini
upstream as part of systemInstruction.parts.
2026-04-16 00:19:01 +08:00
sususu98
7c24d54ca8 feat(session-affinity): add session-sticky routing for multi-account load balancing
When multiple auth credentials are configured, requests from the same
session are now routed to the same credential, improving upstream prompt
cache hit rates and maintaining context continuity.

Core components:
- SessionAffinitySelector: wraps RoundRobin/FillFirst selectors with
  session-to-auth binding; automatic failover when bound auth is
  unavailable, re-binding via the fallback selector for even distribution
- SessionCache: TTL-based in-memory cache with background cleanup
  goroutine, supporting per-session and per-auth invalidation
- StoppableSelector interface: lifecycle hook for selectors holding
  resources, called during Manager.StopAutoRefresh()

Session ID extraction priority (extractSessionIDs):
1. metadata.user_id with Claude Code session format (old
   user_{hash}_session_{uuid} and new JSON {session_id} format)
2. X-Session-ID header (generic client support)
3. metadata.user_id (non-Claude format, used as-is)
4. conversation_id field
5. Stable FNV hash from system prompt + first user/assistant messages
   (fallback for clients with no explicit session ID); returns both a
   full hash (primaryID) and a short hash without assistant content
   (fallbackID) to inherit bindings from the first turn

Multi-format message hash covers OpenAI messages, Claude system array,
Gemini contents/systemInstruction, and OpenAI Responses API input items
(including inline messages with role but no type field).

Configuration (config.yaml routing section):
- session-affinity: bool (default false)
- session-affinity-ttl: duration string (default "1h")
- claude-code-session-affinity: bool (deprecated, alias for above)
All three fields trigger selector rebuild on config hot reload.

Side effect: Idempotency-Key header is no longer auto-generated with a
random UUID when absent — only forwarded when explicitly provided by the
client, to avoid polluting session hash extraction.
2026-04-16 00:18:47 +08:00
Luis Pater
a4c1e32ff6 chore(models): remove outdated GPT-5 and related model entries from registry JSON 2026-04-15 20:37:32 +08:00
Luis Pater
f56cf42461 Merge pull request #2800 from sususu98/fix/antigravity-max-output-tokens-cap
fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
2026-04-15 20:35:11 +08:00
Luis Pater
3dea1da249 Merge pull request #2782 from sususu98/fix/strip-invalid-signature-thinking-blocks
fix(antigravity): use E-prefixed fake signature in strict bypass test
2026-04-15 20:34:32 +08:00
Luis Pater
8fac29631d chore: remove Qwen support from SDK and internal components
- Deleted `QwenAuthenticator`, internal `qwen_auth`, and `qwen_executor` implementations.
- Removed all Qwen-related OAuth flows, token handling, and execution logic.
- Cleaned up dependencies and references to Qwen across the codebase.
2026-04-15 12:16:08 +08:00
sususu98
8fecd625d2 fix(antigravity): cap maxOutputTokens using registry max_completion_tokens
Claude models on antigravity have a 64000 token output limit but
max_tokens from downstream requests was passed through uncapped,
causing 400 INVALID_ARGUMENT from Google when clients sent 128000.
2026-04-15 11:57:55 +08:00
sususu98
10b55b5ddd fix(antigravity): use E-prefixed fake signature in strict bypass test
The strict bypass test used testGeminiSignaturePayload() which produces
a base64 string starting with 'C'. Since StripInvalidSignatureThinkingBlocks
now strips all non-E/R signatures unconditionally, the test payload was
stripped before reaching ValidateClaudeBypassSignatures, causing the test
to pass the request through instead of rejecting it with 400.

Replace with testFakeClaudeSignature() which produces a base64 string
starting with 'E' (valid at the lightweight check) but with invalid
protobuf content (no valid field 2), so strict mode correctly rejects
it at the deep validation layer.
2026-04-14 15:46:02 +08:00
sususu98
41ae2c81e7 fix(antigravity): discard thinking blocks with non-Claude-format signatures
Proxy-generated thinking blocks may carry hex hashes or other non-Claude
signatures (e.g. "d5cb9cd0823142109f451861") from Gemini responses. These
are now discarded alongside empty-signature blocks during the strip phase,
before validation runs. Valid Claude signatures always start with 'E' or 'R'
(after stripping any cache prefix).
2026-04-14 15:14:48 +08:00
sususu98
278a89824c fix(antigravity): strip thinking blocks with empty signatures instead of rejecting
Thinking blocks with empty signatures come from proxy-generated
responses (Antigravity/Gemini routed as Claude). These should be
silently dropped from the request payload before forwarding, not
rejected with 400. Fixes 10 "missing thinking signature" errors.
2026-04-14 15:14:48 +08:00
sususu98
c4459c4346 Merge pull request #2724 from sususu98/fix/skip-schema-cleanup-empty-tools
fix(antigravity): skip full schema cleanup for empty tool requests
2026-04-12 14:05:47 +08:00
sususu98
61e0447f92 Merge pull request #2723 from sususu98/fix/drop-redacted-thinking-blocks
fix(antigravity): drop redacted thinking blocks with empty text
2026-04-12 14:05:41 +08:00
sususu98
1dc3018fd6 Merge pull request #2716 from sususu98/pr/antigravity-bypass-log-noise
fix(antigravity): reduce bypass mode log noise
2026-04-12 14:05:34 +08:00
sususu98
26fd3eff03 Merge pull request #2715 from sususu98/pr/antigravity-32mb-bypass-signatures
fix(antigravity): allow 32MB bypass signatures
2026-04-12 14:05:27 +08:00
Luis Pater
5bfaf8086b feat(auth): add configurable worker pool size for auto-refresh loop
- Introduced `auth-auto-refresh-workers` config option to override default concurrency.
- Updated `authAutoRefreshLoop` to support customizable worker counts.
- Enhanced token refresh scheduling flexibility by aligning worker pool with runtime configurations.
2026-04-12 13:56:05 +08:00
Luis Pater
6c0a1efd71 refactor(auth): simplify auth directory scanning and improve JSON processing logic
- Replaced `filepath.Walk` with `os.ReadDir` for cleaner directory traversal.
- Fixed `isAuthJSON` check to use `filepath.Dir` for directory comparison.
- Updated auth hash cache generation and file synthesis to improve readability and maintainability.
2026-04-12 13:32:03 +08:00
sususu98
f5ed5c7453 fix(antigravity): skip full schema cleanup for empty tool requests
Avoid whole-payload schema sanitization when translated Antigravity requests have no actual tool schemas, including missing and empty tools arrays. Add regression coverage so image-heavy no-tool requests keep bypassing the old memory amplification path.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 12:51:42 +08:00
sususu98
65158cce46 fix(antigravity): drop redacted thinking blocks with empty text
Antigravity wraps empty thinking text into a prompt-caching-scope
object that omits the required inner "thinking" field, causing 400
"messages.N.content.0.thinking.thinking: Field required" when Claude
Max requests are routed through Antigravity in bypass mode.
2026-04-12 12:30:43 +08:00
JokerRun
1c6c3675d1 fix(copilot): route claude models through native messages 2026-04-12 04:11:03 +08:00
Luis Pater
a583463d60 feat(auth): implement auto-refresh loop for managing auth token schedule
- Introduced `authAutoRefreshLoop` to handle token refresh scheduling.
- Replaced semaphore-based refresh logic in `Manager` with the new loop.
- Added unit tests to verify refresh schedule logic and edge cases.
2026-04-12 02:06:40 +08:00
sususu98
8ed290c1c4 fix(antigravity): reduce bypass mode log noise
Keep cache-disable visibility at info level while suppressing duplicate state-change logs and moving strict-mode chatter down to debug.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 00:52:41 +08:00
sususu98
727221df2e fix(antigravity): allow 32MB bypass signatures
Raise the local bypass-signature ceiling so long Claude thinking signatures are not rejected before request translation, and keep the oversized-signature test cheap to execute.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-12 00:51:53 +08:00
Luis Pater
1d8e68ad15 fix(executor): remove immediate retry logic for 429 in Qwen, add enhanced Retry-After handling, and update tests 2026-04-11 21:15:15 +08:00
Luis Pater
0ab1f5412f fix(executor): handle 429 Retry-After header and default retry logic for quota exhaustion
- Added proper parsing of `Retry-After` headers for 429 responses.
- Set default retry duration when "disable cooling" is active on quota exhaustion.
- Updated tests to verify `Retry-After` handling and default behavior.
2026-04-11 21:04:55 +08:00
Luis Pater
9ded75d335 Merge pull request #2702 from AllenReder/docs/add-quota-inspector
docs(README): add CLIproxyAPI Quota Inspector to community projects list
2026-04-11 16:42:02 +08:00
Allen Yi
f135fdf7fc docs: clarify codex quota window wording in README locales 2026-04-11 16:39:32 +08:00
Luis Pater
828df80088 refactor(executor): remove immediate retry with token refresh on 429 for Qwen and update tests accordingly 2026-04-11 16:35:18 +08:00
Allen Yi
c585caa0ce docs: fix CLIProxyAPI Quota Inspector naming and link casing 2026-04-11 16:22:45 +08:00
Allen Yi
5bb69fa4ab docs: refine CLIproxyAPI Quota Inspector description in all README locales 2026-04-11 15:22:27 +08:00
Luis Pater
344043b9f1 Merge pull request #506 from router-for-me/plus
v6.9.22
2026-04-10 21:58:39 +08:00
Luis Pater
26c298ced1 Merge branch 'main' into plus 2026-04-10 21:58:14 +08:00
Luis Pater
5ab9afac83 fix(executor): handle OAuth tool name remapping with rename detection and add tests
Closes: #2656
2026-04-10 21:54:59 +08:00
Luis Pater
65ce86338b fix(executor): implement immediate retry with token refresh on 429 for Qwen and add associated tests
Closes: #2661
2026-04-10 21:12:03 +08:00
Chén Mù
2a97037d7b Merge pull request #2670 from sususu98/feat/antigravity-prefer-prod-url
feat(antigravity): prefer prod URL as first priority
2026-04-10 19:43:27 +08:00
sususu98
d801393841 feat(antigravity): prefer prod URL as first priority
Promote cloudcode-pa.googleapis.com to the first position in the
fallback order, with daily and sandbox URLs as fallbacks.
2026-04-10 19:37:56 +08:00
Luis Pater
b2c0cdfc88 Merge pull request #2621 from wykk-12138/fix/oauth-extra-usage-detection
fix(claude): prevent OAuth extra-usage billing via tool name fingerprinting and system prompt cloaking
2026-04-10 10:29:27 +08:00
Luis Pater
f32c8c9620 fix(handlers): update listener to bind on all interfaces instead of localhost
Fixed: #2640
2026-04-10 07:24:34 +08:00
wykk-12138
0f45d89255 fix(claude): address PR review feedback for OAuth cloaking
- Use buildTextBlock for billing header to avoid raw JSON string interpolation
- Fix empty array edge case in prependToFirstUserMessage
- Allow remapOAuthToolNames to process messages even without tools array
- Move claude_system_prompt.go to helps/ per repo convention
- Export prompt constants (ClaudeCode* prefix) for cross-package access

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-10 00:07:11 +08:00
wykk-12138
96056d0137 Merge remote-tracking branch 'upstream/main' into fix/oauth-extra-usage-detection 2026-04-09 22:59:31 +08:00
wykk-12138
f780c289e8 fix(claude): map question/skill to TitleCase instead of removing them
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 22:28:00 +08:00
wykk-12138
ac36119a02 fix(claude): preserve OAuth tool renames when filtering tools
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 22:20:15 +08:00
Luis Pater
39dc4557c1 Merge pull request #2412 from sususu98/feat/signature-cache-toggle
feat: configurable signature cache toggle for Antigravity/Claude thinking blocks
2026-04-09 21:54:47 +08:00
ZTXBOSS666
30e94b6792 fix(antigravity): refine 429 handling and credits fallback
Includes: restore SDK docs under docs/; update antigravity executor credits tests; gofmt.
2026-04-09 21:48:32 +08:00
Luis Pater
938af75954 Merge branch 'router-for-me:main' into main 2026-04-09 21:14:30 +08:00
sususu98
38f0ae5970 docs(antigravity): document signature validation spec alignment
Add package-level comment documenting the protobuf tree structure,
base64 encoding equivalence proof, output dimensions, and spec
section references. Remove unreachable legacy_vertex_group dead code.
2026-04-09 21:12:40 +08:00
sususu98
cf249586a9 feat(antigravity): configurable signature cache with bypass-mode validation
Antigravity 的 Claude thinking signature 处理新增 cache/bypass 双模式,
并为 bypass 模式实现按 SIGNATURE-CHANNEL-SPEC.md 的签名校验。

新增 antigravity-signature-cache-enabled 配置项(默认 true):
- cache mode(true):使用服务端缓存的签名,行为与原有逻辑完全一致
- bypass mode(false):直接使用客户端提供的签名,经过校验和归一化

支持配置热重载,运行时可切换模式。

校验流程:
1. 剥离历史 cache-mode 的 'modelGroup#' 前缀(如 claude#Exxxx → Exxxx)
2. 首字符必须为 'E'(单层编码)或 'R'(双层编码),否则拒绝
3. R 开头:base64 解码 → 内层必须以 'E' 开头 → 继续单层校验
4. E 开头:base64 解码 → 首字节必须为 0x12(Claude protobuf 标识)
5. 所有合法签名归一化为 R 形式(双层 base64)发往 Antigravity 后端

非法签名处理策略:
- 非严格模式(默认):translator 静默丢弃无签名的 thinking block
- 严格模式(antigravity-signature-bypass-strict: true):
  executor 层在请求发往上游前直接返回 HTTP 400

按 SIGNATURE-CHANNEL-SPEC.md 解析 Claude 签名的完整 protobuf 结构:
- Top-level Field 2(容器)→ Field 1(渠道块)
- 渠道块提取:channel_id (Field 1)、infrastructure (Field 2)、
  model_text (Field 6)、field7 (Field 7)
- 计算 routing_class、infrastructure_class、schema_features
- 使用 google.golang.org/protobuf/encoding/protowire 解析

- resolveThinkingSignature 拆分为 resolveCacheModeSignature / resolveBypassModeSignature
- hasResolvedThinkingSignature:mode-aware 签名有效性判断
  (cache: len>=50 via HasValidSignature,bypass: non-empty)
- validateAntigravityRequestSignatures:executor 预检,
  仅在 bypass + strict 模式下拦截非法签名返回 400
- 响应侧签名缓存逻辑与 cache mode 集成
- Cache mode 行为完全保留:无 '#' 前缀的原生签名静默丢弃
2026-04-09 21:12:40 +08:00
Luis Pater
1dba2d0f81 fix(handlers): add base URL validation and improve API key deletion tests 2026-04-09 20:51:54 +08:00
Luis Pater
730809d8ea fix(auth): preserve and restore ready view cursors during index rebuilds 2026-04-09 20:26:16 +08:00
wykk-12138
e8d1b79cb3 fix(claude): remap OAuth tool names to Claude Code style to avoid third-party fingerprint detection
A/B testing confirmed that Anthropic uses tool name fingerprinting to detect
third-party clients on OAuth traffic. OpenCode-style lowercase names like
'bash', 'read', 'todowrite' trigger extra-usage billing, while Claude Code
TitleCase names like 'Bash', 'Read', 'TodoWrite' pass through normally.

Changes:
- Add oauthToolRenameMap: maps lowercase tool names to Claude Code equivalents
- Add oauthToolsToRemove: removes 'question' and 'skill' (no Claude Code counterpart)
- remapOAuthToolNames: renames tools, removes blacklisted ones, updates tool_choice and messages
- reverseRemapOAuthToolNames/reverseRemapOAuthToolNamesFromStreamLine: reverse map for responses
- Apply in Execute(), ExecuteStream(), and CountTokens() for OAuth token requests
2026-04-09 20:15:16 +08:00
Luis Pater
5e81b65f2f fix(auth, executor): normalize Qwen base URL, adjust RefreshLead duration, and add tests 2026-04-09 18:07:07 +08:00
wykk-12138
7e8e2226a6 fix(claude): reduce forwarded OAuth prompt to minimal tool reminder
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 17:12:07 +08:00
wykk-12138
f0c20e852f fix(claude): remove invalid cache_control scope from static system block
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
2026-04-09 17:00:04 +08:00
wykk-12138
7cdf8e9872 fix(claude): sanitize forwarded third-party prompts for OAuth cloaking
Only for Claude OAuth requests, sanitize forwarded system-prompt context before
it is prepended into the first user message. This preserves neutral task/tool
instructions while removing OpenCode branding, docs links, environment banners,
and product-specific workflow sections that still triggered Anthropic extra-usage
classification after top-level system[] cloaking.
2026-04-09 16:45:29 +08:00
Supra4E8C
c42480a574 Merge pull request #501 from Ve-ria/main
feat: add glm-5.1 to CodeBuddy model list
2026-04-09 14:37:28 +08:00
rensumo
55c146a0e7 feat: add glm-5.1 to CodeBuddy model list 2026-04-09 14:20:26 +08:00
wykk-12138
e2e3c7dde0 fix: remove invalid org scope and match Claude Code block layout 2026-04-09 14:09:52 +08:00
wykk-12138
9e0ab4d116 fix: build cache_control JSON manually to avoid sjson map marshaling 2026-04-09 14:03:23 +08:00
wykk-12138
8783caf313 fix: buildTextBlock cache_control sjson path issue
sjson treats 'cache_control.type' as nested path, creating
{ephemeral: {scope: org}} instead of {type: ephemeral, scope: org}.
Pass the whole map to sjson.SetBytes as a single value.
2026-04-09 13:58:04 +08:00
wykk-12138
f6f4640c5e fix: use sjson to build system blocks, avoid raw newlines in JSON
The previous commit used fmt.Sprintf with %s to insert multi-line string
constants into JSON strings. Go raw string literals contain actual newline
bytes, which produce invalid JSON (control characters in string values).

Replace with buildTextBlock() helper that uses sjson.SetBytes to properly
escape text content for JSON serialization.
2026-04-09 13:50:49 +08:00
wykk-12138
613fe6768d fix(executor): inject full Claude Code system prompt blocks with proper cache scopes
Previous fix only injected billing header + agent identifier (2 blocks).
Anthropic's updated detection now validates system prompt content depth:
- Block count (needs 4-6 blocks, not 2)
- Cache control scopes (org for agent, global for core prompt)
- Presence of known Claude Code instruction sections

Changes:
- Add claude_system_prompt.go with extracted Claude Code v2.1.63 system prompt
  sections (intro, system instructions, doing tasks, tone & style, output efficiency)
- Rewrite checkSystemInstructionsWithSigningMode to build 5 system blocks:
  [0] billing header (no cache_control)
  [1] agent identifier (cache_control: ephemeral, scope=org)
  [2] core intro prompt (cache_control: ephemeral, scope=global)
  [3] system instructions (no cache_control)
  [4] doing tasks (no cache_control)
- Third-party client system instructions still moved to first user message

Follow-up to 69b950db4c
2026-04-09 12:58:50 +08:00
Luis Pater
ad8e3964ff fix(auth): add retry logic for 429 status with Retry-After and improve testing 2026-04-09 07:07:19 +08:00
Luis Pater
e9dc576409 Merge branch 'router-for-me:main' into main 2026-04-09 03:49:09 +08:00
Luis Pater
941334da79 fix(auth): handle OAuth model alias in retry logic and refine Qwen quota handling 2026-04-09 03:44:19 +08:00
Luis Pater
d54f816363 fix(executor): update Qwen user agent and enhance header configuration 2026-04-09 01:45:52 +08:00
wykk-12138
69b950db4c fix(executor): fix OAuth extra usage detection by Anthropic API
Three changes to avoid Anthropic's content-based system prompt validation:

1. Fix identity prefix: Use 'You are Claude Code, Anthropic's official CLI
   for Claude.' instead of the SDK agent prefix, matching real Claude Code.

2. Move user system instructions to user message: Only keep billing header +
   identity prefix in system[] array. User system instructions are prepended
   to the first user message as <system-reminder> blocks.

3. Enable cch signing for OAuth tokens by default: The xxHash64 cch integrity
   check was previously gated behind experimentalCCHSigning config flag.
   Now automatically enabled when using OAuth tokens.

Related: router-for-me/CLIProxyAPI#2599
2026-04-09 00:06:38 +08:00
Luis Pater
f43d25def1 Merge pull request #496 from kunish/fix/copilot-premium-request-inflation
fix(copilot): prevent intermittent context overflow for Claude models
2026-04-08 23:43:15 +08:00
Luis Pater
a279192881 Merge pull request #498 from router-for-me/plus
v6.9.17
2026-04-08 23:42:40 +08:00
Luis Pater
6a43d7285c Merge branch 'main' into plus 2026-04-08 23:42:05 +08:00
kunish
578c312660 fix(copilot): lower static Claude context limits and expose them to Claude Code
The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that are lower than the total context window (200K). When
the dynamic /models API fetch fails or returns no capabilities.limits,
the static fallback of 200K exceeds the real enforced limit, causing
intermittent "prompt token count exceeds the limit" errors.

Two complementary fixes:

1. Lower static Copilot Claude model ContextLength from 200000 to 128000
   (the conservative default matching defaultCopilotContextLength). Dynamic
   API limits override this when available.

2. Add context_length and max_completion_tokens to Claude-format model
   responses so Claude Code CLI can learn the actual Copilot limit instead
   of relying on its built-in 1M context configuration.
2026-04-08 17:02:53 +08:00
Supra4E8C
6bb9bf3132 Merge pull request #495 from Ve-ria/main
feat(codebuddy): 新增 glm-5v-turbo 模型并更新上下文长度
2026-04-08 14:27:43 +08:00
hkfires
343a2fc2f7 docs: update AGENTS.md for improved clarity and detail in commands and architecture 2026-04-08 12:33:16 +08:00
Luis Pater
12b967118b Merge pull request #2592 from router-for-me/tests
fix(tests): update test cases
2026-04-08 11:57:15 +08:00
Luis Pater
70efd4e016 chore: add workflow to retarget main PRs to dev automatically 2026-04-08 10:35:49 +08:00
Luis Pater
f5aa68ecda chore: add workflow to prevent AGENTS.md modifications in pull requests 2026-04-08 10:12:51 +08:00
rensumo
9a5f142c33 feat(codebuddy): add glm-5v-turbo model and update context lengths 2026-04-08 09:48:25 +08:00
hkfires
d390b95b76 fix(tests): update test cases 2026-04-08 08:53:50 +08:00
Luis Pater
d1f6224b70 Merge pull request #2569 from LucasInsight/fix/record-zero-usage
fix: record zero usage
2026-04-08 08:13:11 +08:00
Luis Pater
fcc59d606d fix(translator): add unit tests to validate output_item.done fallback logic for Gemini and Claude 2026-04-08 03:54:15 +08:00
Luis Pater
91e7591955 fix(executor): add transient 429 resource exhausted handling with retry logic 2026-04-08 02:48:53 +08:00
Luis Pater
4607356333 Merge pull request #491 from Ve-ria/main
修复 CodeBuddy 不支持非流式请求的问题
2026-04-07 18:25:21 +08:00
Luis Pater
9a9ed99072 Merge pull request #494 from router-for-me/plus
v6.9.16
2026-04-07 18:23:51 +08:00
Luis Pater
5ae38584b8 Merge branch 'main' into plus 2026-04-07 18:23:31 +08:00
Luis Pater
c8b7e2b8d6 fix(executor): ensure empty stream completions use output_item.done as fallback
Fixed: #2583
2026-04-07 18:21:12 +08:00
Luis Pater
cad45ffa33 Merge pull request #2578 from LemonZuo/feat_socks5h
feat: support socks5h scheme for proxy settings
2026-04-07 09:57:18 +08:00
Luis Pater
6a27bceec0 Merge pull request #2576 from zilianpn/fix/disable-cooling-auth-errors
fix(auth): honor disable-cooling and enrich no-auth errors
2026-04-07 09:56:25 +08:00
Lemon
163d68318f feat: support socks5h scheme for proxy settings 2026-04-07 07:46:11 +08:00
zilianpn
0ea768011b fix(auth): honor disable-cooling and enrich no-auth errors 2026-04-07 01:12:13 +08:00
Michael
8b9dbe10f0 fix: record zero usage 2026-04-06 20:19:42 +08:00
rensumo
341b4beea1 Update internal/runtime/executor/codebuddy_executor.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-04-06 14:16:56 +08:00
rensumo
bea13f9724 fix(executor): support non-stream requests for CodeBuddy 2026-04-06 13:59:06 +08:00
Luis Pater
9f5bdfaa31 Merge pull request #2531 from jamestut/openai-vertex-token-usage-fix
Fix missing `response.completed.usage` for late-usage OpenAI-compatible streams
2026-04-06 09:30:49 +08:00
Luis Pater
9eabdd09db Merge pull request #2522 from aikins01/fix/strip-tool-use-signature
fix(amp): strip signature from tool_use blocks before forwarding to Claude
2026-04-06 09:30:14 +08:00
Luis Pater
c3f8dc362e Merge pull request #2491 from mpfo0106/feature/claude-code-safe-alignment-sentinels
test(claude): add compatibility sentinels and centralize builtin fallback handling
2026-04-06 09:27:08 +08:00
Luis Pater
b85120873b Merge pull request #2332 from RaviTharuma/fix/claude-thinking-signature
fix: preserve Claude thinking signatures in Codex translator
2026-04-06 09:25:06 +08:00
Luis Pater
6f58518c69 docs(readme): remove redundant GITSTORE_GIT_BRANCH description in README files 2026-04-06 09:23:04 +08:00
Luis Pater
000fcb15fa Merge pull request #2298 from snoyiatk/feat/add-gitstore-branch
feat(gitstore): add support for specifying git branch (via GITSTORE_G…
2026-04-06 09:21:03 +08:00
Luis Pater
ea43361492 Merge pull request #2121 from destinoantagonista-wq/main
Reconcile registry model states on auth changes
2026-04-06 09:13:27 +08:00
Luis Pater
c1818f197b Merge pull request #1940 from Blue-B/fix/claude-interleaved-thinking-amp-gzip-budget
fix(claude): enable interleaved-thinking beta, decode AMP error gzip, fix budget 400
2026-04-06 09:08:23 +08:00
Aikins Laryea
b0653cec7b fix(amp): strip signature from tool_use blocks before forwarding to Claude
ensureAmpSignature injects signature:"" into tool_use blocks so the
Amp TUI does not crash on P.signature.length. when Amp sends the
conversation back, Claude rejects the extra field with 400:
  tool_use.signature: Extra inputs are not permitted

strip the proxy-injected signature from tool_use blocks in
SanitizeAmpRequestBody before forwarding to the upstream API.
2026-04-05 12:26:24 +00:00
Luis Pater
22a1a24cf5 feat(executor): add tests for preserving key order in cache control functions
Added comprehensive tests to ensure key order is maintained when modifying payloads in `normalizeCacheControlTTL` and `enforceCacheControlLimit` functions. Removed unused helper functions and refactored implementations for better readability and efficiency.
2026-04-05 17:58:13 +08:00
Luis Pater
7223fee2de Merge branch 'pr-488'
# Conflicts:
#	README.md
#	README_CN.md
#	README_JA.md
2026-04-05 02:08:45 +08:00
Luis Pater
ada8e2905e feat(api): enhance proxy resolution for API key-based auth
Added comprehensive support for resolving proxy URLs from configuration based on API key and provider attributes. Introduced new helper functions and extended the test suite to validate fallback mechanisms and compatibility cases.
2026-04-05 01:56:34 +08:00
Luis Pater
4ba10531da feat(docs): add Poixe AI sponsorship details to README files
Added Poixe AI sponsorship information, including referral bonuses and platform capabilities, to README files in English, Japanese, and Chinese. Updated assets to include Poixe AI logo.
2026-04-05 01:20:50 +08:00
Luis Pater
3774b56e9f feat(misc): add background updater for Antigravity version caching
Introduce `StartAntigravityVersionUpdater` to periodically refresh the cached Antigravity version using a non-blocking background process. Updated main server flow to initialize the updater.
2026-04-04 22:09:11 +08:00
Luis Pater
c2d4137fb9 feat(executor): enhance Qwen system message handling with strict injection and merging rules
Closes: #2537
2026-04-04 21:51:02 +08:00
Luis Pater
2ee938acaf Merge pull request #2535 from rensumo/main
feat: 动态获取 Antigravity User-Agent 版本号
2026-04-04 21:00:47 +08:00
rensumo
8d5e470e1f feat: dynamically fetch antigravity UA version from releases API
Fetch the latest version from the antigravity auto-updater releases
endpoint and cache it for 6 hours. Falls back to 1.21.9 if the API
is unreachable or returns unexpected data.
2026-04-04 14:52:59 +08:00
James
65e9e892a4 Fix missing response.completed.usage for late-usage OpenAI-compatible streams 2026-04-04 05:58:04 +00:00
Luis Pater
3882494878 Merge pull request #486 from router-for-me/plus
v6.9.14
2026-04-04 11:40:13 +08:00
Luis Pater
088c1d07f4 Merge branch 'main' into plus 2026-04-04 11:40:03 +08:00
Luis Pater
8430b28cfa Merge pull request #2526 from rensumo/main
feat: 升级反重力 (antigravity) UA 版本为 1.21.9
2026-04-04 11:32:16 +08:00
rensumo
f3ab8f4bc5 chore: update antigravity UA version to 1.21.9 2026-04-04 07:35:08 +08:00
Luis Pater
0e4f189c2e Merge pull request #1302 from dinhkarate/feat(vertex)/add-prefix-field
Feat(vertex): add prefix field
2026-04-04 04:17:12 +08:00
Luis Pater
98509f615c Merge pull request #485 from kunish/fix/copilot-premium-request-inflation
fix(copilot): reduce premium request inflation, enable thinking, and use dynamic API limits
2026-04-04 02:19:56 +08:00
Luis Pater
e7a66ae504 Merge branch 'router-for-me:main' into main 2026-04-04 02:18:06 +08:00
Luis Pater
754b126944 fix(executor): remove commented-out code in QwenExecutor 2026-04-04 02:14:48 +08:00
Luis Pater
ae37ccffbf Merge pull request #2520 from Arronlong/main
fix:qwen invalid_parameter_error
2026-04-04 02:13:09 +08:00
Luis Pater
42c062bb5b Merge pull request #2509 from adamhelfgott/fix-claude-thinking-temperature
Normalize Claude temperature when thinking is enabled
2026-04-03 23:55:50 +08:00
kunish
87bf0b73d5 fix(copilot): use dynamic API limits to prevent prompt token overflow
The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that differ from the static 200K context length advertised
by the proxy. This mismatch caused Claude Code to accumulate context
beyond the actual limit, triggering "prompt token count exceeds the limit
of 128000" errors.

Changes:
- Extract max_prompt_tokens and max_output_tokens from the Copilot
  /models API response (capabilities.limits) and use them as the
  authoritative ContextLength and MaxCompletionTokens values
- Add CopilotModelLimits struct and Limits() helper to parse limits
  from the existing Capabilities map
- Fix GitLab Duo context-1m beta header not being set when routing
  through the Anthropic gateway (gitlab_duo_force_context_1m attr
  was set but only gin headers were checked)
- Fix flaky parallel tests that shared global model registry state
2026-04-03 23:54:17 +08:00
Luis Pater
f389667ec3 Merge pull request #2513 from lonr-6/codex/fix-ws-custom-tool-repair-v2
fix: repair responses websocket custom tool call pairing
2026-04-03 23:45:38 +08:00
Arronlong
29dba0399b Comment out system message check in Qwen executor
fix qwen invalid_parameter_error
2026-04-03 23:07:33 +08:00
Luis Pater
a824e7cd0b feat(models): add GPT-5.3, GPT-5.4, and GPT-5.4-mini with enhanced "thinking" levels 2026-04-03 23:05:10 +08:00
Luis Pater
140faef7dc Merge branch 'router-for-me:main' into main 2026-04-03 21:48:23 +08:00
Luis Pater
adb580b344 feat(security): add configuration to toggle Gemini CLI endpoint access
Closes: #2445
2026-04-03 21:46:49 +08:00
Luis Pater
06405f2129 fix(security): enforce stricter localhost validation for GeminiCLIAPIHandler
Closes: #2445
2026-04-03 21:22:03 +08:00
kunish
b849bf79d6 fix(copilot): address code review — SSE reasoning, multi-choice, agent detection
- Strip SSE `data:` prefix before normalizing reasoning_text→reasoning_content
  in streaming mode; re-wrap afterward for the translator
- Iterate all choices in normalizeGitHubCopilotReasoningField (not just
  choices[0]) to support n>1 requests
- Remove over-broad tool-role fallback in isAgentInitiated that scanned
  all messages for role:"tool", aligning with opencode's approach of only
  detecting active tool loops — genuine user follow-ups after tool use are
  no longer mis-classified as agent-initiated
- Add 5 reasoning normalization tests; update 2 X-Initiator tests to match
  refined semantics
2026-04-03 20:51:19 +08:00
kunish
59af2c57b1 fix(copilot): reduce premium request inflation and enable thinking
This commit addresses three issues with Claude Code through GitHub
Copilot:

1. **Premium request inflation**: Responses API requests were missing
   Openai-Intent headers and proper defaults, causing Copilot to bill
   each tool-loop continuation as a new premium request. Fixed by adding
   isAgentInitiated() heuristic (checks for tool_result content or
   preceding assistant tool_use), applying Responses API defaults
   (store, include, reasoning.summary), and local tiktoken-based token
   counting to avoid extra API calls.

2. **Context overflow**: Claude Code's modelSupports1M() hardcodes
   opus-4-6 as 1M-capable, but Copilot only supports ~128K-200K.
   Fixed by stripping the context-1m-2025-08-07 beta from translated
   request bodies. Also forwards response headers in non-streaming
   Execute() and registers the GET /copilot-quota management API route.

3. **Thinking not working**: Add ThinkingSupport with level-based
   reasoning to Claude models in the static definitions. Normalize
   Copilot's non-standard 'reasoning_text' response field to
   'reasoning_content' before passing to the SDK translator. Use
   caller-provided context in CountTokens instead of Background().
2026-04-03 20:24:30 +08:00
Kai Wang
d1fd2c4ad4 fix: repair websocket custom tool calls 2026-04-03 17:11:44 +08:00
Kai Wang
b6c6379bfa fix: repair websocket custom tool calls 2026-04-03 17:11:42 +08:00
Kai Wang
8f0e66b72e fix: repair websocket custom tool calls 2026-04-03 17:11:41 +08:00
Adam Helfgott
f63cf6ff7a Normalize Claude temperature for thinking 2026-04-03 03:45:51 -04:00
Luis Pater
d2419ed49d feat(executor): ensure default system message in QwenExecutor payload 2026-04-03 11:18:48 +08:00
Luis Pater
516d22c695 Merge pull request #484 from Ve-ria/main
更新CodeBuddy CN的模型列表
2026-04-03 11:10:32 +08:00
rensumo
73cda6e836 Update CodeBuddy DeepSeek model description 2026-04-03 11:03:33 +08:00
rensumo
0805989ee5 更新CodeBuddy CN的模型列表 2026-04-03 10:59:27 +08:00
mpfo0106
9b5ce8c64f Keep Claude builtin helpers aligned with the shared helper layout
The review asked for the builtin tool registry helper to live with the rest
of executor support utilities. This moves the registry code into the helps
package, exports the minimal surface executor needs, and keeps behavior tests
with the executor while leaving registry-focused checks with the helper.

Constraint: Requested layout keeps executor helper utilities centralized under internal/runtime/executor/helps
Rejected: Keep the files in executor and reply with rationale | conflicts with requested package organization
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep executor behavior tests near applyClaudeToolPrefix and keep pure registry tests in helps
Tested: go test ./internal/runtime/executor/helps ./internal/runtime/executor -run 'Claude|Builtin|Tool'; go test ./test/...; go test ./...
Not-tested: End-to-end Claude Code direct-connect/session runtime behavior
2026-04-03 00:13:02 +09:00
Duong M. CUONG
058793c73a feat(gitstore): honor configured branch and follow live remote default 2026-04-02 14:44:44 +00:00
Luis Pater
75da02af55 Merge branch 'router-for-me:main' into main 2026-04-02 22:34:47 +08:00
Luis Pater
ab9ebea592 Merge PR #2474
# Conflicts:
#	internal/api/modules/amp/response_rewriter.go
#	internal/api/modules/amp/response_rewriter_test.go
2026-04-02 22:31:12 +08:00
Luis Pater
7ee37ee4b9 feat: add /healthz endpoint and test coverage for health check
Closes: #2493
2026-04-02 21:56:27 +08:00
Luis Pater
837afffb31 docs: remove README_JA.md and clean up related links from README files 2026-04-02 21:37:47 +08:00
Luis Pater
03a1bac898 Merge upstream v6.9.9 (PR #483) 2026-04-02 21:31:21 +08:00
Luis Pater
3171d524f0 docs: fix duplicated ProxyPal entry in README files 2026-04-02 21:22:40 +08:00
Luis Pater
3e78a8d500 Merge branch 'main' into dev 2026-04-02 21:21:26 +08:00
Luis Pater
fcba912cc4 Merge pull request #2492 from davidwushi1145/main
fix(responses): reassemble split SSE event/data frames before streaming
2026-04-02 21:20:31 +08:00
Luis Pater
7170eeea5f Merge pull request #2454 from buddingnewinsights/add-proxypal-to-readme
docs: add ProxyPal to "Who is with us?" section
2026-04-02 21:18:22 +08:00
Luis Pater
e3eb048c7a Merge pull request #2489 from Soein/upstream-pr
fix: 增强 Claude 反代检测对抗能力
2026-04-02 21:16:58 +08:00
Luis Pater
a59e92435b Merge pull request #2490 from router-for-me/logs
Refactor websocket logging and error handling
2026-04-02 20:47:31 +08:00
davidwushi1145
108895fc04 Harden Responses SSE framing against partial chunk boundaries
Follow-up review found two real framing hazards in the handler-layer
framer: it could flush a partial `data:` payload before the JSON was
complete, and it could inject an extra newline before chunks that
already began with `\n`/`\r\n`. This commit tightens the framer so it
only emits undelimited events when the buffered `data:` payload is
already valid JSON (or `[DONE]`), skips newline injection for chunks
that already start with a line break, and avoids the heavier
`bytes.Split` path while scanning SSE fields.

The regression suite now covers split `data:` payload chunks,
newline-prefixed chunks, and dropping incomplete trailing data on
flush, so the original Responses fix remains intact while the review
concerns are explicitly locked down.

Constraint: Keep the follow-up limited to handler-layer framing and tests
Rejected: Ignore the review and rely on current executor chunk shapes | leaves partial data payload corruption possible
Rejected: Build a fully generic SSE parser | wider change than needed for the identified risks
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not emit undelimited Responses SSE events unless buffered `data:` content is already complete and valid
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1
Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/...
Not-tested: Full repository test suite outside sdk/api/handlers packages
2026-04-02 20:39:49 +08:00
davidwushi1145
abc293c642 Prevent malformed Responses SSE frames from breaking stream clients
Line-oriented upstream executors can emit `event:` and `data:` as
separate chunks, but the Responses handler had started terminating
each incoming chunk as a full SSE event. That split `response.created`
into an empty event plus a later data block, which broke downstream
clients like OpenClaw.

This keeps the fix in the handler layer: a small stateful framer now
buffers standalone `event:` lines until the matching `data:` arrives,
preserves already-framed events, and ignores delimiter-only leftovers.
The regression suite now covers split event/data framing, full-event
passthrough, terminal errors, and the bootstrap path that forwards
line-oriented openai-response streams from non-Codex executors too.

Constraint: Keep the fix localized to Responses handler framing instead of patching every executor
Rejected: Revert to v6.9.7 chunk writing | would reintroduce data-only framing regressions
Rejected: Patch each line-oriented executor separately | duplicates fragile SSE assembly logic
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not assume incoming Responses stream chunks are already complete SSE events; preserve handler-layer reassembly for split `event:`/`data:` inputs
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1
Tested: /tmp/go1.26.1/go test ./sdk/api/handlers/... -count=1
Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/...
Tested: Temporary patched server on 127.0.0.1:18317 -> /v1/models 200, /v1/responses non-stream 200, /v1/responses stream emitted combined `event:` + `data:` frames
Not-tested: Full repository test suite outside sdk/api/handlers packages
2026-04-02 20:26:42 +08:00
mpfo0106
da3a498a28 Keep Claude Code compatibility work low-risk and reviewable
This change stops short of broader Claude Code runtime alignment and instead
hardens two safe edges: builtin tool prefix handling and source-informed
sentinel coverage for future drift checks.

Constraint: Must preserve existing default behavior for current users
Rejected: Implement control-plane/session alignment now | too much runtime risk for a first slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Treat the new fixtures as compatibility sentinels, not a full Claude Code schema contract
Tested: go test ./test/...; go test ./sdk/translator/...; go test ./internal/runtime/executor -run 'Claude|Builtin|Tool'; go test ./...
Not-tested: End-to-end Claude Code direct-connect/session runtime behavior
2026-04-02 20:35:39 +09:00
pzy
bb44671845 fix: 修复反代检测对抗的 3 个问题
- computeFingerprint 使用 rune 索引替代字节索引,修复多字节字符指纹不匹配
- utls Chrome TLS 指纹仅对 Anthropic 官方域名生效,自定义 base_url 走标准 transport
- IPv6 地址使用 net.JoinHostPort 正确拼接端口
2026-04-02 19:12:55 +08:00
Luis Pater
09e480036a feat(auth): add support for managing custom headers in auth files
Closes #2457
2026-04-02 19:11:09 +08:00
pzy
249f969110 fix: Claude API 请求使用 utls Chrome TLS 指纹
Claude executor 的 API 请求之前使用 Go 标准库 crypto/tls,JA3 指纹
与真实 Claude Code(Bun/BoringSSL)不匹配,可被 Cloudflare 识别。

- 新增 helps/utls_client.go,封装 utls Chrome 指纹 + HTTP/2 + 代理支持
- Claude executor 的 4 处 NewProxyAwareHTTPClient 替换为 NewUtlsHTTPClient
- 其他 executor(Gemini/Codex/iFlow 等)不受影响,仍用标准 TLS
- 非 HTTPS 请求自动回退到标准 transport
2026-04-02 19:09:56 +08:00
hkfires
4f8acec2d8 refactor(logging): centralize websocket handshake recording 2026-04-02 18:39:32 +08:00
hkfires
34339f61ee Refactor websocket logging and error handling
- Introduced new logging functions for websocket requests, handshakes, errors, and responses in `logging_helpers.go`.
- Updated `CodexWebsocketsExecutor` to utilize the new logging functions for improved clarity and consistency in websocket operations.
- Modified the handling of websocket upgrade rejections to log relevant metadata.
- Changed the request body key to a timeline body key in `openai_responses_websocket.go` to better reflect its purpose.
- Enhanced tests to verify the correct logging of websocket events and responses, including disconnect events and error handling scenarios.
2026-04-02 17:30:51 +08:00
pzy
4045378cb4 fix: 增强 Claude 反代检测对抗能力
基于 Claude Code v2.1.88 源码分析,修复多个可被 Anthropic 检测的差距:

- 实现消息指纹算法(SHA256 盐值 + 字符索引),替代随机 buildHash
- billing header cc_version 从设备 profile 动态取版本号,不再硬编码
- billing header cc_entrypoint 从客户端 UA 解析,支持 cli/vscode/local-agent
- billing header 新增 cc_workload 支持(通过 X-CPA-Claude-Workload 头传入)
- 新增 X-Claude-Code-Session-Id 头(每 apiKey 缓存 UUID,TTL=1h)
- 新增 x-client-request-id 头(仅 api.anthropic.com,每请求 UUID)
- 补全 4 个缺失的 beta flags(structured-outputs/fast-mode/redact-thinking/token-efficient-tools)
- OAuth scope 对齐 Claude Code 2.1.88(移除 org:create_api_key,添加 sessions/mcp/file_upload)
- Anthropic-Dangerous-Direct-Browser-Access 仅在 API key 模式发送
- 响应头网关指纹清洗(剥离 litellm/helicone/portkey/cloudflare/kong/braintrust 前缀头)
2026-04-02 15:55:22 +08:00
Luis Pater
2df35449fe Fix executor compat helpers 2026-04-02 12:20:12 +08:00
Luis Pater
c744179645 Merge PR #479 2026-04-02 12:15:33 +08:00
Luis Pater
9720b03a6b Merge pull request #477 from ben-vargas/plus-main
fix(copilot): route Gemini preview models to chat endpoint and correct context lengths
2026-04-02 11:36:51 +08:00
Luis Pater
f2c0f3d325 Merge pull request #476 from hungthai1401/fix/ghc-gpt54mini
Fix GitHub Copilot gpt-5.4-mini endpoint routing
2026-04-02 11:36:26 +08:00
Luis Pater
4f99bc54f1 test: update codex header expectations 2026-04-02 11:19:37 +08:00
Luis Pater
913f4a9c5f test: fix executor tests after helpers refactor 2026-04-02 11:12:30 +08:00
Luis Pater
25d1c18a3f fix: scope experimental cch signing to billing header 2026-04-02 11:03:11 +08:00
Luis Pater
d09dd4d0b2 Merge commit '15c2f274ea690c9a7c9db22f9f454af869db5375' into dev 2026-04-02 10:59:54 +08:00
Luis Pater
474fb042da Merge pull request #2476 from router-for-me/cherry-pick/pr-2438-to-dev
Cherry-pick PR #2438 onto dev
2026-04-02 10:36:50 +08:00
Michael
8435c3d7be feat(tui): show time in usage details 2026-04-02 10:35:13 +08:00
Luis Pater
e783d0a62e Merge pull request #2441 from MonsterQiu/issue-2421-alias-before-suspension
fix(auth): resolve oauth aliases before suspension checks
2026-04-02 10:27:39 +08:00
Luis Pater
b05f575e9b Merge pull request #2444 from 0oAstro/fix/codex-nonstream-finish-reason-tool-calls
fix(codex): set finish_reason to "tool_calls" in non-streaming response when tool calls are present
2026-04-02 10:01:25 +08:00
Aikins Laryea
f5e9f01811 test(amp): update tests to expect thinking blocks to pass through during streaming 2026-04-01 20:35:23 +00:00
Aikins Laryea
ff7dbb5867 test(amp): update tests to expect thinking blocks to pass through during streaming 2026-04-01 20:21:39 +00:00
Aikins Laryea
e34b2b4f1d fix(gemini): clean tool schemas and eager_input_streaming
delegate schema sanitization to util.CleanJSONSchemaForGemini and drop the top-level eager_input_streaming key to prevent validation errors when sending claude tools to the gemini api
2026-04-01 19:49:38 +00:00
edlsh
15c2f274ea fix: preserve cloak config defaults when mode omitted 2026-04-01 13:20:11 -04:00
edlsh
37249339ac feat: add opt-in experimental Claude cch signing 2026-04-01 13:03:17 -04:00
Luis Pater
c422d16beb Merge pull request #2398 from 7RPH/fix/responses-sse-framing
fix: preserve SSE event boundaries for Responses streams
2026-04-02 00:46:51 +08:00
Luis Pater
66cd50f603 Merge pull request #2468 from router-for-me/ip
fix(openai): improve client IP retrieval in websocket handler
2026-04-02 00:03:35 +08:00
hkfires
caa529c282 fix(openai): improve client IP retrieval in websocket handler 2026-04-01 20:16:01 +08:00
hkfires
51a4379bf4 refactor(openai): remove websocket body log truncation limit 2026-04-01 18:11:43 +08:00
Luis Pater
acf98ed10e fix(openai): add session reference counter and cache lifecycle management for websocket tools 2026-04-01 17:28:50 +08:00
Luis Pater
d1c07a091e fix(openai): add websocket tool call repair with caching and tests to improve transcript consistency 2026-04-01 17:16:49 +08:00
Ben Vargas
c1a8adf1ab feat(registry): add GitHub Copilot gemini-3.1-pro-preview model 2026-04-01 01:25:03 -06:00
Ben Vargas
08e078fc25 fix(openai): route copilot Gemini preview models to chat endpoint 2026-04-01 01:24:58 -06:00
Luis Pater
105a21548f fix(codex): centralize session management with global store and add tests for executor session lifecycle 2026-04-01 13:17:10 +08:00
Luis Pater
1734aa1664 fix(codex): prioritize websocket-enabled credentials across priority tiers in scheduler logic 2026-04-01 12:51:12 +08:00
Luis Pater
ca11b236a7 refactor(runtime, openai): simplify header management and remove redundant websocket logging logic 2026-04-01 11:57:31 +08:00
huynhgiabuu
6fdff8227d docs: add ProxyPal to 'Who is with us?' section
Add ProxyPal (https://github.com/buddingnewinsights/proxypal) to the
community projects list in all three README files (EN, CN, JA).
Placed after CCS, restoring its original position.

ProxyPal is a cross-platform desktop app (macOS, Windows, Linux) that
wraps CLIProxyAPI with a native GUI, supporting multiple AI providers,
usage analytics, request monitoring, and auto-configuration for popular
coding tools.

Closes #2420
2026-04-01 10:23:22 +07:00
Luis Pater
330e12d3c2 fix(codex): conditionally set Session_id header for Mac OS user agents and clean up redundant logic 2026-04-01 11:11:45 +08:00
Thai Nguyen Hung
bd09c0bf09 feat(registry): add gpt-5.4-mini model to GitHub Copilot registry 2026-04-01 10:04:38 +07:00
Luis Pater
b468ca79c3 Merge branch 'dev' of github.com:router-for-me/CLIProxyAPI into dev 2026-04-01 03:09:03 +08:00
Luis Pater
d2c7e4e96a refactor(runtime): move executor utilities to helps package and update references 2026-04-01 03:08:20 +08:00
Luis Pater
1c7003ff68 Merge pull request #2452 from Lucaszmv/fix-qwen-cli-v0.13.2
fix(qwen): update CLI simulation to v0.13.2 and adjust header casing
2026-04-01 02:44:27 +08:00
Lucaszmv
1b44364e78 fix(qwen): update CLI simulation to v0.13.2 2026-03-31 15:19:07 -03:00
0oAstro
ec77f4a4f5 fix(codex): set finish_reason to tool_calls in non-streaming response when tool calls are present 2026-03-31 14:12:15 +05:30
MonsterQiu
f611dd6e96 refactor(auth): dedupe route-aware model support checks 2026-03-31 15:42:25 +08:00
MonsterQiu
07b7c1a1e0 fix(auth): resolve oauth aliases before suspension checks 2026-03-31 14:27:14 +08:00
Luis Pater
51fd58d74f fix(codex): use normalizeCodexInstructions to set default instructions 2026-03-31 12:16:57 +08:00
Luis Pater
faae9c2f7c Merge pull request #2422 from MonsterQiu/fix/codex-compact-instructions
fix(codex): add default instructions for /responses/compact
2026-03-31 12:14:20 +08:00
Luis Pater
bc3a6e4646 Merge pull request #2434 from MonsterQiu/fix/codex-responses-null-instructions
fix(codex): normalize null instructions for /responses requests
2026-03-31 12:01:21 +08:00
Luis Pater
b09b03e35e Merge pull request #2424 from possible055/fix/websocket-transcript-replacement
fix(openai): handle transcript replacement after websocket v2 compaction
2026-03-31 11:00:33 +08:00
Luis Pater
16231947e7 Merge pull request #2426 from xixiwenxuanhe/feature/antigravity-credits
feat(antigravity): add AI credits quota fallback
2026-03-31 10:51:40 +08:00
MonsterQiu
39b9a38fbc fix(codex): normalize null instructions across responses paths 2026-03-31 10:32:39 +08:00
MonsterQiu
bd855abec9 fix(codex): normalize null instructions for responses requests 2026-03-31 10:29:02 +08:00
Luis Pater
7c3c2e9f64 Merge pull request #2417 from CharTyr/fix/amp-streaming-thinking-regression
fix(amp): 修复流式响应中 thinking block 被错误抑制导致的 TUI 空白回复
2026-03-31 10:12:13 +08:00
Luis Pater
c10f8ae2e2 Fixed: #2420
docs(readme): remove ProxyPal section from all README translations
2026-03-31 07:23:02 +08:00
xixiwenxuanhe
a0bf33eca6 fix(antigravity): preserve fallback and honor config gate 2026-03-31 00:14:05 +08:00
xixiwenxuanhe
88dd9c715d feat(antigravity): add AI credits quota fallback 2026-03-30 23:58:12 +08:00
apparition
a3e21df814 fix(openai): avoid developer transcript resets
- Narrow websocket transcript replacement detection to assistant outputs and function calls
- Preserve existing merge behavior for follow-up developer messages without previous_response_id
- Add a regression test covering mid-session developer message updates
2026-03-30 23:33:16 +08:00
MonsterQiu
d3b94c9241 fix(codex): normalize null instructions for compact requests 2026-03-30 22:58:05 +08:00
apparition
c1d7599829 fix(openai): handle transcript replacement after websocket compaction
- Add shouldReplaceWebsocketTranscript() to detect historical model output in input
- Add normalizeResponseTranscriptReplacement() for full transcript reset handling
- Prevent duplicate stale turn-state when clients replace local history post-compaction
- Avoid orphaned function_call items from incremental append on compact transcripts
- Add unit tests for transcript replacement detection and state reset behavior
2026-03-30 22:44:58 +08:00
MonsterQiu
d11936f292 fix(codex): add default instructions for /responses/compact 2026-03-30 22:44:46 +08:00
Luis Pater
17363edf25 fix(auth): skip downtime for request-scoped 404 errors in model state management 2026-03-30 22:22:42 +08:00
CharTyr
279cbbbb8a fix(amp): don't suppress thinking blocks in streaming mode
Reverts the streaming thinking suppression introduced in b15453c.
rewriteStreamEvent should only inject signatures and rewrite model
names — suppressing thinking blocks in streaming mode breaks SSE
index alignment and causes the Amp TUI to render empty responses
on the second message onward (especially with model-mapped
non-Claude providers like GPT-5.4).

Non-streaming responses still suppress thinking when tool_use is
present via rewriteModelInResponse.
2026-03-30 20:09:32 +08:00
Luis Pater
486cd4c343 Merge pull request #2409 from sususu98/fix/tool-use-pairing-break
fix(antigravity): reorder model parts to prevent tool_use↔tool_result pairing breakage
2026-03-30 16:59:46 +08:00
sususu98
25feceb783 fix(antigravity): reorder model parts to prevent tool_use↔tool_result pairing breakage
When a Claude assistant message contains [text, tool_use, text], the
Antigravity API internally splits the model message at functionCall
boundaries, creating an extra assistant turn between tool_use and the
following tool_result. Claude then rejects with:

  tool_use ids were found without tool_result blocks immediately after

Fix: extend the existing 2-way part reordering (thinking-first) to a
3-way partition: thinking → regular → functionCall. This ensures
functionCall parts are always last, so Antigravity's split cannot
insert an extra assistant turn before the user's tool_result.

Fixes #989
2026-03-30 15:09:33 +08:00
Luis Pater
d26752250d Merge pull request #2403 from CharTyr/clean-pr
fix(amp): 修复Amp CLI 集成 缺失/无效 signature 导致的 TUI 崩溃与上游 400 问题
2026-03-30 12:54:15 +08:00
CharTyr
b15453c369 fix(amp): address PR review - stream thinking suppression, SSE detection, test init
- Call suppressAmpThinking in rewriteStreamEvent for streaming path
- Handle nil return from suppressAmpThinking to skip suppressed events
- Narrow looksLikeSSEChunk to line-prefix detection (HasPrefix vs Contains)
- Initialize suppressedContentBlock map in test
2026-03-30 00:42:04 -04:00
CharTyr
04ba8c8bc3 feat(amp): sanitize signatures and handle stream suppression for Amp compatibility 2026-03-29 22:23:18 -04:00
Luis Pater
6570692291 Merge pull request #2400 from router-for-me/revert-2374-codex-cache-clean
Revert "fix(codex): restore prompt cache continuity for Codex requests"
2026-03-29 22:19:39 +08:00
trph
f73d55ddaa fix: simplify responses SSE suffix handling 2026-03-29 22:19:25 +08:00
Luis Pater
13aa5b3375 Revert "fix(codex): restore prompt cache continuity for Codex requests" 2026-03-29 22:18:14 +08:00
trph
0fcc02fbea fix: tighten responses SSE review follow-up 2026-03-29 22:10:28 +08:00
trph
c03883ccf0 fix: address responses SSE review feedback 2026-03-29 22:00:46 +08:00
trph
134a9eac9d fix: preserve SSE event boundaries for Responses streams 2026-03-29 17:23:16 +08:00
Luis Pater
6d8de0ade4 feat(auth): implement weighted provider rotation for improved scheduling fairness 2026-03-29 13:49:01 +08:00
Luis Pater
1587ff5e74 Merge pull request #2389 from router-for-me/claude
fix(claude): add default max_tokens for models
2026-03-29 13:03:20 +08:00
hkfires
f033d3a6df fix(claude): enhance ensureModelMaxTokens to use registered max_completion_tokens and fallback to default 2026-03-29 13:00:43 +08:00
hkfires
145e0e0b5d fix(claude): add default max_tokens for models 2026-03-29 12:46:00 +08:00
Luis Pater
f8d1bc06ea Merge pull request #469 from router-for-me/plus
v6.9.5
2026-03-29 12:40:26 +08:00
Luis Pater
d5930f4e44 Merge branch 'main' into plus 2026-03-29 12:40:17 +08:00
Luis Pater
9b7d7021af docs(readme): update LingtrueAPI link in all README translations 2026-03-29 12:30:24 +08:00
Luis Pater
e41c22ef44 docs(readme): add LingtrueAPI sponsorship details to all README translations 2026-03-29 12:23:37 +08:00
Ravi Tharuma
5fc2bd393e fix: retain codex thinking signature until item done 2026-03-28 14:41:25 +01:00
Luis Pater
55271403fb Merge pull request #2374 from VooDisss/codex-cache-clean
fix(codex): restore prompt cache continuity for Codex requests
2026-03-28 21:16:51 +08:00
Luis Pater
36fba66619 Merge pull request #2371 from RaviTharuma/docs/provider-specific-routes
docs: clarify provider-specific routing for aliased models
2026-03-28 21:11:29 +08:00
Ravi Tharuma
66eb12294a fix: clear stale thinking signature when no block is open 2026-03-28 14:08:31 +01:00
Ravi Tharuma
73b22ec29b fix: omit empty signature field from thinking blocks
Emit signature only when non-empty in both streaming content_block_start
and non-streaming thinking blocks. Avoids turning 'missing signature'
into 'empty/invalid signature' which Claude clients may reject.
2026-03-28 14:08:31 +01:00
Ravi Tharuma
c31ae2f3b5 fix: retain previously captured thinking signature on new summary part 2026-03-28 14:08:31 +01:00
Ravi Tharuma
76b53d6b5b fix: finalize pending thinking block before next summary part 2026-03-28 14:08:31 +01:00
Ravi Tharuma
a34dfed378 fix: preserve Claude thinking signatures in Codex translator 2026-03-28 14:08:31 +01:00
Luis Pater
b9b127a7ea Merge pull request #2347 from edlsh/fix/codex-strip-stream-options
fix(codex): strip stream_options from Responses API requests
2026-03-28 21:03:01 +08:00
Luis Pater
2741e7b7b3 Merge pull request #2346 from pjpjq/codex/fix-codex-capacity-retry
fix(codex): Treat Codex capacity errors as retryable
2026-03-28 21:00:50 +08:00
Luis Pater
1767a56d4f Merge pull request #2343 from kongkk233/fix/proxy-transport-defaults
Preserve default transport settings for proxy clients
2026-03-28 20:58:24 +08:00
Luis Pater
779e6c2d2f Merge pull request #2231 from 7RPH/fix/responses-stream-multi-tool-calls
fix: preserve separate streamed tool calls in Responses API
2026-03-28 20:53:19 +08:00
Luis Pater
73c831747b Merge pull request #2133 from DragonFSKY/fix/2061-stale-modelstates
fix(auth): prevent stale runtime state inheritance from disabled auth entries
2026-03-28 20:50:57 +08:00
Luis Pater
b8b89f34f4 Merge pull request #442 from LuxVTZ/feat/gitlab-duo-panel-parity
Improve GitLab Duo gateway compatibility\n\nRestore internal/runtime/executor/claude_executor.go to main during merge.
2026-03-28 05:06:41 +08:00
Luis Pater
1fa094dac6 Merge pull request #461 from MrHuangJser/main
feat(cursor): Full Cursor provider with H2 streaming, MCP tools, multi-turn & multi-account
2026-03-28 05:01:27 +08:00
VooDisss
e5d3541b5a refactor(codex): remove stale affinity cleanup leftovers
Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.
2026-03-27 20:40:26 +02:00
VooDisss
79755e76ea refactor(pr): remove forbidden translator changes
Drop the chat-completions translator edits from this PR so the branch complies with the repository policy that forbids pull-request changes under internal/translator. The remaining PR stays focused on the executor-level Codex continuity fix that was validated to restore cache reuse.
2026-03-27 19:34:13 +02:00
VooDisss
35f158d526 refactor(pr): narrow Codex cache fix scope
Remove the experimental auth-affinity routing changes from this PR so it stays focused on the validated Codex continuity fix. This keeps the prompt-cache repair while avoiding unrelated routing-policy concerns such as provider/model affinity scope, lifecycle cleanup, and hard-pin fallback semantics.
2026-03-27 19:06:34 +02:00
VooDisss
6962e09dd9 fix(auth): scope affinity by provider
Keep sticky auth affinity limited to matching providers and stop persisting execution-session IDs as long-lived affinity keys so provider switching and normal streaming traffic do not create incorrect pins or stale affinity state.
2026-03-27 18:52:58 +02:00
VooDisss
4c4cbd44da fix(auth): avoid leaking or over-persisting affinity keys
Stop using one-shot idempotency keys as long-lived auth-affinity identifiers and remove raw affinity-key values from debug logs so sticky routing keeps its continuity benefits without creating avoidable memory growth or credential exposure risks.
2026-03-27 18:34:51 +02:00
VooDisss
26eca8b6ba fix(codex): preserve continuity and safe affinity fallback
Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.
2026-03-27 18:27:33 +02:00
VooDisss
62b17f40a1 refactor(codex): align continuity helpers with review feedback
Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.
2026-03-27 18:11:57 +02:00
VooDisss
511b8a992e fix(codex): restore prompt cache continuity for Codex requests
Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself.

The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context.

This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists.

The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse.

To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.
2026-03-27 17:49:29 +02:00
Ravi Tharuma
0ab977c236 docs: clarify provider path limitations 2026-03-27 11:13:08 +01:00
Ravi Tharuma
224f0de353 docs: neutralize provider-specific path wording 2026-03-27 11:11:06 +01:00
Ravi Tharuma
d54de441d3 docs: clarify provider-specific routing for aliased models 2026-03-27 10:53:09 +01:00
MrHuangJser
7386a70724 feat(cursor): auto-identify accounts from JWT sub for multi-account support
Previously Cursor required a manual ?label=xxx parameter to distinguish
accounts (unlike Codex which auto-generates filenames from JWT claims).

Cursor JWTs contain a "sub" claim (e.g. "auth0|user_XXXX") that uniquely
identifies each account. Now we:

- Add ParseJWTSub() + SubToShortHash() to extract and hash the sub claim
- Refactor GetTokenExpiry() to share the new decodeJWTPayload() helper
- Update CredentialFileName(label, subHash) to auto-generate filenames
  from the sub hash when no explicit label is provided
  (e.g. "cursor.8f202e67.json" instead of always "cursor.json")
- Add DisplayLabel() for human-readable account identification
- Store "sub" in metadata for observability
- Update both management API handler and SDK authenticator

Same account always produces the same filename (deterministic), different
accounts get different files. Explicit ?label= still takes priority.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:40:02 +08:00
MrHuangJser
1b7447b682 feat(cursor): implement StatusError for conductor cooldown integration
Cursor executor errors were plain fmt.Errorf — the conductor couldn't
extract HTTP status codes, so exhausted accounts never entered cooldown.

Changes:
- Add ConnectError struct to proto/connect.go: ParseConnectEndStream now
  returns *ConnectError with Code/Message fields for precise matching
- Add cursorStatusErr implementing StatusError + RetryAfter interfaces
- Add classifyCursorError() with two-layer classification:
  Layer 1: exact match on ConnectError.Code (gRPC standard codes)
    resource_exhausted → 429, unauthenticated → 401,
    permission_denied → 403, unavailable → 503, internal → 500
  Layer 2: fuzzy string match for H2 errors (RST_STREAM → 502)
- Log all ConnectError code/message pairs for observing real server
  error codes (we have no samples yet)
- Wrap Execute and ExecuteStream error returns with classifyCursorError

Now the conductor properly marks Cursor auths as cooldown on quota errors,
enabling exponential backoff and round-robin failover.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 11:42:22 +08:00
MrHuangJser
40dee4453a feat(cursor): auto-migrate sessions to healthy account on quota exhaustion
When a Cursor account's quota is exhausted, sessions bound to it can now
seamlessly continue on a different account:

Layer 1 — Checkpoint decoupling:
  Key checkpoints by conversationId (not authID:conversationId). Store
  authID inside savedCheckpoint. On lookup, if auth changed, discard the
  stale checkpoint and flatten conversation history into userText.

Layer 2 — Cross-account session cleanup:
  When a request arrives for a conversation whose session belongs to a
  different (now-exhausted) auth, close the old H2 stream and remove
  the stale session to free resources.

Layer 3 — H2Stream.Err() exposure:
  New Err() method on H2Stream so callers can inspect RST_STREAM,
  GOAWAY, or other stream-level errors after closure.

Layer 4 — processH2SessionFrames error propagation:
  Returns error instead of bare return. Connect EndStream errors (quota,
  rate limit) are now propagated instead of being logged and swallowed.

Layer 5 — Pre-response transparent retry:
  If the stream fails before any data is sent to the client, return an
  error to the conductor so it retries with a different auth — fully
  transparent to the client.

Layer 6 — Post-response error logging:
  If the stream fails after data was already sent, log a warning. The
  conductor's existing cooldown mechanism ensures the next request routes
  to a healthy account.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 10:50:32 +08:00
MrHuangJser
8902e1cccb style(cursor): replace fmt.Print* with log package for consistent logging
Address Gemini Code Assist review feedback: use logrus log package
instead of fmt.Printf/Println in Cursor auth handlers and CLI for
unified log formatting and level control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 17:03:32 +08:00
黄姜恒
de5fe71478 feat(cursor): multi-account routing with round-robin and session isolation
- Add cursor/filename.go for multi-account credential file naming
- Include auth.ID in session and checkpoint keys for per-account isolation
- Record authID in cursorSession, validate on resume to prevent cross-account access
- Management API /cursor-auth-url supports ?label= for creating named accounts
- Leverages existing conductor round-robin + failover framework

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:27:49 +08:00
黄姜恒
dcfbec2990 feat(cursor): add management API for Cursor OAuth authentication
- Add RequestCursorToken handler with PKCE + polling flow
- Register /v0/management/cursor-auth-url route
- Returns login URL + state for browser auth, polls in background
- Saves cursor.json with access/refresh tokens on success

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 11:10:07 +08:00
黄姜恒
c95620f90e feat(cursor): conversation checkpoint + session_id for multi-turn context
- Capture conversation_checkpoint_update from Cursor server (was ignored)
- Store checkpoint per conversationId, replay as conversation_state on next request
- Use protowire to embed raw checkpoint bytes directly (no deserialization)
- Extract session_id from Claude Code metadata for stable conversationId across resume
- Flatten conversation history into userText as fallback when no checkpoint available
- Use conversationId as session key for reliable tool call resume
- Add checkpoint TTL cleanup (30min)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 10:51:47 +08:00
edlsh
754f3bcbc3 fix(codex): strip stream_options from Responses API requests
The Codex/OpenAI Responses API does not support the stream_options
parameter. When clients (e.g. Amp CLI) include stream_options in their
requests, CLIProxyAPI forwards it as-is, causing a 400 error:

  {"detail":"Unsupported parameter: stream_options"}

Strip stream_options alongside the other unsupported parameters
(previous_response_id, prompt_cache_retention, safety_identifier)
in Execute, ExecuteStream, and CountTokens.
2026-03-25 11:58:36 -04:00
pjpj
36973d4a6f Handle Codex capacity errors as retryable 2026-03-25 23:25:31 +08:00
黄姜恒
9613f0b3f9 feat(cursor): deterministic conversation_id from Claude Code session cch
Extract the cch hash from Claude Code's billing header in the system
prompt (x-anthropic-billing-header: ...cch=XXXXX;) and use it to derive
a deterministic conversation_id instead of generating a random UUID.

Same Claude Code session → same cch → same conversation_id → Cursor
server can reuse conversation state across multiple turns, preserving
tool call results and other context without re-encoding history.

Also cleans up temporary debug logging from previous iterations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 20:29:49 +08:00
黄姜恒
274f29e26b fix(cursor): improve session key uniqueness for multi-session safety
Include system prompt prefix (first 200 chars) in session key derivation.
Claude Code sessions have unique system prompts containing cwd, session_id,
file paths, etc., making collisions between concurrent sessions from the
same user virtually impossible.

Session key now = SHA256(apiKey + model + systemPrompt[:200] + firstUserMsg)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:24:37 +08:00
黄姜恒
c8e79c3787 fix(cursor): prevent session key collision across users
Include client API key in session key derivation to prevent different
users sharing the same proxy from accidentally resuming each other's
H2 streams when they send identical first messages with the same model.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:19:11 +08:00
黄姜恒
8afef43887 fix(cursor): preserve tool call context in multi-turn conversations
When an assistant message appears after tool results without a pending
user message, append it to the last turn's assistant text instead of
dropping it. Also add bakeToolResultsIntoTurns() to merge tool results
into turn context when no active H2 session exists for resume, ensuring
the model sees the full tool interaction history in follow-up requests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:15:24 +08:00
黄姜恒
c1083cbfc6 fix(cursor): MCP tool call resume, H2 flow control, and token usage
- Rewrite tool call mechanism from interrupt-resume to inline-wait mode:
  processH2SessionFrames no longer exits on mcpArgs; instead blocks on
  toolResultCh while continuing to handle KV/heartbeat messages, then
  sends MCP result and continues processing text in the same goroutine.
  Fixes the issue where server stopped generating text after resume.

- Add switchable output channel (outMu/currentOut) so first HTTP response
  closes after tool_calls+[DONE], and resumed text goes to a new channel
  returned by resumeWithToolResults. Reset streamParam on switch so
  Translator produces fresh message_start/content_block_start events.

- Implement send-side H2 flow control: track server's initial window size
  and WINDOW_UPDATE increments; Write() blocks when window exhausted.
  Fixes RST_STREAM FLOW_CONTROL_ERROR on large requests (178KB+).

- Decode new InteractionUpdate fields: TurnEndedUpdate (field 14) as
  stream termination signal, HeartbeatUpdate (field 13) silently ignored,
  TokenDeltaUpdate (field 8) for token usage tracking.

- Include token usage in final stop chunk (prompt_tokens estimated from
  payload size, completion_tokens from accumulated TokenDeltaUpdate deltas)
  so Claude CLI status bar shows non-zero token counts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 17:03:14 +08:00
kwz
c89d19b300 Preserve default transport settings for proxy clients 2026-03-25 15:33:09 +08:00
黄姜恒
19c52bcb60 feat: stash code 2026-03-25 10:14:14 +08:00
trph
cc32f5ff61 fix: unify Responses output indexes for streamed items 2026-03-24 08:59:09 +08:00
trph
fbff68b9e0 fix: preserve choice-aware output indexes for streamed tool calls 2026-03-24 08:54:43 +08:00
trph
7e1a543b79 fix: preserve separate streamed tool calls in Responses API 2026-03-24 08:51:15 +08:00
DragonFSKY
74b862d8b8 test(cliproxy): cover delete re-add stale state flow 2026-03-24 00:21:04 +08:00
dinhkarate
36efcc6e28 fix(vertex): include prefix in auth filename and validate at import
Address two blocking issues from PR review:
- Auth file now named vertex-{prefix}-{project}.json so importing the
  same project with different prefixes no longer overwrites credentials
- Prefix containing "/" is rejected at import time instead of being
  silently ignored at runtime
- Add prefix to in-memory metadata map for consistency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:06:04 +07:00
Pham Quang Dinh
a337ecf35c Merge branch 'router-for-me:main' into feat(vertex)/add-prefix-field 2026-03-17 11:48:40 +07:00
DragonFSKY
5c817a9b42 fix(auth): prevent stale ModelStates inheritance from disabled auth entries
When an auth file is deleted and re-created with the same path/ID, the
new auth could inherit stale ModelStates (cooldown/backoff) from the
previously disabled entry, preventing it from being routed.

Gate runtime state inheritance (ModelStates, LastRefreshedAt,
NextRefreshAfter) on both existing and incoming auth being non-disabled
in Manager.Update and Service.applyCoreAuthAddOrUpdate.

Closes #2061
2026-03-14 23:46:23 +08:00
destinoantagonista-wq
e08f68ed7c chore(auth): drop reconcile test file from pr 2026-03-14 14:41:26 +00:00
destinoantagonista-wq
f09ed25fd3 fix(auth): tighten registry model reconciliation 2026-03-14 14:40:06 +00:00
luxvtz
5da0decef6 Improve GitLab Duo gateway compatibility 2026-03-14 03:18:43 -07:00
destinoantagonista-wq
e166e56249 Reconcile registry model states on auth changes
Add Manager.ReconcileRegistryModelStates to clear stale per-model runtime failures for models currently registered in the global model registry. The method finds models supported for an auth, resets non-clean ModelState entries, updates aggregated availability, persists changes, and pushes a snapshot to the scheduler. Introduce modelStateIsClean helper to determine when a model state needs resetting. Call ReconcileRegistryModelStates from Service paths that register/refresh models (applyCoreAuthAddOrUpdate and refreshModelRegistrationForAuth) to keep the scheduler and global registry aligned after model re-registration.
2026-03-13 19:41:49 +00:00
Blue-B
5f58248016 fix(claude): clamp max_tokens to model limit in normalizeClaudeBudget
When adjustedBudget < minBudget, the previous fix blindly set
max_tokens = budgetTokens+1 which could exceed MaxCompletionTokens.

Now: cap max_tokens at MaxCompletionTokens, recalculate budget, and
disable thinking entirely if constraints are unsatisfiable.

Add unit tests covering raise, clamp, disable, and no-op scenarios.
2026-03-09 22:10:30 +09:00
Blue-B
07d6689d87 fix(claude): add interleaved-thinking beta header, AMP gzip error decoding, normalizeClaudeBudget max_tokens
1. Always include interleaved-thinking-2025-05-14 beta header so that
   thinking blocks are returned correctly for all Claude models.

2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
   error responses (4xx/5xx) with hidden gzip encoding are decoded
   properly — prevents garbled error messages reaching the client.

3. In normalizeClaudeBudget, when the adjusted budget falls below the
   model minimum, set max_tokens = budgetTokens+1 instead of leaving
   the request unchanged (which causes a 400 from the API).
2026-03-07 21:31:10 +09:00
dinhkarate
14cb2b95c6 feat(vertex): add --vertex-import-prefix flag for model namespacing 2026-01-29 13:32:38 +07:00
dinhkarate
fdeef48498 feat(vertex): Add Prefix field to VertexCredentialStorage for per-file model namespacing 2026-01-29 13:32:38 +07:00
213 changed files with 25633 additions and 6031 deletions

81
.github/workflows/agents-md-guard.yml vendored Normal file
View File

@@ -0,0 +1,81 @@
name: agents-md-guard
on:
pull_request_target:
types:
- opened
- synchronize
- reopened
permissions:
contents: read
issues: write
pull-requests: write
jobs:
close-when-agents-md-changed:
runs-on: ubuntu-latest
steps:
- name: Detect AGENTS.md changes and close PR
uses: actions/github-script@v7
with:
script: |
const prNumber = context.payload.pull_request.number;
const { owner, repo } = context.repo;
const files = await github.paginate(github.rest.pulls.listFiles, {
owner,
repo,
pull_number: prNumber,
per_page: 100,
});
const touchesAgentsMd = (path) =>
typeof path === "string" &&
(path === "AGENTS.md" || path.endsWith("/AGENTS.md"));
const touched = files.filter(
(f) => touchesAgentsMd(f.filename) || touchesAgentsMd(f.previous_filename),
);
if (touched.length === 0) {
core.info("No AGENTS.md changes detected.");
return;
}
const changedList = touched
.map((f) =>
f.previous_filename && f.previous_filename !== f.filename
? `- ${f.previous_filename} -> ${f.filename}`
: `- ${f.filename}`,
)
.join("\n");
const body = [
"This repository does not allow modifying `AGENTS.md` in pull requests.",
"",
"Detected changes:",
changedList,
"",
"Please revert these changes and open a new PR without touching `AGENTS.md`.",
].join("\n");
try {
await github.rest.issues.createComment({
owner,
repo,
issue_number: prNumber,
body,
});
} catch (error) {
core.warning(`Failed to comment on PR #${prNumber}: ${error.message}`);
}
await github.rest.pulls.update({
owner,
repo,
pull_number: prNumber,
state: "closed",
});
core.setFailed("PR modifies AGENTS.md");

View File

@@ -0,0 +1,73 @@
name: auto-retarget-main-pr-to-dev
on:
pull_request_target:
types:
- opened
- reopened
- edited
branches:
- main
permissions:
contents: read
issues: write
pull-requests: write
jobs:
retarget:
if: github.actor != 'github-actions[bot]'
runs-on: ubuntu-latest
steps:
- name: Retarget PR base to dev
uses: actions/github-script@v7
with:
script: |
const pr = context.payload.pull_request;
const prNumber = pr.number;
const { owner, repo } = context.repo;
const baseRef = pr.base?.ref;
const headRef = pr.head?.ref;
const desiredBase = "dev";
if (baseRef !== "main") {
core.info(`PR #${prNumber} base is ${baseRef}; nothing to do.`);
return;
}
if (headRef === desiredBase) {
core.info(`PR #${prNumber} is ${desiredBase} -> main; skipping retarget.`);
return;
}
core.info(`Retargeting PR #${prNumber} base from ${baseRef} to ${desiredBase}.`);
try {
await github.rest.pulls.update({
owner,
repo,
pull_number: prNumber,
base: desiredBase,
});
} catch (error) {
core.setFailed(`Failed to retarget PR #${prNumber} to ${desiredBase}: ${error.message}`);
return;
}
const body = [
`This pull request targeted \`${baseRef}\`.`,
"",
`The base branch has been automatically changed to \`${desiredBase}\`.`,
].join("\n");
try {
await github.rest.issues.createComment({
owner,
repo,
issue_number: prNumber,
body,
});
} catch (error) {
core.warning(`Failed to comment on PR #${prNumber}: ${error.message}`);
}

10
.gitignore vendored
View File

@@ -1,6 +1,7 @@
# Binaries
cli-proxy-api
cliproxy
/server
*.exe
@@ -36,15 +37,16 @@ GEMINI.md
# Tooling metadata
.vscode/*
.worktrees/
.codex/*
.claude/*
.gemini/*
.serena/*
.agent/*
.agents/*
.agents/*
.opencode/*
.idea/*
.beads/*
.bmad/*
_bmad/*
_bmad-output/*
@@ -53,4 +55,10 @@ _bmad-output/*
# macOS
.DS_Store
._*
# Opencode
.beads/
.opencode/
.cli-proxy-api/
.venv/
*.bak

58
AGENTS.md Normal file
View File

@@ -0,0 +1,58 @@
# AGENTS.md
Go 1.26+ proxy server providing OpenAI/Gemini/Claude/Codex compatible APIs with OAuth and round-robin load balancing.
## Repository
- GitHub: https://github.com/router-for-me/CLIProxyAPI
## Commands
```bash
gofmt -w . # Format (required after Go changes)
go build -o cli-proxy-api ./cmd/server # Build
go run ./cmd/server # Run dev server
go test ./... # Run all tests
go test -v -run TestName ./path/to/pkg # Run single test
go build -o test-output ./cmd/server && rm test-output # Verify compile (REQUIRED after changes)
```
- Common flags: `--config <path>`, `--tui`, `--standalone`, `--local-model`, `--no-browser`, `--oauth-callback-port <port>`
## Config
- Default config: `config.yaml` (template: `config.example.yaml`)
- `.env` is auto-loaded from the working directory
- Auth material defaults under `auths/`
- Storage backends: file-based default; optional Postgres/git/object store (`PGSTORE_*`, `GITSTORE_*`, `OBJECTSTORE_*`)
## Architecture
- `cmd/server/` — Server entrypoint
- `internal/api/` — Gin HTTP API (routes, middleware, modules)
- `internal/api/modules/amp/` — Amp integration (Amp-style routes + reverse proxy)
- `internal/thinking/` — Main thinking/reasoning pipeline. `ApplyThinking()` (apply.go) parses suffixes (`suffix.go`, suffix overrides body), normalizes config to canonical `ThinkingConfig` (`types.go`), normalizes and validates centrally (`validate.go`/`convert.go`), then applies provider-specific output via `ProviderApplier`. Do not break this "canonical representation → per-provider translation" architecture.
- `internal/runtime/executor/` — Per-provider runtime executors (incl. Codex WebSocket)
- `internal/translator/` — Provider protocol translators (and shared `common`)
- `internal/registry/` — Model registry + remote updater (`StartModelsUpdater`); `--local-model` disables remote updates
- `internal/store/` — Storage implementations and secret resolution
- `internal/managementasset/` — Config snapshots and management assets
- `internal/cache/` — Request signature caching
- `internal/watcher/` — Config hot-reload and watchers
- `internal/wsrelay/` — WebSocket relay sessions
- `internal/usage/` — Usage and token accounting
- `internal/tui/` — Bubbletea terminal UI (`--tui`, `--standalone`)
- `sdk/cliproxy/` — Embeddable SDK entry (service/builder/watchers/pipeline)
- `test/` — Cross-module integration tests
## Code Conventions
- Keep changes small and simple (KISS)
- Comments in English only
- If editing code that already contains non-English comments, translate them to English (dont add new non-English comments)
- For user-visible strings, keep the existing language used in that file/area
- New Markdown docs should be in English unless the file is explicitly language-specific (e.g. `README_CN.md`)
- As a rule, do not make standalone changes to `internal/translator/`. You may modify it only as part of broader changes elsewhere.
- If a task requires changing only `internal/translator/`, run `gh repo view --json viewerPermission -q .viewerPermission` to confirm you have `WRITE`, `MAINTAIN`, or `ADMIN`. If you do, you may proceed; otherwise, file a GitHub issue including the goal, rationale, and the intended implementation code, then stop further work.
- `internal/runtime/executor/` should contain executors and their unit tests only. Place any helper/supporting files under `internal/runtime/executor/helps/`.
- Follow `gofmt`; keep imports goimports-style; wrap errors with context where helpful
- Do not use `log.Fatal`/`log.Fatalf` (terminates the process); prefer returning errors and logging via logrus
- Shadowed variables: use method suffix (`errStart := server.Start()`)
- Wrap defer errors: `defer func() { if err := f.Close(); err != nil { log.Errorf(...) } }()`
- Use logrus structured logging; avoid leaking secrets/tokens in logs
- Avoid panics in HTTP handlers; prefer logged errors and meaningful HTTP status codes
- Timeouts are allowed only during credential acquisition; after an upstream connection is established, do not set timeouts for any subsequent network behavior. Intentional exceptions that must remain allowed are the Codex websocket liveness deadlines in `internal/runtime/executor/codex_websockets_executor.go`, the wsrelay session deadlines in `internal/wsrelay/session.go`, the management APICall timeout in `internal/api/handlers/management/api_tools.go`, and the `cmd/fetch_antigravity_models` utility timeouts

View File

@@ -1,6 +1,6 @@
# CLIProxyAPI Plus
[English](README.md) | 中文 | [日本語](README_JA.md)
[English](README.md) | 中文
这是 [CLIProxyAPI](https://github.com/router-for-me/CLIProxyAPI) 的 Plus 版本,在原有基础上增加了第三方供应商的支持。

View File

@@ -1,187 +0,0 @@
# CLI Proxy API
[English](README.md) | [中文](README_CN.md) | 日本語
CLI向けのOpenAI/Gemini/Claude/Codex互換APIインターフェースを提供するプロキシサーバーです。
OAuth経由でOpenAI CodexGPTモデルおよびClaude Codeもサポートしています。
ローカルまたはマルチアカウントのCLIアクセスを、OpenAIResponses含む/Gemini/Claude互換のクライアントやSDKで利用できます。
## スポンサー
[![z.ai](https://assets.router-for.me/english-5-0.jpg)](https://z.ai/subscribe?ic=8JVLJQFSKB)
本プロジェクトはZ.aiにスポンサーされており、GLM CODING PLANの提供を受けています。
GLM CODING PLANはAIコーディング向けに設計されたサブスクリプションサービスで、月額わずか$10から利用可能です。フラッグシップのGLM-4.7およびGLM-5はProユーザーのみ利用可能モデルを10以上の人気AIコーディングツールClaude Code、Cline、Roo Codeなどで利用でき、開発者にトップクラスの高速かつ安定したコーディング体験を提供します。
GLM CODING PLANを10%割引で取得https://z.ai/subscribe?ic=8JVLJQFSKB
---
<table>
<tbody>
<tr>
<td width="180"><a href="https://www.packyapi.com/register?aff=cliproxyapi"><img src="./assets/packycode.png" alt="PackyCode" width="150"></a></td>
<td>PackyCodeのスポンサーシップに感謝しますPackyCodeは信頼性が高く効率的なAPIリレーサービスプロバイダーで、Claude Code、Codex、Geminiなどのリレーサービスを提供しています。PackyCodeは当ソフトウェアのユーザーに特別割引を提供しています<a href="https://www.packyapi.com/register?aff=cliproxyapi">こちらのリンク</a>から登録し、チャージ時にプロモーションコード「cliproxyapi」を入力すると10%割引になります。</td>
</tr>
<tr>
<td width="180"><a href="https://www.aicodemirror.com/register?invitecode=TJNAIF"><img src="./assets/aicodemirror.png" alt="AICodeMirror" width="150"></a></td>
<td>AICodeMirrorのスポンサーシップに感謝しますAICodeMirrorはClaude Code / Codex / Gemini CLI向けの公式高安定性リレーサービスを提供しており、エンタープライズグレードの同時接続、迅速な請求書発行、24時間365日の専任技術サポートを備えています。Claude Code / Codex / Geminiの公式チャネルが元の価格の38% / 2% / 9%で利用でき、チャージ時にはさらに割引がありますCLIProxyAPIユーザー向けの特別特典<a href="https://www.aicodemirror.com/register?invitecode=TJNAIF">こちらのリンク</a>から登録すると、初回チャージが20%割引になり、エンタープライズのお客様は最大25%割引を受けられます!</td>
</tr>
<tr>
<td width="180"><a href="https://shop.bmoplus.com/?utm_source=github"><img src="./assets/bmoplus.png" alt="BmoPlus" width="150"></a></td>
<td>本プロジェクトにご支援いただいた BmoPlus に感謝いたしますBmoPlusは、AIサブスクリプションのヘビーユーザー向けに特化した信頼性の高いAIアカウントサービスプロバイダーであり、安定した ChatGPT Plus / ChatGPT Pro (完全保証) / Claude Pro / Super Grok / Gemini Pro の公式代行チャージおよび即納アカウントを提供しています。こちらの<a href="https://shop.bmoplus.com/?utm_source=github">BmoPlus AIアカウント専門店/代行チャージ</a>経由でご登録・ご注文いただいたユーザー様は、GPTを <b>公式サイト価格の約1割90% OFF</b> という驚異的な価格でご利用いただけます!</td>
</tr>
</tbody>
</table>
## 概要
- CLIモデル向けのOpenAI/Gemini/Claude互換APIエンドポイント
- OAuthログインによるOpenAI CodexサポートGPTモデル
- OAuthログインによるClaude Codeサポート
- OAuthログインによるQwen Codeサポート
- OAuthログインによるiFlowサポート
- プロバイダールーティングによるAmp CLIおよびIDE拡張機能のサポート
- ストリーミングおよび非ストリーミングレスポンス
- 関数呼び出し/ツールのサポート
- マルチモーダル入力サポート(テキストと画像)
- ラウンドロビン負荷分散による複数アカウント対応Gemini、OpenAI、Claude、QwenおよびiFlow
- シンプルなCLI認証フローGemini、OpenAI、Claude、QwenおよびiFlow
- Generative Language APIキーのサポート
- AI Studioビルドのマルチアカウント負荷分散
- Gemini CLIのマルチアカウント負荷分散
- Claude Codeのマルチアカウント負荷分散
- Qwen Codeのマルチアカウント負荷分散
- iFlowのマルチアカウント負荷分散
- OpenAI Codexのマルチアカウント負荷分散
- 設定によるOpenAI互換アップストリームプロバイダーOpenRouter
- プロキシ埋め込み用の再利用可能なGo SDK`docs/sdk-usage.md`を参照)
## はじめに
CLIProxyAPIガイド[https://help.router-for.me/](https://help.router-for.me/)
## 管理API
[MANAGEMENT_API.md](https://help.router-for.me/management/api)を参照
## Amp CLIサポート
CLIProxyAPIは[Amp CLI](https://ampcode.com)およびAmp IDE拡張機能の統合サポートを含んでおり、Google/ChatGPT/ClaudeのOAuthサブスクリプションをAmpのコーディングツールで使用できます
- Ampの APIパターン用のプロバイダールートエイリアス`/api/provider/{provider}/v1...`
- OAuth認証およびアカウント機能用の管理プロキシ
- 自動ルーティングによるスマートモデルフォールバック
- 利用できないモデルを代替モデルにルーティングする**モデルマッピング**(例:`claude-opus-4.5``claude-sonnet-4`
- localhostのみの管理エンドポイントによるセキュリティファーストの設計
**→ [Amp CLI統合ガイドの完全版](https://help.router-for.me/agent-client/amp-cli.html)**
## SDKドキュメント
- 使い方:[docs/sdk-usage.md](docs/sdk-usage.md)
- 上級(エグゼキューターとトランスレーター):[docs/sdk-advanced.md](docs/sdk-advanced.md)
- アクセス:[docs/sdk-access.md](docs/sdk-access.md)
- ウォッチャー:[docs/sdk-watcher.md](docs/sdk-watcher.md)
- カスタムプロバイダーの例:`examples/custom-provider`
## コントリビューション
コントリビューションを歓迎しますお気軽にPull Requestを送ってください。
1. リポジトリをフォーク
2. フィーチャーブランチを作成(`git checkout -b feature/amazing-feature`
3. 変更をコミット(`git commit -m 'Add some amazing feature'`
4. ブランチにプッシュ(`git push origin feature/amazing-feature`
5. Pull Requestを作成
## 関連プロジェクト
CLIProxyAPIをベースにした以下のプロジェクトがあります
### [vibeproxy](https://github.com/automazeio/vibeproxy)
macOSネイティブのメニューバーアプリで、Claude CodeとChatGPTのサブスクリプションをAIコーディングツールで使用可能 - APIキー不要
### [Subtitle Translator](https://github.com/VjayC/SRT-Subtitle-Translator-Validator)
CLIProxyAPI経由でGeminiサブスクリプションを使用してSRT字幕を翻訳するブラウザベースのツール。自動検証/エラー修正機能付き - APIキー不要
### [CCS (Claude Code Switch)](https://github.com/kaitranntt/ccs)
CLIProxyAPI OAuthを使用して複数のClaudeアカウントや代替モデルGemini、Codex、Antigravityを即座に切り替えるCLIラッパー - APIキー不要
### [ProxyPal](https://github.com/heyhuynhgiabuu/proxypal)
CLIProxyAPI管理用のmacOSネイティブGUIOAuth経由でプロバイダー、モデルマッピング、エンドポイントを設定 - APIキー不要
### [Quotio](https://github.com/nguyenphutrong/quotio)
Claude、Gemini、OpenAI、Qwen、Antigravityのサブスクリプションを統合し、リアルタイムのクォータ追跡とスマート自動フェイルオーバーを備えたmacOSネイティブのメニューバーアプリ。Claude Code、OpenCode、Droidなどのコーディングツール向け - APIキー不要
### [CodMate](https://github.com/loocor/CodMate)
CLI AIセッションCodex、Claude Code、Gemini CLIを管理するmacOS SwiftUIネイティブアプリ。統合プロバイダー管理、Gitレビュー、プロジェクト整理、グローバル検索、ターミナル統合機能を搭載。CLIProxyAPIと統合し、Codex、Claude、Gemini、Antigravity、Qwen CodeのOAuth認証を提供。単一のプロキシエンドポイントを通じた組み込みおよびサードパーティプロバイダーの再ルーティングに対応 - OAuthプロバイダーではAPIキー不要
### [ProxyPilot](https://github.com/Finesssee/ProxyPilot)
TUI、システムトレイ、マルチプロバイダーOAuthを備えたWindows向けCLIProxyAPIフォーク - AIコーディングツール用、APIキー不要
### [Claude Proxy VSCode](https://github.com/uzhao/claude-proxy-vscode)
Claude Codeモデルを素早く切り替えるVSCode拡張機能。バックエンドとしてCLIProxyAPIを統合し、バックグラウンドでの自動ライフサイクル管理を搭載
### [ZeroLimit](https://github.com/0xtbug/zero-limit)
CLIProxyAPIを使用してAIコーディングアシスタントのクォータを監視するTauri + React製のWindowsデスクトップアプリ。Gemini、Claude、OpenAI Codex、Antigravityアカウントの使用量をリアルタイムダッシュボード、システムトレイ統合、ワンクリックプロキシコントロールで追跡 - APIキー不要
### [CPA-XXX Panel](https://github.com/ferretgeek/CPA-X)
CLIProxyAPI向けの軽量Web管理パネル。ヘルスチェック、リソース監視、リアルタイムログ、自動更新、リクエスト統計、料金表示機能を搭載。ワンクリックインストールとsystemdサービスに対応
### [CLIProxyAPI Tray](https://github.com/kitephp/CLIProxyAPI_Tray)
PowerShellスクリプトで実装されたWindowsトレイアプリケーション。サードパーティライブラリに依存せず、ショートカットの自動作成、サイレント実行、パスワード管理、チャネル切り替えMain / Plus、自動ダウンロードおよび自動更新に対応
### [霖君](https://github.com/wangdabaoqq/LinJun)
霖君はAIプログラミングアシスタントを管理するクロスプラットフォームデスクトップアプリケーションで、macOS、Windows、Linuxシステムに対応。Claude Code、Gemini CLI、OpenAI Codex、Qwen Codeなどのコーディングツールを統合管理し、ローカルプロキシによるマルチアカウントクォータ追跡とワンクリック設定が可能
### [CLIProxyAPI Dashboard](https://github.com/itsmylife44/cliproxyapi-dashboard)
Next.js、React、PostgreSQLで構築されたCLIProxyAPI用のモダンなWebベース管理ダッシュボード。リアルタイムログストリーミング、構造化された設定編集、APIキー管理、Claude/Gemini/Codex向けOAuthプロバイダー統合、使用量分析、コンテナ管理、コンパニオンプラグインによるOpenCodeとの設定同期機能を搭載 - 手動でのYAML編集は不要
### [All API Hub](https://github.com/qixing-jk/all-api-hub)
New API互換リレーサイトアカウントをワンストップで管理するブラウザ拡張機能。残高と使用量のダッシュボード、自動チェックイン、一般的なアプリへのワンクリックキーエクスポート、ページ内API可用性テスト、チャネル/モデルの同期とリダイレクト機能を搭載。Management APIを通じてCLIProxyAPIと統合し、ワンクリックでプロバイダーのインポートと設定同期が可能
### [Shadow AI](https://github.com/HEUDavid/shadow-ai)
Shadow AIは制限された環境向けに特別に設計されたAIアシスタントツールです。ウィンドウや痕跡のないステルス動作モードを提供し、LANローカルエリアネットワークを介したクロスデバイスAI質疑応答のインタラクションと制御を可能にします。本質的には「画面/音声キャプチャ + AI推論 + 低摩擦デリバリー」の自動化コラボレーションレイヤーであり、制御されたデバイスや制限された環境でアプリケーション横断的にAIアシスタントを没入的に使用できるようユーザーを支援します。
> [!NOTE]
> CLIProxyAPIをベースにプロジェクトを開発した場合は、PRを送ってこのリストに追加してください。
## その他の選択肢
以下のプロジェクトはCLIProxyAPIの移植版またはそれに触発されたものです
### [9Router](https://github.com/decolua/9router)
CLIProxyAPIに触発されたNext.js実装。インストールと使用が簡単で、フォーマット変換OpenAI/Claude/Gemini/Ollama、自動フォールバック付きコンボシステム、指数バックオフ付きマルチアカウント管理、Next.js Webダッシュボード、CLIツールCursor、Claude Code、Cline、RooCodeのサポートをゼロから構築 - APIキー不要
### [OmniRoute](https://github.com/diegosouzapw/OmniRoute)
コーディングを止めない。無料および低コストのAIモデルへのスマートルーティングと自動フォールバック。
OmniRouteはマルチプロバイダーLLM向けのAIゲートウェイですスマートルーティング、負荷分散、リトライ、フォールバックを備えたOpenAI互換エンドポイント。ポリシー、レート制限、キャッシュ、可観測性を追加して、信頼性が高くコストを意識した推論を実現します。
> [!NOTE]
> CLIProxyAPIの移植版またはそれに触発されたプロジェクトを開発した場合は、PRを送ってこのリストに追加してください。
## ライセンス
本プロジェクトはMITライセンスの下でライセンスされています - 詳細は[LICENSE](LICENSE)ファイルを参照してください。

BIN
assets/lingtrue.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

View File

@@ -26,6 +26,7 @@ import (
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
sdkauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/auth"
coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/proxyutil"
@@ -188,7 +189,7 @@ func fetchModels(ctx context.Context, auth *coreauth.Auth) []modelEntry {
httpReq.Close = true
httpReq.Header.Set("Content-Type", "application/json")
httpReq.Header.Set("Authorization", "Bearer "+accessToken)
httpReq.Header.Set("User-Agent", "antigravity/1.19.6 darwin/arm64")
httpReq.Header.Set("User-Agent", misc.AntigravityUserAgent())
httpClient := &http.Client{Timeout: 30 * time.Second}
if transport, _, errProxy := proxyutil.BuildHTTPTransport(auth.ProxyURL); errProxy == nil && transport != nil {

20
cmd/mcpdebug/main.go Normal file
View File

@@ -0,0 +1,20 @@
package main
import (
"encoding/hex"
"fmt"
"os"
cursorproto "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/cursor/proto"
)
func main() {
// Encode MCP result with empty execId
resultBytes := cursorproto.EncodeExecMcpResult(1, "", `{"test": "data"}`, false)
fmt.Printf("Result protobuf hex: %s\n", hex.EncodeToString(resultBytes))
fmt.Printf("Result length: %d bytes\n", len(resultBytes))
// Write to file for analysis
os.WriteFile("mcp_result.bin", resultBytes)
fmt.Println("Wrote mcp_result.bin")
}

32
cmd/protocheck/main.go Normal file
View File

@@ -0,0 +1,32 @@
package main
import (
"fmt"
cursorproto "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/cursor/proto"
)
func main() {
ecm := cursorproto.NewMsg("ExecClientMessage")
// Try different field names
names := []string{
"mcp_result", "mcpResult", "McpResult", "MCP_RESULT",
"shell_result", "shellResult",
}
for _, name := range names {
fd := ecm.Descriptor().Fields().ByName(name)
if fd != nil {
fmt.Printf("Found field %q: number=%d, kind=%s\n", name, fd.Number(), fd.Kind())
} else {
fmt.Printf("Field %q NOT FOUND\n", name)
}
}
// List all fields
fmt.Println("\nAll fields in ExecClientMessage:")
for i := 0; i < ecm.Descriptor().Fields().Len(); i++ {
f := ecm.Descriptor().Fields().Get(i)
fmt.Printf(" %d: %q (number=%d)\n", i, f.Name(), f.Number())
}
}

View File

@@ -75,7 +75,6 @@ func main() {
var codexLogin bool
var codexDeviceLogin bool
var claudeLogin bool
var qwenLogin bool
var kiloLogin bool
var iflowLogin bool
var iflowCookie bool
@@ -85,6 +84,7 @@ func main() {
var oauthCallbackPort int
var antigravityLogin bool
var kimiLogin bool
var cursorLogin bool
var kiroLogin bool
var kiroGoogleLogin bool
var kiroAWSLogin bool
@@ -98,6 +98,7 @@ func main() {
var codeBuddyLogin bool
var projectID string
var vertexImport string
var vertexImportPrefix string
var configPath string
var password string
var tuiMode bool
@@ -111,7 +112,6 @@ func main() {
flag.BoolVar(&codexLogin, "codex-login", false, "Login to Codex using OAuth")
flag.BoolVar(&codexDeviceLogin, "codex-device-login", false, "Login to Codex using device code flow")
flag.BoolVar(&claudeLogin, "claude-login", false, "Login to Claude using OAuth")
flag.BoolVar(&qwenLogin, "qwen-login", false, "Login to Qwen using OAuth")
flag.BoolVar(&kiloLogin, "kilo-login", false, "Login to Kilo AI using device flow")
flag.BoolVar(&iflowLogin, "iflow-login", false, "Login to iFlow using OAuth")
flag.BoolVar(&iflowCookie, "iflow-cookie", false, "Login to iFlow using Cookie")
@@ -123,6 +123,7 @@ func main() {
flag.BoolVar(&noIncognito, "no-incognito", false, "Force disable incognito mode (uses existing browser session)")
flag.BoolVar(&antigravityLogin, "antigravity-login", false, "Login to Antigravity using OAuth")
flag.BoolVar(&kimiLogin, "kimi-login", false, "Login to Kimi using OAuth")
flag.BoolVar(&cursorLogin, "cursor-login", false, "Login to Cursor using OAuth")
flag.BoolVar(&kiroLogin, "kiro-login", false, "Login to Kiro using Google OAuth")
flag.BoolVar(&kiroGoogleLogin, "kiro-google-login", false, "Login to Kiro using Google OAuth (same as --kiro-login)")
flag.BoolVar(&kiroAWSLogin, "kiro-aws-login", false, "Login to Kiro using AWS Builder ID (device code flow)")
@@ -137,6 +138,7 @@ func main() {
flag.StringVar(&projectID, "project_id", "", "Project ID (Gemini only, not required)")
flag.StringVar(&configPath, "config", DefaultConfigPath, "Configure File Path")
flag.StringVar(&vertexImport, "vertex-import", "", "Import Vertex service account key JSON file")
flag.StringVar(&vertexImportPrefix, "vertex-import-prefix", "", "Prefix for Vertex model namespacing (use with -vertex-import)")
flag.StringVar(&password, "password", "", "")
flag.BoolVar(&tuiMode, "tui", false, "Start with terminal management UI")
flag.BoolVar(&standalone, "standalone", false, "In TUI mode, start an embedded local server")
@@ -186,6 +188,7 @@ func main() {
gitStoreRemoteURL string
gitStoreUser string
gitStorePassword string
gitStoreBranch string
gitStoreLocalPath string
gitStoreInst *store.GitTokenStore
gitStoreRoot string
@@ -255,6 +258,9 @@ func main() {
if value, ok := lookupEnv("GITSTORE_LOCAL_PATH", "gitstore_local_path"); ok {
gitStoreLocalPath = value
}
if value, ok := lookupEnv("GITSTORE_GIT_BRANCH", "gitstore_git_branch"); ok {
gitStoreBranch = value
}
if value, ok := lookupEnv("OBJECTSTORE_ENDPOINT", "objectstore_endpoint"); ok {
useObjectStore = true
objectStoreEndpoint = value
@@ -389,7 +395,7 @@ func main() {
}
gitStoreRoot = filepath.Join(gitStoreLocalPath, "gitstore")
authDir := filepath.Join(gitStoreRoot, "auths")
gitStoreInst = store.NewGitTokenStore(gitStoreRemoteURL, gitStoreUser, gitStorePassword)
gitStoreInst = store.NewGitTokenStore(gitStoreRemoteURL, gitStoreUser, gitStorePassword, gitStoreBranch)
gitStoreInst.SetBaseDir(authDir)
if errRepo := gitStoreInst.EnsureRepository(); errRepo != nil {
log.Errorf("failed to prepare git token store: %v", errRepo)
@@ -508,7 +514,7 @@ func main() {
if vertexImport != "" {
// Handle Vertex service account import
cmd.DoVertexImport(cfg, vertexImport)
cmd.DoVertexImport(cfg, vertexImport, vertexImportPrefix)
} else if login {
// Handle Google/Gemini login
cmd.DoLogin(cfg, projectID, options)
@@ -530,8 +536,6 @@ func main() {
} else if claudeLogin {
// Handle Claude login
cmd.DoClaudeLogin(cfg, options)
} else if qwenLogin {
cmd.DoQwenLogin(cfg, options)
} else if kiloLogin {
cmd.DoKiloLogin(cfg, options)
} else if iflowLogin {
@@ -544,6 +548,8 @@ func main() {
cmd.DoGitLabTokenLogin(cfg, options)
} else if kimiLogin {
cmd.DoKimiLogin(cfg, options)
} else if cursorLogin {
cmd.DoCursorLogin(cfg, options)
} else if kiroLogin {
// For Kiro auth, default to incognito mode for multi-account support
// Users can explicitly override with --no-incognito
@@ -592,6 +598,7 @@ func main() {
if standalone {
// Standalone mode: start an embedded local server and connect TUI client to it.
managementasset.StartAutoUpdater(context.Background(), configFilePath)
misc.StartAntigravityVersionUpdater(context.Background())
if !localModel {
registry.StartModelsUpdater(context.Background())
}
@@ -667,6 +674,7 @@ func main() {
} else {
// Start the main proxy service
managementasset.StartAutoUpdater(context.Background(), configFilePath)
misc.StartAntigravityVersionUpdater(context.Background())
if !localModel {
registry.StartModelsUpdater(context.Background())
}

View File

@@ -92,26 +92,54 @@ max-retry-credentials: 0
# Maximum wait time in seconds for a cooled-down credential before triggering a retry.
max-retry-interval: 30
# When true, disable auth/model cooldown scheduling globally (prevents blackout windows after failure states).
disable-cooling: false
# Core auth auto-refresh worker pool size (OAuth/file-based auth token refresh).
# When > 0, overrides the default worker count (16).
# auth-auto-refresh-workers: 16
# Quota exceeded behavior
quota-exceeded:
switch-project: true # Whether to automatically switch to another project when a quota is exceeded
switch-preview-model: true # Whether to automatically switch to a preview model when a quota is exceeded
antigravity-credits: true # Whether to retry Antigravity quota_exhausted 429s once with enabledCreditTypes=["GOOGLE_ONE_AI"]
# Routing strategy for selecting credentials when multiple match.
routing:
strategy: 'round-robin' # round-robin (default), fill-first
strategy: "round-robin" # round-robin (default), fill-first
# Enable universal session-sticky routing for all clients.
# Session IDs are extracted from: X-Session-ID header, Idempotency-Key,
# metadata.user_id, conversation_id, or first few messages hash.
# Automatic failover is always enabled when bound auth becomes unavailable.
session-affinity: false # default: false
# How long session-to-auth bindings are retained. Default: 1h
session-affinity-ttl: "1h"
# When true, enable authentication for the WebSocket API (/v1/ws).
ws-auth: false
# When true, enable Gemini CLI internal endpoints (/v1internal:*).
# Default is false for safety.
enable-gemini-cli-endpoint: false
# When > 0, emit blank lines every N seconds for non-streaming responses to prevent idle timeouts.
nonstream-keepalive-interval: 0
# Streaming behavior (SSE keep-alives + safe bootstrap retries).
# streaming:
# keepalive-seconds: 15 # Default: 0 (disabled). <= 0 disables keep-alives.
# bootstrap-retries: 1 # Default: 0 (disabled). Retries before first byte is sent.
# Signature cache validation for thinking blocks (Antigravity/Claude).
# When true (default), cached signatures are preferred and validated.
# When false, client signatures are used directly after normalization (bypass mode for testing).
# antigravity-signature-cache-enabled: true
# Bypass mode signature validation strictness (only applies when signature cache is disabled).
# When true, validates full Claude protobuf tree (Field 2 -> Field 1 structure).
# When false (default), only checks R/E prefix + base64 + first byte 0x12.
# antigravity-signature-bypass-strict: false
# Gemini API keys
# gemini-api-key:
# - api-key: "AIzaSy...01"
@@ -177,6 +205,8 @@ nonstream-keepalive-interval: 0
# - "API"
# - "proxy"
# cache-user-id: true # optional: default is false; set true to reuse cached user_id per API key instead of generating a random one each request
# experimental-cch-signing: false # optional: default is false; when true, sign the final /v1/messages body using the current Claude Code cch algorithm
# # keep this disabled unless you explicitly need the behavior, so upstream seed changes fall back to legacy proxy behavior
# Default headers for Claude API requests. Update when Claude Code releases new versions.
# In legacy mode, user-agent/package-version/runtime-version/timeout are used as fallbacks
@@ -250,7 +280,7 @@ nonstream-keepalive-interval: 0
# # Requests to that alias will round-robin across the upstream names below,
# # and if the chosen upstream fails before producing output, the request will
# # continue with the next upstream model in the same alias pool.
# - name: "qwen3.5-plus"
# - name: "deepseek-v3.1"
# alias: "claude-opus-4.66"
# - name: "glm-5"
# alias: "claude-opus-4.66"
@@ -311,8 +341,12 @@ nonstream-keepalive-interval: 0
# Global OAuth model name aliases (per channel)
# These aliases rename model IDs for both model listing and request routing.
# Supported channels: gemini-cli, vertex, aistudio, antigravity, claude, codex, qwen, iflow, kiro, github-copilot, kimi.
# Supported channels: gemini-cli, vertex, aistudio, antigravity, claude, codex, iflow, kiro, github-copilot, kimi.
# NOTE: Aliases do not apply to gemini-api-key, codex-api-key, claude-api-key, openai-compatibility, vertex-api-key, or ampcode.
# NOTE: Because aliases affect the merged /v1 model list and merged request routing, overlapping
# client-visible names can become ambiguous across providers. /api/provider/{provider}/... helps
# you select the protocol surface, but inference backend selection can still follow the resolved
# model/alias. For strict backend pinning, use unique aliases/prefixes or avoid overlapping names.
# You can repeat the same name with different aliases to expose multiple client model names.
# oauth-model-alias:
# antigravity:
@@ -346,12 +380,6 @@ nonstream-keepalive-interval: 0
# codex:
# - name: "gpt-5"
# alias: "g5"
# qwen:
# - name: "qwen3-coder-plus"
# alias: "qwen-plus"
# iflow:
# - name: "glm-4.7"
# alias: "glm-god"
# kimi:
# - name: "kimi-k2.5"
# alias: "k2.5"
@@ -380,10 +408,6 @@ nonstream-keepalive-interval: 0
# - "claude-3-5-haiku-20241022"
# codex:
# - "gpt-5-codex-mini"
# qwen:
# - "vision-model"
# iflow:
# - "tstars2.0"
# kimi:
# - "kimi-k2-thinking"
# kiro:

View File

@@ -109,10 +109,19 @@ wait_for_service() {
sleep 2
}
if [[ "${1:-}" == "--with-usage" ]]; then
WITH_USAGE=true
export_stats_api_secret
fi
case "${1:-}" in
"")
;;
"--with-usage")
WITH_USAGE=true
export_stats_api_secret
;;
*)
echo "Error: unknown option '${1}'. Did you mean '--with-usage'?"
echo "Usage: ./docker-build.sh [--with-usage]"
exit 1
;;
esac
# --- Step 1: Choose Environment ---
echo "Please select an option:"

View File

@@ -1,278 +0,0 @@
# Plan: GitLab Duo Codex Parity
**Generated**: 2026-03-10
**Estimated Complexity**: High
## Overview
Bring GitLab Duo support from the current "auth + basic executor" stage to the same practical level as `codex` inside `CLIProxyAPI`: a user logs in once, points external clients such as Claude Code at `CLIProxyAPI`, selects GitLab Duo-backed models, and gets stable streaming, multi-turn behavior, tool calling compatibility, and predictable model routing without manual provider-specific workarounds.
The core architectural shift is to stop treating GitLab Duo as only two REST wrappers (`/api/v4/chat/completions` and `/api/v4/code_suggestions/completions`) and instead use GitLab's `direct_access` contract as the primary runtime entrypoint wherever possible. Official GitLab docs confirm that `direct_access` returns AI gateway connection details, headers, token, and expiry; that contract is the closest path to codex-like provider behavior.
## Prerequisites
- Official GitLab Duo API references confirmed during implementation:
- `POST /api/v4/code_suggestions/direct_access`
- `POST /api/v4/code_suggestions/completions`
- `POST /api/v4/chat/completions`
- Access to at least one real GitLab Duo account for manual verification.
- One downstream client target for acceptance testing:
- Claude Code against Claude-compatible endpoint
- OpenAI-compatible client against `/v1/chat/completions` and `/v1/responses`
- Existing PR branch as starting point:
- `feat/gitlab-duo-auth`
- PR [#2028](https://github.com/router-for-me/CLIProxyAPI/pull/2028)
## Definition Of Done
- GitLab Duo models can be used via `CLIProxyAPI` from the same client surfaces that already work for `codex`.
- Upstream streaming is real passthrough or faithful chunked forwarding, not synthetic whole-response replay.
- Tool/function calling survives translation layers without dropping fields or corrupting names.
- Multi-turn and session semantics are stable across `chat/completions`, `responses`, and Claude-compatible routes.
- Model exposure stays current from GitLab metadata or gateway discovery without hardcoded stale model tables.
- `go test ./...` stays green and at least one real manual end-to-end client flow is documented.
## Sprint 1: Contract And Gap Closure
**Goal**: Replace assumptions with a hard compatibility contract between current `codex` behavior and what GitLab Duo can actually support.
**Demo/Validation**:
- Written matrix showing `codex` features vs current GitLab Duo behavior.
- One checked-in developer note or test fixture for real GitLab Duo payload examples.
### Task 1.1: Freeze Codex Parity Checklist
- **Location**: [internal/runtime/executor/codex_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/codex_executor.go), [internal/runtime/executor/codex_websockets_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/codex_websockets_executor.go), [sdk/api/handlers/openai/openai_responses_handlers.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai/openai_responses_handlers.go), [sdk/api/handlers/openai/openai_responses_websocket.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai/openai_responses_websocket.go)
- **Description**: Produce a concrete feature matrix for `codex`: HTTP execute, SSE execute, `/v1/responses`, websocket downstream path, tool calling, request IDs, session close semantics, and model registration behavior.
- **Dependencies**: None
- **Acceptance Criteria**:
- A checklist exists in repo docs or issue notes.
- Each capability is marked `required`, `optional`, or `not possible` for GitLab Duo.
- **Validation**:
- Review against current `codex` code paths.
### Task 1.2: Lock GitLab Duo Runtime Contract
- **Location**: [internal/auth/gitlab/gitlab.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/auth/gitlab/gitlab.go), [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go)
- **Description**: Validate the exact upstream contract we can rely on:
- `direct_access` fields and refresh cadence
- whether AI gateway path is usable directly
- when `chat/completions` is available vs when fallback is required
- what streaming shape is returned by `code_suggestions/completions?stream=true`
- **Dependencies**: Task 1.1
- **Acceptance Criteria**:
- GitLab transport decision is explicit: `gateway-first`, `REST-first`, or `hybrid`.
- Unknown areas are isolated behind feature flags, not spread across executor logic.
- **Validation**:
- Official docs + captured real responses from a Duo account.
### Task 1.3: Define Client-Facing Compatibility Targets
- **Location**: [README.md](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/README.md), [gitlab-duo-codex-parity-plan.md](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/gitlab-duo-codex-parity-plan.md)
- **Description**: Define exactly which external flows must work to call GitLab Duo support "like codex".
- **Dependencies**: Task 1.2
- **Acceptance Criteria**:
- Required surfaces are listed:
- Claude-compatible route
- OpenAI `chat/completions`
- OpenAI `responses`
- optional downstream websocket path
- Non-goals are explicit if GitLab upstream cannot support them.
- **Validation**:
- Maintainer review of stated scope.
## Sprint 2: Primary Transport Parity
**Goal**: Move GitLab Duo execution onto a transport that supports codex-like runtime behavior.
**Demo/Validation**:
- A GitLab Duo model works over real streaming through `/v1/chat/completions`.
- No synthetic "collect full body then fake stream" path remains on the primary flow.
### Task 2.1: Refactor GitLab Executor Into Strategy Layers
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go)
- **Description**: Split current executor into explicit strategies:
- auth refresh/direct access refresh
- gateway transport
- GitLab REST fallback transport
- downstream translation helpers
- **Dependencies**: Sprint 1
- **Acceptance Criteria**:
- Executor no longer mixes discovery, refresh, fallback selection, and response synthesis in one path.
- Transport choice is testable in isolation.
- **Validation**:
- Unit tests for strategy selection and fallback boundaries.
### Task 2.2: Implement Real Streaming Path
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [internal/runtime/executor/gitlab_executor_test.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor_test.go)
- **Description**: Replace synthetic streaming with true upstream incremental forwarding:
- use gateway stream if available
- otherwise consume GitLab Code Suggestions streaming response and map chunks incrementally
- **Dependencies**: Task 2.1
- **Acceptance Criteria**:
- `ExecuteStream` emits chunks before upstream completion.
- error handling preserves status and early failure semantics.
- **Validation**:
- tests with chunked upstream server
- manual curl check against `/v1/chat/completions` with `stream=true`
### Task 2.3: Preserve Upstream Auth And Headers Correctly
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [internal/auth/gitlab/gitlab.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/auth/gitlab/gitlab.go)
- **Description**: Use `direct_access` connection details as first-class transport state:
- gateway token
- expiry
- mandatory forwarded headers
- model metadata
- **Dependencies**: Task 2.1
- **Acceptance Criteria**:
- executor stops ignoring gateway headers/token when transport requires them
- refresh logic never over-fetches `direct_access`
- **Validation**:
- tests verifying propagated headers and refresh interval behavior
## Sprint 3: Request/Response Semantics Parity
**Goal**: Make GitLab Duo behave correctly under the same request shapes that current `codex` consumers send.
**Demo/Validation**:
- OpenAI and Claude-compatible clients can do non-streaming and streaming conversations without losing structure.
### Task 3.1: Normalize Multi-Turn Message Mapping
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [sdk/translator](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/translator)
- **Description**: Replace the current "flatten prompt into one instruction" behavior with stable multi-turn mapping:
- preserve system context
- preserve user/assistant ordering
- maintain bounded context truncation
- **Dependencies**: Sprint 2
- **Acceptance Criteria**:
- multi-turn requests are not collapsed into a lossy single string unless fallback mode explicitly requires it
- truncation policy is deterministic and tested
- **Validation**:
- golden tests for request mapping
### Task 3.2: Tool Calling Compatibility Layer
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [sdk/api/handlers/openai/openai_responses_handlers.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai/openai_responses_handlers.go)
- **Description**: Decide and implement one of two paths:
- native pass-through if GitLab gateway supports tool/function structures
- strict downgrade path with explicit unsupported errors instead of silent field loss
- **Dependencies**: Task 3.1
- **Acceptance Criteria**:
- tool-related fields are either preserved correctly or rejected explicitly
- no silent corruption of tool names, tool calls, or tool results
- **Validation**:
- table-driven tests for tool payloads
- one manual client scenario using tools
### Task 3.3: Token Counting And Usage Reporting Fidelity
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [internal/runtime/executor/usage_helpers.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/usage_helpers.go)
- **Description**: Improve token/usage reporting so GitLab models behave like first-class providers in logs and scheduling.
- **Dependencies**: Sprint 2
- **Acceptance Criteria**:
- `CountTokens` uses the closest supported estimation path
- usage logging distinguishes prompt vs completion when possible
- **Validation**:
- unit tests for token estimation outputs
## Sprint 4: Responses And Session Parity
**Goal**: Reach codex-level support for OpenAI Responses clients and long-lived sessions where GitLab upstream permits it.
**Demo/Validation**:
- `/v1/responses` works with GitLab Duo in a realistic client flow.
- If websocket parity is not possible, the code explicitly declines it and keeps HTTP paths stable.
### Task 4.1: Make GitLab Compatible With `/v1/responses`
- **Location**: [sdk/api/handlers/openai/openai_responses_handlers.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai/openai_responses_handlers.go), [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go)
- **Description**: Ensure GitLab transport can safely back the Responses API path, including compact responses if applicable.
- **Dependencies**: Sprint 3
- **Acceptance Criteria**:
- GitLab Duo can be selected behind `/v1/responses`
- response IDs and follow-up semantics are defined
- **Validation**:
- handler tests analogous to codex/openai responses tests
### Task 4.2: Evaluate Downstream Websocket Parity
- **Location**: [sdk/api/handlers/openai/openai_responses_websocket.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai/openai_responses_websocket.go), [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go)
- **Description**: Decide whether GitLab Duo can support downstream websocket sessions like codex:
- if yes, add session-aware execution path
- if no, mark GitLab auth as websocket-ineligible and keep HTTP routes first-class
- **Dependencies**: Task 4.1
- **Acceptance Criteria**:
- websocket behavior is explicit, not accidental
- no route claims websocket support when the upstream cannot honor it
- **Validation**:
- websocket handler tests or explicit capability tests
### Task 4.3: Add Session Cleanup And Failure Recovery Semantics
- **Location**: [internal/runtime/executor/gitlab_executor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor.go), [sdk/cliproxy/auth/conductor.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/cliproxy/auth/conductor.go)
- **Description**: Add codex-like session cleanup, retry boundaries, and model suspension/resume behavior for GitLab failures and quota events.
- **Dependencies**: Sprint 2
- **Acceptance Criteria**:
- auth/model cooldown behavior is predictable on GitLab 4xx/5xx/quota responses
- executor cleans up per-session resources if any are introduced
- **Validation**:
- tests for quota and retry behavior
## Sprint 5: Client UX, Model UX, And Manual E2E
**Goal**: Make GitLab Duo feel like a normal built-in provider to operators and downstream clients.
**Demo/Validation**:
- A documented setup exists for "login once, point Claude Code at CLIProxyAPI, use GitLab Duo-backed model".
### Task 5.1: Model Alias And Provider UX Cleanup
- **Location**: [sdk/cliproxy/service.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/cliproxy/service.go), [README.md](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/README.md)
- **Description**: Normalize what users see:
- stable alias such as `gitlab-duo`
- discovered upstream model names
- optional prefix behavior
- account labels that clearly distinguish OAuth vs PAT
- **Dependencies**: Sprint 3
- **Acceptance Criteria**:
- users can select a stable GitLab alias even when upstream model changes
- dynamic model discovery does not cause confusing model churn
- **Validation**:
- registry tests and manual `/v1/models` inspection
### Task 5.2: Add Real End-To-End Acceptance Tests
- **Location**: [internal/runtime/executor/gitlab_executor_test.go](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/internal/runtime/executor/gitlab_executor_test.go), [sdk/api/handlers/openai](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/sdk/api/handlers/openai)
- **Description**: Add higher-level tests covering the actual proxy surfaces:
- OpenAI `chat/completions`
- OpenAI `responses`
- Claude-compatible request path if GitLab is routed there
- **Dependencies**: Sprint 4
- **Acceptance Criteria**:
- tests fail if streaming regresses into synthetic buffering again
- tests cover at least one tool-related request and one multi-turn request
- **Validation**:
- `go test ./...`
### Task 5.3: Publish Operator Documentation
- **Location**: [README.md](/home/luxvtz/projects/cliproxyapi/CLIProxyAPI/README.md)
- **Description**: Document:
- OAuth setup requirements
- PAT requirements
- current capability matrix
- known limitations if websocket/tool parity is partial
- **Dependencies**: Sprint 5.1
- **Acceptance Criteria**:
- setup instructions are enough for a new user to reproduce the GitLab Duo flow
- limitations are explicit
- **Validation**:
- dry-run docs review from a clean environment
## Testing Strategy
- Keep `go test ./...` green after every committable task.
- Add table-driven tests first for request mapping, refresh behavior, and dynamic model registration.
- Add transport tests with `httptest.Server` for:
- real chunked streaming
- header propagation from `direct_access`
- upstream fallback rules
- Add at least one manual acceptance checklist:
- login via OAuth
- login via PAT
- list models
- run one streaming prompt via OpenAI route
- run one prompt from the target downstream client
## Potential Risks & Gotchas
- GitLab public docs expose `direct_access`, but do not fully document every possible AI gateway path. We should isolate any empirically discovered gateway assumptions behind one transport layer and feature flags.
- `chat/completions` availability differs by GitLab offering and version. The executor must not assume it always exists.
- Code Suggestions is completion-oriented; lossy mapping from rich chat/tool payloads will make GitLab Duo feel worse than codex unless explicitly handled.
- Synthetic streaming is not good enough for codex parity and will cause regressions in interactive clients.
- Dynamic model discovery can create unstable UX if the stable alias and discovered model IDs are not separated cleanly.
- PAT auth may validate successfully while still lacking effective Duo permissions. Error reporting must surface this explicitly.
## Rollback Plan
- Keep the current basic GitLab executor behind a fallback mode until the new transport path is stable.
- If parity work destabilizes existing providers, revert only GitLab-specific executor changes and leave auth support intact.
- Preserve the stable `gitlab-duo` alias so rollback does not break client configuration.

1
go.mod
View File

@@ -83,6 +83,7 @@ require (
github.com/muesli/cancelreader v0.2.2 // indirect
github.com/muesli/termenv v0.16.0 // indirect
github.com/pelletier/go-toml/v2 v2.2.2 // indirect
github.com/pierrec/xxHash v0.1.5
github.com/pjbgf/sha1cd v0.5.0 // indirect
github.com/rivo/uniseg v0.4.7 // indirect
github.com/rs/xid v1.5.0 // indirect

2
go.sum
View File

@@ -154,6 +154,8 @@ github.com/muesli/termenv v0.16.0 h1:S5AlUN9dENB57rsbnkPyfdGuWIlkmzJjbFf0Tf5FWUc
github.com/muesli/termenv v0.16.0/go.mod h1:ZRfOIKPFDYQoDFF4Olj7/QJbW60Ol/kL1pU3VfY/Cnk=
github.com/pelletier/go-toml/v2 v2.2.2 h1:aYUidT7k73Pcl9nb2gScu7NSrKCSHIDE89b3+6Wq+LM=
github.com/pelletier/go-toml/v2 v2.2.2/go.mod h1:1t835xjRzz80PqgE6HHgN2JOsmgYu/h4qDAS4n929Rs=
github.com/pierrec/xxHash v0.1.5 h1:n/jBpwTHiER4xYvK3/CdPVnLDPchj8eTJFFLUb4QHBo=
github.com/pierrec/xxHash v0.1.5/go.mod h1:w2waW5Zoa/Wc4Yqe0wgrIYAGKqRMf7czn2HNKXmuL+I=
github.com/pjbgf/sha1cd v0.5.0 h1:a+UkboSi1znleCDUNT3M5YxjOnN1fz2FhN48FlwCxs0=
github.com/pjbgf/sha1cd v0.5.0/go.mod h1:lhpGlyHLpQZoxMv8HcgXvZEhcGs0PG/vsZnEJ7H0iCM=
github.com/pkg/browser v0.0.0-20240102092130-5ac0b6a4141c h1:+mdjkGKdHQG3305AYmdv1U2eRNDiU2ErMBj1gwrq8eQ=

View File

@@ -13,6 +13,7 @@ import (
"github.com/fxamacker/cbor/v2"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/geminicli"
coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/proxyutil"
@@ -700,6 +701,11 @@ func (h *Handler) apiCallTransport(auth *coreauth.Auth) http.RoundTripper {
if proxyStr := strings.TrimSpace(auth.ProxyURL); proxyStr != "" {
proxyCandidates = append(proxyCandidates, proxyStr)
}
if h != nil && h.cfg != nil {
if proxyStr := strings.TrimSpace(proxyURLFromAPIKeyConfig(h.cfg, auth)); proxyStr != "" {
proxyCandidates = append(proxyCandidates, proxyStr)
}
}
}
if h != nil && h.cfg != nil {
if proxyStr := strings.TrimSpace(h.cfg.ProxyURL); proxyStr != "" {
@@ -722,6 +728,123 @@ func (h *Handler) apiCallTransport(auth *coreauth.Auth) http.RoundTripper {
return clone
}
type apiKeyConfigEntry interface {
GetAPIKey() string
GetBaseURL() string
}
func resolveAPIKeyConfig[T apiKeyConfigEntry](entries []T, auth *coreauth.Auth) *T {
if auth == nil || len(entries) == 0 {
return nil
}
attrKey, attrBase := "", ""
if auth.Attributes != nil {
attrKey = strings.TrimSpace(auth.Attributes["api_key"])
attrBase = strings.TrimSpace(auth.Attributes["base_url"])
}
for i := range entries {
entry := &entries[i]
cfgKey := strings.TrimSpace((*entry).GetAPIKey())
cfgBase := strings.TrimSpace((*entry).GetBaseURL())
if attrKey != "" && attrBase != "" {
if strings.EqualFold(cfgKey, attrKey) && strings.EqualFold(cfgBase, attrBase) {
return entry
}
continue
}
if attrKey != "" && strings.EqualFold(cfgKey, attrKey) {
if cfgBase == "" || strings.EqualFold(cfgBase, attrBase) {
return entry
}
}
if attrKey == "" && attrBase != "" && strings.EqualFold(cfgBase, attrBase) {
return entry
}
}
if attrKey != "" {
for i := range entries {
entry := &entries[i]
if strings.EqualFold(strings.TrimSpace((*entry).GetAPIKey()), attrKey) {
return entry
}
}
}
return nil
}
func proxyURLFromAPIKeyConfig(cfg *config.Config, auth *coreauth.Auth) string {
if cfg == nil || auth == nil {
return ""
}
authKind, authAccount := auth.AccountInfo()
if !strings.EqualFold(strings.TrimSpace(authKind), "api_key") {
return ""
}
attrs := auth.Attributes
compatName := ""
providerKey := ""
if len(attrs) > 0 {
compatName = strings.TrimSpace(attrs["compat_name"])
providerKey = strings.TrimSpace(attrs["provider_key"])
}
if compatName != "" || strings.EqualFold(strings.TrimSpace(auth.Provider), "openai-compatibility") {
return resolveOpenAICompatAPIKeyProxyURL(cfg, auth, strings.TrimSpace(authAccount), providerKey, compatName)
}
switch strings.ToLower(strings.TrimSpace(auth.Provider)) {
case "gemini":
if entry := resolveAPIKeyConfig(cfg.GeminiKey, auth); entry != nil {
return strings.TrimSpace(entry.ProxyURL)
}
case "claude":
if entry := resolveAPIKeyConfig(cfg.ClaudeKey, auth); entry != nil {
return strings.TrimSpace(entry.ProxyURL)
}
case "codex":
if entry := resolveAPIKeyConfig(cfg.CodexKey, auth); entry != nil {
return strings.TrimSpace(entry.ProxyURL)
}
}
return ""
}
func resolveOpenAICompatAPIKeyProxyURL(cfg *config.Config, auth *coreauth.Auth, apiKey, providerKey, compatName string) string {
if cfg == nil || auth == nil {
return ""
}
apiKey = strings.TrimSpace(apiKey)
if apiKey == "" {
return ""
}
candidates := make([]string, 0, 3)
if v := strings.TrimSpace(compatName); v != "" {
candidates = append(candidates, v)
}
if v := strings.TrimSpace(providerKey); v != "" {
candidates = append(candidates, v)
}
if v := strings.TrimSpace(auth.Provider); v != "" {
candidates = append(candidates, v)
}
for i := range cfg.OpenAICompatibility {
compat := &cfg.OpenAICompatibility[i]
for _, candidate := range candidates {
if candidate != "" && strings.EqualFold(strings.TrimSpace(candidate), compat.Name) {
for j := range compat.APIKeyEntries {
entry := &compat.APIKeyEntries[j]
if strings.EqualFold(strings.TrimSpace(entry.APIKey), apiKey) {
return strings.TrimSpace(entry.ProxyURL)
}
}
return ""
}
}
}
return ""
}
func buildProxyTransport(proxyStr string) *http.Transport {
transport, _, errBuild := proxyutil.BuildHTTPTransport(proxyStr)
if errBuild != nil {

View File

@@ -58,6 +58,105 @@ func TestAPICallTransportInvalidAuthFallsBackToGlobalProxy(t *testing.T) {
}
}
func TestAPICallTransportAPIKeyAuthFallsBackToConfigProxyURL(t *testing.T) {
t.Parallel()
h := &Handler{
cfg: &config.Config{
SDKConfig: sdkconfig.SDKConfig{ProxyURL: "http://global-proxy.example.com:8080"},
GeminiKey: []config.GeminiKey{{
APIKey: "gemini-key",
ProxyURL: "http://gemini-proxy.example.com:8080",
}},
ClaudeKey: []config.ClaudeKey{{
APIKey: "claude-key",
ProxyURL: "http://claude-proxy.example.com:8080",
}},
CodexKey: []config.CodexKey{{
APIKey: "codex-key",
ProxyURL: "http://codex-proxy.example.com:8080",
}},
OpenAICompatibility: []config.OpenAICompatibility{{
Name: "bohe",
BaseURL: "https://bohe.example.com",
APIKeyEntries: []config.OpenAICompatibilityAPIKey{{
APIKey: "compat-key",
ProxyURL: "http://compat-proxy.example.com:8080",
}},
}},
},
}
cases := []struct {
name string
auth *coreauth.Auth
wantProxy string
}{
{
name: "gemini",
auth: &coreauth.Auth{
Provider: "gemini",
Attributes: map[string]string{"api_key": "gemini-key"},
},
wantProxy: "http://gemini-proxy.example.com:8080",
},
{
name: "claude",
auth: &coreauth.Auth{
Provider: "claude",
Attributes: map[string]string{"api_key": "claude-key"},
},
wantProxy: "http://claude-proxy.example.com:8080",
},
{
name: "codex",
auth: &coreauth.Auth{
Provider: "codex",
Attributes: map[string]string{"api_key": "codex-key"},
},
wantProxy: "http://codex-proxy.example.com:8080",
},
{
name: "openai-compatibility",
auth: &coreauth.Auth{
Provider: "bohe",
Attributes: map[string]string{
"api_key": "compat-key",
"compat_name": "bohe",
"provider_key": "bohe",
},
},
wantProxy: "http://compat-proxy.example.com:8080",
},
}
for _, tc := range cases {
tc := tc
t.Run(tc.name, func(t *testing.T) {
t.Parallel()
transport := h.apiCallTransport(tc.auth)
httpTransport, ok := transport.(*http.Transport)
if !ok {
t.Fatalf("transport type = %T, want *http.Transport", transport)
}
req, errRequest := http.NewRequest(http.MethodGet, "https://example.com", nil)
if errRequest != nil {
t.Fatalf("http.NewRequest returned error: %v", errRequest)
}
proxyURL, errProxy := httpTransport.Proxy(req)
if errProxy != nil {
t.Fatalf("httpTransport.Proxy returned error: %v", errProxy)
}
if proxyURL == nil || proxyURL.String() != tc.wantProxy {
t.Fatalf("proxy URL = %v, want %s", proxyURL, tc.wantProxy)
}
})
}
}
func TestAuthByIndexDistinguishesSharedAPIKeysAcrossProviders(t *testing.T) {
t.Parallel()

View File

@@ -29,13 +29,13 @@ import (
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/claude"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/codex"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/copilot"
cursorauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/cursor"
geminiAuth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/gemini"
gitlabauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/gitlab"
iflowauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/iflow"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/kilo"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/kimi"
kiroauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/kiro"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/qwen"
"github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
@@ -151,7 +151,7 @@ func startCallbackForwarder(port int, provider, targetBase string) (*callbackFor
stopForwarderInstance(port, prev)
}
addr := fmt.Sprintf("127.0.0.1:%d", port)
addr := fmt.Sprintf("0.0.0.0:%d", port)
ln, err := net.Listen("tcp", addr)
if err != nil {
return nil, fmt.Errorf("failed to listen on %s: %w", addr, err)
@@ -1046,6 +1046,7 @@ func (h *Handler) buildAuthFromFileData(path string, data []byte) (*coreauth.Aut
auth.Runtime = existing.Runtime
}
}
coreauth.ApplyCustomHeadersFromMetadata(auth)
return auth, nil
}
@@ -1128,7 +1129,7 @@ func (h *Handler) PatchAuthFileStatus(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"status": "ok", "disabled": *req.Disabled})
}
// PatchAuthFileFields updates editable fields (prefix, proxy_url, priority, note) of an auth file.
// PatchAuthFileFields updates editable fields (prefix, proxy_url, headers, priority, note) of an auth file.
func (h *Handler) PatchAuthFileFields(c *gin.Context) {
if h.authManager == nil {
c.JSON(http.StatusServiceUnavailable, gin.H{"error": "core auth manager unavailable"})
@@ -1136,11 +1137,12 @@ func (h *Handler) PatchAuthFileFields(c *gin.Context) {
}
var req struct {
Name string `json:"name"`
Prefix *string `json:"prefix"`
ProxyURL *string `json:"proxy_url"`
Priority *int `json:"priority"`
Note *string `json:"note"`
Name string `json:"name"`
Prefix *string `json:"prefix"`
ProxyURL *string `json:"proxy_url"`
Headers map[string]string `json:"headers"`
Priority *int `json:"priority"`
Note *string `json:"note"`
}
if err := c.ShouldBindJSON(&req); err != nil {
c.JSON(http.StatusBadRequest, gin.H{"error": "invalid request body"})
@@ -1176,13 +1178,107 @@ func (h *Handler) PatchAuthFileFields(c *gin.Context) {
changed := false
if req.Prefix != nil {
targetAuth.Prefix = *req.Prefix
prefix := strings.TrimSpace(*req.Prefix)
targetAuth.Prefix = prefix
if targetAuth.Metadata == nil {
targetAuth.Metadata = make(map[string]any)
}
if prefix == "" {
delete(targetAuth.Metadata, "prefix")
} else {
targetAuth.Metadata["prefix"] = prefix
}
changed = true
}
if req.ProxyURL != nil {
targetAuth.ProxyURL = *req.ProxyURL
proxyURL := strings.TrimSpace(*req.ProxyURL)
targetAuth.ProxyURL = proxyURL
if targetAuth.Metadata == nil {
targetAuth.Metadata = make(map[string]any)
}
if proxyURL == "" {
delete(targetAuth.Metadata, "proxy_url")
} else {
targetAuth.Metadata["proxy_url"] = proxyURL
}
changed = true
}
if len(req.Headers) > 0 {
existingHeaders := coreauth.ExtractCustomHeadersFromMetadata(targetAuth.Metadata)
nextHeaders := make(map[string]string, len(existingHeaders))
for k, v := range existingHeaders {
nextHeaders[k] = v
}
headerChanged := false
for key, value := range req.Headers {
name := strings.TrimSpace(key)
if name == "" {
continue
}
val := strings.TrimSpace(value)
attrKey := "header:" + name
if val == "" {
if _, ok := nextHeaders[name]; ok {
delete(nextHeaders, name)
headerChanged = true
}
if targetAuth.Attributes != nil {
if _, ok := targetAuth.Attributes[attrKey]; ok {
headerChanged = true
}
}
continue
}
if prev, ok := nextHeaders[name]; !ok || prev != val {
headerChanged = true
}
nextHeaders[name] = val
if targetAuth.Attributes != nil {
if prev, ok := targetAuth.Attributes[attrKey]; !ok || prev != val {
headerChanged = true
}
} else {
headerChanged = true
}
}
if headerChanged {
if targetAuth.Metadata == nil {
targetAuth.Metadata = make(map[string]any)
}
if targetAuth.Attributes == nil {
targetAuth.Attributes = make(map[string]string)
}
for key, value := range req.Headers {
name := strings.TrimSpace(key)
if name == "" {
continue
}
val := strings.TrimSpace(value)
attrKey := "header:" + name
if val == "" {
delete(nextHeaders, name)
delete(targetAuth.Attributes, attrKey)
continue
}
nextHeaders[name] = val
targetAuth.Attributes[attrKey] = val
}
if len(nextHeaders) == 0 {
delete(targetAuth.Metadata, "headers")
} else {
metaHeaders := make(map[string]any, len(nextHeaders))
for k, v := range nextHeaders {
metaHeaders[k] = v
}
targetAuth.Metadata["headers"] = metaHeaders
}
changed = true
}
}
if req.Priority != nil || req.Note != nil {
if targetAuth.Metadata == nil {
targetAuth.Metadata = make(map[string]any)
@@ -2137,9 +2233,6 @@ func (h *Handler) RequestGitLabToken(c *gin.Context) {
metadata := buildGitLabAuthMetadata(baseURL, gitLabLoginModeOAuth, tokenResp, direct)
metadata["auth_kind"] = "oauth"
metadata["oauth_client_id"] = clientID
if clientSecret != "" {
metadata["oauth_client_secret"] = clientSecret
}
metadata["username"] = strings.TrimSpace(user.Username)
if email := primaryGitLabEmail(user); email != "" {
metadata["email"] = email
@@ -2432,62 +2525,6 @@ func (h *Handler) RequestAntigravityToken(c *gin.Context) {
c.JSON(200, gin.H{"status": "ok", "url": authURL, "state": state})
}
func (h *Handler) RequestQwenToken(c *gin.Context) {
ctx := context.Background()
ctx = PopulateAuthContext(ctx, c)
fmt.Println("Initializing Qwen authentication...")
state := fmt.Sprintf("gem-%d", time.Now().UnixNano())
// Initialize Qwen auth service
qwenAuth := qwen.NewQwenAuth(h.cfg)
// Generate authorization URL
deviceFlow, err := qwenAuth.InitiateDeviceFlow(ctx)
if err != nil {
log.Errorf("Failed to generate authorization URL: %v", err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to generate authorization url"})
return
}
authURL := deviceFlow.VerificationURIComplete
RegisterOAuthSession(state, "qwen")
go func() {
fmt.Println("Waiting for authentication...")
tokenData, errPollForToken := qwenAuth.PollForToken(deviceFlow.DeviceCode, deviceFlow.CodeVerifier)
if errPollForToken != nil {
SetOAuthSessionError(state, "Authentication failed")
fmt.Printf("Authentication failed: %v\n", errPollForToken)
return
}
// Create token storage
tokenStorage := qwenAuth.CreateTokenStorage(tokenData)
tokenStorage.Email = fmt.Sprintf("%d", time.Now().UnixMilli())
record := &coreauth.Auth{
ID: fmt.Sprintf("qwen-%s.json", tokenStorage.Email),
Provider: "qwen",
FileName: fmt.Sprintf("qwen-%s.json", tokenStorage.Email),
Storage: tokenStorage,
Metadata: map[string]any{"email": tokenStorage.Email},
}
savedPath, errSave := h.saveTokenRecord(ctx, record)
if errSave != nil {
log.Errorf("Failed to save authentication tokens: %v", errSave)
SetOAuthSessionError(state, "Failed to save authentication tokens")
return
}
fmt.Printf("Authentication successful! Token saved to %s\n", savedPath)
fmt.Println("You can now use Qwen services through this CLI")
CompleteOAuthSession(state)
}()
c.JSON(200, gin.H{"status": "ok", "url": authURL, "state": state})
}
func (h *Handler) RequestKimiToken(c *gin.Context) {
ctx := context.Background()
ctx = PopulateAuthContext(ctx, c)
@@ -3707,3 +3744,84 @@ func (h *Handler) RequestKiloToken(c *gin.Context) {
"verification_uri": resp.VerificationURL,
})
}
// RequestCursorToken initiates the Cursor PKCE authentication flow.
// Supports multiple accounts via ?label=xxx query parameter.
// The user opens the returned URL in a browser, logs in, and the server polls
// until the authentication completes.
func (h *Handler) RequestCursorToken(c *gin.Context) {
ctx := context.Background()
ctx = PopulateAuthContext(ctx, c)
label := strings.TrimSpace(c.Query("label"))
log.Infof("Initializing Cursor authentication (label=%q)...", label)
authParams, err := cursorauth.GenerateAuthParams()
if err != nil {
log.Errorf("Failed to generate Cursor auth params: %v", err)
c.JSON(http.StatusInternalServerError, gin.H{"error": "failed to generate auth params"})
return
}
state := fmt.Sprintf("cur-%d", time.Now().UnixNano())
RegisterOAuthSession(state, "cursor")
go func() {
log.Info("Waiting for Cursor authentication...")
log.Infof("Open this URL in your browser: %s", authParams.LoginURL)
tokens, errPoll := cursorauth.PollForAuth(ctx, authParams.UUID, authParams.Verifier)
if errPoll != nil {
SetOAuthSessionError(state, "Authentication failed: "+errPoll.Error())
log.Errorf("Cursor authentication failed: %v", errPoll)
return
}
// Build metadata
metadata := map[string]any{
"type": "cursor",
"access_token": tokens.AccessToken,
"refresh_token": tokens.RefreshToken,
"timestamp": time.Now().UnixMilli(),
}
// Extract expiry and account identity from JWT
expiry := cursorauth.GetTokenExpiry(tokens.AccessToken)
if !expiry.IsZero() {
metadata["expires_at"] = expiry.Format(time.RFC3339)
}
// Auto-identify account from JWT sub claim for multi-account support
sub := cursorauth.ParseJWTSub(tokens.AccessToken)
subHash := cursorauth.SubToShortHash(sub)
if sub != "" {
metadata["sub"] = sub
}
fileName := cursorauth.CredentialFileName(label, subHash)
displayLabel := cursorauth.DisplayLabel(label, subHash)
record := &coreauth.Auth{
ID: fileName,
Provider: "cursor",
FileName: fileName,
Label: displayLabel,
Metadata: metadata,
}
savedPath, errSave := h.saveTokenRecord(ctx, record)
if errSave != nil {
log.Errorf("Failed to save Cursor tokens: %v", errSave)
SetOAuthSessionError(state, "Failed to save tokens")
return
}
log.Infof("Cursor authentication successful! Token saved to %s", savedPath)
CompleteOAuthSession(state)
CompleteOAuthSessionsByProvider("cursor")
}()
c.JSON(200, gin.H{
"status": "ok",
"url": authParams.LoginURL,
"state": state,
})
}

View File

@@ -0,0 +1,164 @@
package management
import (
"context"
"encoding/json"
"net/http"
"net/http/httptest"
"strings"
"testing"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
)
func TestPatchAuthFileFields_MergeHeadersAndDeleteEmptyValues(t *testing.T) {
t.Setenv("MANAGEMENT_PASSWORD", "")
gin.SetMode(gin.TestMode)
store := &memoryAuthStore{}
manager := coreauth.NewManager(store, nil, nil)
record := &coreauth.Auth{
ID: "test.json",
FileName: "test.json",
Provider: "claude",
Attributes: map[string]string{
"path": "/tmp/test.json",
"header:X-Old": "old",
"header:X-Remove": "gone",
},
Metadata: map[string]any{
"type": "claude",
"headers": map[string]any{
"X-Old": "old",
"X-Remove": "gone",
},
},
}
if _, errRegister := manager.Register(context.Background(), record); errRegister != nil {
t.Fatalf("failed to register auth record: %v", errRegister)
}
h := NewHandlerWithoutConfigFilePath(&config.Config{AuthDir: t.TempDir()}, manager)
body := `{"name":"test.json","prefix":"p1","proxy_url":"http://proxy.local","headers":{"X-Old":"new","X-New":"v","X-Remove":" ","X-Nope":""}}`
rec := httptest.NewRecorder()
ctx, _ := gin.CreateTestContext(rec)
req := httptest.NewRequest(http.MethodPatch, "/v0/management/auth-files/fields", strings.NewReader(body))
req.Header.Set("Content-Type", "application/json")
ctx.Request = req
h.PatchAuthFileFields(ctx)
if rec.Code != http.StatusOK {
t.Fatalf("expected status %d, got %d with body %s", http.StatusOK, rec.Code, rec.Body.String())
}
updated, ok := manager.GetByID("test.json")
if !ok || updated == nil {
t.Fatalf("expected auth record to exist after patch")
}
if updated.Prefix != "p1" {
t.Fatalf("prefix = %q, want %q", updated.Prefix, "p1")
}
if updated.ProxyURL != "http://proxy.local" {
t.Fatalf("proxy_url = %q, want %q", updated.ProxyURL, "http://proxy.local")
}
if updated.Metadata == nil {
t.Fatalf("expected metadata to be non-nil")
}
if got, _ := updated.Metadata["prefix"].(string); got != "p1" {
t.Fatalf("metadata.prefix = %q, want %q", got, "p1")
}
if got, _ := updated.Metadata["proxy_url"].(string); got != "http://proxy.local" {
t.Fatalf("metadata.proxy_url = %q, want %q", got, "http://proxy.local")
}
headersMeta, ok := updated.Metadata["headers"].(map[string]any)
if !ok {
raw, _ := json.Marshal(updated.Metadata["headers"])
t.Fatalf("metadata.headers = %T (%s), want map[string]any", updated.Metadata["headers"], string(raw))
}
if got := headersMeta["X-Old"]; got != "new" {
t.Fatalf("metadata.headers.X-Old = %#v, want %q", got, "new")
}
if got := headersMeta["X-New"]; got != "v" {
t.Fatalf("metadata.headers.X-New = %#v, want %q", got, "v")
}
if _, ok := headersMeta["X-Remove"]; ok {
t.Fatalf("expected metadata.headers.X-Remove to be deleted")
}
if _, ok := headersMeta["X-Nope"]; ok {
t.Fatalf("expected metadata.headers.X-Nope to be absent")
}
if got := updated.Attributes["header:X-Old"]; got != "new" {
t.Fatalf("attrs header:X-Old = %q, want %q", got, "new")
}
if got := updated.Attributes["header:X-New"]; got != "v" {
t.Fatalf("attrs header:X-New = %q, want %q", got, "v")
}
if _, ok := updated.Attributes["header:X-Remove"]; ok {
t.Fatalf("expected attrs header:X-Remove to be deleted")
}
if _, ok := updated.Attributes["header:X-Nope"]; ok {
t.Fatalf("expected attrs header:X-Nope to be absent")
}
}
func TestPatchAuthFileFields_HeadersEmptyMapIsNoop(t *testing.T) {
t.Setenv("MANAGEMENT_PASSWORD", "")
gin.SetMode(gin.TestMode)
store := &memoryAuthStore{}
manager := coreauth.NewManager(store, nil, nil)
record := &coreauth.Auth{
ID: "noop.json",
FileName: "noop.json",
Provider: "claude",
Attributes: map[string]string{
"path": "/tmp/noop.json",
"header:X-Kee": "1",
},
Metadata: map[string]any{
"type": "claude",
"headers": map[string]any{
"X-Kee": "1",
},
},
}
if _, errRegister := manager.Register(context.Background(), record); errRegister != nil {
t.Fatalf("failed to register auth record: %v", errRegister)
}
h := NewHandlerWithoutConfigFilePath(&config.Config{AuthDir: t.TempDir()}, manager)
body := `{"name":"noop.json","note":"hello","headers":{}}`
rec := httptest.NewRecorder()
ctx, _ := gin.CreateTestContext(rec)
req := httptest.NewRequest(http.MethodPatch, "/v0/management/auth-files/fields", strings.NewReader(body))
req.Header.Set("Content-Type", "application/json")
ctx.Request = req
h.PatchAuthFileFields(ctx)
if rec.Code != http.StatusOK {
t.Fatalf("expected status %d, got %d with body %s", http.StatusOK, rec.Code, rec.Body.String())
}
updated, ok := manager.GetByID("noop.json")
if !ok || updated == nil {
t.Fatalf("expected auth record to exist after patch")
}
if got := updated.Attributes["header:X-Kee"]; got != "1" {
t.Fatalf("attrs header:X-Kee = %q, want %q", got, "1")
}
headersMeta, ok := updated.Metadata["headers"].(map[string]any)
if !ok {
t.Fatalf("expected metadata.headers to remain a map, got %T", updated.Metadata["headers"])
}
if got := headersMeta["X-Kee"]; got != "1" {
t.Fatalf("metadata.headers.X-Kee = %#v, want %q", got, "1")
}
}

View File

@@ -214,19 +214,46 @@ func (h *Handler) PatchGeminiKey(c *gin.Context) {
func (h *Handler) DeleteGeminiKey(c *gin.Context) {
if val := strings.TrimSpace(c.Query("api-key")); val != "" {
out := make([]config.GeminiKey, 0, len(h.cfg.GeminiKey))
for _, v := range h.cfg.GeminiKey {
if v.APIKey != val {
if baseRaw, okBase := c.GetQuery("base-url"); okBase {
base := strings.TrimSpace(baseRaw)
out := make([]config.GeminiKey, 0, len(h.cfg.GeminiKey))
for _, v := range h.cfg.GeminiKey {
if strings.TrimSpace(v.APIKey) == val && strings.TrimSpace(v.BaseURL) == base {
continue
}
out = append(out, v)
}
if len(out) != len(h.cfg.GeminiKey) {
h.cfg.GeminiKey = out
h.cfg.SanitizeGeminiKeys()
h.persist(c)
} else {
c.JSON(404, gin.H{"error": "item not found"})
}
return
}
if len(out) != len(h.cfg.GeminiKey) {
h.cfg.GeminiKey = out
h.cfg.SanitizeGeminiKeys()
h.persist(c)
} else {
matchIndex := -1
matchCount := 0
for i := range h.cfg.GeminiKey {
if strings.TrimSpace(h.cfg.GeminiKey[i].APIKey) == val {
matchCount++
if matchIndex == -1 {
matchIndex = i
}
}
}
if matchCount == 0 {
c.JSON(404, gin.H{"error": "item not found"})
return
}
if matchCount > 1 {
c.JSON(400, gin.H{"error": "multiple items match api-key; base-url is required"})
return
}
h.cfg.GeminiKey = append(h.cfg.GeminiKey[:matchIndex], h.cfg.GeminiKey[matchIndex+1:]...)
h.cfg.SanitizeGeminiKeys()
h.persist(c)
return
}
if idxStr := c.Query("index"); idxStr != "" {
@@ -335,14 +362,39 @@ func (h *Handler) PatchClaudeKey(c *gin.Context) {
}
func (h *Handler) DeleteClaudeKey(c *gin.Context) {
if val := c.Query("api-key"); val != "" {
out := make([]config.ClaudeKey, 0, len(h.cfg.ClaudeKey))
for _, v := range h.cfg.ClaudeKey {
if v.APIKey != val {
if val := strings.TrimSpace(c.Query("api-key")); val != "" {
if baseRaw, okBase := c.GetQuery("base-url"); okBase {
base := strings.TrimSpace(baseRaw)
out := make([]config.ClaudeKey, 0, len(h.cfg.ClaudeKey))
for _, v := range h.cfg.ClaudeKey {
if strings.TrimSpace(v.APIKey) == val && strings.TrimSpace(v.BaseURL) == base {
continue
}
out = append(out, v)
}
h.cfg.ClaudeKey = out
h.cfg.SanitizeClaudeKeys()
h.persist(c)
return
}
matchIndex := -1
matchCount := 0
for i := range h.cfg.ClaudeKey {
if strings.TrimSpace(h.cfg.ClaudeKey[i].APIKey) == val {
matchCount++
if matchIndex == -1 {
matchIndex = i
}
}
}
if matchCount > 1 {
c.JSON(400, gin.H{"error": "multiple items match api-key; base-url is required"})
return
}
if matchIndex != -1 {
h.cfg.ClaudeKey = append(h.cfg.ClaudeKey[:matchIndex], h.cfg.ClaudeKey[matchIndex+1:]...)
}
h.cfg.ClaudeKey = out
h.cfg.SanitizeClaudeKeys()
h.persist(c)
return
@@ -601,13 +653,38 @@ func (h *Handler) PatchVertexCompatKey(c *gin.Context) {
func (h *Handler) DeleteVertexCompatKey(c *gin.Context) {
if val := strings.TrimSpace(c.Query("api-key")); val != "" {
out := make([]config.VertexCompatKey, 0, len(h.cfg.VertexCompatAPIKey))
for _, v := range h.cfg.VertexCompatAPIKey {
if v.APIKey != val {
if baseRaw, okBase := c.GetQuery("base-url"); okBase {
base := strings.TrimSpace(baseRaw)
out := make([]config.VertexCompatKey, 0, len(h.cfg.VertexCompatAPIKey))
for _, v := range h.cfg.VertexCompatAPIKey {
if strings.TrimSpace(v.APIKey) == val && strings.TrimSpace(v.BaseURL) == base {
continue
}
out = append(out, v)
}
h.cfg.VertexCompatAPIKey = out
h.cfg.SanitizeVertexCompatKeys()
h.persist(c)
return
}
matchIndex := -1
matchCount := 0
for i := range h.cfg.VertexCompatAPIKey {
if strings.TrimSpace(h.cfg.VertexCompatAPIKey[i].APIKey) == val {
matchCount++
if matchIndex == -1 {
matchIndex = i
}
}
}
if matchCount > 1 {
c.JSON(400, gin.H{"error": "multiple items match api-key; base-url is required"})
return
}
if matchIndex != -1 {
h.cfg.VertexCompatAPIKey = append(h.cfg.VertexCompatAPIKey[:matchIndex], h.cfg.VertexCompatAPIKey[matchIndex+1:]...)
}
h.cfg.VertexCompatAPIKey = out
h.cfg.SanitizeVertexCompatKeys()
h.persist(c)
return
@@ -919,14 +996,39 @@ func (h *Handler) PatchCodexKey(c *gin.Context) {
}
func (h *Handler) DeleteCodexKey(c *gin.Context) {
if val := c.Query("api-key"); val != "" {
out := make([]config.CodexKey, 0, len(h.cfg.CodexKey))
for _, v := range h.cfg.CodexKey {
if v.APIKey != val {
if val := strings.TrimSpace(c.Query("api-key")); val != "" {
if baseRaw, okBase := c.GetQuery("base-url"); okBase {
base := strings.TrimSpace(baseRaw)
out := make([]config.CodexKey, 0, len(h.cfg.CodexKey))
for _, v := range h.cfg.CodexKey {
if strings.TrimSpace(v.APIKey) == val && strings.TrimSpace(v.BaseURL) == base {
continue
}
out = append(out, v)
}
h.cfg.CodexKey = out
h.cfg.SanitizeCodexKeys()
h.persist(c)
return
}
matchIndex := -1
matchCount := 0
for i := range h.cfg.CodexKey {
if strings.TrimSpace(h.cfg.CodexKey[i].APIKey) == val {
matchCount++
if matchIndex == -1 {
matchIndex = i
}
}
}
if matchCount > 1 {
c.JSON(400, gin.H{"error": "multiple items match api-key; base-url is required"})
return
}
if matchIndex != -1 {
h.cfg.CodexKey = append(h.cfg.CodexKey[:matchIndex], h.cfg.CodexKey[matchIndex+1:]...)
}
h.cfg.CodexKey = out
h.cfg.SanitizeCodexKeys()
h.persist(c)
return

View File

@@ -0,0 +1,172 @@
package management
import (
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"testing"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
)
func writeTestConfigFile(t *testing.T) string {
t.Helper()
dir := t.TempDir()
path := filepath.Join(dir, "config.yaml")
if errWrite := os.WriteFile(path, []byte("{}\n"), 0o600); errWrite != nil {
t.Fatalf("failed to write test config: %v", errWrite)
}
return path
}
func TestDeleteGeminiKey_RequiresBaseURLWhenAPIKeyDuplicated(t *testing.T) {
t.Parallel()
gin.SetMode(gin.TestMode)
h := &Handler{
cfg: &config.Config{
GeminiKey: []config.GeminiKey{
{APIKey: "shared-key", BaseURL: "https://a.example.com"},
{APIKey: "shared-key", BaseURL: "https://b.example.com"},
},
},
configFilePath: writeTestConfigFile(t),
}
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Request = httptest.NewRequest(http.MethodDelete, "/v0/management/gemini-api-key?api-key=shared-key", nil)
h.DeleteGeminiKey(c)
if rec.Code != http.StatusBadRequest {
t.Fatalf("status = %d, want %d; body=%s", rec.Code, http.StatusBadRequest, rec.Body.String())
}
if got := len(h.cfg.GeminiKey); got != 2 {
t.Fatalf("gemini keys len = %d, want 2", got)
}
}
func TestDeleteGeminiKey_DeletesOnlyMatchingBaseURL(t *testing.T) {
t.Parallel()
gin.SetMode(gin.TestMode)
h := &Handler{
cfg: &config.Config{
GeminiKey: []config.GeminiKey{
{APIKey: "shared-key", BaseURL: "https://a.example.com"},
{APIKey: "shared-key", BaseURL: "https://b.example.com"},
},
},
configFilePath: writeTestConfigFile(t),
}
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Request = httptest.NewRequest(http.MethodDelete, "/v0/management/gemini-api-key?api-key=shared-key&base-url=https://a.example.com", nil)
h.DeleteGeminiKey(c)
if rec.Code != http.StatusOK {
t.Fatalf("status = %d, want %d; body=%s", rec.Code, http.StatusOK, rec.Body.String())
}
if got := len(h.cfg.GeminiKey); got != 1 {
t.Fatalf("gemini keys len = %d, want 1", got)
}
if got := h.cfg.GeminiKey[0].BaseURL; got != "https://b.example.com" {
t.Fatalf("remaining base-url = %q, want %q", got, "https://b.example.com")
}
}
func TestDeleteClaudeKey_DeletesEmptyBaseURLWhenExplicitlyProvided(t *testing.T) {
t.Parallel()
gin.SetMode(gin.TestMode)
h := &Handler{
cfg: &config.Config{
ClaudeKey: []config.ClaudeKey{
{APIKey: "shared-key", BaseURL: ""},
{APIKey: "shared-key", BaseURL: "https://claude.example.com"},
},
},
configFilePath: writeTestConfigFile(t),
}
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Request = httptest.NewRequest(http.MethodDelete, "/v0/management/claude-api-key?api-key=shared-key&base-url=", nil)
h.DeleteClaudeKey(c)
if rec.Code != http.StatusOK {
t.Fatalf("status = %d, want %d; body=%s", rec.Code, http.StatusOK, rec.Body.String())
}
if got := len(h.cfg.ClaudeKey); got != 1 {
t.Fatalf("claude keys len = %d, want 1", got)
}
if got := h.cfg.ClaudeKey[0].BaseURL; got != "https://claude.example.com" {
t.Fatalf("remaining base-url = %q, want %q", got, "https://claude.example.com")
}
}
func TestDeleteVertexCompatKey_DeletesOnlyMatchingBaseURL(t *testing.T) {
t.Parallel()
gin.SetMode(gin.TestMode)
h := &Handler{
cfg: &config.Config{
VertexCompatAPIKey: []config.VertexCompatKey{
{APIKey: "shared-key", BaseURL: "https://a.example.com"},
{APIKey: "shared-key", BaseURL: "https://b.example.com"},
},
},
configFilePath: writeTestConfigFile(t),
}
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Request = httptest.NewRequest(http.MethodDelete, "/v0/management/vertex-api-key?api-key=shared-key&base-url=https://b.example.com", nil)
h.DeleteVertexCompatKey(c)
if rec.Code != http.StatusOK {
t.Fatalf("status = %d, want %d; body=%s", rec.Code, http.StatusOK, rec.Body.String())
}
if got := len(h.cfg.VertexCompatAPIKey); got != 1 {
t.Fatalf("vertex keys len = %d, want 1", got)
}
if got := h.cfg.VertexCompatAPIKey[0].BaseURL; got != "https://a.example.com" {
t.Fatalf("remaining base-url = %q, want %q", got, "https://a.example.com")
}
}
func TestDeleteCodexKey_RequiresBaseURLWhenAPIKeyDuplicated(t *testing.T) {
t.Parallel()
gin.SetMode(gin.TestMode)
h := &Handler{
cfg: &config.Config{
CodexKey: []config.CodexKey{
{APIKey: "shared-key", BaseURL: "https://a.example.com"},
{APIKey: "shared-key", BaseURL: "https://b.example.com"},
},
},
configFilePath: writeTestConfigFile(t),
}
rec := httptest.NewRecorder()
c, _ := gin.CreateTestContext(rec)
c.Request = httptest.NewRequest(http.MethodDelete, "/v0/management/codex-api-key?api-key=shared-key", nil)
h.DeleteCodexKey(c)
if rec.Code != http.StatusBadRequest {
t.Fatalf("status = %d, want %d; body=%s", rec.Code, http.StatusBadRequest, rec.Body.String())
}
if got := len(h.cfg.CodexKey); got != 2 {
t.Fatalf("codex keys len = %d, want 2", got)
}
}

View File

@@ -232,12 +232,8 @@ func NormalizeOAuthProvider(provider string) (string, error) {
return "gitlab", nil
case "gemini", "google":
return "gemini", nil
case "iflow", "i-flow":
return "iflow", nil
case "antigravity", "anti-gravity":
return "antigravity", nil
case "qwen":
return "qwen", nil
case "kiro":
return "kiro", nil
case "github":

View File

@@ -15,6 +15,8 @@ import (
)
const requestBodyOverrideContextKey = "REQUEST_BODY_OVERRIDE"
const responseBodyOverrideContextKey = "RESPONSE_BODY_OVERRIDE"
const websocketTimelineOverrideContextKey = "WEBSOCKET_TIMELINE_OVERRIDE"
// RequestInfo holds essential details of an incoming HTTP request for logging purposes.
type RequestInfo struct {
@@ -304,6 +306,10 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
if len(apiResponse) > 0 {
_ = w.streamWriter.WriteAPIResponse(apiResponse)
}
apiWebsocketTimeline := w.extractAPIWebsocketTimeline(c)
if len(apiWebsocketTimeline) > 0 {
_ = w.streamWriter.WriteAPIWebsocketTimeline(apiWebsocketTimeline)
}
if err := w.streamWriter.Close(); err != nil {
w.streamWriter = nil
return err
@@ -312,7 +318,7 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
return nil
}
return w.logRequest(w.extractRequestBody(c), finalStatusCode, w.cloneHeaders(), w.body.Bytes(), w.extractAPIRequest(c), w.extractAPIResponse(c), w.extractAPIResponseTimestamp(c), slicesAPIResponseError, forceLog)
return w.logRequest(w.extractRequestBody(c), finalStatusCode, w.cloneHeaders(), w.extractResponseBody(c), w.extractWebsocketTimeline(c), w.extractAPIRequest(c), w.extractAPIResponse(c), w.extractAPIWebsocketTimeline(c), w.extractAPIResponseTimestamp(c), slicesAPIResponseError, forceLog)
}
func (w *ResponseWriterWrapper) cloneHeaders() map[string][]string {
@@ -352,6 +358,18 @@ func (w *ResponseWriterWrapper) extractAPIResponse(c *gin.Context) []byte {
return data
}
func (w *ResponseWriterWrapper) extractAPIWebsocketTimeline(c *gin.Context) []byte {
apiTimeline, isExist := c.Get("API_WEBSOCKET_TIMELINE")
if !isExist {
return nil
}
data, ok := apiTimeline.([]byte)
if !ok || len(data) == 0 {
return nil
}
return bytes.Clone(data)
}
func (w *ResponseWriterWrapper) extractAPIResponseTimestamp(c *gin.Context) time.Time {
ts, isExist := c.Get("API_RESPONSE_TIMESTAMP")
if !isExist {
@@ -364,19 +382,8 @@ func (w *ResponseWriterWrapper) extractAPIResponseTimestamp(c *gin.Context) time
}
func (w *ResponseWriterWrapper) extractRequestBody(c *gin.Context) []byte {
if c != nil {
if bodyOverride, isExist := c.Get(requestBodyOverrideContextKey); isExist {
switch value := bodyOverride.(type) {
case []byte:
if len(value) > 0 {
return bytes.Clone(value)
}
case string:
if strings.TrimSpace(value) != "" {
return []byte(value)
}
}
}
if body := extractBodyOverride(c, requestBodyOverrideContextKey); len(body) > 0 {
return body
}
if w.requestInfo != nil && len(w.requestInfo.Body) > 0 {
return w.requestInfo.Body
@@ -384,13 +391,48 @@ func (w *ResponseWriterWrapper) extractRequestBody(c *gin.Context) []byte {
return nil
}
func (w *ResponseWriterWrapper) logRequest(requestBody []byte, statusCode int, headers map[string][]string, body []byte, apiRequestBody, apiResponseBody []byte, apiResponseTimestamp time.Time, apiResponseErrors []*interfaces.ErrorMessage, forceLog bool) error {
func (w *ResponseWriterWrapper) extractResponseBody(c *gin.Context) []byte {
if body := extractBodyOverride(c, responseBodyOverrideContextKey); len(body) > 0 {
return body
}
if w.body == nil || w.body.Len() == 0 {
return nil
}
return bytes.Clone(w.body.Bytes())
}
func (w *ResponseWriterWrapper) extractWebsocketTimeline(c *gin.Context) []byte {
return extractBodyOverride(c, websocketTimelineOverrideContextKey)
}
func extractBodyOverride(c *gin.Context, key string) []byte {
if c == nil {
return nil
}
bodyOverride, isExist := c.Get(key)
if !isExist {
return nil
}
switch value := bodyOverride.(type) {
case []byte:
if len(value) > 0 {
return bytes.Clone(value)
}
case string:
if strings.TrimSpace(value) != "" {
return []byte(value)
}
}
return nil
}
func (w *ResponseWriterWrapper) logRequest(requestBody []byte, statusCode int, headers map[string][]string, body, websocketTimeline, apiRequestBody, apiResponseBody, apiWebsocketTimeline []byte, apiResponseTimestamp time.Time, apiResponseErrors []*interfaces.ErrorMessage, forceLog bool) error {
if w.requestInfo == nil {
return nil
}
if loggerWithOptions, ok := w.logger.(interface {
LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool, string, time.Time, time.Time) error
LogRequestWithOptions(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []byte, []byte, []*interfaces.ErrorMessage, bool, string, time.Time, time.Time) error
}); ok {
return loggerWithOptions.LogRequestWithOptions(
w.requestInfo.URL,
@@ -400,8 +442,10 @@ func (w *ResponseWriterWrapper) logRequest(requestBody []byte, statusCode int, h
statusCode,
headers,
body,
websocketTimeline,
apiRequestBody,
apiResponseBody,
apiWebsocketTimeline,
apiResponseErrors,
forceLog,
w.requestInfo.RequestID,
@@ -418,8 +462,10 @@ func (w *ResponseWriterWrapper) logRequest(requestBody []byte, statusCode int, h
statusCode,
headers,
body,
websocketTimeline,
apiRequestBody,
apiResponseBody,
apiWebsocketTimeline,
apiResponseErrors,
w.requestInfo.RequestID,
w.requestInfo.Timestamp,

View File

@@ -1,10 +1,14 @@
package middleware
import (
"bytes"
"net/http/httptest"
"testing"
"time"
"github.com/gin-gonic/gin"
"github.com/router-for-me/CLIProxyAPI/v6/internal/interfaces"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
)
func TestExtractRequestBodyPrefersOverride(t *testing.T) {
@@ -33,7 +37,7 @@ func TestExtractRequestBodySupportsStringOverride(t *testing.T) {
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
wrapper := &ResponseWriterWrapper{}
wrapper := &ResponseWriterWrapper{body: &bytes.Buffer{}}
c.Set(requestBodyOverrideContextKey, "override-as-string")
body := wrapper.extractRequestBody(c)
@@ -41,3 +45,158 @@ func TestExtractRequestBodySupportsStringOverride(t *testing.T) {
t.Fatalf("request body = %q, want %q", string(body), "override-as-string")
}
}
func TestExtractResponseBodyPrefersOverride(t *testing.T) {
gin.SetMode(gin.TestMode)
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
wrapper := &ResponseWriterWrapper{body: &bytes.Buffer{}}
wrapper.body.WriteString("original-response")
body := wrapper.extractResponseBody(c)
if string(body) != "original-response" {
t.Fatalf("response body = %q, want %q", string(body), "original-response")
}
c.Set(responseBodyOverrideContextKey, []byte("override-response"))
body = wrapper.extractResponseBody(c)
if string(body) != "override-response" {
t.Fatalf("response body = %q, want %q", string(body), "override-response")
}
body[0] = 'X'
if got := wrapper.extractResponseBody(c); string(got) != "override-response" {
t.Fatalf("response override should be cloned, got %q", string(got))
}
}
func TestExtractResponseBodySupportsStringOverride(t *testing.T) {
gin.SetMode(gin.TestMode)
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
wrapper := &ResponseWriterWrapper{}
c.Set(responseBodyOverrideContextKey, "override-response-as-string")
body := wrapper.extractResponseBody(c)
if string(body) != "override-response-as-string" {
t.Fatalf("response body = %q, want %q", string(body), "override-response-as-string")
}
}
func TestExtractBodyOverrideClonesBytes(t *testing.T) {
gin.SetMode(gin.TestMode)
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
override := []byte("body-override")
c.Set(requestBodyOverrideContextKey, override)
body := extractBodyOverride(c, requestBodyOverrideContextKey)
if !bytes.Equal(body, override) {
t.Fatalf("body override = %q, want %q", string(body), string(override))
}
body[0] = 'X'
if !bytes.Equal(override, []byte("body-override")) {
t.Fatalf("override mutated: %q", string(override))
}
}
func TestExtractWebsocketTimelineUsesOverride(t *testing.T) {
gin.SetMode(gin.TestMode)
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
wrapper := &ResponseWriterWrapper{}
if got := wrapper.extractWebsocketTimeline(c); got != nil {
t.Fatalf("expected nil websocket timeline, got %q", string(got))
}
c.Set(websocketTimelineOverrideContextKey, []byte("timeline"))
body := wrapper.extractWebsocketTimeline(c)
if string(body) != "timeline" {
t.Fatalf("websocket timeline = %q, want %q", string(body), "timeline")
}
}
func TestFinalizeStreamingWritesAPIWebsocketTimeline(t *testing.T) {
gin.SetMode(gin.TestMode)
recorder := httptest.NewRecorder()
c, _ := gin.CreateTestContext(recorder)
streamWriter := &testStreamingLogWriter{}
wrapper := &ResponseWriterWrapper{
ResponseWriter: c.Writer,
logger: &testRequestLogger{enabled: true},
requestInfo: &RequestInfo{
URL: "/v1/responses",
Method: "POST",
Headers: map[string][]string{"Content-Type": {"application/json"}},
RequestID: "req-1",
Timestamp: time.Date(2026, time.April, 1, 12, 0, 0, 0, time.UTC),
},
isStreaming: true,
streamWriter: streamWriter,
}
c.Set("API_WEBSOCKET_TIMELINE", []byte("Timestamp: 2026-04-01T12:00:00Z\nEvent: api.websocket.request\n{}"))
if err := wrapper.Finalize(c); err != nil {
t.Fatalf("Finalize error: %v", err)
}
if string(streamWriter.apiWebsocketTimeline) != "Timestamp: 2026-04-01T12:00:00Z\nEvent: api.websocket.request\n{}" {
t.Fatalf("stream writer websocket timeline = %q", string(streamWriter.apiWebsocketTimeline))
}
if !streamWriter.closed {
t.Fatal("expected stream writer to be closed")
}
}
type testRequestLogger struct {
enabled bool
}
func (l *testRequestLogger) LogRequest(string, string, map[string][]string, []byte, int, map[string][]string, []byte, []byte, []byte, []byte, []byte, []*interfaces.ErrorMessage, string, time.Time, time.Time) error {
return nil
}
func (l *testRequestLogger) LogStreamingRequest(string, string, map[string][]string, []byte, string) (logging.StreamingLogWriter, error) {
return &testStreamingLogWriter{}, nil
}
func (l *testRequestLogger) IsEnabled() bool {
return l.enabled
}
type testStreamingLogWriter struct {
apiWebsocketTimeline []byte
closed bool
}
func (w *testStreamingLogWriter) WriteChunkAsync([]byte) {}
func (w *testStreamingLogWriter) WriteStatus(int, map[string][]string) error {
return nil
}
func (w *testStreamingLogWriter) WriteAPIRequest([]byte) error {
return nil
}
func (w *testStreamingLogWriter) WriteAPIResponse([]byte) error {
return nil
}
func (w *testStreamingLogWriter) WriteAPIWebsocketTimeline(apiWebsocketTimeline []byte) error {
w.apiWebsocketTimeline = bytes.Clone(apiWebsocketTimeline)
return nil
}
func (w *testStreamingLogWriter) SetFirstChunkTimestamp(time.Time) {}
func (w *testStreamingLogWriter) Close() error {
w.closed = true
return nil
}

View File

@@ -123,6 +123,10 @@ func (fh *FallbackHandler) WrapHandler(handler gin.HandlerFunc) gin.HandlerFunc
return
}
// Sanitize request body: remove thinking blocks with invalid signatures
// to prevent upstream API 400 errors
bodyBytes = SanitizeAmpRequestBody(bodyBytes)
// Restore the body for the handler to read
c.Request.Body = io.NopCloser(bytes.NewReader(bodyBytes))
@@ -249,6 +253,7 @@ func (fh *FallbackHandler) WrapHandler(handler gin.HandlerFunc) gin.HandlerFunc
log.Debugf("amp model mapping: request %s -> %s", normalizedModel, resolvedModel)
logAmpRouting(RouteTypeModelMapping, modelName, resolvedModel, providerName, requestPath)
rewriter := NewResponseRewriter(c.Writer, modelName)
rewriter.suppressThinking = true
c.Writer = rewriter
// Filter Anthropic-Beta header only for local handling paths
filterAntropicBetaHeader(c)
@@ -259,10 +264,17 @@ func (fh *FallbackHandler) WrapHandler(handler gin.HandlerFunc) gin.HandlerFunc
} else if len(providers) > 0 {
// Log: Using local provider (free)
logAmpRouting(RouteTypeLocalProvider, modelName, resolvedModel, providerName, requestPath)
// Wrap with ResponseRewriter for local providers too, because upstream
// proxies (e.g. NewAPI) may return a different model name and lack
// Amp-required fields like thinking.signature.
rewriter := NewResponseRewriter(c.Writer, modelName)
rewriter.suppressThinking = providerName != "claude"
c.Writer = rewriter
// Filter Anthropic-Beta header only for local handling paths
filterAntropicBetaHeader(c)
c.Request.Body = io.NopCloser(bytes.NewReader(bodyBytes))
handler(c)
rewriter.Flush()
} else {
// No provider, no mapping, no proxy: fall back to the wrapped handler so it can return an error response
c.Request.Body = io.NopCloser(bytes.NewReader(bodyBytes))

View File

@@ -129,11 +129,11 @@ func TestModifyResponse_GzipScenarios(t *testing.T) {
wantCE: "",
},
{
name: "skips_non_2xx_status",
name: "decompresses_non_2xx_status_when_gzip_detected",
header: http.Header{},
body: good,
status: 404,
wantBody: good,
wantBody: goodJSON,
wantCE: "",
},
}

View File

@@ -2,6 +2,8 @@ package amp
import (
"bytes"
"encoding/json"
"fmt"
"net/http"
"strings"
@@ -12,15 +14,17 @@ import (
)
// ResponseRewriter wraps a gin.ResponseWriter to intercept and modify the response body
// It's used to rewrite model names in responses when model mapping is used
// It is used to rewrite model names in responses when model mapping is used
// and to keep Amp-compatible response shapes.
type ResponseRewriter struct {
gin.ResponseWriter
body *bytes.Buffer
originalModel string
isStreaming bool
body *bytes.Buffer
originalModel string
isStreaming bool
suppressThinking bool
}
// NewResponseRewriter creates a new response rewriter for model name substitution
// NewResponseRewriter creates a new response rewriter for model name substitution.
func NewResponseRewriter(w gin.ResponseWriter, originalModel string) *ResponseRewriter {
return &ResponseRewriter{
ResponseWriter: w,
@@ -33,15 +37,15 @@ const maxBufferedResponseBytes = 2 * 1024 * 1024 // 2MB safety cap
func looksLikeSSEChunk(data []byte) bool {
// Fallback detection: some upstreams may omit/lie about Content-Type, causing SSE to be buffered.
// Heuristics are intentionally simple and cheap.
return bytes.Contains(data, []byte("data:")) ||
bytes.Contains(data, []byte("event:")) ||
bytes.Contains(data, []byte("message_start")) ||
bytes.Contains(data, []byte("message_delta")) ||
bytes.Contains(data, []byte("content_block_start")) ||
bytes.Contains(data, []byte("content_block_delta")) ||
bytes.Contains(data, []byte("content_block_stop")) ||
bytes.Contains(data, []byte("\n\n"))
// We conservatively detect SSE by checking for "data:" / "event:" at the start of any line.
for _, line := range bytes.Split(data, []byte("\n")) {
trimmed := bytes.TrimSpace(line)
if bytes.HasPrefix(trimmed, []byte("data:")) ||
bytes.HasPrefix(trimmed, []byte("event:")) {
return true
}
}
return false
}
func (rw *ResponseRewriter) enableStreaming(reason string) error {
@@ -95,7 +99,8 @@ func (rw *ResponseRewriter) Write(data []byte) (int, error) {
}
if rw.isStreaming {
n, err := rw.ResponseWriter.Write(rw.rewriteStreamChunk(data))
rewritten := rw.rewriteStreamChunk(data)
n, err := rw.ResponseWriter.Write(rewritten)
if err == nil {
if flusher, ok := rw.ResponseWriter.(http.Flusher); ok {
flusher.Flush()
@@ -106,7 +111,6 @@ func (rw *ResponseRewriter) Write(data []byte) (int, error) {
return rw.body.Write(data)
}
// Flush writes the buffered response with model names rewritten
func (rw *ResponseRewriter) Flush() {
if rw.isStreaming {
if flusher, ok := rw.ResponseWriter.(http.Flusher); ok {
@@ -115,40 +119,79 @@ func (rw *ResponseRewriter) Flush() {
return
}
if rw.body.Len() > 0 {
if _, err := rw.ResponseWriter.Write(rw.rewriteModelInResponse(rw.body.Bytes())); err != nil {
rewritten := rw.rewriteModelInResponse(rw.body.Bytes())
// Update Content-Length to match the rewritten body size, since
// signature injection and model name changes alter the payload length.
rw.ResponseWriter.Header().Set("Content-Length", fmt.Sprintf("%d", len(rewritten)))
if _, err := rw.ResponseWriter.Write(rewritten); err != nil {
log.Warnf("amp response rewriter: failed to write rewritten response: %v", err)
}
}
}
// modelFieldPaths lists all JSON paths where model name may appear
var modelFieldPaths = []string{"message.model", "model", "modelVersion", "response.model", "response.modelVersion"}
// rewriteModelInResponse replaces all occurrences of the mapped model with the original model in JSON
// It also suppresses "thinking" blocks if "tool_use" is present to ensure Amp client compatibility
func (rw *ResponseRewriter) rewriteModelInResponse(data []byte) []byte {
// 1. Amp Compatibility: Suppress thinking blocks if tool use is detected
// The Amp client struggles when both thinking and tool_use blocks are present
// ensureAmpSignature injects empty signature fields into tool_use/thinking blocks
// in API responses so that the Amp TUI does not crash on P.signature.length.
func ensureAmpSignature(data []byte) []byte {
for index, block := range gjson.GetBytes(data, "content").Array() {
blockType := block.Get("type").String()
if blockType != "tool_use" && blockType != "thinking" {
continue
}
signaturePath := fmt.Sprintf("content.%d.signature", index)
if gjson.GetBytes(data, signaturePath).Exists() {
continue
}
var err error
data, err = sjson.SetBytes(data, signaturePath, "")
if err != nil {
log.Warnf("Amp ResponseRewriter: failed to add empty signature to %s block: %v", blockType, err)
break
}
}
contentBlockType := gjson.GetBytes(data, "content_block.type").String()
if (contentBlockType == "tool_use" || contentBlockType == "thinking") && !gjson.GetBytes(data, "content_block.signature").Exists() {
var err error
data, err = sjson.SetBytes(data, "content_block.signature", "")
if err != nil {
log.Warnf("Amp ResponseRewriter: failed to add empty signature to streaming %s block: %v", contentBlockType, err)
}
}
return data
}
func (rw *ResponseRewriter) suppressAmpThinking(data []byte) []byte {
if !rw.suppressThinking {
return data
}
if gjson.GetBytes(data, `content.#(type=="tool_use")`).Exists() {
filtered := gjson.GetBytes(data, `content.#(type!="thinking")#`)
if filtered.Exists() {
originalCount := gjson.GetBytes(data, "content.#").Int()
filteredCount := filtered.Get("#").Int()
if originalCount > filteredCount {
var err error
data, err = sjson.SetBytes(data, "content", filtered.Value())
if err != nil {
log.Warnf("Amp ResponseRewriter: failed to suppress thinking blocks: %v", err)
} else {
log.Debugf("Amp ResponseRewriter: Suppressed %d thinking blocks due to tool usage", originalCount-filteredCount)
// Log the result for verification
log.Debugf("Amp ResponseRewriter: Resulting content: %s", gjson.GetBytes(data, "content").String())
}
}
}
}
return data
}
func (rw *ResponseRewriter) rewriteModelInResponse(data []byte) []byte {
data = ensureAmpSignature(data)
data = rw.suppressAmpThinking(data)
if len(data) == 0 {
return data
}
if rw.originalModel == "" {
return data
}
@@ -160,24 +203,164 @@ func (rw *ResponseRewriter) rewriteModelInResponse(data []byte) []byte {
return data
}
// rewriteStreamChunk rewrites model names in SSE stream chunks
func (rw *ResponseRewriter) rewriteStreamChunk(chunk []byte) []byte {
if rw.originalModel == "" {
return chunk
lines := bytes.Split(chunk, []byte("\n"))
var out [][]byte
i := 0
for i < len(lines) {
line := lines[i]
trimmed := bytes.TrimSpace(line)
// Case 1: "event:" line - look ahead for its "data:" line
if bytes.HasPrefix(trimmed, []byte("event: ")) {
// Scan forward past blank lines to find the data: line
dataIdx := -1
for j := i + 1; j < len(lines); j++ {
t := bytes.TrimSpace(lines[j])
if len(t) == 0 {
continue
}
if bytes.HasPrefix(t, []byte("data: ")) {
dataIdx = j
}
break
}
if dataIdx >= 0 {
// Found event+data pair - process through rewriter
jsonData := bytes.TrimPrefix(bytes.TrimSpace(lines[dataIdx]), []byte("data: "))
if len(jsonData) > 0 && jsonData[0] == '{' {
rewritten := rw.rewriteStreamEvent(jsonData)
if rewritten == nil {
i = dataIdx + 1
continue
}
// Emit event line
out = append(out, line)
// Emit blank lines between event and data
for k := i + 1; k < dataIdx; k++ {
out = append(out, lines[k])
}
// Emit rewritten data
out = append(out, append([]byte("data: "), rewritten...))
i = dataIdx + 1
continue
}
}
// No data line found (orphan event from cross-chunk split)
// Pass it through as-is - the data will arrive in the next chunk
out = append(out, line)
i++
continue
}
// Case 2: standalone "data:" line (no preceding event: in this chunk)
if bytes.HasPrefix(trimmed, []byte("data: ")) {
jsonData := bytes.TrimPrefix(trimmed, []byte("data: "))
if len(jsonData) > 0 && jsonData[0] == '{' {
rewritten := rw.rewriteStreamEvent(jsonData)
if rewritten != nil {
out = append(out, append([]byte("data: "), rewritten...))
}
i++
continue
}
}
// Case 3: everything else
out = append(out, line)
i++
}
// SSE format: "data: {json}\n\n"
lines := bytes.Split(chunk, []byte("\n"))
for i, line := range lines {
if bytes.HasPrefix(line, []byte("data: ")) {
jsonData := bytes.TrimPrefix(line, []byte("data: "))
if len(jsonData) > 0 && jsonData[0] == '{' {
// Rewrite JSON in the data line
rewritten := rw.rewriteModelInResponse(jsonData)
lines[i] = append([]byte("data: "), rewritten...)
return bytes.Join(out, []byte("\n"))
}
// rewriteStreamEvent processes a single JSON event in the SSE stream.
// It rewrites model names and ensures signature fields exist.
// NOTE: streaming mode does NOT suppress thinking blocks - they are
// passed through with signature injection to avoid breaking SSE index
// alignment and TUI rendering.
func (rw *ResponseRewriter) rewriteStreamEvent(data []byte) []byte {
// Inject empty signature where needed
data = ensureAmpSignature(data)
// Rewrite model name
if rw.originalModel != "" {
for _, path := range modelFieldPaths {
if gjson.GetBytes(data, path).Exists() {
data, _ = sjson.SetBytes(data, path, rw.originalModel)
}
}
}
return bytes.Join(lines, []byte("\n"))
return data
}
// SanitizeAmpRequestBody removes thinking blocks with empty/missing/invalid signatures
// and strips the proxy-injected "signature" field from tool_use blocks in the messages
// array before forwarding to the upstream API.
// This prevents 400 errors from the API which requires valid signatures on thinking
// blocks and does not accept a signature field on tool_use blocks.
func SanitizeAmpRequestBody(body []byte) []byte {
messages := gjson.GetBytes(body, "messages")
if !messages.Exists() || !messages.IsArray() {
return body
}
modified := false
for msgIdx, msg := range messages.Array() {
if msg.Get("role").String() != "assistant" {
continue
}
content := msg.Get("content")
if !content.Exists() || !content.IsArray() {
continue
}
var keepBlocks []interface{}
contentModified := false
for _, block := range content.Array() {
blockType := block.Get("type").String()
if blockType == "thinking" {
sig := block.Get("signature")
if !sig.Exists() || sig.Type != gjson.String || strings.TrimSpace(sig.String()) == "" {
contentModified = true
continue
}
}
// Use raw JSON to prevent float64 rounding of large integers in tool_use inputs
blockRaw := []byte(block.Raw)
if blockType == "tool_use" && block.Get("signature").Exists() {
blockRaw, _ = sjson.DeleteBytes(blockRaw, "signature")
contentModified = true
}
// sjson.SetBytes supports raw JSON strings if wrapped in gjson.Raw
keepBlocks = append(keepBlocks, json.RawMessage(blockRaw))
}
if contentModified {
contentPath := fmt.Sprintf("messages.%d.content", msgIdx)
var err error
if len(keepBlocks) == 0 {
body, err = sjson.SetBytes(body, contentPath, []interface{}{})
} else {
body, err = sjson.SetBytes(body, contentPath, keepBlocks)
}
if err != nil {
log.Warnf("Amp RequestSanitizer: failed to sanitize message %d: %v", msgIdx, err)
continue
}
modified = true
}
}
if modified {
log.Debugf("Amp RequestSanitizer: sanitized request body")
}
return body
}

View File

@@ -1,6 +1,7 @@
package amp
import (
"strings"
"testing"
)
@@ -100,6 +101,80 @@ func TestRewriteStreamChunk_MessageModel(t *testing.T) {
}
}
func TestRewriteStreamChunk_PreservesThinkingWithSignatureInjection(t *testing.T) {
rw := &ResponseRewriter{}
chunk := []byte("event: content_block_start\ndata: {\"type\":\"content_block_start\",\"index\":0,\"content_block\":{\"type\":\"thinking\",\"thinking\":\"\"}}\n\nevent: content_block_delta\ndata: {\"type\":\"content_block_delta\",\"index\":0,\"delta\":{\"type\":\"thinking_delta\",\"thinking\":\"abc\"}}\n\nevent: content_block_stop\ndata: {\"type\":\"content_block_stop\",\"index\":0}\n\nevent: content_block_start\ndata: {\"type\":\"content_block_start\",\"index\":1,\"content_block\":{\"type\":\"tool_use\",\"name\":\"bash\",\"input\":{}}}\n\n")
result := rw.rewriteStreamChunk(chunk)
// Streaming mode preserves thinking blocks (does NOT suppress them)
// to avoid breaking SSE index alignment and TUI rendering
if !contains(result, []byte(`"content_block":{"type":"thinking"`)) {
t.Fatalf("expected thinking content_block_start to be preserved, got %s", string(result))
}
if !contains(result, []byte(`"delta":{"type":"thinking_delta"`)) {
t.Fatalf("expected thinking_delta to be preserved, got %s", string(result))
}
if !contains(result, []byte(`"type":"content_block_stop","index":0`)) {
t.Fatalf("expected content_block_stop for thinking block to be preserved, got %s", string(result))
}
if !contains(result, []byte(`"content_block":{"type":"tool_use"`)) {
t.Fatalf("expected tool_use content_block frame to remain, got %s", string(result))
}
// Signature should be injected into both thinking and tool_use blocks
if count := strings.Count(string(result), `"signature":""`); count != 2 {
t.Fatalf("expected 2 signature injections, but got %d in %s", count, string(result))
}
}
func TestSanitizeAmpRequestBody_RemovesWhitespaceAndNonStringSignatures(t *testing.T) {
input := []byte(`{"messages":[{"role":"assistant","content":[{"type":"thinking","thinking":"drop-whitespace","signature":" "},{"type":"thinking","thinking":"drop-number","signature":123},{"type":"thinking","thinking":"keep-valid","signature":"valid-signature"},{"type":"text","text":"keep-text"}]}]}`)
result := SanitizeAmpRequestBody(input)
if contains(result, []byte("drop-whitespace")) {
t.Fatalf("expected whitespace-only signature block to be removed, got %s", string(result))
}
if contains(result, []byte("drop-number")) {
t.Fatalf("expected non-string signature block to be removed, got %s", string(result))
}
if !contains(result, []byte("keep-valid")) {
t.Fatalf("expected valid thinking block to remain, got %s", string(result))
}
if !contains(result, []byte("keep-text")) {
t.Fatalf("expected non-thinking content to remain, got %s", string(result))
}
}
func TestSanitizeAmpRequestBody_StripsSignatureFromToolUseBlocks(t *testing.T) {
input := []byte(`{"messages":[{"role":"assistant","content":[{"type":"thinking","thinking":"thought","signature":"valid-sig"},{"type":"tool_use","id":"toolu_01","name":"Bash","input":{"cmd":"ls"},"signature":""}]}]}`)
result := SanitizeAmpRequestBody(input)
if contains(result, []byte(`"signature":""`)) {
t.Fatalf("expected signature to be stripped from tool_use block, got %s", string(result))
}
if !contains(result, []byte(`"valid-sig"`)) {
t.Fatalf("expected thinking signature to remain, got %s", string(result))
}
if !contains(result, []byte(`"tool_use"`)) {
t.Fatalf("expected tool_use block to remain, got %s", string(result))
}
}
func TestSanitizeAmpRequestBody_MixedInvalidThinkingAndToolUseSignature(t *testing.T) {
input := []byte(`{"messages":[{"role":"assistant","content":[{"type":"thinking","thinking":"drop-me","signature":""},{"type":"tool_use","id":"toolu_01","name":"Bash","input":{"cmd":"ls"},"signature":""}]}]}`)
result := SanitizeAmpRequestBody(input)
if contains(result, []byte("drop-me")) {
t.Fatalf("expected invalid thinking block to be removed, got %s", string(result))
}
if contains(result, []byte(`"signature"`)) {
t.Fatalf("expected signature to be stripped from tool_use block, got %s", string(result))
}
if !contains(result, []byte(`"tool_use"`)) {
t.Fatalf("expected tool_use block to remain, got %s", string(result))
}
}
func contains(data, substr []byte) bool {
for i := 0; i <= len(data)-len(substr); i++ {
if string(data[i:i+len(substr)]) == string(substr) {

View File

@@ -25,6 +25,7 @@ import (
"github.com/router-for-me/CLIProxyAPI/v6/internal/api/modules"
ampmodule "github.com/router-for-me/CLIProxyAPI/v6/internal/api/modules/amp"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/kiro"
"github.com/router-for-me/CLIProxyAPI/v6/internal/cache"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
"github.com/router-for-me/CLIProxyAPI/v6/internal/managementasset"
@@ -262,6 +263,7 @@ func NewServer(cfg *config.Config, authManager *auth.Manager, accessManager *sdk
}
managementasset.SetCurrentConfig(cfg)
auth.SetQuotaCooldownDisabled(cfg.DisableCooling)
applySignatureCacheConfig(nil, cfg)
// Initialize management handler
s.mgmt = managementHandlers.NewHandler(cfg, configFilePath, authManager)
if optionState.localPassword != "" {
@@ -323,6 +325,10 @@ func NewServer(cfg *config.Config, authManager *auth.Manager, accessManager *sdk
// setupRoutes configures the API routes for the server.
// It defines the endpoints and associates them with their respective handlers.
func (s *Server) setupRoutes() {
s.engine.GET("/healthz", func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{"status": "ok"})
})
s.engine.GET("/management.html", s.serveManagementControlPanel)
openaiHandlers := openai.NewOpenAIAPIHandler(s.handlers)
geminiHandlers := gemini.NewGeminiAPIHandler(s.handlers)
@@ -431,20 +437,6 @@ func (s *Server) setupRoutes() {
c.String(http.StatusOK, oauthCallbackSuccessHTML)
})
s.engine.GET("/iflow/callback", func(c *gin.Context) {
code := c.Query("code")
state := c.Query("state")
errStr := c.Query("error")
if errStr == "" {
errStr = c.Query("error_description")
}
if state != "" {
_, _ = managementHandlers.WriteOAuthCallbackFileForPendingSession(s.cfg.AuthDir, "iflow", state, code, errStr)
}
c.Header("Content-Type", "text/html; charset=utf-8")
c.String(http.StatusOK, oauthCallbackSuccessHTML)
})
s.engine.GET("/antigravity/callback", func(c *gin.Context) {
code := c.Query("code")
state := c.Query("state")
@@ -569,6 +561,8 @@ func (s *Server) registerManagementRoutes() {
mgmt.PUT("/quota-exceeded/switch-preview-model", s.mgmt.PutSwitchPreviewModel)
mgmt.PATCH("/quota-exceeded/switch-preview-model", s.mgmt.PutSwitchPreviewModel)
mgmt.GET("/copilot-quota", s.mgmt.GetCopilotQuota)
mgmt.GET("/api-keys", s.mgmt.GetAPIKeys)
mgmt.PUT("/api-keys", s.mgmt.PutAPIKeys)
mgmt.PATCH("/api-keys", s.mgmt.PatchAPIKeys)
@@ -675,18 +669,18 @@ func (s *Server) registerManagementRoutes() {
mgmt.GET("/gitlab-auth-url", s.mgmt.RequestGitLabToken)
mgmt.POST("/gitlab-auth-url", s.mgmt.RequestGitLabPATToken)
mgmt.GET("/gemini-cli-auth-url", s.mgmt.RequestGeminiCLIToken)
mgmt.GET("/antigravity-auth-url", s.mgmt.RequestAntigravityToken)
mgmt.GET("/qwen-auth-url", s.mgmt.RequestQwenToken)
mgmt.GET("/kilo-auth-url", s.mgmt.RequestKiloToken)
mgmt.GET("/kimi-auth-url", s.mgmt.RequestKimiToken)
mgmt.GET("/iflow-auth-url", s.mgmt.RequestIFlowToken)
mgmt.POST("/iflow-auth-url", s.mgmt.RequestIFlowCookieToken)
mgmt.GET("/kiro-auth-url", s.mgmt.RequestKiroToken)
mgmt.GET("/github-auth-url", s.mgmt.RequestGitHubToken)
mgmt.POST("/oauth-callback", s.mgmt.PostOAuthCallback)
mgmt.GET("/get-auth-status", s.mgmt.GetAuthStatus)
mgmt.GET("/antigravity-auth-url", s.mgmt.RequestAntigravityToken)
mgmt.GET("/kilo-auth-url", s.mgmt.RequestKiloToken)
mgmt.GET("/kimi-auth-url", s.mgmt.RequestKimiToken)
mgmt.GET("/iflow-auth-url", s.mgmt.RequestIFlowToken)
mgmt.POST("/iflow-auth-url", s.mgmt.RequestIFlowCookieToken)
mgmt.GET("/kiro-auth-url", s.mgmt.RequestKiroToken)
mgmt.GET("/cursor-auth-url", s.mgmt.RequestCursorToken)
mgmt.GET("/github-auth-url", s.mgmt.RequestGitHubToken)
mgmt.POST("/oauth-callback", s.mgmt.PostOAuthCallback)
mgmt.GET("/get-auth-status", s.mgmt.GetAuthStatus)
}
}
}
func (s *Server) managementAvailabilityMiddleware() gin.HandlerFunc {
return func(c *gin.Context) {
@@ -959,6 +953,8 @@ func (s *Server) UpdateClients(cfg *config.Config) {
auth.SetQuotaCooldownDisabled(cfg.DisableCooling)
}
applySignatureCacheConfig(oldCfg, cfg)
if s.handlers != nil && s.handlers.AuthManager != nil {
s.handlers.AuthManager.SetRetryConfig(cfg.RequestRetry, time.Duration(cfg.MaxRetryInterval)*time.Second, cfg.MaxRetryCredentials)
}
@@ -1097,3 +1093,37 @@ func AuthMiddleware(manager *sdkaccess.Manager) gin.HandlerFunc {
c.AbortWithStatusJSON(statusCode, gin.H{"error": err.Message})
}
}
func configuredSignatureCacheEnabled(cfg *config.Config) bool {
if cfg != nil && cfg.AntigravitySignatureCacheEnabled != nil {
return *cfg.AntigravitySignatureCacheEnabled
}
return true
}
func applySignatureCacheConfig(oldCfg, cfg *config.Config) {
newVal := configuredSignatureCacheEnabled(cfg)
newStrict := configuredSignatureBypassStrict(cfg)
if oldCfg == nil {
cache.SetSignatureCacheEnabled(newVal)
cache.SetSignatureBypassStrictMode(newStrict)
return
}
oldVal := configuredSignatureCacheEnabled(oldCfg)
if oldVal != newVal {
cache.SetSignatureCacheEnabled(newVal)
}
oldStrict := configuredSignatureBypassStrict(oldCfg)
if oldStrict != newStrict {
cache.SetSignatureBypassStrictMode(newStrict)
}
}
func configuredSignatureBypassStrict(cfg *config.Config) bool {
if cfg != nil && cfg.AntigravitySignatureBypassStrict != nil {
return *cfg.AntigravitySignatureBypassStrict
}
return false
}

View File

@@ -1,6 +1,7 @@
package api
import (
"encoding/json"
"net/http"
"net/http/httptest"
"os"
@@ -46,6 +47,28 @@ func newTestServer(t *testing.T) *Server {
return NewServer(cfg, authManager, accessManager, configPath)
}
func TestHealthz(t *testing.T) {
server := newTestServer(t)
req := httptest.NewRequest(http.MethodGet, "/healthz", nil)
rr := httptest.NewRecorder()
server.engine.ServeHTTP(rr, req)
if rr.Code != http.StatusOK {
t.Fatalf("unexpected status code: got %d want %d; body=%s", rr.Code, http.StatusOK, rr.Body.String())
}
var resp struct {
Status string `json:"status"`
}
if err := json.Unmarshal(rr.Body.Bytes(), &resp); err != nil {
t.Fatalf("failed to parse response JSON: %v; body=%s", err, rr.Body.String())
}
if resp.Status != "ok" {
t.Fatalf("unexpected response status: got %q want %q", resp.Status, "ok")
}
}
func TestAmpProviderModelRoutes(t *testing.T) {
testCases := []struct {
name string
@@ -172,6 +195,8 @@ func TestDefaultRequestLoggerFactory_UsesResolvedLogDirectory(t *testing.T) {
nil,
nil,
nil,
nil,
nil,
true,
"issue-1711",
time.Now(),

View File

@@ -59,10 +59,30 @@ type ClaudeAuth struct {
// Returns:
// - *ClaudeAuth: A new Claude authentication service instance
func NewClaudeAuth(cfg *config.Config) *ClaudeAuth {
return NewClaudeAuthWithProxyURL(cfg, "")
}
// NewClaudeAuthWithProxyURL creates a new Anthropic authentication service with a proxy override.
// proxyURL takes precedence over cfg.ProxyURL when non-empty.
func NewClaudeAuthWithProxyURL(cfg *config.Config, proxyURL string) *ClaudeAuth {
effectiveProxyURL := strings.TrimSpace(proxyURL)
var sdkCfg *config.SDKConfig
if cfg != nil {
sdkCfgCopy := cfg.SDKConfig
if effectiveProxyURL == "" {
effectiveProxyURL = strings.TrimSpace(cfg.ProxyURL)
}
sdkCfgCopy.ProxyURL = effectiveProxyURL
sdkCfg = &sdkCfgCopy
} else if effectiveProxyURL != "" {
sdkCfgCopy := config.SDKConfig{ProxyURL: effectiveProxyURL}
sdkCfg = &sdkCfgCopy
}
// Use custom HTTP client with Firefox TLS fingerprint to bypass
// Cloudflare's bot detection on Anthropic domains
return &ClaudeAuth{
httpClient: NewAnthropicHttpClient(&cfg.SDKConfig),
httpClient: NewAnthropicHttpClient(sdkCfg),
}
}
@@ -88,7 +108,7 @@ func (o *ClaudeAuth) GenerateAuthURL(state string, pkceCodes *PKCECodes) (string
"client_id": {ClientID},
"response_type": {"code"},
"redirect_uri": {RedirectURI},
"scope": {"org:create_api_key user:profile user:inference"},
"scope": {"user:profile user:inference user:sessions:claude_code user:mcp_servers user:file_upload"},
"code_challenge": {pkceCodes.CodeChallenge},
"code_challenge_method": {"S256"},
"state": {state},

View File

@@ -0,0 +1,33 @@
package claude
import (
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"golang.org/x/net/proxy"
)
func TestNewClaudeAuthWithProxyURL_OverrideDirectTakesPrecedence(t *testing.T) {
cfg := &config.Config{SDKConfig: config.SDKConfig{ProxyURL: "socks5://proxy.example.com:1080"}}
auth := NewClaudeAuthWithProxyURL(cfg, "direct")
transport, ok := auth.httpClient.Transport.(*utlsRoundTripper)
if !ok || transport == nil {
t.Fatalf("expected utlsRoundTripper, got %T", auth.httpClient.Transport)
}
if transport.dialer != proxy.Direct {
t.Fatalf("expected proxy.Direct, got %T", transport.dialer)
}
}
func TestNewClaudeAuthWithProxyURL_OverrideProxyAppliedWithoutConfig(t *testing.T) {
auth := NewClaudeAuthWithProxyURL(nil, "socks5://proxy.example.com:1080")
transport, ok := auth.httpClient.Transport.(*utlsRoundTripper)
if !ok || transport == nil {
t.Fatalf("expected utlsRoundTripper, got %T", auth.httpClient.Transport)
}
if transport.dialer == proxy.Direct {
t.Fatalf("expected proxy dialer, got %T", transport.dialer)
}
}

View File

@@ -63,7 +63,7 @@ func (a *CodeBuddyAuth) FetchAuthState(ctx context.Context) (*AuthState, error)
return nil, fmt.Errorf("codebuddy: failed to create auth state request: %w", err)
}
requestID := uuid.NewString()
requestID := uuid.NewString()
req.Header.Set("Accept", "application/json, text/plain, */*")
req.Header.Set("Content-Type", "application/json")
req.Header.Set("X-Requested-With", "XMLHttpRequest")

View File

@@ -19,4 +19,3 @@ func TestDecodeUserID_ValidJWT(t *testing.T) {
t.Errorf("expected 'test-user-id-123', got '%s'", userID)
}
}

View File

@@ -37,8 +37,23 @@ type CodexAuth struct {
// NewCodexAuth creates a new CodexAuth service instance.
// It initializes an HTTP client with proxy settings from the provided configuration.
func NewCodexAuth(cfg *config.Config) *CodexAuth {
return NewCodexAuthWithProxyURL(cfg, "")
}
// NewCodexAuthWithProxyURL creates a new CodexAuth service instance.
// proxyURL takes precedence over cfg.ProxyURL when non-empty.
func NewCodexAuthWithProxyURL(cfg *config.Config, proxyURL string) *CodexAuth {
effectiveProxyURL := strings.TrimSpace(proxyURL)
var sdkCfg config.SDKConfig
if cfg != nil {
sdkCfg = cfg.SDKConfig
if effectiveProxyURL == "" {
effectiveProxyURL = strings.TrimSpace(cfg.ProxyURL)
}
}
sdkCfg.ProxyURL = effectiveProxyURL
return &CodexAuth{
httpClient: util.SetProxy(&cfg.SDKConfig, &http.Client{}),
httpClient: util.SetProxy(&sdkCfg, &http.Client{}),
}
}

View File

@@ -7,6 +7,8 @@ import (
"strings"
"sync/atomic"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
)
type roundTripFunc func(*http.Request) (*http.Response, error)
@@ -42,3 +44,37 @@ func TestRefreshTokensWithRetry_NonRetryableOnlyAttemptsOnce(t *testing.T) {
t.Fatalf("expected 1 refresh attempt, got %d", got)
}
}
func TestNewCodexAuthWithProxyURL_OverrideDirectDisablesProxy(t *testing.T) {
cfg := &config.Config{SDKConfig: config.SDKConfig{ProxyURL: "http://proxy.example.com:8080"}}
auth := NewCodexAuthWithProxyURL(cfg, "direct")
transport, ok := auth.httpClient.Transport.(*http.Transport)
if !ok || transport == nil {
t.Fatalf("expected http.Transport, got %T", auth.httpClient.Transport)
}
if transport.Proxy != nil {
t.Fatal("expected direct transport to disable proxy function")
}
}
func TestNewCodexAuthWithProxyURL_OverrideProxyTakesPrecedence(t *testing.T) {
cfg := &config.Config{SDKConfig: config.SDKConfig{ProxyURL: "http://global.example.com:8080"}}
auth := NewCodexAuthWithProxyURL(cfg, "http://override.example.com:8081")
transport, ok := auth.httpClient.Transport.(*http.Transport)
if !ok || transport == nil {
t.Fatalf("expected http.Transport, got %T", auth.httpClient.Transport)
}
req, errReq := http.NewRequest(http.MethodGet, "https://example.com", nil)
if errReq != nil {
t.Fatalf("new request: %v", errReq)
}
proxyURL, errProxy := transport.Proxy(req)
if errProxy != nil {
t.Fatalf("proxy func: %v", errProxy)
}
if proxyURL == nil || proxyURL.String() != "http://override.example.com:8081" {
t.Fatalf("proxy URL = %v, want http://override.example.com:8081", proxyURL)
}
}

View File

@@ -24,11 +24,11 @@ const (
copilotAPIEndpoint = "https://api.githubcopilot.com"
// Common HTTP header values for Copilot API requests.
copilotUserAgent = "GithubCopilot/1.0"
copilotEditorVersion = "vscode/1.100.0"
copilotPluginVersion = "copilot/1.300.0"
copilotIntegrationID = "vscode-chat"
copilotOpenAIIntent = "conversation-panel"
copilotUserAgent = "GithubCopilot/1.0"
copilotEditorVersion = "vscode/1.100.0"
copilotPluginVersion = "copilot/1.300.0"
copilotIntegrationID = "vscode-chat"
copilotOpenAIIntent = "conversation-panel"
)
// CopilotAPIToken represents the Copilot API token response.
@@ -235,6 +235,74 @@ type CopilotModelEntry struct {
Capabilities map[string]any `json:"capabilities,omitempty"`
}
// CopilotModelLimits holds the token limits returned by the Copilot /models API
// under capabilities.limits. These limits vary by account type (individual vs
// business) and are the authoritative source for enforcing prompt size.
type CopilotModelLimits struct {
// MaxContextWindowTokens is the total context window (prompt + output).
MaxContextWindowTokens int
// MaxPromptTokens is the hard limit on input/prompt tokens.
// Exceeding this triggers a 400 error from the Copilot API.
MaxPromptTokens int
// MaxOutputTokens is the maximum number of output/completion tokens.
MaxOutputTokens int
}
// Limits extracts the token limits from the model's capabilities map.
// Returns nil if no limits are available or the structure is unexpected.
//
// Expected Copilot API shape:
//
// "capabilities": {
// "limits": {
// "max_context_window_tokens": 200000,
// "max_prompt_tokens": 168000,
// "max_output_tokens": 32000
// }
// }
func (e *CopilotModelEntry) Limits() *CopilotModelLimits {
if e.Capabilities == nil {
return nil
}
limitsRaw, ok := e.Capabilities["limits"]
if !ok {
return nil
}
limitsMap, ok := limitsRaw.(map[string]any)
if !ok {
return nil
}
result := &CopilotModelLimits{
MaxContextWindowTokens: anyToInt(limitsMap["max_context_window_tokens"]),
MaxPromptTokens: anyToInt(limitsMap["max_prompt_tokens"]),
MaxOutputTokens: anyToInt(limitsMap["max_output_tokens"]),
}
// Only return if at least one field is populated.
if result.MaxContextWindowTokens == 0 && result.MaxPromptTokens == 0 && result.MaxOutputTokens == 0 {
return nil
}
return result
}
// anyToInt converts a JSON-decoded numeric value to int.
// Go's encoding/json decodes numbers into float64 when the target is any/interface{}.
func anyToInt(v any) int {
switch n := v.(type) {
case float64:
return int(n)
case float32:
return int(n)
case int:
return n
case int64:
return int(n)
default:
return 0
}
}
// CopilotModelsResponse represents the response from the Copilot /models endpoint.
type CopilotModelsResponse struct {
Data []CopilotModelEntry `json:"data"`
@@ -246,9 +314,9 @@ const maxModelsResponseSize = 2 * 1024 * 1024
// allowedCopilotAPIHosts is the set of hosts that are considered safe for Copilot API requests.
var allowedCopilotAPIHosts = map[string]bool{
"api.githubcopilot.com": true,
"api.individual.githubcopilot.com": true,
"api.business.githubcopilot.com": true,
"api.githubcopilot.com": true,
"api.individual.githubcopilot.com": true,
"api.business.githubcopilot.com": true,
"copilot-proxy.githubusercontent.com": true,
}

View File

@@ -0,0 +1,33 @@
package cursor
import (
"fmt"
"strings"
)
// CredentialFileName returns the filename used to persist Cursor credentials.
// Priority: explicit label > auto-generated from JWT sub hash.
// If both label and subHash are empty, falls back to "cursor.json".
func CredentialFileName(label, subHash string) string {
label = strings.TrimSpace(label)
subHash = strings.TrimSpace(subHash)
if label != "" {
return fmt.Sprintf("cursor.%s.json", label)
}
if subHash != "" {
return fmt.Sprintf("cursor.%s.json", subHash)
}
return "cursor.json"
}
// DisplayLabel returns a human-readable label for the Cursor account.
func DisplayLabel(label, subHash string) string {
label = strings.TrimSpace(label)
if label != "" {
return "Cursor " + label
}
if subHash != "" {
return "Cursor " + subHash
}
return "Cursor User"
}

View File

@@ -0,0 +1,249 @@
// Package cursor implements Cursor OAuth PKCE authentication and token refresh.
package cursor
import (
"context"
"crypto/rand"
"crypto/sha256"
"encoding/base64"
"encoding/json"
"fmt"
"io"
"math"
"net/http"
"strings"
"time"
)
const (
CursorLoginURL = "https://cursor.com/loginDeepControl"
CursorPollURL = "https://api2.cursor.sh/auth/poll"
CursorRefreshURL = "https://api2.cursor.sh/auth/exchange_user_api_key"
pollMaxAttempts = 150
pollBaseDelay = 1 * time.Second
pollMaxDelay = 10 * time.Second
pollBackoffMultiply = 1.2
maxConsecutiveErrors = 10
)
// AuthParams holds the PKCE parameters for Cursor login.
type AuthParams struct {
Verifier string
Challenge string
UUID string
LoginURL string
}
// TokenPair holds the access and refresh tokens from Cursor.
type TokenPair struct {
AccessToken string `json:"accessToken"`
RefreshToken string `json:"refreshToken"`
}
// GeneratePKCE creates a PKCE verifier and challenge pair.
func GeneratePKCE() (verifier, challenge string, err error) {
verifierBytes := make([]byte, 96)
if _, err = rand.Read(verifierBytes); err != nil {
return "", "", fmt.Errorf("cursor: failed to generate PKCE verifier: %w", err)
}
verifier = base64.RawURLEncoding.EncodeToString(verifierBytes)
h := sha256.Sum256([]byte(verifier))
challenge = base64.RawURLEncoding.EncodeToString(h[:])
return verifier, challenge, nil
}
// GenerateAuthParams creates the full set of auth params for Cursor login.
func GenerateAuthParams() (*AuthParams, error) {
verifier, challenge, err := GeneratePKCE()
if err != nil {
return nil, err
}
uuidBytes := make([]byte, 16)
if _, err = rand.Read(uuidBytes); err != nil {
return nil, fmt.Errorf("cursor: failed to generate UUID: %w", err)
}
uuid := fmt.Sprintf("%x-%x-%x-%x-%x",
uuidBytes[0:4], uuidBytes[4:6], uuidBytes[6:8], uuidBytes[8:10], uuidBytes[10:16])
loginURL := fmt.Sprintf("%s?challenge=%s&uuid=%s&mode=login&redirectTarget=cli",
CursorLoginURL, challenge, uuid)
return &AuthParams{
Verifier: verifier,
Challenge: challenge,
UUID: uuid,
LoginURL: loginURL,
}, nil
}
// PollForAuth polls the Cursor auth endpoint until the user completes login.
func PollForAuth(ctx context.Context, uuid, verifier string) (*TokenPair, error) {
delay := pollBaseDelay
consecutiveErrors := 0
client := &http.Client{Timeout: 10 * time.Second}
for attempt := 0; attempt < pollMaxAttempts; attempt++ {
select {
case <-ctx.Done():
return nil, ctx.Err()
case <-time.After(delay):
}
url := fmt.Sprintf("%s?uuid=%s&verifier=%s", CursorPollURL, uuid, verifier)
req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
if err != nil {
return nil, fmt.Errorf("cursor: failed to create poll request: %w", err)
}
resp, err := client.Do(req)
if err != nil {
consecutiveErrors++
if consecutiveErrors >= maxConsecutiveErrors {
return nil, fmt.Errorf("cursor: too many consecutive poll errors (last: %v)", err)
}
delay = minDuration(time.Duration(float64(delay)*pollBackoffMultiply), pollMaxDelay)
continue
}
body, _ := io.ReadAll(resp.Body)
resp.Body.Close()
if resp.StatusCode == http.StatusNotFound {
// Still waiting for user to authorize
consecutiveErrors = 0
delay = minDuration(time.Duration(float64(delay)*pollBackoffMultiply), pollMaxDelay)
continue
}
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
var tokens TokenPair
if err := json.Unmarshal(body, &tokens); err != nil {
return nil, fmt.Errorf("cursor: failed to parse auth response: %w", err)
}
return &tokens, nil
}
return nil, fmt.Errorf("cursor: poll failed with status %d: %s", resp.StatusCode, string(body))
}
return nil, fmt.Errorf("cursor: authentication polling timeout (waited ~%.0f seconds)",
float64(pollMaxAttempts)*pollMaxDelay.Seconds()/2)
}
// RefreshToken refreshes a Cursor access token using the refresh token.
func RefreshToken(ctx context.Context, refreshToken string) (*TokenPair, error) {
client := &http.Client{Timeout: 10 * time.Second}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, CursorRefreshURL,
strings.NewReader("{}"))
if err != nil {
return nil, fmt.Errorf("cursor: failed to create refresh request: %w", err)
}
req.Header.Set("Authorization", "Bearer "+refreshToken)
req.Header.Set("Content-Type", "application/json")
resp, err := client.Do(req)
if err != nil {
return nil, fmt.Errorf("cursor: token refresh request failed: %w", err)
}
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
return nil, fmt.Errorf("cursor: token refresh failed (status %d): %s", resp.StatusCode, string(body))
}
var tokens TokenPair
if err := json.Unmarshal(body, &tokens); err != nil {
return nil, fmt.Errorf("cursor: failed to parse refresh response: %w", err)
}
// Keep original refresh token if not returned
if tokens.RefreshToken == "" {
tokens.RefreshToken = refreshToken
}
return &tokens, nil
}
// ParseJWTSub extracts the "sub" claim from a Cursor JWT access token.
// Cursor JWTs contain "sub" like "auth0|user_XXXX" which uniquely identifies
// the account. Returns empty string if parsing fails.
func ParseJWTSub(token string) string {
decoded := decodeJWTPayload(token)
if decoded == nil {
return ""
}
var claims struct {
Sub string `json:"sub"`
}
if err := json.Unmarshal(decoded, &claims); err != nil {
return ""
}
return claims.Sub
}
// SubToShortHash converts a JWT sub claim to a short hex hash for use in filenames.
// e.g. "auth0|user_2x..." → "a3f8b2c1"
func SubToShortHash(sub string) string {
if sub == "" {
return ""
}
h := sha256.Sum256([]byte(sub))
return fmt.Sprintf("%x", h[:4]) // 8 hex chars
}
// decodeJWTPayload decodes the payload (middle) part of a JWT.
func decodeJWTPayload(token string) []byte {
parts := strings.Split(token, ".")
if len(parts) != 3 {
return nil
}
payload := parts[1]
switch len(payload) % 4 {
case 2:
payload += "=="
case 3:
payload += "="
}
payload = strings.ReplaceAll(payload, "-", "+")
payload = strings.ReplaceAll(payload, "_", "/")
decoded, err := base64.StdEncoding.DecodeString(payload)
if err != nil {
return nil
}
return decoded
}
// GetTokenExpiry extracts the JWT expiry from an access token with a 5-minute safety margin.
// Falls back to 1 hour from now if the token can't be parsed.
func GetTokenExpiry(token string) time.Time {
decoded := decodeJWTPayload(token)
if decoded == nil {
return time.Now().Add(1 * time.Hour)
}
var claims struct {
Exp float64 `json:"exp"`
}
if err := json.Unmarshal(decoded, &claims); err != nil || claims.Exp == 0 {
return time.Now().Add(1 * time.Hour)
}
sec, frac := math.Modf(claims.Exp)
expiry := time.Unix(int64(sec), int64(frac*1e9))
// Subtract 5-minute safety margin
return expiry.Add(-5 * time.Minute)
}
func minDuration(a, b time.Duration) time.Duration {
if a < b {
return a
}
return b
}

View File

@@ -0,0 +1,84 @@
package proto
import (
"encoding/binary"
"encoding/json"
"fmt"
)
const (
// ConnectEndStreamFlag marks the end-of-stream frame (trailers).
ConnectEndStreamFlag byte = 0x02
// ConnectCompressionFlag indicates the payload is compressed (not supported).
ConnectCompressionFlag byte = 0x01
// ConnectFrameHeaderSize is the fixed 5-byte frame header.
ConnectFrameHeaderSize = 5
)
// FrameConnectMessage wraps a protobuf payload in a Connect frame.
// Frame format: [1 byte flags][4 bytes payload length (big-endian)][payload]
func FrameConnectMessage(data []byte, flags byte) []byte {
frame := make([]byte, ConnectFrameHeaderSize+len(data))
frame[0] = flags
binary.BigEndian.PutUint32(frame[1:5], uint32(len(data)))
copy(frame[5:], data)
return frame
}
// ParseConnectFrame extracts one frame from a buffer.
// Returns (flags, payload, bytesConsumed, ok).
// ok is false when the buffer is too short for a complete frame.
func ParseConnectFrame(buf []byte) (flags byte, payload []byte, consumed int, ok bool) {
if len(buf) < ConnectFrameHeaderSize {
return 0, nil, 0, false
}
flags = buf[0]
length := binary.BigEndian.Uint32(buf[1:5])
total := ConnectFrameHeaderSize + int(length)
if len(buf) < total {
return 0, nil, 0, false
}
return flags, buf[5:total], total, true
}
// ConnectError is a structured error from the Connect protocol end-of-stream trailer.
// The Code field contains the server-defined error code (e.g. gRPC standard codes
// like "resource_exhausted", "unauthenticated", "permission_denied", "unavailable").
type ConnectError struct {
Code string // server-defined error code
Message string // human-readable error description
}
func (e *ConnectError) Error() string {
return fmt.Sprintf("Connect error %s: %s", e.Code, e.Message)
}
// ParseConnectEndStream parses a Connect end-of-stream frame payload (JSON).
// Returns nil if there is no error in the trailer.
// On error, returns a *ConnectError with the server's error code and message.
func ParseConnectEndStream(data []byte) error {
if len(data) == 0 {
return nil
}
var trailer struct {
Error *struct {
Code string `json:"code"`
Message string `json:"message"`
} `json:"error"`
}
if err := json.Unmarshal(data, &trailer); err != nil {
return fmt.Errorf("failed to parse Connect end stream: %w", err)
}
if trailer.Error != nil {
code := trailer.Error.Code
if code == "" {
code = "unknown"
}
msg := trailer.Error.Message
if msg == "" {
msg = "Unknown error"
}
return &ConnectError{Code: code, Message: msg}
}
return nil
}

View File

@@ -0,0 +1,563 @@
package proto
import (
"encoding/hex"
"fmt"
log "github.com/sirupsen/logrus"
"google.golang.org/protobuf/encoding/protowire"
)
// ServerMessageType identifies the kind of decoded server message.
type ServerMessageType int
const (
ServerMsgUnknown ServerMessageType = iota
ServerMsgTextDelta // Text content delta
ServerMsgThinkingDelta // Thinking/reasoning delta
ServerMsgThinkingCompleted // Thinking completed
ServerMsgKvGetBlob // Server wants a blob
ServerMsgKvSetBlob // Server wants to store a blob
ServerMsgExecRequestCtx // Server requests context (tools, etc.)
ServerMsgExecMcpArgs // Server wants MCP tool execution
ServerMsgExecShellArgs // Rejected: shell command
ServerMsgExecReadArgs // Rejected: file read
ServerMsgExecWriteArgs // Rejected: file write
ServerMsgExecDeleteArgs // Rejected: file delete
ServerMsgExecLsArgs // Rejected: directory listing
ServerMsgExecGrepArgs // Rejected: grep search
ServerMsgExecFetchArgs // Rejected: HTTP fetch
ServerMsgExecDiagnostics // Respond with empty diagnostics
ServerMsgExecShellStream // Rejected: shell stream
ServerMsgExecBgShellSpawn // Rejected: background shell
ServerMsgExecWriteShellStdin // Rejected: write shell stdin
ServerMsgExecOther // Other exec types (respond with empty)
ServerMsgTurnEnded // Turn has ended (no more output)
ServerMsgHeartbeat // Server heartbeat
ServerMsgTokenDelta // Token usage delta
ServerMsgCheckpoint // Conversation checkpoint update
)
// DecodedServerMessage holds parsed data from an AgentServerMessage.
type DecodedServerMessage struct {
Type ServerMessageType
// For text/thinking deltas
Text string
// For KV messages
KvId uint32
BlobId []byte // hex-encoded blob ID
BlobData []byte // for setBlobArgs
// For exec messages
ExecMsgId uint32
ExecId string
// For MCP args
McpToolName string
McpToolCallId string
McpArgs map[string][]byte // arg name -> protobuf-encoded value
// For rejection context
Path string
Command string
WorkingDirectory string
Url string
// For other exec - the raw field number for building a response
ExecFieldNumber int
// For TokenDeltaUpdate
TokenDelta int64
// For conversation checkpoint update (raw bytes, not decoded)
CheckpointData []byte
}
// DecodeAgentServerMessage parses an AgentServerMessage and returns
// a structured representation of the first meaningful message found.
func DecodeAgentServerMessage(data []byte) (*DecodedServerMessage, error) {
msg := &DecodedServerMessage{Type: ServerMsgUnknown}
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return msg, fmt.Errorf("invalid tag")
}
data = data[n:]
switch typ {
case protowire.BytesType:
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return msg, fmt.Errorf("invalid bytes field %d", num)
}
data = data[n:]
// Debug: log top-level ASM fields
log.Debugf("DecodeAgentServerMessage: found ASM field %d, len=%d", num, len(val))
switch num {
case ASM_InteractionUpdate:
log.Debugf("DecodeAgentServerMessage: calling decodeInteractionUpdate")
decodeInteractionUpdate(val, msg)
case ASM_ExecServerMessage:
log.Debugf("DecodeAgentServerMessage: calling decodeExecServerMessage")
decodeExecServerMessage(val, msg)
case ASM_KvServerMessage:
decodeKvServerMessage(val, msg)
case ASM_ConversationCheckpoint:
msg.Type = ServerMsgCheckpoint
msg.CheckpointData = append([]byte(nil), val...) // copy raw bytes
log.Debugf("DecodeAgentServerMessage: captured checkpoint %d bytes", len(val))
}
case protowire.VarintType:
_, n := protowire.ConsumeVarint(data)
if n < 0 {
return msg, fmt.Errorf("invalid varint field %d", num)
}
data = data[n:]
default:
// Skip unknown wire types
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return msg, fmt.Errorf("invalid field %d", num)
}
data = data[n:]
}
}
return msg, nil
}
func decodeInteractionUpdate(data []byte, msg *DecodedServerMessage) {
log.Debugf("decodeInteractionUpdate: input len=%d, hex=%x", len(data), data)
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
log.Debugf("decodeInteractionUpdate: invalid tag, remaining=%x", data)
return
}
data = data[n:]
log.Debugf("decodeInteractionUpdate: field=%d wire=%d remaining=%d bytes", num, typ, len(data))
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
log.Debugf("decodeInteractionUpdate: invalid bytes field %d", num)
return
}
data = data[n:]
log.Debugf("decodeInteractionUpdate: field %d content len=%d, first 20 bytes: %x", num, len(val), val[:min(20, len(val))])
switch num {
case IU_TextDelta:
msg.Type = ServerMsgTextDelta
msg.Text = decodeStringField(val, TDU_Text)
log.Debugf("decodeInteractionUpdate: TextDelta text=%q", msg.Text)
case IU_ThinkingDelta:
msg.Type = ServerMsgThinkingDelta
msg.Text = decodeStringField(val, TKD_Text)
log.Debugf("decodeInteractionUpdate: ThinkingDelta text=%q", msg.Text)
case IU_ThinkingCompleted:
msg.Type = ServerMsgThinkingCompleted
log.Debugf("decodeInteractionUpdate: ThinkingCompleted")
case 2:
// tool_call_started - ignore but log
log.Debugf("decodeInteractionUpdate: ToolCallStarted (ignored)")
case 3:
// tool_call_completed - ignore but log
log.Debugf("decodeInteractionUpdate: ToolCallCompleted (ignored)")
case 8:
// token_delta - extract token count
msg.Type = ServerMsgTokenDelta
msg.TokenDelta = decodeVarintField(val, 1)
log.Debugf("decodeInteractionUpdate: TokenDeltaUpdate tokens=%d", msg.TokenDelta)
case 13:
// heartbeat from server
msg.Type = ServerMsgHeartbeat
case 14:
// turn_ended - critical: model finished generating
msg.Type = ServerMsgTurnEnded
log.Debugf("decodeInteractionUpdate: TurnEndedUpdate - stream should end")
case 16:
// step_started - ignore
log.Debugf("decodeInteractionUpdate: StepStartedUpdate (ignored)")
case 17:
// step_completed - ignore
log.Debugf("decodeInteractionUpdate: StepCompletedUpdate (ignored)")
default:
log.Debugf("decodeInteractionUpdate: unknown field %d", num)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
func decodeKvServerMessage(data []byte, msg *DecodedServerMessage) {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
switch typ {
case protowire.VarintType:
val, n := protowire.ConsumeVarint(data)
if n < 0 {
return
}
data = data[n:]
if num == KSM_Id {
msg.KvId = uint32(val)
}
case protowire.BytesType:
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
switch num {
case KSM_GetBlobArgs:
msg.Type = ServerMsgKvGetBlob
msg.BlobId = decodeBytesField(val, GBA_BlobId)
case KSM_SetBlobArgs:
msg.Type = ServerMsgKvSetBlob
decodeSetBlobArgs(val, msg)
}
default:
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
func decodeSetBlobArgs(data []byte, msg *DecodedServerMessage) {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
switch num {
case SBA_BlobId:
msg.BlobId = val
case SBA_BlobData:
msg.BlobData = val
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
func decodeExecServerMessage(data []byte, msg *DecodedServerMessage) {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
switch typ {
case protowire.VarintType:
val, n := protowire.ConsumeVarint(data)
if n < 0 {
return
}
data = data[n:]
if num == ESM_Id {
msg.ExecMsgId = uint32(val)
log.Debugf("decodeExecServerMessage: ESM_Id = %d", val)
}
case protowire.BytesType:
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
// Debug: log all fields found in ExecServerMessage
log.Debugf("decodeExecServerMessage: found field %d, len=%d, first 20 bytes: %x", num, len(val), val[:min(20, len(val))])
switch num {
case ESM_ExecId:
msg.ExecId = string(val)
log.Debugf("decodeExecServerMessage: ESM_ExecId = %q", msg.ExecId)
case ESM_RequestContextArgs:
msg.Type = ServerMsgExecRequestCtx
case ESM_McpArgs:
msg.Type = ServerMsgExecMcpArgs
decodeMcpArgs(val, msg)
case ESM_ShellArgs:
msg.Type = ServerMsgExecShellArgs
decodeShellArgs(val, msg)
case ESM_ShellStreamArgs:
msg.Type = ServerMsgExecShellStream
decodeShellArgs(val, msg)
case ESM_ReadArgs:
msg.Type = ServerMsgExecReadArgs
msg.Path = decodeStringField(val, RA_Path)
case ESM_WriteArgs:
msg.Type = ServerMsgExecWriteArgs
msg.Path = decodeStringField(val, WA_Path)
case ESM_DeleteArgs:
msg.Type = ServerMsgExecDeleteArgs
msg.Path = decodeStringField(val, DA_Path)
case ESM_LsArgs:
msg.Type = ServerMsgExecLsArgs
msg.Path = decodeStringField(val, LA_Path)
case ESM_GrepArgs:
msg.Type = ServerMsgExecGrepArgs
case ESM_FetchArgs:
msg.Type = ServerMsgExecFetchArgs
msg.Url = decodeStringField(val, FA_Url)
case ESM_DiagnosticsArgs:
msg.Type = ServerMsgExecDiagnostics
case ESM_BackgroundShellSpawn:
msg.Type = ServerMsgExecBgShellSpawn
decodeShellArgs(val, msg) // same structure
case ESM_WriteShellStdinArgs:
msg.Type = ServerMsgExecWriteShellStdin
default:
// Unknown exec types - only set if we haven't identified the type yet
// (other fields like span_context (19) come after the exec type field)
if msg.Type == ServerMsgUnknown {
msg.Type = ServerMsgExecOther
msg.ExecFieldNumber = int(num)
}
}
default:
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
func decodeMcpArgs(data []byte, msg *DecodedServerMessage) {
msg.McpArgs = make(map[string][]byte)
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
switch num {
case MCA_Name:
msg.McpToolName = string(val)
case MCA_Args:
// Map entries are encoded as submessages with key=1, value=2
decodeMapEntry(val, msg.McpArgs)
case MCA_ToolCallId:
msg.McpToolCallId = string(val)
case MCA_ToolName:
// ToolName takes precedence if present
if msg.McpToolName == "" || string(val) != "" {
msg.McpToolName = string(val)
}
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
func decodeMapEntry(data []byte, m map[string][]byte) {
var key string
var value []byte
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
if num == 1 {
key = string(val)
} else if num == 2 {
value = append([]byte(nil), val...)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
if key != "" {
m[key] = value
}
}
func decodeShellArgs(data []byte, msg *DecodedServerMessage) {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return
}
data = data[n:]
switch num {
case SHA_Command:
msg.Command = string(val)
case SHA_WorkingDirectory:
msg.WorkingDirectory = string(val)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return
}
data = data[n:]
}
}
}
// --- Helper decoders ---
// decodeStringField extracts a string from the first matching field in a submessage.
func decodeStringField(data []byte, targetField protowire.Number) string {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return ""
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return ""
}
data = data[n:]
if num == targetField {
return string(val)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return ""
}
data = data[n:]
}
}
return ""
}
// decodeBytesField extracts bytes from the first matching field in a submessage.
func decodeBytesField(data []byte, targetField protowire.Number) []byte {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return nil
}
data = data[n:]
if typ == protowire.BytesType {
val, n := protowire.ConsumeBytes(data)
if n < 0 {
return nil
}
data = data[n:]
if num == targetField {
return append([]byte(nil), val...)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return nil
}
data = data[n:]
}
}
return nil
}
// decodeVarintField extracts an int64 from the first matching varint field in a submessage.
func decodeVarintField(data []byte, targetField protowire.Number) int64 {
for len(data) > 0 {
num, typ, n := protowire.ConsumeTag(data)
if n < 0 {
return 0
}
data = data[n:]
if typ == protowire.VarintType {
val, n := protowire.ConsumeVarint(data)
if n < 0 {
return 0
}
data = data[n:]
if num == targetField {
return int64(val)
}
} else {
n := protowire.ConsumeFieldValue(num, typ, data)
if n < 0 {
return 0
}
data = data[n:]
}
}
return 0
}
// BlobIdHex returns the hex string of a blob ID for use as a map key.
func BlobIdHex(blobId []byte) string {
return hex.EncodeToString(blobId)
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,664 @@
// Package proto provides protobuf encoding for Cursor's gRPC API,
// using dynamicpb with the embedded FileDescriptorProto from agent.proto.
// This mirrors the cursor-auth TS plugin's use of @bufbuild/protobuf create()+toBinary().
package proto
import (
"crypto/sha256"
"encoding/hex"
"encoding/json"
"fmt"
log "github.com/sirupsen/logrus"
"google.golang.org/protobuf/encoding/protowire"
"google.golang.org/protobuf/proto"
"google.golang.org/protobuf/reflect/protoreflect"
"google.golang.org/protobuf/types/dynamicpb"
"google.golang.org/protobuf/types/known/structpb"
)
// --- Public types ---
// RunRequestParams holds all data needed to build an AgentRunRequest.
type RunRequestParams struct {
ModelId string
SystemPrompt string
UserText string
MessageId string
ConversationId string
Images []ImageData
Turns []TurnData
McpTools []McpToolDef
BlobStore map[string][]byte // hex(sha256) -> data, populated during encoding
RawCheckpoint []byte // if non-nil, use as conversation_state directly (from server checkpoint)
}
type ImageData struct {
MimeType string
Data []byte
}
type TurnData struct {
UserText string
AssistantText string
}
type McpToolDef struct {
Name string
Description string
InputSchema json.RawMessage
}
// --- Helper: create a dynamic message and set fields ---
func newMsg(name string) *dynamicpb.Message {
return dynamicpb.NewMessage(Msg(name))
}
func field(msg *dynamicpb.Message, name string) protoreflect.FieldDescriptor {
return msg.Descriptor().Fields().ByName(protoreflect.Name(name))
}
func setStr(msg *dynamicpb.Message, name, val string) {
if val != "" {
msg.Set(field(msg, name), protoreflect.ValueOfString(val))
}
}
func setBytes(msg *dynamicpb.Message, name string, val []byte) {
if len(val) > 0 {
msg.Set(field(msg, name), protoreflect.ValueOfBytes(val))
}
}
func setUint32(msg *dynamicpb.Message, name string, val uint32) {
msg.Set(field(msg, name), protoreflect.ValueOfUint32(val))
}
func setBool(msg *dynamicpb.Message, name string, val bool) {
msg.Set(field(msg, name), protoreflect.ValueOfBool(val))
}
func setMsg(msg *dynamicpb.Message, name string, sub *dynamicpb.Message) {
msg.Set(field(msg, name), protoreflect.ValueOfMessage(sub.ProtoReflect()))
}
func marshal(msg *dynamicpb.Message) []byte {
b, err := proto.Marshal(msg)
if err != nil {
panic("cursor proto marshal: " + err.Error())
}
return b
}
// --- Encode functions mirroring cursor-fetch.ts ---
// EncodeHeartbeat returns an encoded AgentClientMessage with clientHeartbeat.
// Mirrors: create(AgentClientMessageSchema, { message: { case: 'clientHeartbeat', value: create(ClientHeartbeatSchema, {}) } })
func EncodeHeartbeat() []byte {
hb := newMsg("ClientHeartbeat")
acm := newMsg("AgentClientMessage")
setMsg(acm, "client_heartbeat", hb)
return marshal(acm)
}
// EncodeRunRequest builds a full AgentClientMessage wrapping an AgentRunRequest.
// Mirrors buildCursorRequest() in cursor-fetch.ts.
// If p.RawCheckpoint is set, it is used directly as the conversation_state bytes
// (from a previous conversation_checkpoint_update), skipping manual turn construction.
func EncodeRunRequest(p *RunRequestParams) []byte {
if p.RawCheckpoint != nil {
return encodeRunRequestWithCheckpoint(p)
}
if p.BlobStore == nil {
p.BlobStore = make(map[string][]byte)
}
// --- Conversation turns ---
// Each turn is serialized as bytes (ConversationTurnStructure → bytes)
var turnBytes [][]byte
for _, turn := range p.Turns {
// UserMessage for this turn
um := newMsg("UserMessage")
setStr(um, "text", turn.UserText)
setStr(um, "message_id", generateId())
umBytes := marshal(um)
// Steps (assistant response)
var stepBytes [][]byte
if turn.AssistantText != "" {
am := newMsg("AssistantMessage")
setStr(am, "text", turn.AssistantText)
step := newMsg("ConversationStep")
setMsg(step, "assistant_message", am)
stepBytes = append(stepBytes, marshal(step))
}
// AgentConversationTurnStructure (fields are bytes, not submessages)
agentTurn := newMsg("AgentConversationTurnStructure")
setBytes(agentTurn, "user_message", umBytes)
for _, sb := range stepBytes {
stepsField := field(agentTurn, "steps")
list := agentTurn.Mutable(stepsField).List()
list.Append(protoreflect.ValueOfBytes(sb))
}
// ConversationTurnStructure (oneof turn → agentConversationTurn)
cts := newMsg("ConversationTurnStructure")
setMsg(cts, "agent_conversation_turn", agentTurn)
turnBytes = append(turnBytes, marshal(cts))
}
// --- System prompt blob ---
systemJSON, _ := json.Marshal(map[string]string{"role": "system", "content": p.SystemPrompt})
blobId := sha256Sum(systemJSON)
p.BlobStore[hex.EncodeToString(blobId)] = systemJSON
// --- ConversationStateStructure ---
css := newMsg("ConversationStateStructure")
// rootPromptMessagesJson: repeated bytes
rootField := field(css, "root_prompt_messages_json")
rootList := css.Mutable(rootField).List()
rootList.Append(protoreflect.ValueOfBytes(blobId))
// turns: repeated bytes (field 8) + turns_old (field 2) for compatibility
turnsField := field(css, "turns")
turnsList := css.Mutable(turnsField).List()
for _, tb := range turnBytes {
turnsList.Append(protoreflect.ValueOfBytes(tb))
}
turnsOldField := field(css, "turns_old")
if turnsOldField != nil {
turnsOldList := css.Mutable(turnsOldField).List()
for _, tb := range turnBytes {
turnsOldList.Append(protoreflect.ValueOfBytes(tb))
}
}
// --- UserMessage (current) ---
userMessage := newMsg("UserMessage")
setStr(userMessage, "text", p.UserText)
setStr(userMessage, "message_id", p.MessageId)
// Images via SelectedContext
if len(p.Images) > 0 {
sc := newMsg("SelectedContext")
imgsField := field(sc, "selected_images")
imgsList := sc.Mutable(imgsField).List()
for _, img := range p.Images {
si := newMsg("SelectedImage")
setStr(si, "uuid", generateId())
setStr(si, "mime_type", img.MimeType)
setBytes(si, "data", img.Data)
imgsList.Append(protoreflect.ValueOfMessage(si.ProtoReflect()))
}
setMsg(userMessage, "selected_context", sc)
}
// --- UserMessageAction ---
uma := newMsg("UserMessageAction")
setMsg(uma, "user_message", userMessage)
// --- ConversationAction ---
ca := newMsg("ConversationAction")
setMsg(ca, "user_message_action", uma)
// --- ModelDetails ---
md := newMsg("ModelDetails")
setStr(md, "model_id", p.ModelId)
setStr(md, "display_model_id", p.ModelId)
setStr(md, "display_name", p.ModelId)
// --- AgentRunRequest ---
arr := newMsg("AgentRunRequest")
setMsg(arr, "conversation_state", css)
setMsg(arr, "action", ca)
setMsg(arr, "model_details", md)
setStr(arr, "conversation_id", p.ConversationId)
// McpTools
if len(p.McpTools) > 0 {
mcpTools := newMsg("McpTools")
toolsField := field(mcpTools, "mcp_tools")
toolsList := mcpTools.Mutable(toolsField).List()
for _, tool := range p.McpTools {
td := newMsg("McpToolDefinition")
setStr(td, "name", tool.Name)
setStr(td, "description", tool.Description)
if len(tool.InputSchema) > 0 {
setBytes(td, "input_schema", jsonToProtobufValueBytes(tool.InputSchema))
}
setStr(td, "provider_identifier", "proxy")
setStr(td, "tool_name", tool.Name)
toolsList.Append(protoreflect.ValueOfMessage(td.ProtoReflect()))
}
setMsg(arr, "mcp_tools", mcpTools)
}
// --- AgentClientMessage ---
acm := newMsg("AgentClientMessage")
setMsg(acm, "run_request", arr)
return marshal(acm)
}
// encodeRunRequestWithCheckpoint builds an AgentClientMessage using a raw checkpoint
// as conversation_state. The checkpoint bytes are embedded directly without deserialization.
func encodeRunRequestWithCheckpoint(p *RunRequestParams) []byte {
// Build UserMessage
userMessage := newMsg("UserMessage")
setStr(userMessage, "text", p.UserText)
setStr(userMessage, "message_id", p.MessageId)
if len(p.Images) > 0 {
sc := newMsg("SelectedContext")
imgsField := field(sc, "selected_images")
imgsList := sc.Mutable(imgsField).List()
for _, img := range p.Images {
si := newMsg("SelectedImage")
setStr(si, "uuid", generateId())
setStr(si, "mime_type", img.MimeType)
setBytes(si, "data", img.Data)
imgsList.Append(protoreflect.ValueOfMessage(si.ProtoReflect()))
}
setMsg(userMessage, "selected_context", sc)
}
// Build ConversationAction with UserMessageAction
uma := newMsg("UserMessageAction")
setMsg(uma, "user_message", userMessage)
ca := newMsg("ConversationAction")
setMsg(ca, "user_message_action", uma)
caBytes := marshal(ca)
// Build ModelDetails
md := newMsg("ModelDetails")
setStr(md, "model_id", p.ModelId)
setStr(md, "display_model_id", p.ModelId)
setStr(md, "display_name", p.ModelId)
mdBytes := marshal(md)
// Build McpTools
var mcpToolsBytes []byte
if len(p.McpTools) > 0 {
mcpTools := newMsg("McpTools")
toolsField := field(mcpTools, "mcp_tools")
toolsList := mcpTools.Mutable(toolsField).List()
for _, tool := range p.McpTools {
td := newMsg("McpToolDefinition")
setStr(td, "name", tool.Name)
setStr(td, "description", tool.Description)
if len(tool.InputSchema) > 0 {
setBytes(td, "input_schema", jsonToProtobufValueBytes(tool.InputSchema))
}
setStr(td, "provider_identifier", "proxy")
setStr(td, "tool_name", tool.Name)
toolsList.Append(protoreflect.ValueOfMessage(td.ProtoReflect()))
}
mcpToolsBytes = marshal(mcpTools)
}
// Manually assemble AgentRunRequest using protowire to embed raw checkpoint
var arrBuf []byte
// field 1: conversation_state = raw checkpoint bytes (length-delimited)
arrBuf = protowire.AppendTag(arrBuf, ARR_ConversationState, protowire.BytesType)
arrBuf = protowire.AppendBytes(arrBuf, p.RawCheckpoint)
// field 2: action = ConversationAction
arrBuf = protowire.AppendTag(arrBuf, ARR_Action, protowire.BytesType)
arrBuf = protowire.AppendBytes(arrBuf, caBytes)
// field 3: model_details = ModelDetails
arrBuf = protowire.AppendTag(arrBuf, ARR_ModelDetails, protowire.BytesType)
arrBuf = protowire.AppendBytes(arrBuf, mdBytes)
// field 4: mcp_tools = McpTools
if len(mcpToolsBytes) > 0 {
arrBuf = protowire.AppendTag(arrBuf, ARR_McpTools, protowire.BytesType)
arrBuf = protowire.AppendBytes(arrBuf, mcpToolsBytes)
}
// field 5: conversation_id = string
if p.ConversationId != "" {
arrBuf = protowire.AppendTag(arrBuf, ARR_ConversationId, protowire.BytesType)
arrBuf = protowire.AppendString(arrBuf, p.ConversationId)
}
// Wrap in AgentClientMessage field 1 (run_request)
var acmBuf []byte
acmBuf = protowire.AppendTag(acmBuf, ACM_RunRequest, protowire.BytesType)
acmBuf = protowire.AppendBytes(acmBuf, arrBuf)
log.Debugf("cursor encode: built RunRequest with checkpoint (%d bytes), total=%d bytes", len(p.RawCheckpoint), len(acmBuf))
return acmBuf
}
// ResumeRequestParams holds data for a ResumeAction request.
type ResumeRequestParams struct {
ModelId string
ConversationId string
McpTools []McpToolDef
}
// EncodeResumeRequest builds an AgentClientMessage with ResumeAction.
// Used to resume a conversation by conversation_id without re-sending full history.
func EncodeResumeRequest(p *ResumeRequestParams) []byte {
// RequestContext with tools
rc := newMsg("RequestContext")
if len(p.McpTools) > 0 {
toolsField := field(rc, "tools")
toolsList := rc.Mutable(toolsField).List()
for _, tool := range p.McpTools {
td := newMsg("McpToolDefinition")
setStr(td, "name", tool.Name)
setStr(td, "description", tool.Description)
if len(tool.InputSchema) > 0 {
setBytes(td, "input_schema", jsonToProtobufValueBytes(tool.InputSchema))
}
setStr(td, "provider_identifier", "proxy")
setStr(td, "tool_name", tool.Name)
toolsList.Append(protoreflect.ValueOfMessage(td.ProtoReflect()))
}
}
// ResumeAction
ra := newMsg("ResumeAction")
setMsg(ra, "request_context", rc)
// ConversationAction with resume_action
ca := newMsg("ConversationAction")
setMsg(ca, "resume_action", ra)
// ModelDetails
md := newMsg("ModelDetails")
setStr(md, "model_id", p.ModelId)
setStr(md, "display_model_id", p.ModelId)
setStr(md, "display_name", p.ModelId)
// AgentRunRequest — no conversation_state needed for resume
arr := newMsg("AgentRunRequest")
setMsg(arr, "action", ca)
setMsg(arr, "model_details", md)
setStr(arr, "conversation_id", p.ConversationId)
// McpTools at top level
if len(p.McpTools) > 0 {
mcpTools := newMsg("McpTools")
toolsField := field(mcpTools, "mcp_tools")
toolsList := mcpTools.Mutable(toolsField).List()
for _, tool := range p.McpTools {
td := newMsg("McpToolDefinition")
setStr(td, "name", tool.Name)
setStr(td, "description", tool.Description)
if len(tool.InputSchema) > 0 {
setBytes(td, "input_schema", jsonToProtobufValueBytes(tool.InputSchema))
}
setStr(td, "provider_identifier", "proxy")
setStr(td, "tool_name", tool.Name)
toolsList.Append(protoreflect.ValueOfMessage(td.ProtoReflect()))
}
setMsg(arr, "mcp_tools", mcpTools)
}
acm := newMsg("AgentClientMessage")
setMsg(acm, "run_request", arr)
return marshal(acm)
}
// --- KV response encoders ---
// Mirrors handleKvMessage() in cursor-fetch.ts
// EncodeKvGetBlobResult responds to a getBlobArgs request.
func EncodeKvGetBlobResult(kvId uint32, blobData []byte) []byte {
result := newMsg("GetBlobResult")
if blobData != nil {
setBytes(result, "blob_data", blobData)
}
kvc := newMsg("KvClientMessage")
setUint32(kvc, "id", kvId)
setMsg(kvc, "get_blob_result", result)
acm := newMsg("AgentClientMessage")
setMsg(acm, "kv_client_message", kvc)
return marshal(acm)
}
// EncodeKvSetBlobResult responds to a setBlobArgs request.
func EncodeKvSetBlobResult(kvId uint32) []byte {
result := newMsg("SetBlobResult")
kvc := newMsg("KvClientMessage")
setUint32(kvc, "id", kvId)
setMsg(kvc, "set_blob_result", result)
acm := newMsg("AgentClientMessage")
setMsg(acm, "kv_client_message", kvc)
return marshal(acm)
}
// --- Exec response encoders ---
// Mirrors handleExecMessage() and sendExec() in cursor-fetch.ts
// EncodeExecRequestContextResult responds to requestContextArgs with tool definitions.
func EncodeExecRequestContextResult(execMsgId uint32, execId string, tools []McpToolDef) []byte {
// RequestContext with tools
rc := newMsg("RequestContext")
if len(tools) > 0 {
toolsField := field(rc, "tools")
toolsList := rc.Mutable(toolsField).List()
for _, tool := range tools {
td := newMsg("McpToolDefinition")
setStr(td, "name", tool.Name)
setStr(td, "description", tool.Description)
if len(tool.InputSchema) > 0 {
setBytes(td, "input_schema", jsonToProtobufValueBytes(tool.InputSchema))
}
setStr(td, "provider_identifier", "proxy")
setStr(td, "tool_name", tool.Name)
toolsList.Append(protoreflect.ValueOfMessage(td.ProtoReflect()))
}
}
// RequestContextSuccess
rcs := newMsg("RequestContextSuccess")
setMsg(rcs, "request_context", rc)
// RequestContextResult (oneof success)
rcr := newMsg("RequestContextResult")
setMsg(rcr, "success", rcs)
return encodeExecClientMsg(execMsgId, execId, "request_context_result", rcr)
}
// EncodeExecMcpResult responds with MCP tool result.
func EncodeExecMcpResult(execMsgId uint32, execId string, content string, isError bool) []byte {
textContent := newMsg("McpTextContent")
setStr(textContent, "text", content)
contentItem := newMsg("McpToolResultContentItem")
setMsg(contentItem, "text", textContent)
success := newMsg("McpSuccess")
contentField := field(success, "content")
contentList := success.Mutable(contentField).List()
contentList.Append(protoreflect.ValueOfMessage(contentItem.ProtoReflect()))
setBool(success, "is_error", isError)
result := newMsg("McpResult")
setMsg(result, "success", success)
return encodeExecClientMsg(execMsgId, execId, "mcp_result", result)
}
// EncodeExecMcpError responds with MCP error.
func EncodeExecMcpError(execMsgId uint32, execId string, errMsg string) []byte {
mcpErr := newMsg("McpError")
setStr(mcpErr, "error", errMsg)
result := newMsg("McpResult")
setMsg(result, "error", mcpErr)
return encodeExecClientMsg(execMsgId, execId, "mcp_result", result)
}
// --- Rejection encoders (mirror handleExecMessage rejections) ---
func EncodeExecReadRejected(execMsgId uint32, execId string, path, reason string) []byte {
rej := newMsg("ReadRejected")
setStr(rej, "path", path)
setStr(rej, "reason", reason)
result := newMsg("ReadResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "read_result", result)
}
func EncodeExecShellRejected(execMsgId uint32, execId string, command, workDir, reason string) []byte {
rej := newMsg("ShellRejected")
setStr(rej, "command", command)
setStr(rej, "working_directory", workDir)
setStr(rej, "reason", reason)
result := newMsg("ShellResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "shell_result", result)
}
func EncodeExecWriteRejected(execMsgId uint32, execId string, path, reason string) []byte {
rej := newMsg("WriteRejected")
setStr(rej, "path", path)
setStr(rej, "reason", reason)
result := newMsg("WriteResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "write_result", result)
}
func EncodeExecDeleteRejected(execMsgId uint32, execId string, path, reason string) []byte {
rej := newMsg("DeleteRejected")
setStr(rej, "path", path)
setStr(rej, "reason", reason)
result := newMsg("DeleteResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "delete_result", result)
}
func EncodeExecLsRejected(execMsgId uint32, execId string, path, reason string) []byte {
rej := newMsg("LsRejected")
setStr(rej, "path", path)
setStr(rej, "reason", reason)
result := newMsg("LsResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "ls_result", result)
}
func EncodeExecGrepError(execMsgId uint32, execId string, errMsg string) []byte {
grepErr := newMsg("GrepError")
setStr(grepErr, "error", errMsg)
result := newMsg("GrepResult")
setMsg(result, "error", grepErr)
return encodeExecClientMsg(execMsgId, execId, "grep_result", result)
}
func EncodeExecFetchError(execMsgId uint32, execId string, url, errMsg string) []byte {
fetchErr := newMsg("FetchError")
setStr(fetchErr, "url", url)
setStr(fetchErr, "error", errMsg)
result := newMsg("FetchResult")
setMsg(result, "error", fetchErr)
return encodeExecClientMsg(execMsgId, execId, "fetch_result", result)
}
func EncodeExecDiagnosticsResult(execMsgId uint32, execId string) []byte {
result := newMsg("DiagnosticsResult")
return encodeExecClientMsg(execMsgId, execId, "diagnostics_result", result)
}
func EncodeExecBackgroundShellSpawnRejected(execMsgId uint32, execId string, command, workDir, reason string) []byte {
rej := newMsg("ShellRejected")
setStr(rej, "command", command)
setStr(rej, "working_directory", workDir)
setStr(rej, "reason", reason)
result := newMsg("BackgroundShellSpawnResult")
setMsg(result, "rejected", rej)
return encodeExecClientMsg(execMsgId, execId, "background_shell_spawn_result", result)
}
func EncodeExecWriteShellStdinError(execMsgId uint32, execId string, errMsg string) []byte {
wsErr := newMsg("WriteShellStdinError")
setStr(wsErr, "error", errMsg)
result := newMsg("WriteShellStdinResult")
setMsg(result, "error", wsErr)
return encodeExecClientMsg(execMsgId, execId, "write_shell_stdin_result", result)
}
// encodeExecClientMsg wraps an exec result in AgentClientMessage.
// Mirrors sendExec() in cursor-fetch.ts.
func encodeExecClientMsg(id uint32, execId string, resultFieldName string, resultMsg *dynamicpb.Message) []byte {
ecm := newMsg("ExecClientMessage")
setUint32(ecm, "id", id)
// Force set exec_id even if empty - Cursor requires this field to be set
ecm.Set(field(ecm, "exec_id"), protoreflect.ValueOfString(execId))
// Debug: check if field exists
fd := field(ecm, resultFieldName)
if fd == nil {
panic(fmt.Sprintf("field %q NOT FOUND in ExecClientMessage! Available fields: %v", resultFieldName, listFields(ecm)))
}
// Debug: log the actual field being set
log.Debugf("encodeExecClientMsg: setting field %q (number=%d, kind=%s)", fd.Name(), fd.Number(), fd.Kind())
ecm.Set(fd, protoreflect.ValueOfMessage(resultMsg.ProtoReflect()))
acm := newMsg("AgentClientMessage")
setMsg(acm, "exec_client_message", ecm)
return marshal(acm)
}
func listFields(msg *dynamicpb.Message) []string {
var names []string
for i := 0; i < msg.Descriptor().Fields().Len(); i++ {
names = append(names, string(msg.Descriptor().Fields().Get(i).Name()))
}
return names
}
// --- Utilities ---
// jsonToProtobufValueBytes converts a JSON schema (json.RawMessage) to protobuf Value binary.
// This mirrors the TS pattern: toBinary(ValueSchema, fromJson(ValueSchema, jsonSchema))
func jsonToProtobufValueBytes(jsonData json.RawMessage) []byte {
if len(jsonData) == 0 {
return nil
}
var v interface{}
if err := json.Unmarshal(jsonData, &v); err != nil {
return jsonData // fallback to raw JSON if parsing fails
}
pbVal, err := structpb.NewValue(v)
if err != nil {
return jsonData // fallback
}
b, err := proto.Marshal(pbVal)
if err != nil {
return jsonData // fallback
}
return b
}
// ProtobufValueBytesToJSON converts protobuf Value binary back to JSON.
// This mirrors the TS pattern: toJson(ValueSchema, fromBinary(ValueSchema, value))
func ProtobufValueBytesToJSON(data []byte) (interface{}, error) {
val := &structpb.Value{}
if err := proto.Unmarshal(data, val); err != nil {
return nil, err
}
return val.AsInterface(), nil
}
func sha256Sum(data []byte) []byte {
h := sha256.Sum256(data)
return h[:]
}
var idCounter uint64
func generateId() string {
idCounter++
h := sha256.Sum256([]byte{byte(idCounter), byte(idCounter >> 8), byte(idCounter >> 16)})
return hex.EncodeToString(h[:16])
}

View File

@@ -0,0 +1,332 @@
// Package proto provides hand-rolled protobuf encode/decode for Cursor's gRPC API.
// Field numbers are extracted from the TypeScript generated proto/agent_pb.ts in alma-plugins/cursor-auth.
package proto
// AgentClientMessage (msg 118) oneof "message"
const (
ACM_RunRequest = 1 // AgentRunRequest
ACM_ExecClientMessage = 2 // ExecClientMessage
ACM_KvClientMessage = 3 // KvClientMessage
ACM_ConversationAction = 4 // ConversationAction
ACM_ExecClientControlMsg = 5 // ExecClientControlMessage
ACM_InteractionResponse = 6 // InteractionResponse
ACM_ClientHeartbeat = 7 // ClientHeartbeat
)
// AgentServerMessage (msg 119) oneof "message"
const (
ASM_InteractionUpdate = 1 // InteractionUpdate
ASM_ExecServerMessage = 2 // ExecServerMessage
ASM_ConversationCheckpoint = 3 // ConversationStateStructure
ASM_KvServerMessage = 4 // KvServerMessage
ASM_ExecServerControlMessage = 5 // ExecServerControlMessage
ASM_InteractionQuery = 7 // InteractionQuery
)
// AgentRunRequest (msg 91)
const (
ARR_ConversationState = 1 // ConversationStateStructure
ARR_Action = 2 // ConversationAction
ARR_ModelDetails = 3 // ModelDetails
ARR_McpTools = 4 // McpTools
ARR_ConversationId = 5 // string (optional)
)
// ConversationStateStructure (msg 83)
const (
CSS_RootPromptMessagesJson = 1 // repeated bytes
CSS_TurnsOld = 2 // repeated bytes (deprecated)
CSS_Todos = 3 // repeated bytes
CSS_PendingToolCalls = 4 // repeated string
CSS_Turns = 8 // repeated bytes (CURRENT field for turns)
CSS_PreviousWorkspaceUris = 9 // repeated string
CSS_SelfSummaryCount = 17 // uint32
CSS_ReadPaths = 18 // repeated string
)
// ConversationAction (msg 54) oneof "action"
const (
CA_UserMessageAction = 1 // UserMessageAction
)
// UserMessageAction (msg 55)
const (
UMA_UserMessage = 1 // UserMessage
)
// UserMessage (msg 63)
const (
UM_Text = 1 // string
UM_MessageId = 2 // string
UM_SelectedContext = 3 // SelectedContext (optional)
)
// SelectedContext
const (
SC_SelectedImages = 1 // repeated SelectedImage
)
// SelectedImage
const (
SI_BlobId = 1 // bytes (oneof dataOrBlobId)
SI_Uuid = 2 // string
SI_Path = 3 // string
SI_MimeType = 7 // string
SI_Data = 8 // bytes (oneof dataOrBlobId)
)
// ModelDetails (msg 88)
const (
MD_ModelId = 1 // string
MD_ThinkingDetails = 2 // ThinkingDetails (optional)
MD_DisplayModelId = 3 // string
MD_DisplayName = 4 // string
)
// McpTools (msg 307)
const (
MT_McpTools = 1 // repeated McpToolDefinition
)
// McpToolDefinition (msg 306)
const (
MTD_Name = 1 // string
MTD_Description = 2 // string
MTD_InputSchema = 3 // bytes
MTD_ProviderIdentifier = 4 // string
MTD_ToolName = 5 // string
)
// ConversationTurnStructure (msg 70) oneof "turn"
const (
CTS_AgentConversationTurn = 1 // AgentConversationTurnStructure
)
// AgentConversationTurnStructure (msg 72)
const (
ACTS_UserMessage = 1 // bytes (serialized UserMessage)
ACTS_Steps = 2 // repeated bytes (serialized ConversationStep)
)
// ConversationStep (msg 53) oneof "message"
const (
CS_AssistantMessage = 1 // AssistantMessage
)
// AssistantMessage
const (
AM_Text = 1 // string
)
// --- Server-side message fields ---
// InteractionUpdate oneof "message"
const (
IU_TextDelta = 1 // TextDeltaUpdate
IU_ThinkingDelta = 4 // ThinkingDeltaUpdate
IU_ThinkingCompleted = 5 // ThinkingCompletedUpdate
)
// TextDeltaUpdate (msg 92)
const (
TDU_Text = 1 // string
)
// ThinkingDeltaUpdate (msg 97)
const (
TKD_Text = 1 // string
)
// KvServerMessage (msg 271)
const (
KSM_Id = 1 // uint32
KSM_GetBlobArgs = 2 // GetBlobArgs
KSM_SetBlobArgs = 3 // SetBlobArgs
)
// GetBlobArgs (msg 267)
const (
GBA_BlobId = 1 // bytes
)
// SetBlobArgs (msg 269)
const (
SBA_BlobId = 1 // bytes
SBA_BlobData = 2 // bytes
)
// KvClientMessage (msg 272)
const (
KCM_Id = 1 // uint32
KCM_GetBlobResult = 2 // GetBlobResult
KCM_SetBlobResult = 3 // SetBlobResult
)
// GetBlobResult (msg 268)
const (
GBR_BlobData = 1 // bytes (optional)
)
// ExecServerMessage
const (
ESM_Id = 1 // uint32
ESM_ExecId = 15 // string
// oneof message:
ESM_ShellArgs = 2 // ShellArgs
ESM_WriteArgs = 3 // WriteArgs
ESM_DeleteArgs = 4 // DeleteArgs
ESM_GrepArgs = 5 // GrepArgs
ESM_ReadArgs = 7 // ReadArgs (NOTE: 6 is skipped)
ESM_LsArgs = 8 // LsArgs
ESM_DiagnosticsArgs = 9 // DiagnosticsArgs
ESM_RequestContextArgs = 10 // RequestContextArgs
ESM_McpArgs = 11 // McpArgs
ESM_ShellStreamArgs = 14 // ShellArgs (stream variant)
ESM_BackgroundShellSpawn = 16 // BackgroundShellSpawnArgs
ESM_FetchArgs = 20 // FetchArgs
ESM_WriteShellStdinArgs = 23 // WriteShellStdinArgs
)
// ExecClientMessage
const (
ECM_Id = 1 // uint32
ECM_ExecId = 15 // string
// oneof message (mirrors server fields):
ECM_ShellResult = 2
ECM_WriteResult = 3
ECM_DeleteResult = 4
ECM_GrepResult = 5
ECM_ReadResult = 7
ECM_LsResult = 8
ECM_DiagnosticsResult = 9
ECM_RequestContextResult = 10
ECM_McpResult = 11
ECM_ShellStream = 14
ECM_BackgroundShellSpawnRes = 16
ECM_FetchResult = 20
ECM_WriteShellStdinResult = 23
)
// McpArgs
const (
MCA_Name = 1 // string
MCA_Args = 2 // map<string, bytes>
MCA_ToolCallId = 3 // string
MCA_ProviderIdentifier = 4 // string
MCA_ToolName = 5 // string
)
// RequestContextResult oneof "result"
const (
RCR_Success = 1 // RequestContextSuccess
RCR_Error = 2 // RequestContextError
)
// RequestContextSuccess (msg 337)
const (
RCS_RequestContext = 1 // RequestContext
)
// RequestContext
const (
RC_Rules = 2 // repeated CursorRule
RC_Tools = 7 // repeated McpToolDefinition
)
// McpResult oneof "result"
const (
MCR_Success = 1 // McpSuccess
MCR_Error = 2 // McpError
MCR_Rejected = 3 // McpRejected
)
// McpSuccess (msg 290)
const (
MCS_Content = 1 // repeated McpToolResultContentItem
MCS_IsError = 2 // bool
)
// McpToolResultContentItem oneof "content"
const (
MTRCI_Text = 1 // McpTextContent
)
// McpTextContent (msg 287)
const (
MTC_Text = 1 // string
)
// McpError (msg 291)
const (
MCE_Error = 1 // string
)
// --- Rejection messages ---
// ReadRejected: path=1, reason=2
// ShellRejected: command=1, workingDirectory=2, reason=3, isReadonly=4
// WriteRejected: path=1, reason=2
// DeleteRejected: path=1, reason=2
// LsRejected: path=1, reason=2
// GrepError: error=1
// FetchError: url=1, error=2
// WriteShellStdinError: error=1
// ReadResult oneof: success=1, error=2, rejected=3
// ShellResult oneof: success=1 (+ various), rejected=?
// The TS code uses specific result field numbers from the oneof:
const (
RR_Rejected = 3 // ReadResult.rejected
SR_Rejected = 5 // ShellResult.rejected (from TS: ShellResult has success/various/rejected)
WR_Rejected = 5 // WriteResult.rejected
DR_Rejected = 3 // DeleteResult.rejected
LR_Rejected = 3 // LsResult.rejected
GR_Error = 2 // GrepResult.error
FR_Error = 2 // FetchResult.error
BSSR_Rejected = 2 // BackgroundShellSpawnResult.rejected (error field)
WSSR_Error = 2 // WriteShellStdinResult.error
)
// --- Rejection struct fields ---
const (
REJ_Path = 1
REJ_Reason = 2
SREJ_Command = 1
SREJ_WorkingDir = 2
SREJ_Reason = 3
SREJ_IsReadonly = 4
GERR_Error = 1
FERR_Url = 1
FERR_Error = 2
)
// ReadArgs
const (
RA_Path = 1 // string
)
// WriteArgs
const (
WA_Path = 1 // string
)
// DeleteArgs
const (
DA_Path = 1 // string
)
// LsArgs
const (
LA_Path = 1 // string
)
// ShellArgs
const (
SHA_Command = 1 // string
SHA_WorkingDirectory = 2 // string
)
// FetchArgs
const (
FA_Url = 1 // string
)

View File

@@ -0,0 +1,313 @@
package proto
import (
"crypto/tls"
"fmt"
"io"
"net"
"sync"
"time"
log "github.com/sirupsen/logrus"
"golang.org/x/net/http2"
"golang.org/x/net/http2/hpack"
)
const (
defaultInitialWindowSize = 65535 // HTTP/2 default
maxFramePayload = 16384 // HTTP/2 default max frame size
)
// H2Stream provides bidirectional HTTP/2 streaming for the Connect protocol.
// Go's net/http does not support full-duplex HTTP/2, so we use the low-level framer.
type H2Stream struct {
framer *http2.Framer
conn net.Conn
streamID uint32
mu sync.Mutex
id string // unique identifier for debugging
frameNum int64 // sequential frame counter for debugging
dataCh chan []byte
doneCh chan struct{}
err error
// Send-side flow control
sendWindow int32 // available bytes we can send on this stream
connWindow int32 // available bytes on the connection level
windowCond *sync.Cond // signaled when window is updated
windowMu sync.Mutex // protects sendWindow, connWindow
}
// ID returns the unique identifier for this stream (for logging).
func (s *H2Stream) ID() string { return s.id }
// FrameNum returns the current frame number for debugging.
func (s *H2Stream) FrameNum() int64 {
s.mu.Lock()
defer s.mu.Unlock()
return s.frameNum
}
// DialH2Stream establishes a TLS+HTTP/2 connection and opens a new stream.
func DialH2Stream(host string, headers map[string]string) (*H2Stream, error) {
tlsConn, err := tls.Dial("tcp", host+":443", &tls.Config{
NextProtos: []string{"h2"},
})
if err != nil {
return nil, fmt.Errorf("h2: TLS dial failed: %w", err)
}
if tlsConn.ConnectionState().NegotiatedProtocol != "h2" {
tlsConn.Close()
return nil, fmt.Errorf("h2: server did not negotiate h2")
}
framer := http2.NewFramer(tlsConn, tlsConn)
// Client connection preface
if _, err := tlsConn.Write([]byte(http2.ClientPreface)); err != nil {
tlsConn.Close()
return nil, fmt.Errorf("h2: preface write failed: %w", err)
}
// Send initial SETTINGS (tell server how much WE can receive)
if err := framer.WriteSettings(
http2.Setting{ID: http2.SettingInitialWindowSize, Val: 4 * 1024 * 1024},
http2.Setting{ID: http2.SettingMaxConcurrentStreams, Val: 100},
); err != nil {
tlsConn.Close()
return nil, fmt.Errorf("h2: settings write failed: %w", err)
}
// Connection-level window update (for receiving)
if err := framer.WriteWindowUpdate(0, 3*1024*1024); err != nil {
tlsConn.Close()
return nil, fmt.Errorf("h2: window update failed: %w", err)
}
// Read and handle initial server frames (SETTINGS, WINDOW_UPDATE)
// Track server's initial window size (how much WE can send)
serverInitialWindowSize := int32(defaultInitialWindowSize)
connWindowSize := int32(defaultInitialWindowSize) // connection-level send window
for i := 0; i < 10; i++ {
f, err := framer.ReadFrame()
if err != nil {
tlsConn.Close()
return nil, fmt.Errorf("h2: initial frame read failed: %w", err)
}
switch sf := f.(type) {
case *http2.SettingsFrame:
if !sf.IsAck() {
sf.ForeachSetting(func(s http2.Setting) error {
if s.ID == http2.SettingInitialWindowSize {
serverInitialWindowSize = int32(s.Val)
log.Debugf("h2: server initial window size: %d", s.Val)
}
return nil
})
framer.WriteSettingsAck()
} else {
goto handshakeDone
}
case *http2.WindowUpdateFrame:
if sf.StreamID == 0 {
connWindowSize += int32(sf.Increment)
log.Debugf("h2: initial conn window update: +%d, total=%d", sf.Increment, connWindowSize)
}
default:
// unexpected but continue
}
}
handshakeDone:
// Build HEADERS
streamID := uint32(1)
var hdrBuf []byte
enc := hpack.NewEncoder(&sliceWriter{buf: &hdrBuf})
enc.WriteField(hpack.HeaderField{Name: ":method", Value: "POST"})
enc.WriteField(hpack.HeaderField{Name: ":scheme", Value: "https"})
enc.WriteField(hpack.HeaderField{Name: ":authority", Value: host})
if p, ok := headers[":path"]; ok {
enc.WriteField(hpack.HeaderField{Name: ":path", Value: p})
}
for k, v := range headers {
if len(k) > 0 && k[0] == ':' {
continue
}
enc.WriteField(hpack.HeaderField{Name: k, Value: v})
}
if err := framer.WriteHeaders(http2.HeadersFrameParam{
StreamID: streamID,
BlockFragment: hdrBuf,
EndStream: false,
EndHeaders: true,
}); err != nil {
tlsConn.Close()
return nil, fmt.Errorf("h2: headers write failed: %w", err)
}
s := &H2Stream{
framer: framer,
conn: tlsConn,
streamID: streamID,
dataCh: make(chan []byte, 256),
doneCh: make(chan struct{}),
id: fmt.Sprintf("%d-%s", streamID, time.Now().Format("150405.000")),
frameNum: 0,
sendWindow: serverInitialWindowSize,
connWindow: connWindowSize,
}
s.windowCond = sync.NewCond(&s.windowMu)
go s.readLoop()
return s, nil
}
// Write sends a DATA frame on the stream, respecting flow control.
func (s *H2Stream) Write(data []byte) error {
for len(data) > 0 {
chunk := data
if len(chunk) > maxFramePayload {
chunk = data[:maxFramePayload]
}
// Wait for flow control window
s.windowMu.Lock()
for s.sendWindow <= 0 || s.connWindow <= 0 {
s.windowCond.Wait()
}
// Limit chunk to available window
allowed := int(s.sendWindow)
if int(s.connWindow) < allowed {
allowed = int(s.connWindow)
}
if len(chunk) > allowed {
chunk = chunk[:allowed]
}
s.sendWindow -= int32(len(chunk))
s.connWindow -= int32(len(chunk))
s.windowMu.Unlock()
s.mu.Lock()
err := s.framer.WriteData(s.streamID, false, chunk)
s.mu.Unlock()
if err != nil {
return err
}
data = data[len(chunk):]
}
return nil
}
// Data returns the channel of received data chunks.
func (s *H2Stream) Data() <-chan []byte { return s.dataCh }
// Done returns a channel closed when the stream ends.
func (s *H2Stream) Done() <-chan struct{} { return s.doneCh }
// Err returns the error (if any) that caused the stream to close.
// Returns nil for a clean shutdown (EOF / StreamEnded).
func (s *H2Stream) Err() error { return s.err }
// Close tears down the connection.
func (s *H2Stream) Close() {
s.conn.Close()
// Unblock any writers waiting on flow control
s.windowCond.Broadcast()
}
func (s *H2Stream) readLoop() {
defer close(s.doneCh)
defer close(s.dataCh)
for {
f, err := s.framer.ReadFrame()
if err != nil {
if err != io.EOF {
s.err = err
log.Debugf("h2stream[%s]: readLoop error: %v", s.id, err)
}
return
}
// Increment frame counter
s.mu.Lock()
s.frameNum++
s.mu.Unlock()
switch frame := f.(type) {
case *http2.DataFrame:
if frame.StreamID == s.streamID && len(frame.Data()) > 0 {
cp := make([]byte, len(frame.Data()))
copy(cp, frame.Data())
s.dataCh <- cp
// Flow control: send WINDOW_UPDATE for received data
s.mu.Lock()
s.framer.WriteWindowUpdate(0, uint32(len(cp)))
s.framer.WriteWindowUpdate(s.streamID, uint32(len(cp)))
s.mu.Unlock()
}
if frame.StreamEnded() {
return
}
case *http2.HeadersFrame:
if frame.StreamEnded() {
return
}
case *http2.RSTStreamFrame:
s.err = fmt.Errorf("h2: RST_STREAM code=%d", frame.ErrCode)
log.Debugf("h2stream[%s]: received RST_STREAM code=%d", s.id, frame.ErrCode)
return
case *http2.GoAwayFrame:
s.err = fmt.Errorf("h2: GOAWAY code=%d", frame.ErrCode)
return
case *http2.PingFrame:
if !frame.IsAck() {
s.mu.Lock()
s.framer.WritePing(true, frame.Data)
s.mu.Unlock()
}
case *http2.SettingsFrame:
if !frame.IsAck() {
// Check for window size changes
frame.ForeachSetting(func(setting http2.Setting) error {
if setting.ID == http2.SettingInitialWindowSize {
s.windowMu.Lock()
delta := int32(setting.Val) - s.sendWindow
s.sendWindow += delta
s.windowMu.Unlock()
s.windowCond.Broadcast()
}
return nil
})
s.mu.Lock()
s.framer.WriteSettingsAck()
s.mu.Unlock()
}
case *http2.WindowUpdateFrame:
// Update send-side flow control window
s.windowMu.Lock()
if frame.StreamID == 0 {
s.connWindow += int32(frame.Increment)
} else if frame.StreamID == s.streamID {
s.sendWindow += int32(frame.Increment)
}
s.windowMu.Unlock()
s.windowCond.Broadcast()
}
}
}
type sliceWriter struct{ buf *[]byte }
func (w *sliceWriter) Write(p []byte) (int, error) {
*w.buf = append(*w.buf, p...)
return len(p), nil
}

View File

@@ -102,10 +102,24 @@ func NewDeviceFlowClient(cfg *config.Config) *DeviceFlowClient {
// NewDeviceFlowClientWithDeviceID creates a new device flow client with the specified device ID.
func NewDeviceFlowClientWithDeviceID(cfg *config.Config, deviceID string) *DeviceFlowClient {
return NewDeviceFlowClientWithDeviceIDAndProxyURL(cfg, deviceID, "")
}
// NewDeviceFlowClientWithDeviceIDAndProxyURL creates a new device flow client with a proxy override.
// proxyURL takes precedence over cfg.ProxyURL when non-empty.
func NewDeviceFlowClientWithDeviceIDAndProxyURL(cfg *config.Config, deviceID string, proxyURL string) *DeviceFlowClient {
client := &http.Client{Timeout: 30 * time.Second}
effectiveProxyURL := strings.TrimSpace(proxyURL)
var sdkCfg config.SDKConfig
if cfg != nil {
client = util.SetProxy(&cfg.SDKConfig, client)
sdkCfg = cfg.SDKConfig
if effectiveProxyURL == "" {
effectiveProxyURL = strings.TrimSpace(cfg.ProxyURL)
}
}
sdkCfg.ProxyURL = effectiveProxyURL
client = util.SetProxy(&sdkCfg, client)
resolvedDeviceID := strings.TrimSpace(deviceID)
if resolvedDeviceID == "" {
resolvedDeviceID = getOrCreateDeviceID()

View File

@@ -0,0 +1,42 @@
package kimi
import (
"net/http"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
)
func TestNewDeviceFlowClientWithDeviceIDAndProxyURL_OverrideDirectDisablesProxy(t *testing.T) {
cfg := &config.Config{SDKConfig: config.SDKConfig{ProxyURL: "http://proxy.example.com:8080"}}
client := NewDeviceFlowClientWithDeviceIDAndProxyURL(cfg, "device-1", "direct")
transport, ok := client.httpClient.Transport.(*http.Transport)
if !ok || transport == nil {
t.Fatalf("expected http.Transport, got %T", client.httpClient.Transport)
}
if transport.Proxy != nil {
t.Fatal("expected direct transport to disable proxy function")
}
}
func TestNewDeviceFlowClientWithDeviceIDAndProxyURL_OverrideProxyTakesPrecedence(t *testing.T) {
cfg := &config.Config{SDKConfig: config.SDKConfig{ProxyURL: "http://global.example.com:8080"}}
client := NewDeviceFlowClientWithDeviceIDAndProxyURL(cfg, "device-1", "http://override.example.com:8081")
transport, ok := client.httpClient.Transport.(*http.Transport)
if !ok || transport == nil {
t.Fatalf("expected http.Transport, got %T", client.httpClient.Transport)
}
req, errReq := http.NewRequest(http.MethodGet, "https://example.com", nil)
if errReq != nil {
t.Fatalf("new request: %v", errReq)
}
proxyURL, errProxy := transport.Proxy(req)
if errProxy != nil {
t.Fatalf("proxy func: %v", errProxy)
}
if proxyURL == nil || proxyURL.String() != "http://override.example.com:8081" {
t.Fatalf("proxy URL = %v, want http://override.example.com:8081", proxyURL)
}
}

View File

@@ -748,4 +748,3 @@ func TestExtractRegionFromMetadata(t *testing.T) {
})
}
}

View File

@@ -6,8 +6,8 @@ import (
)
const (
CooldownReason429 = "rate_limit_exceeded"
CooldownReasonSuspended = "account_suspended"
CooldownReason429 = "rate_limit_exceeded"
CooldownReasonSuspended = "account_suspended"
CooldownReasonQuotaExhausted = "quota_exhausted"
DefaultShortCooldown = 1 * time.Minute

View File

@@ -26,9 +26,9 @@ const (
)
var (
jitterRand *rand.Rand
jitterRandOnce sync.Once
jitterMu sync.Mutex
jitterRand *rand.Rand
jitterRandOnce sync.Once
jitterMu sync.Mutex
lastRequestTime time.Time
)

View File

@@ -24,10 +24,10 @@ type TokenScorer struct {
metrics map[string]*TokenMetrics
// Scoring weights
successRateWeight float64
quotaWeight float64
latencyWeight float64
lastUsedWeight float64
successRateWeight float64
quotaWeight float64
latencyWeight float64
lastUsedWeight float64
failPenaltyMultiplier float64
}

View File

@@ -1,359 +0,0 @@
package qwen
import (
"context"
"crypto/rand"
"crypto/sha256"
"encoding/base64"
"encoding/json"
"fmt"
"io"
"net/http"
"net/url"
"strings"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
log "github.com/sirupsen/logrus"
)
const (
// QwenOAuthDeviceCodeEndpoint is the URL for initiating the OAuth 2.0 device authorization flow.
QwenOAuthDeviceCodeEndpoint = "https://chat.qwen.ai/api/v1/oauth2/device/code"
// QwenOAuthTokenEndpoint is the URL for exchanging device codes or refresh tokens for access tokens.
QwenOAuthTokenEndpoint = "https://chat.qwen.ai/api/v1/oauth2/token"
// QwenOAuthClientID is the client identifier for the Qwen OAuth 2.0 application.
QwenOAuthClientID = "f0304373b74a44d2b584a3fb70ca9e56"
// QwenOAuthScope defines the permissions requested by the application.
QwenOAuthScope = "openid profile email model.completion"
// QwenOAuthGrantType specifies the grant type for the device code flow.
QwenOAuthGrantType = "urn:ietf:params:oauth:grant-type:device_code"
)
// QwenTokenData represents the OAuth credentials, including access and refresh tokens.
type QwenTokenData struct {
AccessToken string `json:"access_token"`
// RefreshToken is used to obtain a new access token when the current one expires.
RefreshToken string `json:"refresh_token,omitempty"`
// TokenType indicates the type of token, typically "Bearer".
TokenType string `json:"token_type"`
// ResourceURL specifies the base URL of the resource server.
ResourceURL string `json:"resource_url,omitempty"`
// Expire indicates the expiration date and time of the access token.
Expire string `json:"expiry_date,omitempty"`
}
// DeviceFlow represents the response from the device authorization endpoint.
type DeviceFlow struct {
// DeviceCode is the code that the client uses to poll for an access token.
DeviceCode string `json:"device_code"`
// UserCode is the code that the user enters at the verification URI.
UserCode string `json:"user_code"`
// VerificationURI is the URL where the user can enter the user code to authorize the device.
VerificationURI string `json:"verification_uri"`
// VerificationURIComplete is a URI that includes the user_code, which can be used to automatically
// fill in the code on the verification page.
VerificationURIComplete string `json:"verification_uri_complete"`
// ExpiresIn is the time in seconds until the device_code and user_code expire.
ExpiresIn int `json:"expires_in"`
// Interval is the minimum time in seconds that the client should wait between polling requests.
Interval int `json:"interval"`
// CodeVerifier is the cryptographically random string used in the PKCE flow.
CodeVerifier string `json:"code_verifier"`
}
// QwenTokenResponse represents the successful token response from the token endpoint.
type QwenTokenResponse struct {
// AccessToken is the token used to access protected resources.
AccessToken string `json:"access_token"`
// RefreshToken is used to obtain a new access token.
RefreshToken string `json:"refresh_token,omitempty"`
// TokenType indicates the type of token, typically "Bearer".
TokenType string `json:"token_type"`
// ResourceURL specifies the base URL of the resource server.
ResourceURL string `json:"resource_url,omitempty"`
// ExpiresIn is the time in seconds until the access token expires.
ExpiresIn int `json:"expires_in"`
}
// QwenAuth manages authentication and token handling for the Qwen API.
type QwenAuth struct {
httpClient *http.Client
}
// NewQwenAuth creates a new QwenAuth instance with a proxy-configured HTTP client.
func NewQwenAuth(cfg *config.Config) *QwenAuth {
return &QwenAuth{
httpClient: util.SetProxy(&cfg.SDKConfig, &http.Client{}),
}
}
// generateCodeVerifier generates a cryptographically random string for the PKCE code verifier.
func (qa *QwenAuth) generateCodeVerifier() (string, error) {
bytes := make([]byte, 32)
if _, err := rand.Read(bytes); err != nil {
return "", err
}
return base64.RawURLEncoding.EncodeToString(bytes), nil
}
// generateCodeChallenge creates a SHA-256 hash of the code verifier, used as the PKCE code challenge.
func (qa *QwenAuth) generateCodeChallenge(codeVerifier string) string {
hash := sha256.Sum256([]byte(codeVerifier))
return base64.RawURLEncoding.EncodeToString(hash[:])
}
// generatePKCEPair creates a new code verifier and its corresponding code challenge for PKCE.
func (qa *QwenAuth) generatePKCEPair() (string, string, error) {
codeVerifier, err := qa.generateCodeVerifier()
if err != nil {
return "", "", err
}
codeChallenge := qa.generateCodeChallenge(codeVerifier)
return codeVerifier, codeChallenge, nil
}
// RefreshTokens exchanges a refresh token for a new access token.
func (qa *QwenAuth) RefreshTokens(ctx context.Context, refreshToken string) (*QwenTokenData, error) {
data := url.Values{}
data.Set("grant_type", "refresh_token")
data.Set("refresh_token", refreshToken)
data.Set("client_id", QwenOAuthClientID)
req, err := http.NewRequestWithContext(ctx, "POST", QwenOAuthTokenEndpoint, strings.NewReader(data.Encode()))
if err != nil {
return nil, fmt.Errorf("failed to create token request: %w", err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.Header.Set("Accept", "application/json")
resp, err := qa.httpClient.Do(req)
// resp, err := qa.httpClient.PostForm(QwenOAuthTokenEndpoint, data)
if err != nil {
return nil, fmt.Errorf("token refresh request failed: %w", err)
}
defer func() {
_ = resp.Body.Close()
}()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response body: %w", err)
}
if resp.StatusCode != http.StatusOK {
var errorData map[string]interface{}
if err = json.Unmarshal(body, &errorData); err == nil {
return nil, fmt.Errorf("token refresh failed: %v - %v", errorData["error"], errorData["error_description"])
}
return nil, fmt.Errorf("token refresh failed: %s", string(body))
}
var tokenData QwenTokenResponse
if err = json.Unmarshal(body, &tokenData); err != nil {
return nil, fmt.Errorf("failed to parse token response: %w", err)
}
return &QwenTokenData{
AccessToken: tokenData.AccessToken,
TokenType: tokenData.TokenType,
RefreshToken: tokenData.RefreshToken,
ResourceURL: tokenData.ResourceURL,
Expire: time.Now().Add(time.Duration(tokenData.ExpiresIn) * time.Second).Format(time.RFC3339),
}, nil
}
// InitiateDeviceFlow starts the OAuth 2.0 device authorization flow and returns the device flow details.
func (qa *QwenAuth) InitiateDeviceFlow(ctx context.Context) (*DeviceFlow, error) {
// Generate PKCE code verifier and challenge
codeVerifier, codeChallenge, err := qa.generatePKCEPair()
if err != nil {
return nil, fmt.Errorf("failed to generate PKCE pair: %w", err)
}
data := url.Values{}
data.Set("client_id", QwenOAuthClientID)
data.Set("scope", QwenOAuthScope)
data.Set("code_challenge", codeChallenge)
data.Set("code_challenge_method", "S256")
req, err := http.NewRequestWithContext(ctx, "POST", QwenOAuthDeviceCodeEndpoint, strings.NewReader(data.Encode()))
if err != nil {
return nil, fmt.Errorf("failed to create token request: %w", err)
}
req.Header.Set("Content-Type", "application/x-www-form-urlencoded")
req.Header.Set("Accept", "application/json")
resp, err := qa.httpClient.Do(req)
// resp, err := qa.httpClient.PostForm(QwenOAuthDeviceCodeEndpoint, data)
if err != nil {
return nil, fmt.Errorf("device authorization request failed: %w", err)
}
defer func() {
_ = resp.Body.Close()
}()
body, err := io.ReadAll(resp.Body)
if err != nil {
return nil, fmt.Errorf("failed to read response body: %w", err)
}
if resp.StatusCode != http.StatusOK {
return nil, fmt.Errorf("device authorization failed: %d %s. Response: %s", resp.StatusCode, resp.Status, string(body))
}
var result DeviceFlow
if err = json.Unmarshal(body, &result); err != nil {
return nil, fmt.Errorf("failed to parse device flow response: %w", err)
}
// Check if the response indicates success
if result.DeviceCode == "" {
return nil, fmt.Errorf("device authorization failed: device_code not found in response")
}
// Add the code_verifier to the result so it can be used later for polling
result.CodeVerifier = codeVerifier
return &result, nil
}
// PollForToken polls the token endpoint with the device code to obtain an access token.
func (qa *QwenAuth) PollForToken(deviceCode, codeVerifier string) (*QwenTokenData, error) {
pollInterval := 5 * time.Second
maxAttempts := 60 // 5 minutes max
for attempt := 0; attempt < maxAttempts; attempt++ {
data := url.Values{}
data.Set("grant_type", QwenOAuthGrantType)
data.Set("client_id", QwenOAuthClientID)
data.Set("device_code", deviceCode)
data.Set("code_verifier", codeVerifier)
resp, err := http.PostForm(QwenOAuthTokenEndpoint, data)
if err != nil {
fmt.Printf("Polling attempt %d/%d failed: %v\n", attempt+1, maxAttempts, err)
time.Sleep(pollInterval)
continue
}
body, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
if err != nil {
fmt.Printf("Polling attempt %d/%d failed: %v\n", attempt+1, maxAttempts, err)
time.Sleep(pollInterval)
continue
}
if resp.StatusCode != http.StatusOK {
// Parse the response as JSON to check for OAuth RFC 8628 standard errors
var errorData map[string]interface{}
if err = json.Unmarshal(body, &errorData); err == nil {
// According to OAuth RFC 8628, handle standard polling responses
if resp.StatusCode == http.StatusBadRequest {
errorType, _ := errorData["error"].(string)
switch errorType {
case "authorization_pending":
// User has not yet approved the authorization request. Continue polling.
fmt.Printf("Polling attempt %d/%d...\n\n", attempt+1, maxAttempts)
time.Sleep(pollInterval)
continue
case "slow_down":
// Client is polling too frequently. Increase poll interval.
pollInterval = time.Duration(float64(pollInterval) * 1.5)
if pollInterval > 10*time.Second {
pollInterval = 10 * time.Second
}
fmt.Printf("Server requested to slow down, increasing poll interval to %v\n\n", pollInterval)
time.Sleep(pollInterval)
continue
case "expired_token":
return nil, fmt.Errorf("device code expired. Please restart the authentication process")
case "access_denied":
return nil, fmt.Errorf("authorization denied by user. Please restart the authentication process")
}
}
// For other errors, return with proper error information
errorType, _ := errorData["error"].(string)
errorDesc, _ := errorData["error_description"].(string)
return nil, fmt.Errorf("device token poll failed: %s - %s", errorType, errorDesc)
}
// If JSON parsing fails, fall back to text response
return nil, fmt.Errorf("device token poll failed: %d %s. Response: %s", resp.StatusCode, resp.Status, string(body))
}
// log.Debugf("%s", string(body))
// Success - parse token data
var response QwenTokenResponse
if err = json.Unmarshal(body, &response); err != nil {
return nil, fmt.Errorf("failed to parse token response: %w", err)
}
// Convert to QwenTokenData format and save
tokenData := &QwenTokenData{
AccessToken: response.AccessToken,
RefreshToken: response.RefreshToken,
TokenType: response.TokenType,
ResourceURL: response.ResourceURL,
Expire: time.Now().Add(time.Duration(response.ExpiresIn) * time.Second).Format(time.RFC3339),
}
return tokenData, nil
}
return nil, fmt.Errorf("authentication timeout. Please restart the authentication process")
}
// RefreshTokensWithRetry attempts to refresh tokens with a specified number of retries upon failure.
func (o *QwenAuth) RefreshTokensWithRetry(ctx context.Context, refreshToken string, maxRetries int) (*QwenTokenData, error) {
var lastErr error
for attempt := 0; attempt < maxRetries; attempt++ {
if attempt > 0 {
// Wait before retry
select {
case <-ctx.Done():
return nil, ctx.Err()
case <-time.After(time.Duration(attempt) * time.Second):
}
}
tokenData, err := o.RefreshTokens(ctx, refreshToken)
if err == nil {
return tokenData, nil
}
lastErr = err
log.Warnf("Token refresh attempt %d failed: %v", attempt+1, err)
}
return nil, fmt.Errorf("token refresh failed after %d attempts: %w", maxRetries, lastErr)
}
// CreateTokenStorage creates a QwenTokenStorage object from a QwenTokenData object.
func (o *QwenAuth) CreateTokenStorage(tokenData *QwenTokenData) *QwenTokenStorage {
storage := &QwenTokenStorage{
AccessToken: tokenData.AccessToken,
RefreshToken: tokenData.RefreshToken,
LastRefresh: time.Now().Format(time.RFC3339),
ResourceURL: tokenData.ResourceURL,
Expire: tokenData.Expire,
}
return storage
}
// UpdateTokenStorage updates an existing token storage with new token data
func (o *QwenAuth) UpdateTokenStorage(storage *QwenTokenStorage, tokenData *QwenTokenData) {
storage.AccessToken = tokenData.AccessToken
storage.RefreshToken = tokenData.RefreshToken
storage.LastRefresh = time.Now().Format(time.RFC3339)
storage.ResourceURL = tokenData.ResourceURL
storage.Expire = tokenData.Expire
}

View File

@@ -1,79 +0,0 @@
// Package qwen provides authentication and token management functionality
// for Alibaba's Qwen AI services. It handles OAuth2 token storage, serialization,
// and retrieval for maintaining authenticated sessions with the Qwen API.
package qwen
import (
"encoding/json"
"fmt"
"os"
"path/filepath"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
)
// QwenTokenStorage stores OAuth2 token information for Alibaba Qwen API authentication.
// It maintains compatibility with the existing auth system while adding Qwen-specific fields
// for managing access tokens, refresh tokens, and user account information.
type QwenTokenStorage struct {
// AccessToken is the OAuth2 access token used for authenticating API requests.
AccessToken string `json:"access_token"`
// RefreshToken is used to obtain new access tokens when the current one expires.
RefreshToken string `json:"refresh_token"`
// LastRefresh is the timestamp of the last token refresh operation.
LastRefresh string `json:"last_refresh"`
// ResourceURL is the base URL for API requests.
ResourceURL string `json:"resource_url"`
// Email is the Qwen account email address associated with this token.
Email string `json:"email"`
// Type indicates the authentication provider type, always "qwen" for this storage.
Type string `json:"type"`
// Expire is the timestamp when the current access token expires.
Expire string `json:"expired"`
// Metadata holds arbitrary key-value pairs injected via hooks.
// It is not exported to JSON directly to allow flattening during serialization.
Metadata map[string]any `json:"-"`
}
// SetMetadata allows external callers to inject metadata into the storage before saving.
func (ts *QwenTokenStorage) SetMetadata(meta map[string]any) {
ts.Metadata = meta
}
// SaveTokenToFile serializes the Qwen token storage to a JSON file.
// This method creates the necessary directory structure and writes the token
// data in JSON format to the specified file path for persistent storage.
// It merges any injected metadata into the top-level JSON object.
//
// Parameters:
// - authFilePath: The full path where the token file should be saved
//
// Returns:
// - error: An error if the operation fails, nil otherwise
func (ts *QwenTokenStorage) SaveTokenToFile(authFilePath string) error {
misc.LogSavingCredentials(authFilePath)
ts.Type = "qwen"
if err := os.MkdirAll(filepath.Dir(authFilePath), 0700); err != nil {
return fmt.Errorf("failed to create directory: %v", err)
}
f, err := os.Create(authFilePath)
if err != nil {
return fmt.Errorf("failed to create token file: %w", err)
}
defer func() {
_ = f.Close()
}()
// Merge metadata using helper
data, errMerge := misc.MergeMetadata(ts, ts.Metadata)
if errMerge != nil {
return fmt.Errorf("failed to merge metadata: %w", errMerge)
}
if err = json.NewEncoder(f).Encode(data); err != nil {
return fmt.Errorf("failed to write token to file: %w", err)
}
return nil
}

View File

@@ -30,6 +30,10 @@ type VertexCredentialStorage struct {
// Type is the provider identifier stored alongside credentials. Always "vertex".
Type string `json:"type"`
// Prefix optionally namespaces models for this credential (e.g., "teamA").
// This results in model names like "teamA/gemini-2.0-flash".
Prefix string `json:"prefix,omitempty"`
}
// SaveTokenToFile writes the credential payload to the given file path in JSON format.

View File

@@ -5,7 +5,10 @@ import (
"encoding/hex"
"strings"
"sync"
"sync/atomic"
"time"
log "github.com/sirupsen/logrus"
)
// SignatureEntry holds a cached thinking signature with timestamp
@@ -193,3 +196,45 @@ func GetModelGroup(modelName string) string {
}
return modelName
}
var signatureCacheEnabled atomic.Bool
var signatureBypassStrictMode atomic.Bool
func init() {
signatureCacheEnabled.Store(true)
signatureBypassStrictMode.Store(false)
}
// SetSignatureCacheEnabled switches Antigravity signature handling between cache mode and bypass mode.
func SetSignatureCacheEnabled(enabled bool) {
previous := signatureCacheEnabled.Swap(enabled)
if previous == enabled {
return
}
if !enabled {
log.Info("antigravity signature cache DISABLED - bypass mode active, cached signatures will not be used for request translation")
}
}
// SignatureCacheEnabled returns whether signature cache validation is enabled.
func SignatureCacheEnabled() bool {
return signatureCacheEnabled.Load()
}
// SetSignatureBypassStrictMode controls whether bypass mode uses strict protobuf-tree validation.
func SetSignatureBypassStrictMode(strict bool) {
previous := signatureBypassStrictMode.Swap(strict)
if previous == strict {
return
}
if strict {
log.Debug("antigravity bypass signature validation: strict mode (protobuf tree)")
} else {
log.Debug("antigravity bypass signature validation: basic mode (R/E + 0x12)")
}
}
// SignatureBypassStrictMode returns whether bypass mode uses strict protobuf-tree validation.
func SignatureBypassStrictMode() bool {
return signatureBypassStrictMode.Load()
}

View File

@@ -1,8 +1,12 @@
package cache
import (
"bytes"
"strings"
"testing"
"time"
log "github.com/sirupsen/logrus"
)
const testModelName = "claude-sonnet-4-5"
@@ -208,3 +212,90 @@ func TestCacheSignature_ExpirationLogic(t *testing.T) {
// but the logic is verified by the implementation
_ = time.Now() // Acknowledge we're not testing time passage
}
func TestSignatureModeSetters_LogAtInfoLevel(t *testing.T) {
logger := log.StandardLogger()
previousOutput := logger.Out
previousLevel := logger.Level
previousCache := SignatureCacheEnabled()
previousStrict := SignatureBypassStrictMode()
SetSignatureCacheEnabled(true)
SetSignatureBypassStrictMode(false)
buffer := &bytes.Buffer{}
log.SetOutput(buffer)
log.SetLevel(log.InfoLevel)
t.Cleanup(func() {
log.SetOutput(previousOutput)
log.SetLevel(previousLevel)
SetSignatureCacheEnabled(previousCache)
SetSignatureBypassStrictMode(previousStrict)
})
SetSignatureCacheEnabled(false)
SetSignatureBypassStrictMode(true)
SetSignatureBypassStrictMode(false)
output := buffer.String()
if !strings.Contains(output, "antigravity signature cache DISABLED") {
t.Fatalf("expected info output for disabling signature cache, got: %q", output)
}
if strings.Contains(output, "strict mode (protobuf tree)") {
t.Fatalf("expected strict bypass mode log to stay below info level, got: %q", output)
}
if strings.Contains(output, "basic mode (R/E + 0x12)") {
t.Fatalf("expected basic bypass mode log to stay below info level, got: %q", output)
}
}
func TestSignatureModeSetters_DoNotRepeatSameStateLogs(t *testing.T) {
logger := log.StandardLogger()
previousOutput := logger.Out
previousLevel := logger.Level
previousCache := SignatureCacheEnabled()
previousStrict := SignatureBypassStrictMode()
SetSignatureCacheEnabled(false)
SetSignatureBypassStrictMode(true)
buffer := &bytes.Buffer{}
log.SetOutput(buffer)
log.SetLevel(log.InfoLevel)
t.Cleanup(func() {
log.SetOutput(previousOutput)
log.SetLevel(previousLevel)
SetSignatureCacheEnabled(previousCache)
SetSignatureBypassStrictMode(previousStrict)
})
SetSignatureCacheEnabled(false)
SetSignatureBypassStrictMode(true)
if buffer.Len() != 0 {
t.Fatalf("expected repeated setter calls with unchanged state to stay silent, got: %q", buffer.String())
}
}
func TestSignatureBypassStrictMode_LogsAtDebugLevel(t *testing.T) {
logger := log.StandardLogger()
previousOutput := logger.Out
previousLevel := logger.Level
previousStrict := SignatureBypassStrictMode()
SetSignatureBypassStrictMode(false)
buffer := &bytes.Buffer{}
log.SetOutput(buffer)
log.SetLevel(log.DebugLevel)
t.Cleanup(func() {
log.SetOutput(previousOutput)
log.SetLevel(previousLevel)
SetSignatureBypassStrictMode(previousStrict)
})
SetSignatureBypassStrictMode(true)
SetSignatureBypassStrictMode(false)
output := buffer.String()
if !strings.Contains(output, "strict mode (protobuf tree)") {
t.Fatalf("expected debug output for strict bypass mode, got: %q", output)
}
if !strings.Contains(output, "basic mode (R/E + 0x12)") {
t.Fatalf("expected debug output for basic bypass mode, got: %q", output)
}
}

View File

@@ -15,8 +15,6 @@ func newAuthManager() *sdkAuth.Manager {
sdkAuth.NewGeminiAuthenticator(),
sdkAuth.NewCodexAuthenticator(),
sdkAuth.NewClaudeAuthenticator(),
sdkAuth.NewQwenAuthenticator(),
sdkAuth.NewIFlowAuthenticator(),
sdkAuth.NewAntigravityAuthenticator(),
sdkAuth.NewKimiAuthenticator(),
sdkAuth.NewKiroAuthenticator(),
@@ -24,6 +22,7 @@ func newAuthManager() *sdkAuth.Manager {
sdkAuth.NewKiloAuthenticator(),
sdkAuth.NewGitLabAuthenticator(),
sdkAuth.NewCodeBuddyAuthenticator(),
sdkAuth.NewCursorAuthenticator(),
)
return manager
}

View File

@@ -0,0 +1,37 @@
package cmd
import (
"context"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
sdkAuth "github.com/router-for-me/CLIProxyAPI/v6/sdk/auth"
log "github.com/sirupsen/logrus"
)
// DoCursorLogin triggers the OAuth PKCE flow for Cursor and saves tokens.
func DoCursorLogin(cfg *config.Config, options *LoginOptions) {
if options == nil {
options = &LoginOptions{}
}
manager := newAuthManager()
authOpts := &sdkAuth.LoginOptions{
NoBrowser: options.NoBrowser,
Metadata: map[string]string{},
Prompt: options.Prompt,
}
record, savedPath, err := manager.Login(context.Background(), "cursor", cfg, authOpts)
if err != nil {
log.Errorf("Cursor authentication failed: %v", err)
return
}
if savedPath != "" {
log.Infof("Authentication saved to %s", savedPath)
}
if record != nil && record.Label != "" {
log.Infof("Authenticated as %s", record.Label)
}
log.Info("Cursor authentication successful!")
}

View File

@@ -1,60 +0,0 @@
package cmd
import (
"context"
"errors"
"fmt"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
sdkAuth "github.com/router-for-me/CLIProxyAPI/v6/sdk/auth"
log "github.com/sirupsen/logrus"
)
// DoQwenLogin handles the Qwen device flow using the shared authentication manager.
// It initiates the device-based authentication process for Qwen services and saves
// the authentication tokens to the configured auth directory.
//
// Parameters:
// - cfg: The application configuration
// - options: Login options including browser behavior and prompts
func DoQwenLogin(cfg *config.Config, options *LoginOptions) {
if options == nil {
options = &LoginOptions{}
}
manager := newAuthManager()
promptFn := options.Prompt
if promptFn == nil {
promptFn = func(prompt string) (string, error) {
fmt.Println()
fmt.Println(prompt)
var value string
_, err := fmt.Scanln(&value)
return value, err
}
}
authOpts := &sdkAuth.LoginOptions{
NoBrowser: options.NoBrowser,
CallbackPort: options.CallbackPort,
Metadata: map[string]string{},
Prompt: promptFn,
}
_, savedPath, err := manager.Login(context.Background(), "qwen", cfg, authOpts)
if err != nil {
if emailErr, ok := errors.AsType[*sdkAuth.EmailRequiredError](err); ok {
log.Error(emailErr.Error())
return
}
fmt.Printf("Qwen authentication failed: %v\n", err)
return
}
if savedPath != "" {
fmt.Printf("Authentication saved to %s\n", savedPath)
}
fmt.Println("Qwen authentication successful!")
}

View File

@@ -20,7 +20,7 @@ import (
// DoVertexImport imports a Google Cloud service account key JSON and persists
// it as a "vertex" provider credential. The file content is embedded in the auth
// file to allow portable deployment across stores.
func DoVertexImport(cfg *config.Config, keyPath string) {
func DoVertexImport(cfg *config.Config, keyPath string, prefix string) {
if cfg == nil {
cfg = &config.Config{}
}
@@ -62,13 +62,28 @@ func DoVertexImport(cfg *config.Config, keyPath string) {
// Default location if not provided by user. Can be edited in the saved file later.
location := "us-central1"
fileName := fmt.Sprintf("vertex-%s.json", sanitizeFilePart(projectID))
// Normalize and validate prefix: must be a single segment (no "/" allowed).
prefix = strings.TrimSpace(prefix)
prefix = strings.Trim(prefix, "/")
if prefix != "" && strings.Contains(prefix, "/") {
log.Errorf("vertex-import: prefix must be a single segment (no '/' allowed): %q", prefix)
return
}
// Include prefix in filename so importing the same project with different
// prefixes creates separate credential files instead of overwriting.
baseName := sanitizeFilePart(projectID)
if prefix != "" {
baseName = sanitizeFilePart(prefix) + "-" + baseName
}
fileName := fmt.Sprintf("vertex-%s.json", baseName)
// Build auth record
storage := &vertex.VertexCredentialStorage{
ServiceAccount: sa,
ProjectID: projectID,
Email: email,
Location: location,
Prefix: prefix,
}
metadata := map[string]any{
"service_account": sa,
@@ -76,6 +91,7 @@ func DoVertexImport(cfg *config.Config, keyPath string) {
"email": email,
"location": location,
"type": "vertex",
"prefix": prefix,
"label": labelForVertex(projectID, email),
}
record := &coreauth.Auth{

View File

@@ -68,6 +68,10 @@ type Config struct {
// DisableCooling disables quota cooldown scheduling when true.
DisableCooling bool `yaml:"disable-cooling" json:"disable-cooling"`
// AuthAutoRefreshWorkers overrides the size of the core auth auto-refresh worker pool.
// When <= 0, the default worker count is used.
AuthAutoRefreshWorkers int `yaml:"auth-auto-refresh-workers" json:"auth-auto-refresh-workers"`
// RequestRetry defines the retry times when the request failed.
RequestRetry int `yaml:"request-retry" json:"request-retry"`
// MaxRetryCredentials defines the maximum number of credentials to try for a failed request.
@@ -85,6 +89,13 @@ type Config struct {
// WebsocketAuth enables or disables authentication for the WebSocket API.
WebsocketAuth bool `yaml:"ws-auth" json:"ws-auth"`
// AntigravitySignatureCacheEnabled controls whether signature cache validation is enabled for thinking blocks.
// When true (default), cached signatures are preferred and validated.
// When false, client signatures are used directly after normalization (bypass mode).
AntigravitySignatureCacheEnabled *bool `yaml:"antigravity-signature-cache-enabled,omitempty" json:"antigravity-signature-cache-enabled,omitempty"`
AntigravitySignatureBypassStrict *bool `yaml:"antigravity-signature-bypass-strict,omitempty" json:"antigravity-signature-bypass-strict,omitempty"`
// GeminiKey defines Gemini API key configurations with optional routing overrides.
GeminiKey []GeminiKey `yaml:"gemini-api-key" json:"gemini-api-key"`
@@ -124,12 +135,12 @@ type Config struct {
AmpCode AmpCode `yaml:"ampcode" json:"ampcode"`
// OAuthExcludedModels defines per-provider global model exclusions applied to OAuth/file-backed auth entries.
// Supported channels: gemini-cli, vertex, aistudio, antigravity, claude, codex, qwen, iflow, kiro, github-copilot.
// Supported channels: gemini-cli, vertex, aistudio, antigravity, claude, codex, iflow, kiro, github-copilot, kimi.
OAuthExcludedModels map[string][]string `yaml:"oauth-excluded-models,omitempty" json:"oauth-excluded-models,omitempty"`
// OAuthModelAlias defines global model name aliases for OAuth/file-backed auth channels.
// These aliases affect both model listing and model routing for supported channels:
// gemini-cli, vertex, aistudio, antigravity, claude, codex, qwen, iflow, kiro, github-copilot.
// gemini-cli, vertex, aistudio, antigravity, claude, codex, iflow, kiro, github-copilot, kimi.
//
// NOTE: This does not apply to existing per-credential model alias features under:
// gemini-api-key, codex-api-key, claude-api-key, openai-compatibility, vertex-api-key, and ampcode.
@@ -211,6 +222,10 @@ type QuotaExceeded struct {
// SwitchPreviewModel indicates whether to automatically switch to a preview model when a quota is exceeded.
SwitchPreviewModel bool `yaml:"switch-preview-model" json:"switch-preview-model"`
// AntigravityCredits indicates whether to retry Antigravity quota_exhausted 429s once
// on the same credential with enabledCreditTypes=["GOOGLE_ONE_AI"].
AntigravityCredits bool `yaml:"antigravity-credits" json:"antigravity-credits"`
}
// RoutingConfig configures how credentials are selected for requests.
@@ -218,6 +233,22 @@ type RoutingConfig struct {
// Strategy selects the credential selection strategy.
// Supported values: "round-robin" (default), "fill-first".
Strategy string `yaml:"strategy,omitempty" json:"strategy,omitempty"`
// ClaudeCodeSessionAffinity enables session-sticky routing for Claude Code clients.
// When enabled, requests with the same session ID (extracted from metadata.user_id)
// are routed to the same auth credential when available.
// Deprecated: Use SessionAffinity instead for universal session support.
ClaudeCodeSessionAffinity bool `yaml:"claude-code-session-affinity,omitempty" json:"claude-code-session-affinity,omitempty"`
// SessionAffinity enables universal session-sticky routing for all clients.
// Session IDs are extracted from multiple sources:
// X-Session-ID header, Idempotency-Key, metadata.user_id, conversation_id, or message hash.
// Automatic failover is always enabled when bound auth becomes unavailable.
SessionAffinity bool `yaml:"session-affinity,omitempty" json:"session-affinity,omitempty"`
// SessionAffinityTTL specifies how long session-to-auth bindings are retained.
// Default: 1h. Accepts duration strings like "30m", "1h", "2h30m".
SessionAffinityTTL string `yaml:"session-affinity-ttl,omitempty" json:"session-affinity-ttl,omitempty"`
}
// OAuthModelAlias defines a model ID alias for a specific channel.
@@ -257,8 +288,8 @@ type AmpCode struct {
UpstreamAPIKey string `yaml:"upstream-api-key" json:"upstream-api-key"`
// UpstreamAPIKeys maps client API keys (from top-level api-keys) to upstream API keys.
// When a client authenticates with a key that matches an entry, that upstream key is used.
// If no match is found, falls back to UpstreamAPIKey (default behavior).
// When a request is authenticated with one of the APIKeys, the corresponding UpstreamAPIKey
// is used for the upstream Amp request.
UpstreamAPIKeys []AmpUpstreamAPIKeyEntry `yaml:"upstream-api-keys,omitempty" json:"upstream-api-keys,omitempty"`
// RestrictManagementToLocalhost restricts Amp management routes (/api/user, /api/threads, etc.)
@@ -380,6 +411,11 @@ type ClaudeKey struct {
// Cloak configures request cloaking for non-Claude-Code clients.
Cloak *CloakConfig `yaml:"cloak,omitempty" json:"cloak,omitempty"`
// ExperimentalCCHSigning enables opt-in final-body cch signing for cloaked
// Claude /v1/messages requests. It is disabled by default so upstream seed
// changes do not alter the proxy's legacy behavior.
ExperimentalCCHSigning bool `yaml:"experimental-cch-signing,omitempty" json:"experimental-cch-signing,omitempty"`
}
func (k ClaudeKey) GetAPIKey() string { return k.APIKey }
@@ -972,6 +1008,7 @@ func (cfg *Config) SanitizeKiroKeys() {
}
// SanitizeGeminiKeys deduplicates and normalizes Gemini credentials.
// It uses API key + base URL as the uniqueness key.
func (cfg *Config) SanitizeGeminiKeys() {
if cfg == nil {
return
@@ -990,10 +1027,11 @@ func (cfg *Config) SanitizeGeminiKeys() {
entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
entry.Headers = NormalizeHeaders(entry.Headers)
entry.ExcludedModels = NormalizeExcludedModels(entry.ExcludedModels)
if _, exists := seen[entry.APIKey]; exists {
uniqueKey := entry.APIKey + "|" + entry.BaseURL
if _, exists := seen[uniqueKey]; exists {
continue
}
seen[entry.APIKey] = struct{}{}
seen[uniqueKey] = struct{}{}
out = append(out, entry)
}
cfg.GeminiKey = out

View File

@@ -9,6 +9,10 @@ type SDKConfig struct {
// ProxyURL is the URL of an optional proxy server to use for outbound requests.
ProxyURL string `yaml:"proxy-url" json:"proxy-url"`
// EnableGeminiCLIEndpoint controls whether Gemini CLI internal endpoints (/v1internal:*) are enabled.
// Default is false for safety; when false, /v1internal:* requests are rejected.
EnableGeminiCLIEndpoint bool `yaml:"enable-gemini-cli-endpoint" json:"enable-gemini-cli-endpoint"`
// ForceModelPrefix requires explicit model prefixes (e.g., "teamA/gemini-3-pro-preview")
// to target prefixed credentials. When false, unprefixed model requests may use prefixed
// credentials as well.

View File

@@ -4,6 +4,7 @@
package logging
import (
"bufio"
"bytes"
"compress/flate"
"compress/gzip"
@@ -41,15 +42,17 @@ type RequestLogger interface {
// - statusCode: The response status code
// - responseHeaders: The response headers
// - response: The raw response data
// - websocketTimeline: Optional downstream websocket event timeline
// - apiRequest: The API request data
// - apiResponse: The API response data
// - apiWebsocketTimeline: Optional upstream websocket event timeline
// - requestID: Optional request ID for log file naming
// - requestTimestamp: When the request was received
// - apiResponseTimestamp: When the API response was received
//
// Returns:
// - error: An error if logging fails, nil otherwise
LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error
LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error
// LogStreamingRequest initiates logging for a streaming request and returns a writer for chunks.
//
@@ -111,6 +114,16 @@ type StreamingLogWriter interface {
// - error: An error if writing fails, nil otherwise
WriteAPIResponse(apiResponse []byte) error
// WriteAPIWebsocketTimeline writes the upstream websocket timeline to the log.
// This should be called when upstream communication happened over websocket.
//
// Parameters:
// - apiWebsocketTimeline: The upstream websocket event timeline
//
// Returns:
// - error: An error if writing fails, nil otherwise
WriteAPIWebsocketTimeline(apiWebsocketTimeline []byte) error
// SetFirstChunkTimestamp sets the TTFB timestamp captured when first chunk was received.
//
// Parameters:
@@ -203,17 +216,17 @@ func (l *FileRequestLogger) SetErrorLogsMaxFiles(maxFiles int) {
//
// Returns:
// - error: An error if logging fails, nil otherwise
func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, false, requestID, requestTimestamp, apiResponseTimestamp)
func (l *FileRequestLogger) LogRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline []byte, apiResponseErrors []*interfaces.ErrorMessage, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline, apiResponseErrors, false, requestID, requestTimestamp, apiResponseTimestamp)
}
// LogRequestWithOptions logs a request with optional forced logging behavior.
// The force flag allows writing error logs even when regular request logging is disabled.
func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, apiRequest, apiResponse, apiResponseErrors, force, requestID, requestTimestamp, apiResponseTimestamp)
func (l *FileRequestLogger) LogRequestWithOptions(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
return l.logRequest(url, method, requestHeaders, body, statusCode, responseHeaders, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline, apiResponseErrors, force, requestID, requestTimestamp, apiResponseTimestamp)
}
func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, apiRequest, apiResponse []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[string][]string, body []byte, statusCode int, responseHeaders map[string][]string, response, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline []byte, apiResponseErrors []*interfaces.ErrorMessage, force bool, requestID string, requestTimestamp, apiResponseTimestamp time.Time) error {
if !l.enabled && !force {
return nil
}
@@ -260,8 +273,10 @@ func (l *FileRequestLogger) logRequest(url, method string, requestHeaders map[st
requestHeaders,
body,
requestBodyPath,
websocketTimeline,
apiRequest,
apiResponse,
apiWebsocketTimeline,
apiResponseErrors,
statusCode,
responseHeaders,
@@ -518,8 +533,10 @@ func (l *FileRequestLogger) writeNonStreamingLog(
requestHeaders map[string][]string,
requestBody []byte,
requestBodyPath string,
websocketTimeline []byte,
apiRequest []byte,
apiResponse []byte,
apiWebsocketTimeline []byte,
apiResponseErrors []*interfaces.ErrorMessage,
statusCode int,
responseHeaders map[string][]string,
@@ -531,7 +548,16 @@ func (l *FileRequestLogger) writeNonStreamingLog(
if requestTimestamp.IsZero() {
requestTimestamp = time.Now()
}
if errWrite := writeRequestInfoWithBody(w, url, method, requestHeaders, requestBody, requestBodyPath, requestTimestamp); errWrite != nil {
isWebsocketTranscript := hasSectionPayload(websocketTimeline)
downstreamTransport := inferDownstreamTransport(requestHeaders, websocketTimeline)
upstreamTransport := inferUpstreamTransport(apiRequest, apiResponse, apiWebsocketTimeline, apiResponseErrors)
if errWrite := writeRequestInfoWithBody(w, url, method, requestHeaders, requestBody, requestBodyPath, requestTimestamp, downstreamTransport, upstreamTransport, !isWebsocketTranscript); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(w, "=== WEBSOCKET TIMELINE ===\n", "=== WEBSOCKET TIMELINE", websocketTimeline, time.Time{}); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(w, "=== API WEBSOCKET TIMELINE ===\n", "=== API WEBSOCKET TIMELINE", apiWebsocketTimeline, time.Time{}); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(w, "=== API REQUEST ===\n", "=== API REQUEST", apiRequest, time.Time{}); errWrite != nil {
@@ -543,6 +569,12 @@ func (l *FileRequestLogger) writeNonStreamingLog(
if errWrite := writeAPISection(w, "=== API RESPONSE ===\n", "=== API RESPONSE", apiResponse, apiResponseTimestamp); errWrite != nil {
return errWrite
}
if isWebsocketTranscript {
// Intentionally omit the generic downstream HTTP response section for websocket
// transcripts. The durable session exchange is captured in WEBSOCKET TIMELINE,
// and appending a one-off upgrade response snapshot would dilute that transcript.
return nil
}
return writeResponseSection(w, statusCode, true, responseHeaders, bytes.NewReader(response), decompressErr, true)
}
@@ -553,6 +585,9 @@ func writeRequestInfoWithBody(
body []byte,
bodyPath string,
timestamp time.Time,
downstreamTransport string,
upstreamTransport string,
includeBody bool,
) error {
if _, errWrite := io.WriteString(w, "=== REQUEST INFO ===\n"); errWrite != nil {
return errWrite
@@ -566,10 +601,20 @@ func writeRequestInfoWithBody(
if _, errWrite := io.WriteString(w, fmt.Sprintf("Method: %s\n", method)); errWrite != nil {
return errWrite
}
if strings.TrimSpace(downstreamTransport) != "" {
if _, errWrite := io.WriteString(w, fmt.Sprintf("Downstream Transport: %s\n", downstreamTransport)); errWrite != nil {
return errWrite
}
}
if strings.TrimSpace(upstreamTransport) != "" {
if _, errWrite := io.WriteString(w, fmt.Sprintf("Upstream Transport: %s\n", upstreamTransport)); errWrite != nil {
return errWrite
}
}
if _, errWrite := io.WriteString(w, fmt.Sprintf("Timestamp: %s\n", timestamp.Format(time.RFC3339Nano))); errWrite != nil {
return errWrite
}
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
if errWrite := writeSectionSpacing(w, 1); errWrite != nil {
return errWrite
}
@@ -584,36 +629,121 @@ func writeRequestInfoWithBody(
}
}
}
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
if errWrite := writeSectionSpacing(w, 1); errWrite != nil {
return errWrite
}
if !includeBody {
return nil
}
if _, errWrite := io.WriteString(w, "=== REQUEST BODY ===\n"); errWrite != nil {
return errWrite
}
bodyTrailingNewlines := 1
if bodyPath != "" {
bodyFile, errOpen := os.Open(bodyPath)
if errOpen != nil {
return errOpen
}
if _, errCopy := io.Copy(w, bodyFile); errCopy != nil {
tracker := &trailingNewlineTrackingWriter{writer: w}
written, errCopy := io.Copy(tracker, bodyFile)
if errCopy != nil {
_ = bodyFile.Close()
return errCopy
}
if written > 0 {
bodyTrailingNewlines = tracker.trailingNewlines
}
if errClose := bodyFile.Close(); errClose != nil {
log.WithError(errClose).Warn("failed to close request body temp file")
}
} else if _, errWrite := w.Write(body); errWrite != nil {
return errWrite
} else if len(body) > 0 {
bodyTrailingNewlines = countTrailingNewlinesBytes(body)
}
if _, errWrite := io.WriteString(w, "\n\n"); errWrite != nil {
if errWrite := writeSectionSpacing(w, bodyTrailingNewlines); errWrite != nil {
return errWrite
}
return nil
}
func countTrailingNewlinesBytes(payload []byte) int {
count := 0
for i := len(payload) - 1; i >= 0; i-- {
if payload[i] != '\n' {
break
}
count++
}
return count
}
func writeSectionSpacing(w io.Writer, trailingNewlines int) error {
missingNewlines := 3 - trailingNewlines
if missingNewlines <= 0 {
return nil
}
_, errWrite := io.WriteString(w, strings.Repeat("\n", missingNewlines))
return errWrite
}
type trailingNewlineTrackingWriter struct {
writer io.Writer
trailingNewlines int
}
func (t *trailingNewlineTrackingWriter) Write(payload []byte) (int, error) {
written, errWrite := t.writer.Write(payload)
if written > 0 {
writtenPayload := payload[:written]
trailingNewlines := countTrailingNewlinesBytes(writtenPayload)
if trailingNewlines == len(writtenPayload) {
t.trailingNewlines += trailingNewlines
} else {
t.trailingNewlines = trailingNewlines
}
}
return written, errWrite
}
func hasSectionPayload(payload []byte) bool {
return len(bytes.TrimSpace(payload)) > 0
}
func inferDownstreamTransport(headers map[string][]string, websocketTimeline []byte) string {
if hasSectionPayload(websocketTimeline) {
return "websocket"
}
for key, values := range headers {
if strings.EqualFold(strings.TrimSpace(key), "Upgrade") {
for _, value := range values {
if strings.EqualFold(strings.TrimSpace(value), "websocket") {
return "websocket"
}
}
}
}
return "http"
}
func inferUpstreamTransport(apiRequest, apiResponse, apiWebsocketTimeline []byte, _ []*interfaces.ErrorMessage) string {
hasHTTP := hasSectionPayload(apiRequest) || hasSectionPayload(apiResponse)
hasWS := hasSectionPayload(apiWebsocketTimeline)
switch {
case hasHTTP && hasWS:
return "websocket+http"
case hasWS:
return "websocket"
case hasHTTP:
return "http"
default:
return ""
}
}
func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, payload []byte, timestamp time.Time) error {
if len(payload) == 0 {
return nil
@@ -623,11 +753,6 @@ func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, pa
if _, errWrite := w.Write(payload); errWrite != nil {
return errWrite
}
if !bytes.HasSuffix(payload, []byte("\n")) {
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
return errWrite
}
}
} else {
if _, errWrite := io.WriteString(w, sectionHeader); errWrite != nil {
return errWrite
@@ -640,12 +765,9 @@ func writeAPISection(w io.Writer, sectionHeader string, sectionPrefix string, pa
if _, errWrite := w.Write(payload); errWrite != nil {
return errWrite
}
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
return errWrite
}
}
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
if errWrite := writeSectionSpacing(w, countTrailingNewlinesBytes(payload)); errWrite != nil {
return errWrite
}
return nil
@@ -662,12 +784,17 @@ func writeAPIErrorResponses(w io.Writer, apiResponseErrors []*interfaces.ErrorMe
if _, errWrite := io.WriteString(w, fmt.Sprintf("HTTP Status: %d\n", apiResponseErrors[i].StatusCode)); errWrite != nil {
return errWrite
}
trailingNewlines := 1
if apiResponseErrors[i].Error != nil {
if _, errWrite := io.WriteString(w, apiResponseErrors[i].Error.Error()); errWrite != nil {
errText := apiResponseErrors[i].Error.Error()
if _, errWrite := io.WriteString(w, errText); errWrite != nil {
return errWrite
}
if errText != "" {
trailingNewlines = countTrailingNewlinesBytes([]byte(errText))
}
}
if _, errWrite := io.WriteString(w, "\n\n"); errWrite != nil {
if errWrite := writeSectionSpacing(w, trailingNewlines); errWrite != nil {
return errWrite
}
}
@@ -694,12 +821,18 @@ func writeResponseSection(w io.Writer, statusCode int, statusWritten bool, respo
}
}
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
return errWrite
var bufferedReader *bufio.Reader
if responseReader != nil {
bufferedReader = bufio.NewReader(responseReader)
}
if !responseBodyStartsWithLeadingNewline(bufferedReader) {
if _, errWrite := io.WriteString(w, "\n"); errWrite != nil {
return errWrite
}
}
if responseReader != nil {
if _, errCopy := io.Copy(w, responseReader); errCopy != nil {
if bufferedReader != nil {
if _, errCopy := io.Copy(w, bufferedReader); errCopy != nil {
return errCopy
}
}
@@ -717,6 +850,19 @@ func writeResponseSection(w io.Writer, statusCode int, statusWritten bool, respo
return nil
}
func responseBodyStartsWithLeadingNewline(reader *bufio.Reader) bool {
if reader == nil {
return false
}
if peeked, _ := reader.Peek(2); len(peeked) >= 2 && peeked[0] == '\r' && peeked[1] == '\n' {
return true
}
if peeked, _ := reader.Peek(1); len(peeked) >= 1 && peeked[0] == '\n' {
return true
}
return false
}
// formatLogContent creates the complete log content for non-streaming requests.
//
// Parameters:
@@ -724,6 +870,7 @@ func writeResponseSection(w io.Writer, statusCode int, statusWritten bool, respo
// - method: The HTTP method
// - headers: The request headers
// - body: The request body
// - websocketTimeline: The downstream websocket event timeline
// - apiRequest: The API request data
// - apiResponse: The API response data
// - response: The raw response data
@@ -732,11 +879,42 @@ func writeResponseSection(w io.Writer, statusCode int, statusWritten bool, respo
//
// Returns:
// - string: The formatted log content
func (l *FileRequestLogger) formatLogContent(url, method string, headers map[string][]string, body, apiRequest, apiResponse, response []byte, status int, responseHeaders map[string][]string, apiResponseErrors []*interfaces.ErrorMessage) string {
func (l *FileRequestLogger) formatLogContent(url, method string, headers map[string][]string, body, websocketTimeline, apiRequest, apiResponse, apiWebsocketTimeline, response []byte, status int, responseHeaders map[string][]string, apiResponseErrors []*interfaces.ErrorMessage) string {
var content strings.Builder
isWebsocketTranscript := hasSectionPayload(websocketTimeline)
downstreamTransport := inferDownstreamTransport(headers, websocketTimeline)
upstreamTransport := inferUpstreamTransport(apiRequest, apiResponse, apiWebsocketTimeline, apiResponseErrors)
// Request info
content.WriteString(l.formatRequestInfo(url, method, headers, body))
content.WriteString(l.formatRequestInfo(url, method, headers, body, downstreamTransport, upstreamTransport, !isWebsocketTranscript))
if len(websocketTimeline) > 0 {
if bytes.HasPrefix(websocketTimeline, []byte("=== WEBSOCKET TIMELINE")) {
content.Write(websocketTimeline)
if !bytes.HasSuffix(websocketTimeline, []byte("\n")) {
content.WriteString("\n")
}
} else {
content.WriteString("=== WEBSOCKET TIMELINE ===\n")
content.Write(websocketTimeline)
content.WriteString("\n")
}
content.WriteString("\n")
}
if len(apiWebsocketTimeline) > 0 {
if bytes.HasPrefix(apiWebsocketTimeline, []byte("=== API WEBSOCKET TIMELINE")) {
content.Write(apiWebsocketTimeline)
if !bytes.HasSuffix(apiWebsocketTimeline, []byte("\n")) {
content.WriteString("\n")
}
} else {
content.WriteString("=== API WEBSOCKET TIMELINE ===\n")
content.Write(apiWebsocketTimeline)
content.WriteString("\n")
}
content.WriteString("\n")
}
if len(apiRequest) > 0 {
if bytes.HasPrefix(apiRequest, []byte("=== API REQUEST")) {
@@ -773,6 +951,12 @@ func (l *FileRequestLogger) formatLogContent(url, method string, headers map[str
content.WriteString("\n")
}
if isWebsocketTranscript {
// Mirror writeNonStreamingLog: websocket transcripts end with the dedicated
// timeline sections instead of a generic downstream HTTP response block.
return content.String()
}
// Response section
content.WriteString("=== RESPONSE ===\n")
content.WriteString(fmt.Sprintf("Status: %d\n", status))
@@ -933,13 +1117,19 @@ func (l *FileRequestLogger) decompressZstd(data []byte) ([]byte, error) {
//
// Returns:
// - string: The formatted request information
func (l *FileRequestLogger) formatRequestInfo(url, method string, headers map[string][]string, body []byte) string {
func (l *FileRequestLogger) formatRequestInfo(url, method string, headers map[string][]string, body []byte, downstreamTransport string, upstreamTransport string, includeBody bool) string {
var content strings.Builder
content.WriteString("=== REQUEST INFO ===\n")
content.WriteString(fmt.Sprintf("Version: %s\n", buildinfo.Version))
content.WriteString(fmt.Sprintf("URL: %s\n", url))
content.WriteString(fmt.Sprintf("Method: %s\n", method))
if strings.TrimSpace(downstreamTransport) != "" {
content.WriteString(fmt.Sprintf("Downstream Transport: %s\n", downstreamTransport))
}
if strings.TrimSpace(upstreamTransport) != "" {
content.WriteString(fmt.Sprintf("Upstream Transport: %s\n", upstreamTransport))
}
content.WriteString(fmt.Sprintf("Timestamp: %s\n", time.Now().Format(time.RFC3339Nano)))
content.WriteString("\n")
@@ -952,6 +1142,10 @@ func (l *FileRequestLogger) formatRequestInfo(url, method string, headers map[st
}
content.WriteString("\n")
if !includeBody {
return content.String()
}
content.WriteString("=== REQUEST BODY ===\n")
content.Write(body)
content.WriteString("\n\n")
@@ -1011,6 +1205,9 @@ type FileStreamingLogWriter struct {
// apiResponse stores the upstream API response data.
apiResponse []byte
// apiWebsocketTimeline stores the upstream websocket event timeline.
apiWebsocketTimeline []byte
// apiResponseTimestamp captures when the API response was received.
apiResponseTimestamp time.Time
}
@@ -1092,6 +1289,21 @@ func (w *FileStreamingLogWriter) WriteAPIResponse(apiResponse []byte) error {
return nil
}
// WriteAPIWebsocketTimeline buffers the upstream websocket timeline for later writing.
//
// Parameters:
// - apiWebsocketTimeline: The upstream websocket event timeline
//
// Returns:
// - error: Always returns nil (buffering cannot fail)
func (w *FileStreamingLogWriter) WriteAPIWebsocketTimeline(apiWebsocketTimeline []byte) error {
if len(apiWebsocketTimeline) == 0 {
return nil
}
w.apiWebsocketTimeline = bytes.Clone(apiWebsocketTimeline)
return nil
}
func (w *FileStreamingLogWriter) SetFirstChunkTimestamp(timestamp time.Time) {
if !timestamp.IsZero() {
w.apiResponseTimestamp = timestamp
@@ -1100,7 +1312,7 @@ func (w *FileStreamingLogWriter) SetFirstChunkTimestamp(timestamp time.Time) {
// Close finalizes the log file and cleans up resources.
// It writes all buffered data to the file in the correct order:
// API REQUEST -> API RESPONSE -> RESPONSE (status, headers, body chunks)
// API WEBSOCKET TIMELINE -> API REQUEST -> API RESPONSE -> RESPONSE (status, headers, body chunks)
//
// Returns:
// - error: An error if closing fails, nil otherwise
@@ -1182,7 +1394,10 @@ func (w *FileStreamingLogWriter) asyncWriter() {
}
func (w *FileStreamingLogWriter) writeFinalLog(logFile *os.File) error {
if errWrite := writeRequestInfoWithBody(logFile, w.url, w.method, w.requestHeaders, nil, w.requestBodyPath, w.timestamp); errWrite != nil {
if errWrite := writeRequestInfoWithBody(logFile, w.url, w.method, w.requestHeaders, nil, w.requestBodyPath, w.timestamp, "http", inferUpstreamTransport(w.apiRequest, w.apiResponse, w.apiWebsocketTimeline, nil), true); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(logFile, "=== API WEBSOCKET TIMELINE ===\n", "=== API WEBSOCKET TIMELINE", w.apiWebsocketTimeline, time.Time{}); errWrite != nil {
return errWrite
}
if errWrite := writeAPISection(logFile, "=== API REQUEST ===\n", "=== API REQUEST", w.apiRequest, time.Time{}); errWrite != nil {
@@ -1265,6 +1480,17 @@ func (w *NoOpStreamingLogWriter) WriteAPIResponse(_ []byte) error {
return nil
}
// WriteAPIWebsocketTimeline is a no-op implementation that does nothing and always returns nil.
//
// Parameters:
// - apiWebsocketTimeline: The upstream websocket event timeline (ignored)
//
// Returns:
// - error: Always returns nil
func (w *NoOpStreamingLogWriter) WriteAPIWebsocketTimeline(_ []byte) error {
return nil
}
func (w *NoOpStreamingLogWriter) SetFirstChunkTimestamp(_ time.Time) {}
// Close is a no-op implementation that does nothing and always returns nil.

View File

@@ -0,0 +1,151 @@
// Package misc provides miscellaneous utility functions for the CLI Proxy API server.
package misc
import (
"context"
"encoding/json"
"errors"
"fmt"
"net/http"
"sync"
"time"
log "github.com/sirupsen/logrus"
)
const (
antigravityReleasesURL = "https://antigravity-auto-updater-974169037036.us-central1.run.app/releases"
antigravityFallbackVersion = "1.21.9"
antigravityVersionCacheTTL = 6 * time.Hour
antigravityFetchTimeout = 10 * time.Second
)
type antigravityRelease struct {
Version string `json:"version"`
ExecutionID string `json:"execution_id"`
}
var (
cachedAntigravityVersion = antigravityFallbackVersion
antigravityVersionMu sync.RWMutex
antigravityVersionExpiry time.Time
antigravityUpdaterOnce sync.Once
)
// StartAntigravityVersionUpdater starts a background goroutine that periodically refreshes the cached antigravity version.
// This is intentionally decoupled from request execution to avoid blocking executors on version lookups.
func StartAntigravityVersionUpdater(ctx context.Context) {
antigravityUpdaterOnce.Do(func() {
go runAntigravityVersionUpdater(ctx)
})
}
func runAntigravityVersionUpdater(ctx context.Context) {
if ctx == nil {
ctx = context.Background()
}
ticker := time.NewTicker(antigravityVersionCacheTTL / 2)
defer ticker.Stop()
log.Infof("periodic antigravity version refresh started (interval=%s)", antigravityVersionCacheTTL/2)
refreshAntigravityVersion(ctx)
for {
select {
case <-ctx.Done():
return
case <-ticker.C:
refreshAntigravityVersion(ctx)
}
}
}
func refreshAntigravityVersion(ctx context.Context) {
version, errFetch := fetchAntigravityLatestVersion(ctx)
antigravityVersionMu.Lock()
defer antigravityVersionMu.Unlock()
now := time.Now()
if errFetch == nil {
cachedAntigravityVersion = version
antigravityVersionExpiry = now.Add(antigravityVersionCacheTTL)
log.WithField("version", version).Info("fetched latest antigravity version")
return
}
if cachedAntigravityVersion == "" || now.After(antigravityVersionExpiry) {
cachedAntigravityVersion = antigravityFallbackVersion
antigravityVersionExpiry = now.Add(antigravityVersionCacheTTL)
log.WithError(errFetch).Warn("failed to refresh antigravity version, using fallback version")
return
}
log.WithError(errFetch).Debug("failed to refresh antigravity version, keeping cached value")
}
// AntigravityLatestVersion returns the cached antigravity version refreshed by StartAntigravityVersionUpdater.
// It falls back to antigravityFallbackVersion if the cache is empty or stale.
func AntigravityLatestVersion() string {
antigravityVersionMu.RLock()
if cachedAntigravityVersion != "" && time.Now().Before(antigravityVersionExpiry) {
v := cachedAntigravityVersion
antigravityVersionMu.RUnlock()
return v
}
antigravityVersionMu.RUnlock()
return antigravityFallbackVersion
}
// AntigravityUserAgent returns the User-Agent string for antigravity requests
// using the latest version fetched from the releases API.
func AntigravityUserAgent() string {
return fmt.Sprintf("antigravity/%s darwin/arm64", AntigravityLatestVersion())
}
func fetchAntigravityLatestVersion(ctx context.Context) (string, error) {
if ctx == nil {
ctx = context.Background()
}
client := &http.Client{Timeout: antigravityFetchTimeout}
httpReq, errReq := http.NewRequestWithContext(ctx, http.MethodGet, antigravityReleasesURL, nil)
if errReq != nil {
return "", fmt.Errorf("build antigravity releases request: %w", errReq)
}
resp, errDo := client.Do(httpReq)
if errDo != nil {
return "", fmt.Errorf("fetch antigravity releases: %w", errDo)
}
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
log.WithError(errClose).Warn("antigravity releases response body close error")
}
}()
if resp.StatusCode != http.StatusOK {
return "", fmt.Errorf("antigravity releases API returned status %d", resp.StatusCode)
}
var releases []antigravityRelease
if errDecode := json.NewDecoder(resp.Body).Decode(&releases); errDecode != nil {
return "", fmt.Errorf("decode antigravity releases response: %w", errDecode)
}
if len(releases) == 0 {
return "", errors.New("antigravity releases API returned empty list")
}
version := releases[0].Version
if version == "" {
return "", errors.New("antigravity releases API returned empty version")
}
return version, nil
}

View File

@@ -17,8 +17,6 @@ type staticModelsJSON struct {
CodexTeam []*ModelInfo `json:"codex-team"`
CodexPlus []*ModelInfo `json:"codex-plus"`
CodexPro []*ModelInfo `json:"codex-pro"`
Qwen []*ModelInfo `json:"qwen"`
IFlow []*ModelInfo `json:"iflow"`
Kimi []*ModelInfo `json:"kimi"`
Antigravity []*ModelInfo `json:"antigravity"`
}
@@ -68,16 +66,6 @@ func GetCodexProModels() []*ModelInfo {
return cloneModelInfos(getModels().CodexPro)
}
// GetQwenModels returns the standard Qwen model definitions.
func GetQwenModels() []*ModelInfo {
return cloneModelInfos(getModels().Qwen)
}
// GetIFlowModels returns the standard iFlow model definitions.
func GetIFlowModels() []*ModelInfo {
return cloneModelInfos(getModels().IFlow)
}
// GetKimiModels returns the standard Kimi (Moonshot AI) model definitions.
func GetKimiModels() []*ModelInfo {
return cloneModelInfos(getModels().Kimi)
@@ -93,6 +81,54 @@ func GetAntigravityModels() []*ModelInfo {
func GetCodeBuddyModels() []*ModelInfo {
now := int64(1748044800) // 2025-05-24
return []*ModelInfo{
{
ID: "auto",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "Auto",
Description: "Automatic model selection via CodeBuddy",
ContextLength: 128000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "glm-5v-turbo",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "GLM-5v Turbo",
Description: "GLM-5v Turbo via CodeBuddy",
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "glm-5.1",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "GLM-5.1",
Description: "GLM-5.1 via CodeBuddy",
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "glm-5.0-turbo",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "GLM-5.0 Turbo",
Description: "GLM-5.0 Turbo via CodeBuddy",
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "glm-5.0",
Object: "model",
@@ -101,7 +137,7 @@ func GetCodeBuddyModels() []*ModelInfo {
Type: "codebuddy",
DisplayName: "GLM-5.0",
Description: "GLM-5.0 via CodeBuddy",
ContextLength: 128000,
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
@@ -113,18 +149,18 @@ func GetCodeBuddyModels() []*ModelInfo {
Type: "codebuddy",
DisplayName: "GLM-4.7",
Description: "GLM-4.7 via CodeBuddy",
ContextLength: 128000,
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "minimax-m2.5",
ID: "minimax-m2.7",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "MiniMax M2.5",
Description: "MiniMax M2.5 via CodeBuddy",
DisplayName: "MiniMax M2.7",
Description: "MiniMax M2.7 via CodeBuddy",
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
@@ -137,10 +173,23 @@ func GetCodeBuddyModels() []*ModelInfo {
Type: "codebuddy",
DisplayName: "Kimi K2.5",
Description: "Kimi K2.5 via CodeBuddy",
ContextLength: 128000,
ContextLength: 256000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "kimi-k2-thinking",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "Kimi K2 Thinking",
Description: "Kimi K2 Thinking via CodeBuddy",
ContextLength: 256000,
MaxCompletionTokens: 32768,
Thinking: &ThinkingSupport{ZeroAllowed: true},
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "deepseek-v3-2-volc",
Object: "model",
@@ -148,24 +197,11 @@ func GetCodeBuddyModels() []*ModelInfo {
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "DeepSeek V3.2 (Volc)",
Description: "DeepSeek V3.2 via CodeBuddy (Volcano Engine)",
Description: "DeepSeek V3.2 via CodeBuddy",
ContextLength: 128000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "hunyuan-2.0-thinking",
Object: "model",
Created: now,
OwnedBy: "tencent",
Type: "codebuddy",
DisplayName: "Hunyuan 2.0 Thinking",
Description: "Tencent Hunyuan 2.0 Thinking via CodeBuddy",
ContextLength: 128000,
MaxCompletionTokens: 32768,
Thinking: &ThinkingSupport{ZeroAllowed: true},
SupportedEndpoints: []string{"/chat/completions"},
},
}
}
@@ -191,8 +227,6 @@ func cloneModelInfos(models []*ModelInfo) []*ModelInfo {
// - gemini-cli
// - aistudio
// - codex
// - qwen
// - iflow
// - kimi
// - kilo
// - github-copilot
@@ -213,10 +247,6 @@ func GetStaticModelDefinitionsByChannel(channel string) []*ModelInfo {
return GetAIStudioModels()
case "codex":
return GetCodexProModels()
case "qwen":
return GetQwenModels()
case "iflow":
return GetIFlowModels()
case "kimi":
return GetKimiModels()
case "github-copilot":
@@ -231,11 +261,25 @@ func GetStaticModelDefinitionsByChannel(channel string) []*ModelInfo {
return GetAntigravityModels()
case "codebuddy":
return GetCodeBuddyModels()
case "cursor":
return GetCursorModels()
default:
return nil
}
}
// GetCursorModels returns the fallback Cursor model definitions.
func GetCursorModels() []*ModelInfo {
return []*ModelInfo{
{ID: "composer-2", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "Composer 2", ContextLength: 200000, MaxCompletionTokens: 64000, Thinking: &ThinkingSupport{Max: 50000, DynamicAllowed: true}},
{ID: "claude-4-sonnet", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "Claude 4 Sonnet", ContextLength: 200000, MaxCompletionTokens: 64000, Thinking: &ThinkingSupport{Max: 50000, DynamicAllowed: true}},
{ID: "claude-3.5-sonnet", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "Claude 3.5 Sonnet", ContextLength: 200000, MaxCompletionTokens: 8192},
{ID: "gpt-4o", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "GPT-4o", ContextLength: 128000, MaxCompletionTokens: 16384},
{ID: "cursor-small", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "Cursor Small", ContextLength: 200000, MaxCompletionTokens: 64000},
{ID: "gemini-2.5-pro", Object: "model", OwnedBy: "cursor", Type: "cursor", DisplayName: "Gemini 2.5 Pro", ContextLength: 1000000, MaxCompletionTokens: 65536, Thinking: &ThinkingSupport{Max: 50000, DynamicAllowed: true}},
}
}
// LookupStaticModelInfo searches all static model definitions for a model by ID.
// Returns nil if no matching model is found.
func LookupStaticModelInfo(modelID string) *ModelInfo {
@@ -251,8 +295,6 @@ func LookupStaticModelInfo(modelID string) *ModelInfo {
data.GeminiCLI,
data.AIStudio,
data.CodexPro,
data.Qwen,
data.IFlow,
data.Kimi,
data.Antigravity,
GetGitHubCopilotModels(),
@@ -260,6 +302,7 @@ func LookupStaticModelInfo(modelID string) *ModelInfo {
GetKiloModels(),
GetAmazonQModels(),
GetCodeBuddyModels(),
GetCursorModels(),
}
for _, models := range allModels {
for _, m := range models {
@@ -272,10 +315,18 @@ func LookupStaticModelInfo(modelID string) *ModelInfo {
return nil
}
// defaultCopilotClaudeContextLength is the conservative prompt token limit for
// Claude models accessed via the GitHub Copilot API. Individual accounts are
// capped at 128K; business accounts at 168K. When the dynamic /models API fetch
// succeeds, the real per-account limit overrides this value. This constant is
// only used as a safe fallback.
const defaultCopilotClaudeContextLength = 128000
// GetGitHubCopilotModels returns the available models for GitHub Copilot.
// These models are available through the GitHub Copilot API at api.githubcopilot.com.
func GetGitHubCopilotModels() []*ModelInfo {
now := int64(1732752000) // 2024-11-27
copilotClaudeEndpoints := []string{"/chat/completions", "/messages"}
gpt4oEntries := []struct {
ID string
DisplayName string
@@ -462,6 +513,19 @@ func GetGitHubCopilotModels() []*ModelInfo {
SupportedEndpoints: []string{"/responses"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high", "xhigh"}},
},
{
ID: "gpt-5.4-mini",
Object: "model",
Created: now,
OwnedBy: "github-copilot",
Type: "github-copilot",
DisplayName: "GPT-5.4 mini",
Description: "OpenAI GPT-5.4 mini via GitHub Copilot",
ContextLength: 200000,
MaxCompletionTokens: 32768,
SupportedEndpoints: []string{"/responses"},
Thinking: &ThinkingSupport{Levels: []string{"none", "low", "medium", "high", "xhigh"}},
},
{
ID: "claude-haiku-4.5",
Object: "model",
@@ -470,9 +534,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Haiku 4.5",
Description: "Anthropic Claude Haiku 4.5 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
},
{
ID: "claude-opus-4.1",
@@ -482,9 +546,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Opus 4.1",
Description: "Anthropic Claude Opus 4.1 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 32000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
},
{
ID: "claude-opus-4.5",
@@ -494,9 +558,10 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Opus 4.5",
Description: "Anthropic Claude Opus 4.5 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "claude-opus-4.6",
@@ -506,9 +571,10 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Opus 4.6",
Description: "Anthropic Claude Opus 4.6 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "claude-sonnet-4",
@@ -518,9 +584,10 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Sonnet 4",
Description: "Anthropic Claude Sonnet 4 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "claude-sonnet-4.5",
@@ -530,9 +597,10 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Sonnet 4.5",
Description: "Anthropic Claude Sonnet 4.5 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "claude-sonnet-4.6",
@@ -542,9 +610,10 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Claude Sonnet 4.6",
Description: "Anthropic Claude Sonnet 4.6 via GitHub Copilot",
ContextLength: 200000,
ContextLength: defaultCopilotClaudeContextLength,
MaxCompletionTokens: 64000,
SupportedEndpoints: []string{"/chat/completions"},
SupportedEndpoints: copilotClaudeEndpoints,
Thinking: &ThinkingSupport{Levels: []string{"low", "medium", "high"}},
},
{
ID: "gemini-2.5-pro",
@@ -556,6 +625,7 @@ func GetGitHubCopilotModels() []*ModelInfo {
Description: "Google Gemini 2.5 Pro via GitHub Copilot",
ContextLength: 1048576,
MaxCompletionTokens: 65536,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "gemini-3-pro-preview",
@@ -567,6 +637,7 @@ func GetGitHubCopilotModels() []*ModelInfo {
Description: "Google Gemini 3 Pro Preview via GitHub Copilot",
ContextLength: 1048576,
MaxCompletionTokens: 65536,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "gemini-3.1-pro-preview",
@@ -576,8 +647,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Gemini 3.1 Pro (Preview)",
Description: "Google Gemini 3.1 Pro Preview via GitHub Copilot",
ContextLength: 1048576,
ContextLength: 173000,
MaxCompletionTokens: 65536,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "gemini-3-flash-preview",
@@ -587,8 +659,9 @@ func GetGitHubCopilotModels() []*ModelInfo {
Type: "github-copilot",
DisplayName: "Gemini 3 Flash (Preview)",
Description: "Google Gemini 3 Flash Preview via GitHub Copilot",
ContextLength: 1048576,
ContextLength: 173000,
MaxCompletionTokens: 65536,
SupportedEndpoints: []string{"/chat/completions"},
},
{
ID: "grok-code-fast-1",

View File

@@ -0,0 +1,70 @@
package registry
import "testing"
func TestGitHubCopilotGeminiModelsAreChatOnly(t *testing.T) {
models := GetGitHubCopilotModels()
required := map[string]bool{
"gemini-2.5-pro": false,
"gemini-3-pro-preview": false,
"gemini-3.1-pro-preview": false,
"gemini-3-flash-preview": false,
}
for _, model := range models {
if _, ok := required[model.ID]; !ok {
continue
}
required[model.ID] = true
if len(model.SupportedEndpoints) != 1 || model.SupportedEndpoints[0] != "/chat/completions" {
t.Fatalf("model %q supported endpoints = %v, want [/chat/completions]", model.ID, model.SupportedEndpoints)
}
}
for modelID, found := range required {
if !found {
t.Fatalf("expected GitHub Copilot model %q in definitions", modelID)
}
}
}
func TestGitHubCopilotClaudeModelsSupportMessages(t *testing.T) {
models := GetGitHubCopilotModels()
required := map[string]bool{
"claude-haiku-4.5": false,
"claude-opus-4.1": false,
"claude-opus-4.5": false,
"claude-opus-4.6": false,
"claude-sonnet-4": false,
"claude-sonnet-4.5": false,
"claude-sonnet-4.6": false,
}
for _, model := range models {
if _, ok := required[model.ID]; !ok {
continue
}
required[model.ID] = true
if !containsString(model.SupportedEndpoints, "/chat/completions") {
t.Fatalf("model %q supported endpoints = %v, missing /chat/completions", model.ID, model.SupportedEndpoints)
}
if !containsString(model.SupportedEndpoints, "/messages") {
t.Fatalf("model %q supported endpoints = %v, missing /messages", model.ID, model.SupportedEndpoints)
}
}
for modelID, found := range required {
if !found {
t.Fatalf("expected GitHub Copilot model %q in definitions", modelID)
}
}
}
func containsString(items []string, want string) bool {
for _, item := range items {
if item == want {
return true
}
}
return false
}

View File

@@ -1177,6 +1177,16 @@ func (r *ModelRegistry) convertModelToMap(model *ModelInfo, handlerType string)
"dynamic_allowed": model.Thinking.DynamicAllowed,
}
}
// Include context limits so Claude Code can manage conversation
// context correctly, especially for Copilot-proxied models whose
// real prompt limit (128K-168K) is much lower than the 1M window
// that Claude Code may assume for Opus 4.6 with 1M context enabled.
if model.ContextLength > 0 {
result["context_length"] = model.ContextLength
}
if model.MaxCompletionTokens > 0 {
result["max_completion_tokens"] = model.MaxCompletionTokens
}
return result
case "gemini":

View File

@@ -213,8 +213,6 @@ func detectChangedProviders(oldData, newData *staticModelsJSON) []string {
{"codex", oldData.CodexTeam, newData.CodexTeam},
{"codex", oldData.CodexPlus, newData.CodexPlus},
{"codex", oldData.CodexPro, newData.CodexPro},
{"qwen", oldData.Qwen, newData.Qwen},
{"iflow", oldData.IFlow, newData.IFlow},
{"kimi", oldData.Kimi, newData.Kimi},
{"antigravity", oldData.Antigravity, newData.Antigravity},
}
@@ -335,8 +333,6 @@ func validateModelsCatalog(data *staticModelsJSON) error {
{name: "codex-team", models: data.CodexTeam},
{name: "codex-plus", models: data.CodexPlus},
{name: "codex-pro", models: data.CodexPro},
{name: "qwen", models: data.Qwen},
{name: "iflow", models: data.IFlow},
{name: "kimi", models: data.Kimi},
{name: "antigravity", models: data.Antigravity},
}

File diff suppressed because it is too large Load Diff

View File

@@ -14,7 +14,9 @@ import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
"github.com/router-for-me/CLIProxyAPI/v6/internal/wsrelay"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
@@ -46,8 +48,16 @@ func NewAIStudioExecutor(cfg *config.Config, provider string, relay *wsrelay.Man
// Identifier returns the executor identifier.
func (e *AIStudioExecutor) Identifier() string { return "aistudio" }
// PrepareRequest prepares the HTTP request for execution (no-op for AI Studio).
func (e *AIStudioExecutor) PrepareRequest(_ *http.Request, _ *cliproxyauth.Auth) error {
// PrepareRequest prepares the HTTP request for execution.
func (e *AIStudioExecutor) PrepareRequest(req *http.Request, auth *cliproxyauth.Auth) error {
if req == nil {
return nil
}
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(req, attrs)
return nil
}
@@ -66,6 +76,9 @@ func (e *AIStudioExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.A
return nil, fmt.Errorf("aistudio executor: missing auth")
}
httpReq := req.WithContext(ctx)
if err := e.PrepareRequest(httpReq, auth); err != nil {
return nil, err
}
if httpReq.URL == nil || strings.TrimSpace(httpReq.URL.String()) == "" {
return nil, fmt.Errorf("aistudio executor: request URL is empty")
}
@@ -115,8 +128,8 @@ func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth,
return resp, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
translatedReq, body, err := e.translateRequest(req, opts, false)
if err != nil {
@@ -130,6 +143,11 @@ func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth,
Headers: http.Header{"Content-Type": []string{"application/json"}},
Body: body.payload,
}
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(&http.Request{Header: wsReq.Headers}, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -137,7 +155,7 @@ func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth,
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: endpoint,
Method: http.MethodPost,
Headers: wsReq.Headers.Clone(),
@@ -151,17 +169,17 @@ func (e *AIStudioExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth,
wsResp, err := e.relay.NonStream(ctx, authID, wsReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
recordAPIResponseMetadata(ctx, e.cfg, wsResp.Status, wsResp.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, wsResp.Status, wsResp.Headers.Clone())
if len(wsResp.Body) > 0 {
appendAPIResponseChunk(ctx, e.cfg, wsResp.Body)
helps.AppendAPIResponseChunk(ctx, e.cfg, wsResp.Body)
}
if wsResp.Status < 200 || wsResp.Status >= 300 {
return resp, statusErr{code: wsResp.Status, msg: string(wsResp.Body)}
}
reporter.publish(ctx, parseGeminiUsage(wsResp.Body))
reporter.Publish(ctx, helps.ParseGeminiUsage(wsResp.Body))
var param any
out := sdktranslator.TranslateNonStream(ctx, body.toFormat, opts.SourceFormat, req.Model, opts.OriginalRequest, translatedReq, wsResp.Body, &param)
resp = cliproxyexecutor.Response{Payload: ensureColonSpacedJSON(out), Headers: wsResp.Headers.Clone()}
@@ -174,8 +192,8 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
return nil, statusErr{code: http.StatusNotImplemented, msg: "/responses/compact not supported"}
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
translatedReq, body, err := e.translateRequest(req, opts, true)
if err != nil {
@@ -189,13 +207,18 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
Headers: http.Header{"Content-Type": []string{"application/json"}},
Body: body.payload,
}
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(&http.Request{Header: wsReq.Headers}, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
authID = auth.ID
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: endpoint,
Method: http.MethodPost,
Headers: wsReq.Headers.Clone(),
@@ -208,24 +231,24 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
})
wsStream, err := e.relay.Stream(ctx, authID, wsReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return nil, err
}
firstEvent, ok := <-wsStream
if !ok {
err = fmt.Errorf("wsrelay: stream closed before start")
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return nil, err
}
if firstEvent.Status > 0 && firstEvent.Status != http.StatusOK {
metadataLogged := false
if firstEvent.Status > 0 {
recordAPIResponseMetadata(ctx, e.cfg, firstEvent.Status, firstEvent.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, firstEvent.Status, firstEvent.Headers.Clone())
metadataLogged = true
}
var body bytes.Buffer
if len(firstEvent.Payload) > 0 {
appendAPIResponseChunk(ctx, e.cfg, firstEvent.Payload)
helps.AppendAPIResponseChunk(ctx, e.cfg, firstEvent.Payload)
body.Write(firstEvent.Payload)
}
if firstEvent.Type == wsrelay.MessageTypeStreamEnd {
@@ -233,18 +256,18 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
}
for event := range wsStream {
if event.Err != nil {
recordAPIResponseError(ctx, e.cfg, event.Err)
helps.RecordAPIResponseError(ctx, e.cfg, event.Err)
if body.Len() == 0 {
body.WriteString(event.Err.Error())
}
break
}
if !metadataLogged && event.Status > 0 {
recordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
metadataLogged = true
}
if len(event.Payload) > 0 {
appendAPIResponseChunk(ctx, e.cfg, event.Payload)
helps.AppendAPIResponseChunk(ctx, e.cfg, event.Payload)
body.Write(event.Payload)
}
if event.Type == wsrelay.MessageTypeStreamEnd {
@@ -260,23 +283,23 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
metadataLogged := false
processEvent := func(event wsrelay.StreamEvent) bool {
if event.Err != nil {
recordAPIResponseError(ctx, e.cfg, event.Err)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, event.Err)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: fmt.Errorf("wsrelay: %v", event.Err)}
return false
}
switch event.Type {
case wsrelay.MessageTypeStreamStart:
if !metadataLogged && event.Status > 0 {
recordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
metadataLogged = true
}
case wsrelay.MessageTypeStreamChunk:
if len(event.Payload) > 0 {
appendAPIResponseChunk(ctx, e.cfg, event.Payload)
filtered := FilterSSEUsageMetadata(event.Payload)
if detail, ok := parseGeminiStreamUsage(filtered); ok {
reporter.publish(ctx, detail)
helps.AppendAPIResponseChunk(ctx, e.cfg, event.Payload)
filtered := helps.FilterSSEUsageMetadata(event.Payload)
if detail, ok := helps.ParseGeminiStreamUsage(filtered); ok {
reporter.Publish(ctx, detail)
}
lines := sdktranslator.TranslateStream(ctx, body.toFormat, opts.SourceFormat, req.Model, opts.OriginalRequest, translatedReq, filtered, &param)
for i := range lines {
@@ -288,21 +311,21 @@ func (e *AIStudioExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth
return false
case wsrelay.MessageTypeHTTPResp:
if !metadataLogged && event.Status > 0 {
recordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, event.Status, event.Headers.Clone())
metadataLogged = true
}
if len(event.Payload) > 0 {
appendAPIResponseChunk(ctx, e.cfg, event.Payload)
helps.AppendAPIResponseChunk(ctx, e.cfg, event.Payload)
}
lines := sdktranslator.TranslateStream(ctx, body.toFormat, opts.SourceFormat, req.Model, opts.OriginalRequest, translatedReq, event.Payload, &param)
for i := range lines {
out <- cliproxyexecutor.StreamChunk{Payload: ensureColonSpacedJSON(lines[i])}
}
reporter.publish(ctx, parseGeminiUsage(event.Payload))
reporter.Publish(ctx, helps.ParseGeminiUsage(event.Payload))
return false
case wsrelay.MessageTypeError:
recordAPIResponseError(ctx, e.cfg, event.Err)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, event.Err)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: fmt.Errorf("wsrelay: %v", event.Err)}
return false
}
@@ -345,7 +368,7 @@ func (e *AIStudioExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.A
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: endpoint,
Method: http.MethodPost,
Headers: wsReq.Headers.Clone(),
@@ -358,12 +381,12 @@ func (e *AIStudioExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.A
})
resp, err := e.relay.NonStream(ctx, authID, wsReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return cliproxyexecutor.Response{}, err
}
recordAPIResponseMetadata(ctx, e.cfg, resp.Status, resp.Headers.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, resp.Status, resp.Headers.Clone())
if len(resp.Body) > 0 {
appendAPIResponseChunk(ctx, e.cfg, resp.Body)
helps.AppendAPIResponseChunk(ctx, e.cfg, resp.Body)
}
if resp.Status < 200 || resp.Status >= 300 {
return cliproxyexecutor.Response{}, statusErr{code: resp.Status, msg: string(resp.Body)}
@@ -404,8 +427,8 @@ func (e *AIStudioExecutor) translateRequest(req cliproxyexecutor.Request, opts c
return nil, translatedPayload{}, err
}
payload = fixGeminiImageAspectRatio(baseModel, payload)
requestedModel := payloadRequestedModel(opts, req.Model)
payload = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", payload, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
payload = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", payload, originalTranslated, requestedModel)
payload, _ = sjson.DeleteBytes(payload, "generationConfig.maxOutputTokens")
payload, _ = sjson.DeleteBytes(payload, "generationConfig.responseMimeType")
payload, _ = sjson.DeleteBytes(payload, "generationConfig.responseJsonSchema")

File diff suppressed because it is too large Load Diff

View File

@@ -35,12 +35,102 @@ func TestAntigravityBuildRequest_SanitizesAntigravityToolSchema(t *testing.T) {
assertSchemaSanitizedAndPropertyPreserved(t, params)
}
func buildRequestBodyFromPayload(t *testing.T, modelName string) map[string]any {
func TestAntigravityBuildRequest_SkipsSchemaSanitizationWithoutToolsField(t *testing.T) {
body := buildRequestBodyFromRawPayload(t, "gemini-3.1-flash-image", []byte(`{
"request": {
"contents": [
{
"role": "user",
"x-debug": "keep-me",
"parts": [
{
"text": "hello"
}
]
}
],
"nonSchema": {
"nullable": true,
"x-extra": "keep-me"
},
"generationConfig": {
"maxOutputTokens": 128
}
}
}`))
assertNonSchemaRequestPreserved(t, body)
}
func TestAntigravityBuildRequest_SkipsSchemaSanitizationWithEmptyToolsArray(t *testing.T) {
body := buildRequestBodyFromRawPayload(t, "gemini-3.1-flash-image", []byte(`{
"request": {
"tools": [],
"contents": [
{
"role": "user",
"x-debug": "keep-me",
"parts": [
{
"text": "hello"
}
]
}
],
"nonSchema": {
"nullable": true,
"x-extra": "keep-me"
},
"generationConfig": {
"maxOutputTokens": 128
}
}
}`))
assertNonSchemaRequestPreserved(t, body)
}
func assertNonSchemaRequestPreserved(t *testing.T, body map[string]any) {
t.Helper()
executor := &AntigravityExecutor{}
auth := &cliproxyauth.Auth{}
payload := []byte(`{
request, ok := body["request"].(map[string]any)
if !ok {
t.Fatalf("request missing or invalid type")
}
contents, ok := request["contents"].([]any)
if !ok || len(contents) == 0 {
t.Fatalf("contents missing or empty")
}
content, ok := contents[0].(map[string]any)
if !ok {
t.Fatalf("content missing or invalid type")
}
if got, ok := content["x-debug"].(string); !ok || got != "keep-me" {
t.Fatalf("x-debug should be preserved when no tool schema exists, got=%v", content["x-debug"])
}
nonSchema, ok := request["nonSchema"].(map[string]any)
if !ok {
t.Fatalf("nonSchema missing or invalid type")
}
if _, ok := nonSchema["nullable"]; !ok {
t.Fatalf("nullable should be preserved outside schema cleanup path")
}
if got, ok := nonSchema["x-extra"].(string); !ok || got != "keep-me" {
t.Fatalf("x-extra should be preserved outside schema cleanup path, got=%v", nonSchema["x-extra"])
}
if generationConfig, ok := request["generationConfig"].(map[string]any); ok {
if _, ok := generationConfig["maxOutputTokens"]; ok {
t.Fatalf("maxOutputTokens should still be removed for non-Claude requests")
}
}
}
func buildRequestBodyFromPayload(t *testing.T, modelName string) map[string]any {
t.Helper()
return buildRequestBodyFromRawPayload(t, modelName, []byte(`{
"request": {
"tools": [
{
@@ -75,7 +165,14 @@ func buildRequestBodyFromPayload(t *testing.T, modelName string) map[string]any
}
]
}
}`)
}`))
}
func buildRequestBodyFromRawPayload(t *testing.T, modelName string, payload []byte) map[string]any {
t.Helper()
executor := &AntigravityExecutor{}
auth := &cliproxyauth.Auth{}
req, err := executor.buildRequest(context.Background(), auth, "token", modelName, payload, false, "", "https://example.com")
if err != nil {

View File

@@ -0,0 +1,490 @@
package executor
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"sync"
"testing"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
)
func resetAntigravityCreditsRetryState() {
antigravityCreditsFailureByAuth = sync.Map{}
antigravityPreferCreditsByModel = sync.Map{}
antigravityShortCooldownByAuth = sync.Map{}
}
func TestClassifyAntigravity429(t *testing.T) {
t.Run("quota exhausted", func(t *testing.T) {
body := []byte(`{"error":{"status":"RESOURCE_EXHAUSTED","message":"QUOTA_EXHAUSTED"}}`)
if got := classifyAntigravity429(body); got != antigravity429QuotaExhausted {
t.Fatalf("classifyAntigravity429() = %q, want %q", got, antigravity429QuotaExhausted)
}
})
t.Run("structured rate limit", func(t *testing.T) {
body := []byte(`{
"error": {
"status": "RESOURCE_EXHAUSTED",
"details": [
{"@type": "type.googleapis.com/google.rpc.ErrorInfo", "reason": "RATE_LIMIT_EXCEEDED"},
{"@type": "type.googleapis.com/google.rpc.RetryInfo", "retryDelay": "0.5s"}
]
}
}`)
if got := classifyAntigravity429(body); got != antigravity429RateLimited {
t.Fatalf("classifyAntigravity429() = %q, want %q", got, antigravity429RateLimited)
}
})
t.Run("structured quota exhausted", func(t *testing.T) {
body := []byte(`{
"error": {
"status": "RESOURCE_EXHAUSTED",
"details": [
{"@type": "type.googleapis.com/google.rpc.ErrorInfo", "reason": "QUOTA_EXHAUSTED"}
]
}
}`)
if got := classifyAntigravity429(body); got != antigravity429QuotaExhausted {
t.Fatalf("classifyAntigravity429() = %q, want %q", got, antigravity429QuotaExhausted)
}
})
t.Run("unstructured 429 defaults to soft rate limit", func(t *testing.T) {
body := []byte(`{"error":{"message":"too many requests"}}`)
if got := classifyAntigravity429(body); got != antigravity429SoftRateLimit {
t.Fatalf("classifyAntigravity429() = %q, want %q", got, antigravity429SoftRateLimit)
}
})
}
func TestInjectEnabledCreditTypes(t *testing.T) {
body := []byte(`{"model":"gemini-2.5-flash","request":{}}`)
got := injectEnabledCreditTypes(body)
if got == nil {
t.Fatal("injectEnabledCreditTypes() returned nil")
}
if !strings.Contains(string(got), `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("injectEnabledCreditTypes() = %s, want enabledCreditTypes", string(got))
}
if got := injectEnabledCreditTypes([]byte(`not json`)); got != nil {
t.Fatalf("injectEnabledCreditTypes() for invalid json = %s, want nil", string(got))
}
}
func TestShouldMarkAntigravityCreditsExhausted(t *testing.T) {
t.Run("credit errors are marked", func(t *testing.T) {
for _, body := range [][]byte{
[]byte(`{"error":{"message":"Insufficient GOOGLE_ONE_AI credits"}}`),
[]byte(`{"error":{"message":"minimumCreditAmountForUsage requirement not met"}}`),
} {
if !shouldMarkAntigravityCreditsExhausted(http.StatusForbidden, body, nil) {
t.Fatalf("shouldMarkAntigravityCreditsExhausted(%s) = false, want true", string(body))
}
}
})
t.Run("transient 429 resource exhausted is not marked", func(t *testing.T) {
body := []byte(`{"error":{"code":429,"message":"Resource has been exhausted (e.g. check quota).","status":"RESOURCE_EXHAUSTED"}}`)
if shouldMarkAntigravityCreditsExhausted(http.StatusTooManyRequests, body, nil) {
t.Fatalf("shouldMarkAntigravityCreditsExhausted(%s) = true, want false", string(body))
}
})
t.Run("resource exhausted with quota metadata is still marked", func(t *testing.T) {
body := []byte(`{"error":{"code":429,"message":"Resource has been exhausted","status":"RESOURCE_EXHAUSTED","details":[{"@type":"type.googleapis.com/google.rpc.ErrorInfo","metadata":{"quotaResetDelay":"1h","model":"claude-sonnet-4-6"}}]}}`)
if !shouldMarkAntigravityCreditsExhausted(http.StatusTooManyRequests, body, nil) {
t.Fatalf("shouldMarkAntigravityCreditsExhausted(%s) = false, want true", string(body))
}
})
if shouldMarkAntigravityCreditsExhausted(http.StatusServiceUnavailable, []byte(`{"error":{"message":"credits exhausted"}}`), nil) {
t.Fatal("shouldMarkAntigravityCreditsExhausted() = true for 5xx, want false")
}
}
func TestAntigravityExecute_RetriesTransient429ResourceExhausted(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var requestCount int
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
requestCount++
switch requestCount {
case 1:
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"code":429,"message":"Resource has been exhausted (e.g. check quota).","status":"RESOURCE_EXHAUSTED"}}`))
case 2:
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"ok"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2}}}`))
default:
t.Fatalf("unexpected request count %d", requestCount)
}
}))
defer server.Close()
exec := NewAntigravityExecutor(&config.Config{RequestRetry: 1})
auth := &cliproxyauth.Auth{
ID: "auth-transient-429",
Attributes: map[string]string{
"base_url": server.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
resp, err := exec.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FormatAntigravity,
})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if len(resp.Payload) == 0 {
t.Fatal("Execute() returned empty payload")
}
if requestCount != 2 {
t.Fatalf("request count = %d, want 2", requestCount)
}
}
func TestAntigravityExecute_RetriesQuotaExhaustedWithCredits(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var (
mu sync.Mutex
requestBodies []string
)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
_ = r.Body.Close()
mu.Lock()
requestBodies = append(requestBodies, string(body))
reqNum := len(requestBodies)
mu.Unlock()
if reqNum == 1 {
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"status":"RESOURCE_EXHAUSTED","message":"QUOTA_EXHAUSTED"}}`))
return
}
if !strings.Contains(string(body), `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("second request body missing enabledCreditTypes: %s", string(body))
}
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"ok"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2}}}`))
}))
defer server.Close()
exec := NewAntigravityExecutor(&config.Config{
QuotaExceeded: config.QuotaExceeded{AntigravityCredits: true},
})
auth := &cliproxyauth.Auth{
ID: "auth-credits-ok",
Attributes: map[string]string{
"base_url": server.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
resp, err := exec.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FormatAntigravity,
})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if len(resp.Payload) == 0 {
t.Fatal("Execute() returned empty payload")
}
mu.Lock()
defer mu.Unlock()
if len(requestBodies) != 2 {
t.Fatalf("request count = %d, want 2", len(requestBodies))
}
}
func TestAntigravityExecute_SkipsCreditsRetryWhenAlreadyExhausted(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var requestCount int
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
requestCount++
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"status":"RESOURCE_EXHAUSTED","message":"QUOTA_EXHAUSTED"}}`))
}))
defer server.Close()
exec := NewAntigravityExecutor(&config.Config{
QuotaExceeded: config.QuotaExceeded{AntigravityCredits: true},
})
auth := &cliproxyauth.Auth{
ID: "auth-credits-exhausted",
Attributes: map[string]string{
"base_url": server.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
recordAntigravityCreditsFailure(auth, time.Now())
_, err := exec.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FormatAntigravity,
})
if err == nil {
t.Fatal("Execute() error = nil, want 429")
}
sErr, ok := err.(statusErr)
if !ok {
t.Fatalf("Execute() error type = %T, want statusErr", err)
}
if got := sErr.StatusCode(); got != http.StatusTooManyRequests {
t.Fatalf("Execute() status code = %d, want %d", got, http.StatusTooManyRequests)
}
if requestCount != 1 {
t.Fatalf("request count = %d, want 1", requestCount)
}
}
func TestAntigravityExecute_PrefersCreditsAfterSuccessfulFallback(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var (
mu sync.Mutex
requestBodies []string
)
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
_ = r.Body.Close()
mu.Lock()
requestBodies = append(requestBodies, string(body))
reqNum := len(requestBodies)
mu.Unlock()
switch reqNum {
case 1:
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"status":"RESOURCE_EXHAUSTED","details":[{"@type":"type.googleapis.com/google.rpc.ErrorInfo","reason":"QUOTA_EXHAUSTED"},{"@type":"type.googleapis.com/google.rpc.RetryInfo","retryDelay":"10s"}]}}`))
case 2, 3:
if !strings.Contains(string(body), `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("request %d body missing enabledCreditTypes: %s", reqNum, string(body))
}
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"OK"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2}}}`))
default:
t.Fatalf("unexpected request count %d", reqNum)
}
}))
defer server.Close()
exec := NewAntigravityExecutor(&config.Config{
QuotaExceeded: config.QuotaExceeded{AntigravityCredits: true},
})
auth := &cliproxyauth.Auth{
ID: "auth-prefer-credits",
Attributes: map[string]string{
"base_url": server.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
request := cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}
opts := cliproxyexecutor.Options{SourceFormat: sdktranslator.FormatAntigravity}
if _, err := exec.Execute(context.Background(), auth, request, opts); err != nil {
t.Fatalf("first Execute() error = %v", err)
}
if _, err := exec.Execute(context.Background(), auth, request, opts); err != nil {
t.Fatalf("second Execute() error = %v", err)
}
mu.Lock()
defer mu.Unlock()
if len(requestBodies) != 3 {
t.Fatalf("request count = %d, want 3", len(requestBodies))
}
if strings.Contains(requestBodies[0], `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("first request unexpectedly used credits: %s", requestBodies[0])
}
if !strings.Contains(requestBodies[1], `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("fallback request missing credits: %s", requestBodies[1])
}
if !strings.Contains(requestBodies[2], `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("preferred request missing credits: %s", requestBodies[2])
}
}
func TestAntigravityExecute_PreservesBaseURLFallbackAfterCreditsRetryFailure(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var (
mu sync.Mutex
firstCount int
secondCount int
)
firstServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
_ = r.Body.Close()
mu.Lock()
firstCount++
reqNum := firstCount
mu.Unlock()
switch reqNum {
case 1:
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"status":"RESOURCE_EXHAUSTED","details":[{"@type":"type.googleapis.com/google.rpc.ErrorInfo","reason":"QUOTA_EXHAUSTED"}]}}`))
case 2:
if !strings.Contains(string(body), `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("credits retry missing enabledCreditTypes: %s", string(body))
}
w.WriteHeader(http.StatusForbidden)
_, _ = w.Write([]byte(`{"error":{"message":"permission denied"}}`))
default:
t.Fatalf("unexpected first server request count %d", reqNum)
}
}))
defer firstServer.Close()
secondServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
mu.Lock()
secondCount++
mu.Unlock()
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"response":{"candidates":[{"content":{"role":"model","parts":[{"text":"ok"}]}}],"usageMetadata":{"promptTokenCount":1,"candidatesTokenCount":1,"totalTokenCount":2}}}`))
}))
defer secondServer.Close()
exec := NewAntigravityExecutor(&config.Config{
QuotaExceeded: config.QuotaExceeded{AntigravityCredits: true},
})
auth := &cliproxyauth.Auth{
ID: "auth-baseurl-fallback",
Attributes: map[string]string{
"base_url": firstServer.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
originalOrder := antigravityBaseURLFallbackOrder
defer func() { antigravityBaseURLFallbackOrder = originalOrder }()
antigravityBaseURLFallbackOrder = func(auth *cliproxyauth.Auth) []string {
return []string{firstServer.URL, secondServer.URL}
}
resp, err := exec.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FormatAntigravity,
})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if len(resp.Payload) == 0 {
t.Fatal("Execute() returned empty payload")
}
if firstCount != 2 {
t.Fatalf("first server request count = %d, want 2", firstCount)
}
if secondCount != 1 {
t.Fatalf("second server request count = %d, want 1", secondCount)
}
}
func TestAntigravityExecute_DoesNotDirectInjectCreditsWhenFlagDisabled(t *testing.T) {
resetAntigravityCreditsRetryState()
t.Cleanup(resetAntigravityCreditsRetryState)
var requestBodies []string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
_ = r.Body.Close()
requestBodies = append(requestBodies, string(body))
w.WriteHeader(http.StatusTooManyRequests)
_, _ = w.Write([]byte(`{"error":{"status":"RESOURCE_EXHAUSTED","message":"QUOTA_EXHAUSTED"}}`))
}))
defer server.Close()
exec := NewAntigravityExecutor(&config.Config{
QuotaExceeded: config.QuotaExceeded{AntigravityCredits: false},
})
auth := &cliproxyauth.Auth{
ID: "auth-flag-disabled",
Attributes: map[string]string{
"base_url": server.URL,
},
Metadata: map[string]any{
"access_token": "token",
"project_id": "project-1",
"expired": time.Now().Add(1 * time.Hour).Format(time.RFC3339),
},
}
markAntigravityPreferCredits(auth, "gemini-2.5-flash", time.Now(), nil)
_, err := exec.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gemini-2.5-flash",
Payload: []byte(`{"request":{"contents":[{"role":"user","parts":[{"text":"hi"}]}]}}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FormatAntigravity,
})
if err == nil {
t.Fatal("Execute() error = nil, want 429")
}
if len(requestBodies) != 1 {
t.Fatalf("request count = %d, want 1", len(requestBodies))
}
if strings.Contains(requestBodies[0], `"enabledCreditTypes":["GOOGLE_ONE_AI"]`) {
t.Fatalf("request unexpectedly used enabledCreditTypes with flag disabled: %s", requestBodies[0])
}
}

View File

@@ -0,0 +1,165 @@
package executor
import (
"bytes"
"context"
"encoding/base64"
"net/http"
"net/http/httptest"
"sync/atomic"
"testing"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/cache"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
)
func testGeminiSignaturePayload() string {
payload := append([]byte{0x0A}, bytes.Repeat([]byte{0x56}, 48)...)
return base64.StdEncoding.EncodeToString(payload)
}
// testFakeClaudeSignature returns a base64 string starting with 'E' that passes
// the lightweight hasValidClaudeSignature check but has invalid protobuf content
// (first decoded byte 0x12 is correct, but no valid protobuf field 2 follows),
// so it fails deep validation in strict mode.
func testFakeClaudeSignature() string {
return base64.StdEncoding.EncodeToString([]byte{0x12, 0xFF, 0xFE, 0xFD})
}
func testAntigravityAuth(baseURL string) *cliproxyauth.Auth {
return &cliproxyauth.Auth{
Attributes: map[string]string{
"base_url": baseURL,
},
Metadata: map[string]any{
"access_token": "token-123",
"expired": time.Now().Add(24 * time.Hour).Format(time.RFC3339),
},
}
}
func invalidClaudeThinkingPayload() []byte {
return []byte(`{
"model": "claude-sonnet-4-5-thinking",
"messages": [
{
"role": "assistant",
"content": [
{"type": "thinking", "thinking": "bad", "signature": "` + testFakeClaudeSignature() + `"},
{"type": "text", "text": "hello"}
]
}
]
}`)
}
func TestAntigravityExecutor_StrictBypassRejectsInvalidSignature(t *testing.T) {
previousCache := cache.SignatureCacheEnabled()
previousStrict := cache.SignatureBypassStrictMode()
cache.SetSignatureCacheEnabled(false)
cache.SetSignatureBypassStrictMode(true)
t.Cleanup(func() {
cache.SetSignatureCacheEnabled(previousCache)
cache.SetSignatureBypassStrictMode(previousStrict)
})
var hits atomic.Int32
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
hits.Add(1)
w.WriteHeader(http.StatusOK)
_, _ = w.Write([]byte(`{"response":{"candidates":[{"content":{"parts":[{"text":"ok"}]}}]}}`))
}))
defer server.Close()
executor := NewAntigravityExecutor(nil)
auth := testAntigravityAuth(server.URL)
payload := invalidClaudeThinkingPayload()
opts := cliproxyexecutor.Options{SourceFormat: sdktranslator.FromString("claude"), OriginalRequest: payload}
req := cliproxyexecutor.Request{Model: "claude-sonnet-4-5-thinking", Payload: payload}
tests := []struct {
name string
invoke func() error
}{
{
name: "execute",
invoke: func() error {
_, err := executor.Execute(context.Background(), auth, req, opts)
return err
},
},
{
name: "stream",
invoke: func() error {
_, err := executor.ExecuteStream(context.Background(), auth, req, cliproxyexecutor.Options{SourceFormat: opts.SourceFormat, OriginalRequest: payload, Stream: true})
return err
},
},
{
name: "count tokens",
invoke: func() error {
_, err := executor.CountTokens(context.Background(), auth, req, opts)
return err
},
},
}
for _, tt := range tests {
tt := tt
t.Run(tt.name, func(t *testing.T) {
err := tt.invoke()
if err == nil {
t.Fatal("expected invalid signature to return an error")
}
statusProvider, ok := err.(interface{ StatusCode() int })
if !ok {
t.Fatalf("expected status error, got %T: %v", err, err)
}
if statusProvider.StatusCode() != http.StatusBadRequest {
t.Fatalf("status = %d, want %d", statusProvider.StatusCode(), http.StatusBadRequest)
}
})
}
if got := hits.Load(); got != 0 {
t.Fatalf("expected invalid signature to be rejected before upstream request, got %d upstream hits", got)
}
}
func TestAntigravityExecutor_NonStrictBypassSkipsPrecheck(t *testing.T) {
previousCache := cache.SignatureCacheEnabled()
previousStrict := cache.SignatureBypassStrictMode()
cache.SetSignatureCacheEnabled(false)
cache.SetSignatureBypassStrictMode(false)
t.Cleanup(func() {
cache.SetSignatureCacheEnabled(previousCache)
cache.SetSignatureBypassStrictMode(previousStrict)
})
payload := invalidClaudeThinkingPayload()
from := sdktranslator.FromString("claude")
_, err := validateAntigravityRequestSignatures(from, payload)
if err != nil {
t.Fatalf("non-strict bypass should skip precheck, got: %v", err)
}
}
func TestAntigravityExecutor_CacheModeSkipsPrecheck(t *testing.T) {
previous := cache.SignatureCacheEnabled()
cache.SetSignatureCacheEnabled(true)
t.Cleanup(func() {
cache.SetSignatureCacheEnabled(previous)
})
payload := invalidClaudeThinkingPayload()
from := sdktranslator.FromString("claude")
_, err := validateAntigravityRequestSignatures(from, payload)
if err != nil {
t.Fatalf("cache mode should skip precheck, got: %v", err)
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -4,9 +4,11 @@ import (
"bytes"
"compress/gzip"
"context"
"fmt"
"io"
"net/http"
"net/http/httptest"
"regexp"
"strings"
"sync"
"testing"
@@ -14,7 +16,10 @@ import (
"github.com/gin-gonic/gin"
"github.com/klauspost/compress/zstd"
xxHash64 "github.com/pierrec/xxHash/xxHash64"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
@@ -23,9 +28,7 @@ import (
)
func resetClaudeDeviceProfileCache() {
claudeDeviceProfileCacheMu.Lock()
claudeDeviceProfileCache = make(map[string]claudeDeviceProfileCacheEntry)
claudeDeviceProfileCacheMu.Unlock()
helps.ResetClaudeDeviceProfileCache()
}
func newClaudeHeaderTestRequest(t *testing.T, incoming http.Header) *http.Request {
@@ -98,7 +101,7 @@ func TestApplyClaudeHeaders_UsesConfiguredBaselineFingerprint(t *testing.T) {
req := newClaudeHeaderTestRequest(t, incoming)
applyClaudeHeaders(req, auth, "key-baseline", false, nil, cfg)
assertClaudeFingerprint(t, req.Header, "claude-cli/2.1.70 (external, cli)", "0.80.0", "v24.5.0", "MacOS", "arm64")
assertClaudeFingerprint(t, req.Header, "evil-client/9.9", "9.9.9", "v24.5.0", "Linux", "x64")
if got := req.Header.Get("X-Stainless-Timeout"); got != "900" {
t.Fatalf("X-Stainless-Timeout = %q, want %q", got, "900")
}
@@ -338,7 +341,7 @@ func TestResolveClaudeDeviceProfile_RechecksCacheBeforeStoringCandidate(t *testi
var pauseOnce sync.Once
var releaseOnce sync.Once
claudeDeviceProfileBeforeCandidateStore = func(candidate claudeDeviceProfile) {
helps.ClaudeDeviceProfileBeforeCandidateStore = func(candidate helps.ClaudeDeviceProfile) {
if candidate.UserAgent != "claude-cli/2.1.62 (external, cli)" {
return
}
@@ -346,13 +349,13 @@ func TestResolveClaudeDeviceProfile_RechecksCacheBeforeStoringCandidate(t *testi
<-releaseLow
}
t.Cleanup(func() {
claudeDeviceProfileBeforeCandidateStore = nil
helps.ClaudeDeviceProfileBeforeCandidateStore = nil
releaseOnce.Do(func() { close(releaseLow) })
})
lowResultCh := make(chan claudeDeviceProfile, 1)
lowResultCh := make(chan helps.ClaudeDeviceProfile, 1)
go func() {
lowResultCh <- resolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
lowResultCh <- helps.ResolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
"User-Agent": []string{"claude-cli/2.1.62 (external, cli)"},
"X-Stainless-Package-Version": []string{"0.74.0"},
"X-Stainless-Runtime-Version": []string{"v24.3.0"},
@@ -367,7 +370,7 @@ func TestResolveClaudeDeviceProfile_RechecksCacheBeforeStoringCandidate(t *testi
t.Fatal("timed out waiting for lower candidate to pause before storing")
}
highResult := resolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
highResult := helps.ResolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
"User-Agent": []string{"claude-cli/2.1.63 (external, cli)"},
"X-Stainless-Package-Version": []string{"0.75.0"},
"X-Stainless-Runtime-Version": []string{"v24.4.0"},
@@ -398,7 +401,7 @@ func TestResolveClaudeDeviceProfile_RechecksCacheBeforeStoringCandidate(t *testi
t.Fatalf("highResult platform = %s/%s, want %s/%s", highResult.OS, highResult.Arch, "MacOS", "arm64")
}
cached := resolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
cached := helps.ResolveClaudeDeviceProfile(auth, "key-racy-upgrade", http.Header{
"User-Agent": []string{"curl/8.7.1"},
}, cfg)
if cached.UserAgent != "claude-cli/2.1.63 (external, cli)" {
@@ -564,7 +567,7 @@ func TestApplyClaudeHeaders_LegacyModeFallsBackToRuntimeOSArchWhenMissing(t *tes
})
applyClaudeHeaders(req, auth, "key-legacy-runtime-os-arch", false, nil, cfg)
assertClaudeFingerprint(t, req.Header, "claude-cli/2.1.60 (external, cli)", "0.70.0", "v22.0.0", mapStainlessOS(), mapStainlessArch())
assertClaudeFingerprint(t, req.Header, "claude-cli/2.1.60 (external, cli)", "0.70.0", "v22.0.0", helps.MapStainlessOS(), helps.MapStainlessArch())
}
func TestApplyClaudeHeaders_UnsetStabilizationAlsoUsesLegacyRuntimeOSArchFallback(t *testing.T) {
@@ -591,14 +594,14 @@ func TestApplyClaudeHeaders_UnsetStabilizationAlsoUsesLegacyRuntimeOSArchFallbac
})
applyClaudeHeaders(req, auth, "key-unset-runtime-os-arch", false, nil, cfg)
assertClaudeFingerprint(t, req.Header, "claude-cli/2.1.60 (external, cli)", "0.70.0", "v22.0.0", mapStainlessOS(), mapStainlessArch())
assertClaudeFingerprint(t, req.Header, "claude-cli/2.1.60 (external, cli)", "0.70.0", "v22.0.0", helps.MapStainlessOS(), helps.MapStainlessArch())
}
func TestClaudeDeviceProfileStabilizationEnabled_DefaultFalse(t *testing.T) {
if claudeDeviceProfileStabilizationEnabled(nil) {
if helps.ClaudeDeviceProfileStabilizationEnabled(nil) {
t.Fatal("expected nil config to default to disabled stabilization")
}
if claudeDeviceProfileStabilizationEnabled(&config.Config{}) {
if helps.ClaudeDeviceProfileStabilizationEnabled(&config.Config{}) {
t.Fatal("expected unset stabilize-device-profile to default to disabled stabilization")
}
}
@@ -736,6 +739,35 @@ func TestApplyClaudeToolPrefix_ToolChoiceBuiltin(t *testing.T) {
}
}
func TestApplyClaudeToolPrefix_KnownFallbackBuiltinsRemainUnprefixed(t *testing.T) {
for _, builtin := range []string{"web_search", "code_execution", "text_editor", "computer"} {
t.Run(builtin, func(t *testing.T) {
input := []byte(fmt.Sprintf(`{
"tools":[{"name":"Read"}],
"tool_choice":{"type":"tool","name":%q},
"messages":[{"role":"assistant","content":[{"type":"tool_use","name":%q,"id":"toolu_1","input":{}},{"type":"tool_reference","tool_name":%q},{"type":"tool_result","tool_use_id":"toolu_1","content":[{"type":"tool_reference","tool_name":%q}]}]}]
}`, builtin, builtin, builtin, builtin))
out := applyClaudeToolPrefix(input, "proxy_")
if got := gjson.GetBytes(out, "tool_choice.name").String(); got != builtin {
t.Fatalf("tool_choice.name = %q, want %q", got, builtin)
}
if got := gjson.GetBytes(out, "messages.0.content.0.name").String(); got != builtin {
t.Fatalf("messages.0.content.0.name = %q, want %q", got, builtin)
}
if got := gjson.GetBytes(out, "messages.0.content.1.tool_name").String(); got != builtin {
t.Fatalf("messages.0.content.1.tool_name = %q, want %q", got, builtin)
}
if got := gjson.GetBytes(out, "messages.0.content.2.content.0.tool_name").String(); got != builtin {
t.Fatalf("messages.0.content.2.content.0.tool_name = %q, want %q", got, builtin)
}
if got := gjson.GetBytes(out, "tools.0.name").String(); got != "proxy_Read" {
t.Fatalf("tools.0.name = %q, want %q", got, "proxy_Read")
}
})
}
}
func TestStripClaudeToolPrefixFromResponse(t *testing.T) {
input := []byte(`{"content":[{"type":"tool_use","name":"proxy_alpha","id":"t1","input":{}},{"type":"tool_use","name":"bravo","id":"t2","input":{}}]}`)
out := stripClaudeToolPrefixFromResponse(input, "proxy_")
@@ -796,8 +828,6 @@ func TestApplyClaudeToolPrefix_NestedToolReference(t *testing.T) {
}
func TestClaudeExecutor_ReusesUserIDAcrossModelsWhenCacheEnabled(t *testing.T) {
resetUserIDCache()
var userIDs []string
var requestModels []string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
@@ -857,15 +887,13 @@ func TestClaudeExecutor_ReusesUserIDAcrossModelsWhenCacheEnabled(t *testing.T) {
if userIDs[0] != userIDs[1] {
t.Fatalf("expected user_id to be reused across models, got %q and %q", userIDs[0], userIDs[1])
}
if !isValidUserID(userIDs[0]) {
if !helps.IsValidUserID(userIDs[0]) {
t.Fatalf("user_id %q is not valid", userIDs[0])
}
t.Logf("✓ End-to-end test passed: Same user_id (%s) was used for both models", userIDs[0])
}
func TestClaudeExecutor_GeneratesNewUserIDByDefault(t *testing.T) {
resetUserIDCache()
var userIDs []string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
@@ -903,7 +931,7 @@ func TestClaudeExecutor_GeneratesNewUserIDByDefault(t *testing.T) {
if userIDs[0] == userIDs[1] {
t.Fatalf("expected user_id to change when caching is not enabled, got identical values %q", userIDs[0])
}
if !isValidUserID(userIDs[0]) || !isValidUserID(userIDs[1]) {
if !helps.IsValidUserID(userIDs[0]) || !helps.IsValidUserID(userIDs[1]) {
t.Fatalf("user_ids should be valid, got %q and %q", userIDs[0], userIDs[1])
}
}
@@ -966,6 +994,28 @@ func TestNormalizeCacheControlTTL_PreservesOriginalBytesWhenNoChange(t *testing.
}
}
func TestNormalizeCacheControlTTL_PreservesKeyOrderWhenModified(t *testing.T) {
payload := []byte(`{"model":"m","messages":[{"role":"user","content":[{"type":"text","text":"u1","cache_control":{"type":"ephemeral","ttl":"1h"}}]}],"tools":[{"name":"t1","cache_control":{"type":"ephemeral"}}],"system":[{"type":"text","text":"s1","cache_control":{"type":"ephemeral"}}]}`)
out := normalizeCacheControlTTL(payload)
if gjson.GetBytes(out, "messages.0.content.0.cache_control.ttl").Exists() {
t.Fatalf("messages.0.content.0.cache_control.ttl should be removed after a default-5m block")
}
outStr := string(out)
idxModel := strings.Index(outStr, `"model"`)
idxMessages := strings.Index(outStr, `"messages"`)
idxTools := strings.Index(outStr, `"tools"`)
idxSystem := strings.Index(outStr, `"system"`)
if idxModel == -1 || idxMessages == -1 || idxTools == -1 || idxSystem == -1 {
t.Fatalf("failed to locate top-level keys in output: %s", outStr)
}
if !(idxModel < idxMessages && idxMessages < idxTools && idxTools < idxSystem) {
t.Fatalf("top-level key order changed:\noriginal: %s\ngot: %s", payload, out)
}
}
func TestEnforceCacheControlLimit_StripsNonLastToolBeforeMessages(t *testing.T) {
payload := []byte(`{
"tools": [
@@ -995,6 +1045,31 @@ func TestEnforceCacheControlLimit_StripsNonLastToolBeforeMessages(t *testing.T)
}
}
func TestEnforceCacheControlLimit_PreservesKeyOrderWhenModified(t *testing.T) {
payload := []byte(`{"model":"m","messages":[{"role":"user","content":[{"type":"text","text":"u1","cache_control":{"type":"ephemeral"}},{"type":"text","text":"u2","cache_control":{"type":"ephemeral"}}]}],"tools":[{"name":"t1","cache_control":{"type":"ephemeral"}},{"name":"t2","cache_control":{"type":"ephemeral"}}],"system":[{"type":"text","text":"s1","cache_control":{"type":"ephemeral"}}]}`)
out := enforceCacheControlLimit(payload, 4)
if got := countCacheControls(out); got != 4 {
t.Fatalf("cache_control count = %d, want 4", got)
}
if gjson.GetBytes(out, "tools.0.cache_control").Exists() {
t.Fatalf("tools.0.cache_control should be removed first (non-last tool)")
}
outStr := string(out)
idxModel := strings.Index(outStr, `"model"`)
idxMessages := strings.Index(outStr, `"messages"`)
idxTools := strings.Index(outStr, `"tools"`)
idxSystem := strings.Index(outStr, `"system"`)
if idxModel == -1 || idxMessages == -1 || idxTools == -1 || idxSystem == -1 {
t.Fatalf("failed to locate top-level keys in output: %s", outStr)
}
if !(idxModel < idxMessages && idxMessages < idxTools && idxTools < idxSystem) {
t.Fatalf("top-level key order changed:\noriginal: %s\ngot: %s", payload, out)
}
}
func TestEnforceCacheControlLimit_ToolOnlyPayloadStillRespectsLimit(t *testing.T) {
payload := []byte(`{
"tools": [
@@ -1183,6 +1258,83 @@ func testClaudeExecutorInvalidCompressedErrorBody(
}
}
func TestEnsureModelMaxTokens_UsesRegisteredMaxCompletionTokens(t *testing.T) {
reg := registry.GetGlobalRegistry()
clientID := "test-claude-max-completion-tokens-client"
modelID := "test-claude-max-completion-tokens-model"
reg.RegisterClient(clientID, "claude", []*registry.ModelInfo{{
ID: modelID,
Type: "claude",
OwnedBy: "anthropic",
Object: "model",
Created: time.Now().Unix(),
MaxCompletionTokens: 4096,
UserDefined: true,
}})
defer reg.UnregisterClient(clientID)
input := []byte(`{"model":"test-claude-max-completion-tokens-model","messages":[{"role":"user","content":"hi"}]}`)
out := ensureModelMaxTokens(input, modelID)
if got := gjson.GetBytes(out, "max_tokens").Int(); got != 4096 {
t.Fatalf("max_tokens = %d, want %d", got, 4096)
}
}
func TestEnsureModelMaxTokens_DefaultsMissingValue(t *testing.T) {
reg := registry.GetGlobalRegistry()
clientID := "test-claude-default-max-tokens-client"
modelID := "test-claude-default-max-tokens-model"
reg.RegisterClient(clientID, "claude", []*registry.ModelInfo{{
ID: modelID,
Type: "claude",
OwnedBy: "anthropic",
Object: "model",
Created: time.Now().Unix(),
UserDefined: true,
}})
defer reg.UnregisterClient(clientID)
input := []byte(`{"model":"test-claude-default-max-tokens-model","messages":[{"role":"user","content":"hi"}]}`)
out := ensureModelMaxTokens(input, modelID)
if got := gjson.GetBytes(out, "max_tokens").Int(); got != defaultModelMaxTokens {
t.Fatalf("max_tokens = %d, want %d", got, defaultModelMaxTokens)
}
}
func TestEnsureModelMaxTokens_PreservesExplicitValue(t *testing.T) {
reg := registry.GetGlobalRegistry()
clientID := "test-claude-preserve-max-tokens-client"
modelID := "test-claude-preserve-max-tokens-model"
reg.RegisterClient(clientID, "claude", []*registry.ModelInfo{{
ID: modelID,
Type: "claude",
OwnedBy: "anthropic",
Object: "model",
Created: time.Now().Unix(),
MaxCompletionTokens: 4096,
UserDefined: true,
}})
defer reg.UnregisterClient(clientID)
input := []byte(`{"model":"test-claude-preserve-max-tokens-model","max_tokens":2048,"messages":[{"role":"user","content":"hi"}]}`)
out := ensureModelMaxTokens(input, modelID)
if got := gjson.GetBytes(out, "max_tokens").Int(); got != 2048 {
t.Fatalf("max_tokens = %d, want %d", got, 2048)
}
}
func TestEnsureModelMaxTokens_SkipsUnregisteredModel(t *testing.T) {
input := []byte(`{"model":"test-claude-unregistered-model","messages":[{"role":"user","content":"hi"}]}`)
out := ensureModelMaxTokens(input, "test-claude-unregistered-model")
if gjson.GetBytes(out, "max_tokens").Exists() {
t.Fatalf("max_tokens should remain unset, got %s", gjson.GetBytes(out, "max_tokens").Raw)
}
}
// TestClaudeExecutor_ExecuteStream_SetsIdentityAcceptEncoding verifies that streaming
// requests use Accept-Encoding: identity so the upstream cannot respond with a
// compressed SSE body that would silently break the line scanner.
@@ -1340,6 +1492,35 @@ func TestDecodeResponseBody_MagicByteGzipNoHeader(t *testing.T) {
}
}
// TestDecodeResponseBody_MagicByteZstdNoHeader verifies that decodeResponseBody
// detects zstd-compressed content via magic bytes even when Content-Encoding is absent.
func TestDecodeResponseBody_MagicByteZstdNoHeader(t *testing.T) {
const plaintext = "data: {\"type\":\"message_stop\"}\n"
var buf bytes.Buffer
enc, err := zstd.NewWriter(&buf)
if err != nil {
t.Fatalf("zstd.NewWriter: %v", err)
}
_, _ = enc.Write([]byte(plaintext))
_ = enc.Close()
rc := io.NopCloser(&buf)
decoded, err := decodeResponseBody(rc, "")
if err != nil {
t.Fatalf("decodeResponseBody error: %v", err)
}
defer decoded.Close()
got, err := io.ReadAll(decoded)
if err != nil {
t.Fatalf("ReadAll error: %v", err)
}
if string(got) != plaintext {
t.Errorf("decoded = %q, want %q", got, plaintext)
}
}
// TestDecodeResponseBody_PlainTextNoHeader verifies that decodeResponseBody returns
// plain text untouched when Content-Encoding is absent and no magic bytes match.
func TestDecodeResponseBody_PlainTextNoHeader(t *testing.T) {
@@ -1411,77 +1592,6 @@ func TestClaudeExecutor_ExecuteStream_GzipNoContentEncodingHeader(t *testing.T)
}
}
// TestClaudeExecutor_ExecuteStream_AcceptEncodingOverrideCannotBypassIdentity verifies
// that injecting Accept-Encoding via auth.Attributes cannot override the stream
// path's enforced identity encoding.
func TestClaudeExecutor_ExecuteStream_AcceptEncodingOverrideCannotBypassIdentity(t *testing.T) {
var gotEncoding string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotEncoding = r.Header.Get("Accept-Encoding")
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"message_stop\"}\n\n"))
}))
defer server.Close()
executor := NewClaudeExecutor(&config.Config{})
// Inject Accept-Encoding via the custom header attribute mechanism.
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"api_key": "key-123",
"base_url": server.URL,
"header:Accept-Encoding": "gzip, deflate, br, zstd",
}}
payload := []byte(`{"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}`)
result, err := executor.ExecuteStream(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-3-5-sonnet-20241022",
Payload: payload,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("claude"),
})
if err != nil {
t.Fatalf("ExecuteStream error: %v", err)
}
for chunk := range result.Chunks {
if chunk.Err != nil {
t.Fatalf("unexpected chunk error: %v", chunk.Err)
}
}
if gotEncoding != "identity" {
t.Errorf("Accept-Encoding = %q; stream path must enforce identity regardless of auth.Attributes override", gotEncoding)
}
}
// TestDecodeResponseBody_MagicByteZstdNoHeader verifies that decodeResponseBody
// detects zstd-compressed content via magic bytes (28 b5 2f fd) even when
// Content-Encoding is absent.
func TestDecodeResponseBody_MagicByteZstdNoHeader(t *testing.T) {
const plaintext = "data: {\"type\":\"message_stop\"}\n"
var buf bytes.Buffer
enc, err := zstd.NewWriter(&buf)
if err != nil {
t.Fatalf("zstd.NewWriter: %v", err)
}
_, _ = enc.Write([]byte(plaintext))
_ = enc.Close()
rc := io.NopCloser(&buf)
decoded, err := decodeResponseBody(rc, "")
if err != nil {
t.Fatalf("decodeResponseBody error: %v", err)
}
defer decoded.Close()
got, err := io.ReadAll(decoded)
if err != nil {
t.Fatalf("ReadAll error: %v", err)
}
if string(got) != plaintext {
t.Errorf("decoded = %q, want %q", got, plaintext)
}
}
// TestClaudeExecutor_Execute_GzipErrorBodyNoContentEncodingHeader verifies that the
// error path (4xx) correctly decompresses a gzip body even when the upstream omits
// the Content-Encoding header. This closes the gap left by PR #1771, which only
@@ -1565,6 +1675,45 @@ func TestClaudeExecutor_ExecuteStream_GzipErrorBodyNoContentEncodingHeader(t *te
}
}
// TestClaudeExecutor_ExecuteStream_AcceptEncodingOverrideCannotBypassIdentity verifies that the
// streaming executor enforces Accept-Encoding: identity regardless of auth.Attributes override.
func TestClaudeExecutor_ExecuteStream_AcceptEncodingOverrideCannotBypassIdentity(t *testing.T) {
var gotEncoding string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotEncoding = r.Header.Get("Accept-Encoding")
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"message_stop\"}\n\n"))
}))
defer server.Close()
executor := NewClaudeExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"api_key": "key-123",
"base_url": server.URL,
"header:Accept-Encoding": "gzip, deflate, br, zstd",
}}
payload := []byte(`{"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}`)
result, err := executor.ExecuteStream(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-3-5-sonnet-20241022",
Payload: payload,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("claude"),
})
if err != nil {
t.Fatalf("ExecuteStream error: %v", err)
}
for chunk := range result.Chunks {
if chunk.Err != nil {
t.Fatalf("unexpected chunk error: %v", chunk.Err)
}
}
if gotEncoding != "identity" {
t.Errorf("Accept-Encoding = %q; stream path must enforce identity regardless of auth.Attributes override", gotEncoding)
}
}
// Test case 1: String system prompt is preserved and converted to a content block
func TestCheckSystemInstructionsWithMode_StringSystemPreserved(t *testing.T) {
payload := []byte(`{"system":"You are a helpful assistant.","messages":[{"role":"user","content":"hi"}]}`)
@@ -1648,3 +1797,197 @@ func TestCheckSystemInstructionsWithMode_StringWithSpecialChars(t *testing.T) {
t.Fatalf("blocks[2] text mangled, got %q", blocks[2].Get("text").String())
}
}
func TestClaudeExecutor_ExperimentalCCHSigningDisabledByDefaultKeepsLegacyHeader(t *testing.T) {
var seenBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
seenBody = bytes.Clone(body)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"id":"msg_1","type":"message","model":"claude-3-5-sonnet","role":"assistant","content":[{"type":"text","text":"ok"}],"usage":{"input_tokens":1,"output_tokens":1}}`))
}))
defer server.Close()
executor := NewClaudeExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"api_key": "key-123",
"base_url": server.URL,
}}
payload := []byte(`{"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}`)
_, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-3-5-sonnet-20241022",
Payload: payload,
}, cliproxyexecutor.Options{SourceFormat: sdktranslator.FromString("claude")})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if len(seenBody) == 0 {
t.Fatal("expected request body to be captured")
}
billingHeader := gjson.GetBytes(seenBody, "system.0.text").String()
if !strings.HasPrefix(billingHeader, "x-anthropic-billing-header:") {
t.Fatalf("system.0.text = %q, want billing header", billingHeader)
}
if strings.Contains(billingHeader, "cch=00000;") {
t.Fatalf("legacy mode should not forward cch placeholder, got %q", billingHeader)
}
}
func TestClaudeExecutor_ExperimentalCCHSigningOptInSignsFinalBody(t *testing.T) {
var seenBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
body, _ := io.ReadAll(r.Body)
seenBody = bytes.Clone(body)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"id":"msg_1","type":"message","model":"claude-3-5-sonnet","role":"assistant","content":[{"type":"text","text":"ok"}],"usage":{"input_tokens":1,"output_tokens":1}}`))
}))
defer server.Close()
executor := NewClaudeExecutor(&config.Config{
ClaudeKey: []config.ClaudeKey{{
APIKey: "key-123",
BaseURL: server.URL,
ExperimentalCCHSigning: true,
}},
})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"api_key": "key-123",
"base_url": server.URL,
}}
const messageText = "please keep literal cch=00000 in this message"
payload := []byte(`{"messages":[{"role":"user","content":[{"type":"text","text":"please keep literal cch=00000 in this message"}]}]}`)
_, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-3-5-sonnet-20241022",
Payload: payload,
}, cliproxyexecutor.Options{SourceFormat: sdktranslator.FromString("claude")})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if len(seenBody) == 0 {
t.Fatal("expected request body to be captured")
}
if got := gjson.GetBytes(seenBody, "messages.0.content.0.text").String(); got != messageText {
t.Fatalf("message text = %q, want %q", got, messageText)
}
billingPattern := regexp.MustCompile(`(x-anthropic-billing-header:[^"]*?\bcch=)([0-9a-f]{5})(;)`)
match := billingPattern.FindSubmatch(seenBody)
if match == nil {
t.Fatalf("expected signed billing header in body: %s", string(seenBody))
}
actualCCH := string(match[2])
unsignedBody := billingPattern.ReplaceAll(seenBody, []byte(`${1}00000${3}`))
wantCCH := fmt.Sprintf("%05x", xxHash64.Checksum(unsignedBody, 0x6E52736AC806831E)&0xFFFFF)
if actualCCH != wantCCH {
t.Fatalf("cch = %q, want %q\nbody: %s", actualCCH, wantCCH, string(seenBody))
}
}
func TestApplyCloaking_PreservesConfiguredStrictModeAndSensitiveWordsWhenModeOmitted(t *testing.T) {
cfg := &config.Config{
ClaudeKey: []config.ClaudeKey{{
APIKey: "key-123",
Cloak: &config.CloakConfig{
StrictMode: true,
SensitiveWords: []string{"proxy"},
},
}},
}
auth := &cliproxyauth.Auth{Attributes: map[string]string{"api_key": "key-123"}}
payload := []byte(`{"system":"proxy rules","messages":[{"role":"user","content":[{"type":"text","text":"proxy access"}]}]}`)
out := applyCloaking(context.Background(), cfg, auth, payload, "claude-3-5-sonnet-20241022", "key-123")
blocks := gjson.GetBytes(out, "system").Array()
if len(blocks) != 2 {
t.Fatalf("expected strict mode to keep only injected system blocks, got %d", len(blocks))
}
if got := gjson.GetBytes(out, "messages.0.content.0.text").String(); !strings.Contains(got, "\u200B") {
t.Fatalf("expected configured sensitive word obfuscation to apply, got %q", got)
}
}
func TestNormalizeClaudeTemperatureForThinking_AdaptiveCoercesToOne(t *testing.T) {
payload := []byte(`{"temperature":0,"thinking":{"type":"adaptive"},"output_config":{"effort":"max"}}`)
out := normalizeClaudeTemperatureForThinking(payload)
if got := gjson.GetBytes(out, "temperature").Float(); got != 1 {
t.Fatalf("temperature = %v, want 1", got)
}
}
func TestNormalizeClaudeTemperatureForThinking_EnabledCoercesToOne(t *testing.T) {
payload := []byte(`{"temperature":0.2,"thinking":{"type":"enabled","budget_tokens":2048}}`)
out := normalizeClaudeTemperatureForThinking(payload)
if got := gjson.GetBytes(out, "temperature").Float(); got != 1 {
t.Fatalf("temperature = %v, want 1", got)
}
}
func TestNormalizeClaudeTemperatureForThinking_NoThinkingLeavesTemperatureAlone(t *testing.T) {
payload := []byte(`{"temperature":0,"messages":[{"role":"user","content":"hi"}]}`)
out := normalizeClaudeTemperatureForThinking(payload)
if got := gjson.GetBytes(out, "temperature").Float(); got != 0 {
t.Fatalf("temperature = %v, want 0", got)
}
}
func TestNormalizeClaudeTemperatureForThinking_AfterForcedToolChoiceKeepsOriginalTemperature(t *testing.T) {
payload := []byte(`{"temperature":0,"thinking":{"type":"adaptive"},"output_config":{"effort":"max"},"tool_choice":{"type":"any"}}`)
out := disableThinkingIfToolChoiceForced(payload)
out = normalizeClaudeTemperatureForThinking(out)
if gjson.GetBytes(out, "thinking").Exists() {
t.Fatalf("thinking should be removed when tool_choice forces tool use")
}
if got := gjson.GetBytes(out, "temperature").Float(); got != 0 {
t.Fatalf("temperature = %v, want 0", got)
}
}
func TestRemapOAuthToolNames_TitleCase_NoReverseNeeded(t *testing.T) {
body := []byte(`{"tools":[{"name":"Bash","description":"Run shell commands","input_schema":{"type":"object","properties":{"cmd":{"type":"string"}}}}],"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}`)
out, renamed := remapOAuthToolNames(body)
if renamed {
t.Fatalf("renamed = true, want false")
}
if got := gjson.GetBytes(out, "tools.0.name").String(); got != "Bash" {
t.Fatalf("tools.0.name = %q, want %q", got, "Bash")
}
resp := []byte(`{"content":[{"type":"tool_use","id":"toolu_01","name":"Bash","input":{"cmd":"ls"}}]}`)
reversed := resp
if renamed {
reversed = reverseRemapOAuthToolNames(resp)
}
if got := gjson.GetBytes(reversed, "content.0.name").String(); got != "Bash" {
t.Fatalf("content.0.name = %q, want %q", got, "Bash")
}
}
func TestRemapOAuthToolNames_Lowercase_ReverseApplied(t *testing.T) {
body := []byte(`{"tools":[{"name":"bash","description":"Run shell commands","input_schema":{"type":"object","properties":{"cmd":{"type":"string"}}}}],"messages":[{"role":"user","content":[{"type":"text","text":"hi"}]}]}`)
out, renamed := remapOAuthToolNames(body)
if !renamed {
t.Fatalf("renamed = false, want true")
}
if got := gjson.GetBytes(out, "tools.0.name").String(); got != "Bash" {
t.Fatalf("tools.0.name = %q, want %q", got, "Bash")
}
resp := []byte(`{"content":[{"type":"tool_use","id":"toolu_01","name":"Bash","input":{"cmd":"ls"}}]}`)
reversed := resp
if renamed {
reversed = reverseRemapOAuthToolNames(resp)
}
if got := gjson.GetBytes(reversed, "content.0.name").String(); got != "bash" {
t.Fatalf("content.0.name = %q, want %q", got, "bash")
}
}

View File

@@ -0,0 +1,81 @@
package executor
import (
"fmt"
"regexp"
"strings"
xxHash64 "github.com/pierrec/xxHash/xxHash64"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
const claudeCCHSeed uint64 = 0x6E52736AC806831E
var claudeBillingHeaderCCHPattern = regexp.MustCompile(`\bcch=([0-9a-f]{5});`)
func signAnthropicMessagesBody(body []byte) []byte {
billingHeader := gjson.GetBytes(body, "system.0.text").String()
if !strings.HasPrefix(billingHeader, "x-anthropic-billing-header:") {
return body
}
if !claudeBillingHeaderCCHPattern.MatchString(billingHeader) {
return body
}
unsignedBillingHeader := claudeBillingHeaderCCHPattern.ReplaceAllString(billingHeader, "cch=00000;")
unsignedBody, err := sjson.SetBytes(body, "system.0.text", unsignedBillingHeader)
if err != nil {
return body
}
cch := fmt.Sprintf("%05x", xxHash64.Checksum(unsignedBody, claudeCCHSeed)&0xFFFFF)
signedBillingHeader := claudeBillingHeaderCCHPattern.ReplaceAllString(unsignedBillingHeader, "cch="+cch+";")
signedBody, err := sjson.SetBytes(unsignedBody, "system.0.text", signedBillingHeader)
if err != nil {
return unsignedBody
}
return signedBody
}
func resolveClaudeKeyConfig(cfg *config.Config, auth *cliproxyauth.Auth) *config.ClaudeKey {
if cfg == nil || auth == nil {
return nil
}
apiKey, baseURL := claudeCreds(auth)
if apiKey == "" {
return nil
}
for i := range cfg.ClaudeKey {
entry := &cfg.ClaudeKey[i]
cfgKey := strings.TrimSpace(entry.APIKey)
cfgBase := strings.TrimSpace(entry.BaseURL)
if !strings.EqualFold(cfgKey, apiKey) {
continue
}
if baseURL != "" && cfgBase != "" && !strings.EqualFold(cfgBase, baseURL) {
continue
}
return entry
}
return nil
}
// resolveClaudeKeyCloakConfig finds the matching ClaudeKey config and returns its CloakConfig.
func resolveClaudeKeyCloakConfig(cfg *config.Config, auth *cliproxyauth.Auth) *config.CloakConfig {
entry := resolveClaudeKeyConfig(cfg, auth)
if entry == nil {
return nil
}
return entry.Cloak
}
func experimentalCCHSigningEnabled(cfg *config.Config, auth *cliproxyauth.Auth) bool {
entry := resolveClaudeKeyConfig(cfg, auth)
return entry != nil && entry.ExperimentalCCHSigning
}

View File

@@ -4,9 +4,11 @@ import (
"bufio"
"bytes"
"context"
"encoding/json"
"fmt"
"io"
"net/http"
"strings"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/auth/codebuddy"
@@ -14,8 +16,11 @@ import (
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/usage"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
log "github.com/sirupsen/logrus"
"github.com/tidwall/gjson"
"github.com/tidwall/sjson"
)
const (
@@ -98,10 +103,12 @@ func (e *CodeBuddyExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
if len(opts.OriginalRequest) > 0 {
originalPayloadSource = opts.OriginalRequest
}
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayloadSource, false)
translated := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false)
originalTranslated := sdktranslator.TranslateRequest(from, to, baseModel, originalPayloadSource, true)
translated := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, true)
requestedModel := payloadRequestedModel(opts, req.Model)
translated = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", translated, originalTranslated, requestedModel)
translated, _ = sjson.SetBytes(translated, "stream", true)
translated, _ = sjson.SetBytes(translated, "stream_options.include_usage", true)
translated, err = thinking.ApplyThinking(translated, req.Model, from.String(), to.String(), e.Identifier())
if err != nil {
@@ -114,6 +121,8 @@ func (e *CodeBuddyExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
return resp, err
}
e.applyHeaders(httpReq, accessToken, userID, domain)
httpReq.Header.Set("Accept", "text/event-stream")
httpReq.Header.Set("Cache-Control", "no-cache")
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -160,11 +169,16 @@ func (e *CodeBuddyExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, body)
reporter.publish(ctx, parseOpenAIUsage(body))
aggregatedBody, usageDetail, err := aggregateOpenAIChatCompletionStream(body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
reporter.publish(ctx, usageDetail)
reporter.ensurePublished(ctx)
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, opts.OriginalRequest, translated, body, &param)
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, opts.OriginalRequest, translated, aggregatedBody, &param)
resp = cliproxyexecutor.Response{Payload: []byte(out), Headers: httpResp.Header.Clone()}
return resp, nil
}
@@ -341,3 +355,197 @@ func (e *CodeBuddyExecutor) applyHeaders(req *http.Request, accessToken, userID,
req.Header.Set("X-IDE-Version", "2.63.2")
req.Header.Set("X-Requested-With", "XMLHttpRequest")
}
type openAIChatStreamChoiceAccumulator struct {
Role string
ContentParts []string
ReasoningParts []string
FinishReason string
ToolCalls map[int]*openAIChatStreamToolCallAccumulator
ToolCallOrder []int
NativeFinishReason any
}
type openAIChatStreamToolCallAccumulator struct {
ID string
Type string
Name string
Arguments strings.Builder
}
func aggregateOpenAIChatCompletionStream(raw []byte) ([]byte, usage.Detail, error) {
lines := bytes.Split(raw, []byte("\n"))
var (
responseID string
model string
created int64
serviceTier string
systemFP string
usageDetail usage.Detail
choices = map[int]*openAIChatStreamChoiceAccumulator{}
choiceOrder []int
)
for _, line := range lines {
line = bytes.TrimSpace(line)
if len(line) == 0 || !bytes.HasPrefix(line, []byte("data:")) {
continue
}
payload := bytes.TrimSpace(line[5:])
if len(payload) == 0 || bytes.Equal(payload, []byte("[DONE]")) {
continue
}
if !gjson.ValidBytes(payload) {
continue
}
root := gjson.ParseBytes(payload)
if responseID == "" {
responseID = root.Get("id").String()
}
if model == "" {
model = root.Get("model").String()
}
if created == 0 {
created = root.Get("created").Int()
}
if serviceTier == "" {
serviceTier = root.Get("service_tier").String()
}
if systemFP == "" {
systemFP = root.Get("system_fingerprint").String()
}
if detail, ok := parseOpenAIStreamUsage(line); ok {
usageDetail = detail
}
for _, choiceResult := range root.Get("choices").Array() {
idx := int(choiceResult.Get("index").Int())
choice := choices[idx]
if choice == nil {
choice = &openAIChatStreamChoiceAccumulator{ToolCalls: map[int]*openAIChatStreamToolCallAccumulator{}}
choices[idx] = choice
choiceOrder = append(choiceOrder, idx)
}
delta := choiceResult.Get("delta")
if role := delta.Get("role").String(); role != "" {
choice.Role = role
}
if content := delta.Get("content").String(); content != "" {
choice.ContentParts = append(choice.ContentParts, content)
}
if reasoning := delta.Get("reasoning_content").String(); reasoning != "" {
choice.ReasoningParts = append(choice.ReasoningParts, reasoning)
}
if finishReason := choiceResult.Get("finish_reason").String(); finishReason != "" {
choice.FinishReason = finishReason
}
if nativeFinishReason := choiceResult.Get("native_finish_reason"); nativeFinishReason.Exists() {
choice.NativeFinishReason = nativeFinishReason.Value()
}
for _, toolCallResult := range delta.Get("tool_calls").Array() {
toolIdx := int(toolCallResult.Get("index").Int())
toolCall := choice.ToolCalls[toolIdx]
if toolCall == nil {
toolCall = &openAIChatStreamToolCallAccumulator{}
choice.ToolCalls[toolIdx] = toolCall
choice.ToolCallOrder = append(choice.ToolCallOrder, toolIdx)
}
if id := toolCallResult.Get("id").String(); id != "" {
toolCall.ID = id
}
if typ := toolCallResult.Get("type").String(); typ != "" {
toolCall.Type = typ
}
if name := toolCallResult.Get("function.name").String(); name != "" {
toolCall.Name = name
}
if args := toolCallResult.Get("function.arguments").String(); args != "" {
toolCall.Arguments.WriteString(args)
}
}
}
}
if responseID == "" && model == "" && len(choiceOrder) == 0 {
return nil, usageDetail, fmt.Errorf("codebuddy: streaming response did not contain any chat completion chunks")
}
response := map[string]any{
"id": responseID,
"object": "chat.completion",
"created": created,
"model": model,
"choices": make([]map[string]any, 0, len(choiceOrder)),
"usage": map[string]any{
"prompt_tokens": usageDetail.InputTokens,
"completion_tokens": usageDetail.OutputTokens,
"total_tokens": usageDetail.TotalTokens,
},
}
if serviceTier != "" {
response["service_tier"] = serviceTier
}
if systemFP != "" {
response["system_fingerprint"] = systemFP
}
for _, idx := range choiceOrder {
choice := choices[idx]
message := map[string]any{
"role": choice.Role,
"content": strings.Join(choice.ContentParts, ""),
}
if message["role"] == "" {
message["role"] = "assistant"
}
if len(choice.ReasoningParts) > 0 {
message["reasoning_content"] = strings.Join(choice.ReasoningParts, "")
}
if len(choice.ToolCallOrder) > 0 {
toolCalls := make([]map[string]any, 0, len(choice.ToolCallOrder))
for _, toolIdx := range choice.ToolCallOrder {
toolCall := choice.ToolCalls[toolIdx]
toolCallType := toolCall.Type
if toolCallType == "" {
toolCallType = "function"
}
arguments := toolCall.Arguments.String()
if arguments == "" {
arguments = "{}"
}
toolCalls = append(toolCalls, map[string]any{
"id": toolCall.ID,
"type": toolCallType,
"function": map[string]any{
"name": toolCall.Name,
"arguments": arguments,
},
})
}
message["tool_calls"] = toolCalls
}
finishReason := choice.FinishReason
if finishReason == "" {
finishReason = "stop"
}
choicePayload := map[string]any{
"index": idx,
"message": message,
"finish_reason": finishReason,
}
if choice.NativeFinishReason != nil {
choicePayload["native_finish_reason"] = choice.NativeFinishReason
}
response["choices"] = append(response["choices"].([]map[string]any), choicePayload)
}
out, err := json.Marshal(response)
if err != nil {
return nil, usageDetail, fmt.Errorf("codebuddy: failed to encode aggregated response: %w", err)
}
return out, usageDetail, nil
}

View File

@@ -7,12 +7,14 @@ import (
"fmt"
"io"
"net/http"
"sort"
"strings"
"time"
codexauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/codex"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
@@ -28,8 +30,8 @@ import (
)
const (
codexUserAgent = "codex_cli_rs/0.116.0 (Mac OS 26.0.1; arm64) Apple_Terminal/464"
codexOriginator = "codex_cli_rs"
codexUserAgent = "codex-tui/0.118.0 (Mac OS 26.3.1; arm64) iTerm.app/3.6.9 (codex-tui; 0.118.0)"
codexOriginator = "codex-tui"
)
var dataTag = []byte("data:")
@@ -73,7 +75,7 @@ func (e *CodexExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Auth
if err := e.PrepareRequest(httpReq, auth); err != nil {
return nil, err
}
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
return httpClient.Do(httpReq)
}
@@ -88,8 +90,8 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("codex")
@@ -106,16 +108,15 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
return resp, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, _ = sjson.SetBytes(body, "stream", true)
body, _ = sjson.DeleteBytes(body, "previous_response_id")
body, _ = sjson.DeleteBytes(body, "prompt_cache_retention")
body, _ = sjson.DeleteBytes(body, "safety_identifier")
if !gjson.GetBytes(body, "instructions").Exists() {
body, _ = sjson.SetBytes(body, "instructions", "")
}
body, _ = sjson.DeleteBytes(body, "stream_options")
body = normalizeCodexInstructions(body)
url := strings.TrimSuffix(baseURL, "/") + "/responses"
httpReq, err := e.cacheHelper(ctx, from, url, req, body)
@@ -129,7 +130,7 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -140,10 +141,10 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
AuthType: authType,
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
defer func() {
@@ -151,38 +152,79 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
log.Errorf("codex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = newCodexStatusErr(httpResp.StatusCode, b)
return resp, err
}
data, err := io.ReadAll(httpResp.Body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
lines := bytes.Split(data, []byte("\n"))
outputItemsByIndex := make(map[int64][]byte)
var outputItemsFallback [][]byte
for _, line := range lines {
if !bytes.HasPrefix(line, dataTag) {
continue
}
line = bytes.TrimSpace(line[5:])
if gjson.GetBytes(line, "type").String() != "response.completed" {
eventData := bytes.TrimSpace(line[5:])
eventType := gjson.GetBytes(eventData, "type").String()
if eventType == "response.output_item.done" {
itemResult := gjson.GetBytes(eventData, "item")
if !itemResult.Exists() || itemResult.Type != gjson.JSON {
continue
}
outputIndexResult := gjson.GetBytes(eventData, "output_index")
if outputIndexResult.Exists() {
outputItemsByIndex[outputIndexResult.Int()] = []byte(itemResult.Raw)
} else {
outputItemsFallback = append(outputItemsFallback, []byte(itemResult.Raw))
}
continue
}
if detail, ok := parseCodexUsage(line); ok {
reporter.publish(ctx, detail)
if eventType != "response.completed" {
continue
}
if detail, ok := helps.ParseCodexUsage(eventData); ok {
reporter.Publish(ctx, detail)
}
completedData := eventData
outputResult := gjson.GetBytes(completedData, "response.output")
shouldPatchOutput := (!outputResult.Exists() || !outputResult.IsArray() || len(outputResult.Array()) == 0) && (len(outputItemsByIndex) > 0 || len(outputItemsFallback) > 0)
if shouldPatchOutput {
completedDataPatched := completedData
completedDataPatched, _ = sjson.SetRawBytes(completedDataPatched, "response.output", []byte(`[]`))
indexes := make([]int64, 0, len(outputItemsByIndex))
for idx := range outputItemsByIndex {
indexes = append(indexes, idx)
}
sort.Slice(indexes, func(i, j int) bool {
return indexes[i] < indexes[j]
})
for _, idx := range indexes {
completedDataPatched, _ = sjson.SetRawBytes(completedDataPatched, "response.output.-1", outputItemsByIndex[idx])
}
for _, item := range outputItemsFallback {
completedDataPatched, _ = sjson.SetRawBytes(completedDataPatched, "response.output.-1", item)
}
completedData = completedDataPatched
}
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, originalPayload, body, line, &param)
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, originalPayload, body, completedData, &param)
resp = cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}
return resp, nil
}
@@ -198,8 +240,8 @@ func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.A
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("openai-response")
@@ -216,10 +258,11 @@ func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.A
return resp, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, _ = sjson.DeleteBytes(body, "stream")
body = normalizeCodexInstructions(body)
url := strings.TrimSuffix(baseURL, "/") + "/responses/compact"
httpReq, err := e.cacheHelper(ctx, from, url, req, body)
@@ -233,7 +276,7 @@ func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.A
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -244,10 +287,10 @@ func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.A
AuthType: authType,
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
defer func() {
@@ -255,22 +298,22 @@ func (e *CodexExecutor) executeCompact(ctx context.Context, auth *cliproxyauth.A
log.Errorf("codex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = newCodexStatusErr(httpResp.StatusCode, b)
return resp, err
}
data, err := io.ReadAll(httpResp.Body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseOpenAIUsage(data))
reporter.ensurePublished(ctx)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
reporter.Publish(ctx, helps.ParseOpenAIUsage(data))
reporter.EnsurePublished(ctx)
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, originalPayload, body, data, &param)
resp = cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}
@@ -288,8 +331,8 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("codex")
@@ -306,15 +349,14 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
return nil, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.DeleteBytes(body, "previous_response_id")
body, _ = sjson.DeleteBytes(body, "prompt_cache_retention")
body, _ = sjson.DeleteBytes(body, "safety_identifier")
body, _ = sjson.DeleteBytes(body, "stream_options")
body, _ = sjson.SetBytes(body, "model", baseModel)
if !gjson.GetBytes(body, "instructions").Exists() {
body, _ = sjson.SetBytes(body, "instructions", "")
}
body = normalizeCodexInstructions(body)
url := strings.TrimSuffix(baseURL, "/") + "/responses"
httpReq, err := e.cacheHelper(ctx, from, url, req, body)
@@ -328,7 +370,7 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -340,24 +382,24 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return nil, err
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
data, readErr := io.ReadAll(httpResp.Body)
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("codex executor: close response body error: %v", errClose)
}
if readErr != nil {
recordAPIResponseError(ctx, e.cfg, readErr)
helps.RecordAPIResponseError(ctx, e.cfg, readErr)
return nil, readErr
}
appendAPIResponseChunk(ctx, e.cfg, data)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
err = newCodexStatusErr(httpResp.StatusCode, data)
return nil, err
}
@@ -374,13 +416,13 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
var param any
for scanner.Scan() {
line := scanner.Bytes()
appendAPIResponseChunk(ctx, e.cfg, line)
helps.AppendAPIResponseChunk(ctx, e.cfg, line)
if bytes.HasPrefix(line, dataTag) {
data := bytes.TrimSpace(line[5:])
if gjson.GetBytes(data, "type").String() == "response.completed" {
if detail, ok := parseCodexUsage(data); ok {
reporter.publish(ctx, detail)
if detail, ok := helps.ParseCodexUsage(data); ok {
reporter.Publish(ctx, detail)
}
}
}
@@ -391,8 +433,8 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
}
}
if errScan := scanner.Err(); errScan != nil {
recordAPIResponseError(ctx, e.cfg, errScan)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errScan)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errScan}
}
}()
@@ -415,10 +457,9 @@ func (e *CodexExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Auth
body, _ = sjson.DeleteBytes(body, "previous_response_id")
body, _ = sjson.DeleteBytes(body, "prompt_cache_retention")
body, _ = sjson.DeleteBytes(body, "safety_identifier")
body, _ = sjson.DeleteBytes(body, "stream_options")
body, _ = sjson.SetBytes(body, "stream", false)
if !gjson.GetBytes(body, "instructions").Exists() {
body, _ = sjson.SetBytes(body, "instructions", "")
}
body = normalizeCodexInstructions(body)
enc, err := tokenizerForCodexModel(baseModel)
if err != nil {
@@ -571,7 +612,7 @@ func (e *CodexExecutor) Refresh(ctx context.Context, auth *cliproxyauth.Auth) (*
if refreshToken == "" {
return auth, nil
}
svc := codexauth.NewCodexAuth(e.cfg)
svc := codexauth.NewCodexAuthWithProxyURL(e.cfg, auth.ProxyURL)
td, err := svc.RefreshTokensWithRetry(ctx, refreshToken, 3)
if err != nil {
return nil, err
@@ -597,18 +638,18 @@ func (e *CodexExecutor) Refresh(ctx context.Context, auth *cliproxyauth.Auth) (*
}
func (e *CodexExecutor) cacheHelper(ctx context.Context, from sdktranslator.Format, url string, req cliproxyexecutor.Request, rawJSON []byte) (*http.Request, error) {
var cache codexCache
var cache helps.CodexCache
if from == "claude" {
userIDResult := gjson.GetBytes(req.Payload, "metadata.user_id")
if userIDResult.Exists() {
key := fmt.Sprintf("%s-%s", req.Model, userIDResult.String())
var ok bool
if cache, ok = getCodexCache(key); !ok {
cache = codexCache{
if cache, ok = helps.GetCodexCache(key); !ok {
cache = helps.CodexCache{
ID: uuid.New().String(),
Expire: time.Now().Add(1 * time.Hour),
}
setCodexCache(key, cache)
helps.SetCodexCache(key, cache)
}
}
} else if from == "openai-response" {
@@ -617,7 +658,7 @@ func (e *CodexExecutor) cacheHelper(ctx context.Context, from sdktranslator.Form
cache.ID = promptCacheKey.String()
}
} else if from == "openai" {
if apiKey := strings.TrimSpace(apiKeyFromContext(ctx)); apiKey != "" {
if apiKey := strings.TrimSpace(helps.APIKeyFromContext(ctx)); apiKey != "" {
cache.ID = uuid.NewSHA1(uuid.NameSpaceOID, []byte("cli-proxy-api:codex:prompt-cache:"+apiKey)).String()
}
}
@@ -630,7 +671,6 @@ func (e *CodexExecutor) cacheHelper(ctx context.Context, from sdktranslator.Form
return nil, err
}
if cache.ID != "" {
httpReq.Header.Set("Conversation_id", cache.ID)
httpReq.Header.Set("Session_id", cache.ID)
}
return httpReq, nil
@@ -645,13 +685,19 @@ func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string, s
ginHeaders = ginCtx.Request.Header
}
if ginHeaders.Get("X-Codex-Beta-Features") != "" {
r.Header.Set("X-Codex-Beta-Features", ginHeaders.Get("X-Codex-Beta-Features"))
}
misc.EnsureHeader(r.Header, ginHeaders, "Version", "")
misc.EnsureHeader(r.Header, ginHeaders, "Session_id", uuid.NewString())
misc.EnsureHeader(r.Header, ginHeaders, "X-Codex-Turn-Metadata", "")
misc.EnsureHeader(r.Header, ginHeaders, "X-Client-Request-Id", "")
cfgUserAgent, _ := codexHeaderDefaults(cfg, auth)
ensureHeaderWithConfigPrecedence(r.Header, ginHeaders, "User-Agent", cfgUserAgent, codexUserAgent)
if strings.Contains(r.Header.Get("User-Agent"), "Mac OS") {
misc.EnsureHeader(r.Header, ginHeaders, "Session_id", uuid.NewString())
}
if stream {
r.Header.Set("Accept", "text/event-stream")
} else {
@@ -685,13 +731,47 @@ func applyCodexHeaders(r *http.Request, auth *cliproxyauth.Auth, token string, s
}
func newCodexStatusErr(statusCode int, body []byte) statusErr {
err := statusErr{code: statusCode, msg: string(body)}
if retryAfter := parseCodexRetryAfter(statusCode, body, time.Now()); retryAfter != nil {
errCode := statusCode
if isCodexModelCapacityError(body) {
errCode = http.StatusTooManyRequests
}
err := statusErr{code: errCode, msg: string(body)}
if retryAfter := parseCodexRetryAfter(errCode, body, time.Now()); retryAfter != nil {
err.retryAfter = retryAfter
}
return err
}
func normalizeCodexInstructions(body []byte) []byte {
instructions := gjson.GetBytes(body, "instructions")
if !instructions.Exists() || instructions.Type == gjson.Null {
body, _ = sjson.SetBytes(body, "instructions", "")
}
return body
}
func isCodexModelCapacityError(errorBody []byte) bool {
if len(errorBody) == 0 {
return false
}
candidates := []string{
gjson.GetBytes(errorBody, "error.message").String(),
gjson.GetBytes(errorBody, "message").String(),
string(errorBody),
}
for _, candidate := range candidates {
lower := strings.ToLower(strings.TrimSpace(candidate))
if lower == "" {
continue
}
if strings.Contains(lower, "selected model is at capacity") ||
strings.Contains(lower, "model is at capacity. please try a different model") {
return true
}
}
return false
}
func parseCodexRetryAfter(statusCode int, errorBody []byte, now time.Time) *time.Duration {
if statusCode != http.StatusTooManyRequests || len(errorBody) == 0 {
return nil

View File

@@ -42,8 +42,8 @@ func TestCodexExecutorCacheHelper_OpenAIChatCompletions_StablePromptCacheKeyFrom
if gotKey != expectedKey {
t.Fatalf("prompt_cache_key = %q, want %q", gotKey, expectedKey)
}
if gotConversation := httpReq.Header.Get("Conversation_id"); gotConversation != expectedKey {
t.Fatalf("Conversation_id = %q, want %q", gotConversation, expectedKey)
if gotConversation := httpReq.Header.Get("Conversation_id"); gotConversation != "" {
t.Fatalf("Conversation_id = %q, want empty", gotConversation)
}
if gotSession := httpReq.Header.Get("Session_id"); gotSession != expectedKey {
t.Fatalf("Session_id = %q, want %q", gotSession, expectedKey)

View File

@@ -0,0 +1,79 @@
package executor
import (
"context"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
"github.com/tidwall/gjson"
)
func TestCodexExecutorCompactAddsDefaultInstructions(t *testing.T) {
cases := []struct {
name string
payload string
}{
{
name: "missing instructions",
payload: `{"model":"gpt-5.4","input":"hello"}`,
},
{
name: "null instructions",
payload: `{"model":"gpt-5.4","instructions":null,"input":"hello"}`,
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
var gotPath string
var gotBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
body, _ := io.ReadAll(r.Body)
gotBody = body
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"id":"resp_1","object":"response.compaction","usage":{"input_tokens":1,"output_tokens":2,"total_tokens":3}}`))
}))
defer server.Close()
executor := NewCodexExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"base_url": server.URL,
"api_key": "test",
}}
resp, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gpt-5.4",
Payload: []byte(tc.payload),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
Alt: "responses/compact",
Stream: false,
})
if err != nil {
t.Fatalf("Execute error: %v", err)
}
if gotPath != "/responses/compact" {
t.Fatalf("path = %q, want %q", gotPath, "/responses/compact")
}
if !gjson.GetBytes(gotBody, "instructions").Exists() {
t.Fatalf("expected instructions in compact request body, got %s", string(gotBody))
}
if gjson.GetBytes(gotBody, "instructions").Type != gjson.String {
t.Fatalf("instructions type = %v, want string", gjson.GetBytes(gotBody, "instructions").Type)
}
if gjson.GetBytes(gotBody, "instructions").String() != "" {
t.Fatalf("instructions = %q, want empty string", gjson.GetBytes(gotBody, "instructions").String())
}
if string(resp.Payload) != `{"id":"resp_1","object":"response.compaction","usage":{"input_tokens":1,"output_tokens":2,"total_tokens":3}}` {
t.Fatalf("payload = %s", string(resp.Payload))
}
})
}
}

View File

@@ -0,0 +1,123 @@
package executor
import (
"context"
"io"
"net/http"
"net/http/httptest"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
"github.com/tidwall/gjson"
)
func TestCodexExecutorExecuteNormalizesNullInstructions(t *testing.T) {
var gotPath string
var gotBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
body, _ := io.ReadAll(r.Body)
gotBody = body
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"response.completed\",\"response\":{\"id\":\"resp_1\",\"object\":\"response\",\"created_at\":0,\"status\":\"completed\",\"background\":false,\"error\":null}}\n\n"))
}))
defer server.Close()
executor := NewCodexExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"base_url": server.URL,
"api_key": "test",
}}
_, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gpt-5.4",
Payload: []byte(`{"model":"gpt-5.4","instructions":null,"input":"hello"}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
Stream: false,
})
if err != nil {
t.Fatalf("Execute error: %v", err)
}
if gotPath != "/responses" {
t.Fatalf("path = %q, want %q", gotPath, "/responses")
}
if gjson.GetBytes(gotBody, "instructions").Type != gjson.String {
t.Fatalf("instructions type = %v, want string", gjson.GetBytes(gotBody, "instructions").Type)
}
if gjson.GetBytes(gotBody, "instructions").String() != "" {
t.Fatalf("instructions = %q, want empty string", gjson.GetBytes(gotBody, "instructions").String())
}
}
func TestCodexExecutorExecuteStreamNormalizesNullInstructions(t *testing.T) {
var gotPath string
var gotBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
body, _ := io.ReadAll(r.Body)
gotBody = body
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"response.completed\",\"response\":{\"id\":\"resp_1\",\"object\":\"response\",\"created_at\":0,\"status\":\"completed\",\"background\":false,\"error\":null}}\n\n"))
}))
defer server.Close()
executor := NewCodexExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"base_url": server.URL,
"api_key": "test",
}}
result, err := executor.ExecuteStream(context.Background(), auth, cliproxyexecutor.Request{
Model: "gpt-5.4",
Payload: []byte(`{"model":"gpt-5.4","instructions":null,"input":"hello"}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
Stream: true,
})
if err != nil {
t.Fatalf("ExecuteStream error: %v", err)
}
for range result.Chunks {
}
if gotPath != "/responses" {
t.Fatalf("path = %q, want %q", gotPath, "/responses")
}
if gjson.GetBytes(gotBody, "instructions").Type != gjson.String {
t.Fatalf("instructions type = %v, want string", gjson.GetBytes(gotBody, "instructions").Type)
}
if gjson.GetBytes(gotBody, "instructions").String() != "" {
t.Fatalf("instructions = %q, want empty string", gjson.GetBytes(gotBody, "instructions").String())
}
}
func TestCodexExecutorCountTokensTreatsNullInstructionsAsEmpty(t *testing.T) {
executor := NewCodexExecutor(&config.Config{})
nullResp, err := executor.CountTokens(context.Background(), nil, cliproxyexecutor.Request{
Model: "gpt-5.4",
Payload: []byte(`{"model":"gpt-5.4","instructions":null,"input":"hello"}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
})
if err != nil {
t.Fatalf("CountTokens(null) error: %v", err)
}
emptyResp, err := executor.CountTokens(context.Background(), nil, cliproxyexecutor.Request{
Model: "gpt-5.4",
Payload: []byte(`{"model":"gpt-5.4","instructions":"","input":"hello"}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai-response"),
})
if err != nil {
t.Fatalf("CountTokens(empty) error: %v", err)
}
if string(nullResp.Payload) != string(emptyResp.Payload) {
t.Fatalf("token count payload mismatch:\nnull=%s\nempty=%s", string(nullResp.Payload), string(emptyResp.Payload))
}
}

View File

@@ -60,6 +60,19 @@ func TestParseCodexRetryAfter(t *testing.T) {
})
}
func TestNewCodexStatusErrTreatsCapacityAsRetryableRateLimit(t *testing.T) {
body := []byte(`{"error":{"message":"Selected model is at capacity. Please try a different model."}}`)
err := newCodexStatusErr(http.StatusBadRequest, body)
if got := err.StatusCode(); got != http.StatusTooManyRequests {
t.Fatalf("status code = %d, want %d", got, http.StatusTooManyRequests)
}
if err.RetryAfter() != nil {
t.Fatalf("expected nil explicit retryAfter for capacity fallback, got %v", *err.RetryAfter())
}
}
func itoa(v int64) string {
return strconv.FormatInt(v, 10)
}

View File

@@ -0,0 +1,46 @@
package executor
import (
"context"
"net/http"
"net/http/httptest"
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
_ "github.com/router-for-me/CLIProxyAPI/v6/internal/translator"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
"github.com/tidwall/gjson"
)
func TestCodexExecutorExecute_EmptyStreamCompletionOutputUsesOutputItemDone(t *testing.T) {
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"response.output_item.done\",\"item\":{\"type\":\"message\",\"role\":\"assistant\",\"content\":[{\"type\":\"output_text\",\"text\":\"ok\"}]},\"output_index\":0}\n"))
_, _ = w.Write([]byte("data: {\"type\":\"response.completed\",\"response\":{\"id\":\"resp_1\",\"object\":\"response\",\"created_at\":1775555723,\"status\":\"completed\",\"model\":\"gpt-5.4-mini-2026-03-17\",\"output\":[],\"usage\":{\"input_tokens\":8,\"output_tokens\":28,\"total_tokens\":36}}}\n\n"))
}))
defer server.Close()
executor := NewCodexExecutor(&config.Config{})
auth := &cliproxyauth.Auth{Attributes: map[string]string{
"base_url": server.URL,
"api_key": "test",
}}
resp, err := executor.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "gpt-5.4-mini",
Payload: []byte(`{"model":"gpt-5.4-mini","messages":[{"role":"user","content":"Say ok"}]}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai"),
Stream: false,
})
if err != nil {
t.Fatalf("Execute error: %v", err)
}
gotContent := gjson.GetBytes(resp.Payload, "choices.0.message.content").String()
if gotContent != "ok" {
t.Fatalf("choices.0.message.content = %q, want %q; payload=%s", gotContent, "ok", string(resp.Payload))
}
}

View File

@@ -15,10 +15,12 @@ import (
"sync"
"time"
"github.com/gin-gonic/gin"
"github.com/google/uuid"
"github.com/gorilla/websocket"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
@@ -44,10 +46,18 @@ const (
type CodexWebsocketsExecutor struct {
*CodexExecutor
sessMu sync.Mutex
store *codexWebsocketSessionStore
}
type codexWebsocketSessionStore struct {
mu sync.Mutex
sessions map[string]*codexWebsocketSession
}
var globalCodexWebsocketSessionStore = &codexWebsocketSessionStore{
sessions: make(map[string]*codexWebsocketSession),
}
type codexWebsocketSession struct {
sessionID string
@@ -71,7 +81,7 @@ type codexWebsocketSession struct {
func NewCodexWebsocketsExecutor(cfg *config.Config) *CodexWebsocketsExecutor {
return &CodexWebsocketsExecutor{
CodexExecutor: NewCodexExecutor(cfg),
sessions: make(map[string]*codexWebsocketSession),
store: globalCodexWebsocketSessionStore,
}
}
@@ -155,8 +165,8 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("codex")
@@ -173,8 +183,8 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
return resp, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
body, _ = sjson.SetBytes(body, "stream", true)
body, _ = sjson.DeleteBytes(body, "previous_response_id")
@@ -209,7 +219,7 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
}
wsReqBody := buildCodexWebsocketRequestBody(body)
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
wsReqLog := helps.UpstreamRequestLog{
URL: wsURL,
Method: "WEBSOCKET",
Headers: wsHeaders.Clone(),
@@ -219,16 +229,14 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
AuthLabel: authLabel,
AuthType: authType,
AuthValue: authValue,
})
}
helps.RecordAPIWebsocketRequest(ctx, e.cfg, wsReqLog)
conn, respHS, errDial := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
if respHS != nil {
recordAPIResponseMetadata(ctx, e.cfg, respHS.StatusCode, respHS.Header.Clone())
}
if errDial != nil {
bodyErr := websocketHandshakeBody(respHS)
if len(bodyErr) > 0 {
appendAPIResponseChunk(ctx, e.cfg, bodyErr)
if respHS != nil {
helps.RecordAPIWebsocketUpgradeRejection(ctx, e.cfg, websocketUpgradeRequestLog(wsReqLog), respHS.StatusCode, respHS.Header.Clone(), bodyErr)
}
if respHS != nil && respHS.StatusCode == http.StatusUpgradeRequired {
return e.CodexExecutor.Execute(ctx, auth, req, opts)
@@ -236,10 +244,10 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
if respHS != nil && respHS.StatusCode > 0 {
return resp, statusErr{code: respHS.StatusCode, msg: string(bodyErr)}
}
recordAPIResponseError(ctx, e.cfg, errDial)
helps.RecordAPIWebsocketError(ctx, e.cfg, "dial", errDial)
return resp, errDial
}
closeHTTPResponseBody(respHS, "codex websockets executor: close handshake response body error")
recordAPIWebsocketHandshake(ctx, e.cfg, respHS)
if sess == nil {
logCodexWebsocketConnected(executionSessionID, authID, wsURL)
defer func() {
@@ -268,10 +276,10 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
// Retry once with a fresh websocket connection. This is mainly to handle
// upstream closing the socket between sequential requests within the same
// execution session.
connRetry, _, errDialRetry := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
connRetry, respHSRetry, errDialRetry := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
if errDialRetry == nil && connRetry != nil {
wsReqBodyRetry := buildCodexWebsocketRequestBody(body)
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIWebsocketRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: wsURL,
Method: "WEBSOCKET",
Headers: wsHeaders.Clone(),
@@ -282,20 +290,22 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
AuthType: authType,
AuthValue: authValue,
})
recordAPIWebsocketHandshake(ctx, e.cfg, respHSRetry)
if errSendRetry := writeCodexWebsocketMessage(sess, connRetry, wsReqBodyRetry); errSendRetry == nil {
conn = connRetry
wsReqBody = wsReqBodyRetry
} else {
e.invalidateUpstreamConn(sess, connRetry, "send_error", errSendRetry)
recordAPIResponseError(ctx, e.cfg, errSendRetry)
helps.RecordAPIWebsocketError(ctx, e.cfg, "send_retry", errSendRetry)
return resp, errSendRetry
}
} else {
recordAPIResponseError(ctx, e.cfg, errDialRetry)
closeHTTPResponseBody(respHSRetry, "codex websockets executor: close handshake response body error")
helps.RecordAPIWebsocketError(ctx, e.cfg, "dial_retry", errDialRetry)
return resp, errDialRetry
}
} else {
recordAPIResponseError(ctx, e.cfg, errSend)
helps.RecordAPIWebsocketError(ctx, e.cfg, "send", errSend)
return resp, errSend
}
}
@@ -306,7 +316,7 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
}
msgType, payload, errRead := readCodexWebsocketMessage(ctx, sess, conn, readCh)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIWebsocketError(ctx, e.cfg, "read", errRead)
return resp, errRead
}
if msgType != websocket.TextMessage {
@@ -315,7 +325,7 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
if sess != nil {
e.invalidateUpstreamConn(sess, conn, "unexpected_binary", err)
}
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIWebsocketError(ctx, e.cfg, "unexpected_binary", err)
return resp, err
}
continue
@@ -325,21 +335,21 @@ func (e *CodexWebsocketsExecutor) Execute(ctx context.Context, auth *cliproxyaut
if len(payload) == 0 {
continue
}
appendAPIResponseChunk(ctx, e.cfg, payload)
helps.AppendAPIWebsocketResponse(ctx, e.cfg, payload)
if wsErr, ok := parseCodexWebsocketError(payload); ok {
if sess != nil {
e.invalidateUpstreamConn(sess, conn, "upstream_error", wsErr)
}
recordAPIResponseError(ctx, e.cfg, wsErr)
helps.RecordAPIWebsocketError(ctx, e.cfg, "upstream_error", wsErr)
return resp, wsErr
}
payload = normalizeCodexWebsocketCompletion(payload)
eventType := gjson.GetBytes(payload, "type").String()
if eventType == "response.completed" {
if detail, ok := parseCodexUsage(payload); ok {
reporter.publish(ctx, detail)
if detail, ok := helps.ParseCodexUsage(payload); ok {
reporter.Publish(ctx, detail)
}
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, originalPayload, body, payload, &param)
@@ -364,8 +374,8 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
baseURL = "https://chatgpt.com/backend-api/codex"
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("codex")
@@ -376,8 +386,8 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
return nil, err
}
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, body, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, body, requestedModel)
httpURL := strings.TrimSuffix(baseURL, "/") + "/responses"
wsURL, err := buildCodexResponsesWebsocketURL(httpURL)
@@ -403,7 +413,7 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
}
wsReqBody := buildCodexWebsocketRequestBody(body)
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
wsReqLog := helps.UpstreamRequestLog{
URL: wsURL,
Method: "WEBSOCKET",
Headers: wsHeaders.Clone(),
@@ -413,18 +423,18 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
AuthLabel: authLabel,
AuthType: authType,
AuthValue: authValue,
})
}
helps.RecordAPIWebsocketRequest(ctx, e.cfg, wsReqLog)
conn, respHS, errDial := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
var upstreamHeaders http.Header
if respHS != nil {
upstreamHeaders = respHS.Header.Clone()
recordAPIResponseMetadata(ctx, e.cfg, respHS.StatusCode, respHS.Header.Clone())
}
if errDial != nil {
bodyErr := websocketHandshakeBody(respHS)
if len(bodyErr) > 0 {
appendAPIResponseChunk(ctx, e.cfg, bodyErr)
if respHS != nil {
helps.RecordAPIWebsocketUpgradeRejection(ctx, e.cfg, websocketUpgradeRequestLog(wsReqLog), respHS.StatusCode, respHS.Header.Clone(), bodyErr)
}
if respHS != nil && respHS.StatusCode == http.StatusUpgradeRequired {
return e.CodexExecutor.ExecuteStream(ctx, auth, req, opts)
@@ -432,13 +442,13 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
if respHS != nil && respHS.StatusCode > 0 {
return nil, statusErr{code: respHS.StatusCode, msg: string(bodyErr)}
}
recordAPIResponseError(ctx, e.cfg, errDial)
helps.RecordAPIWebsocketError(ctx, e.cfg, "dial", errDial)
if sess != nil {
sess.reqMu.Unlock()
}
return nil, errDial
}
closeHTTPResponseBody(respHS, "codex websockets executor: close handshake response body error")
recordAPIWebsocketHandshake(ctx, e.cfg, respHS)
if sess == nil {
logCodexWebsocketConnected(executionSessionID, authID, wsURL)
@@ -451,20 +461,21 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
}
if errSend := writeCodexWebsocketMessage(sess, conn, wsReqBody); errSend != nil {
recordAPIResponseError(ctx, e.cfg, errSend)
helps.RecordAPIWebsocketError(ctx, e.cfg, "send", errSend)
if sess != nil {
e.invalidateUpstreamConn(sess, conn, "send_error", errSend)
// Retry once with a new websocket connection for the same execution session.
connRetry, _, errDialRetry := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
connRetry, respHSRetry, errDialRetry := e.ensureUpstreamConn(ctx, auth, sess, authID, wsURL, wsHeaders)
if errDialRetry != nil || connRetry == nil {
recordAPIResponseError(ctx, e.cfg, errDialRetry)
closeHTTPResponseBody(respHSRetry, "codex websockets executor: close handshake response body error")
helps.RecordAPIWebsocketError(ctx, e.cfg, "dial_retry", errDialRetry)
sess.clearActive(readCh)
sess.reqMu.Unlock()
return nil, errDialRetry
}
wsReqBodyRetry := buildCodexWebsocketRequestBody(body)
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIWebsocketRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: wsURL,
Method: "WEBSOCKET",
Headers: wsHeaders.Clone(),
@@ -475,8 +486,9 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
AuthType: authType,
AuthValue: authValue,
})
recordAPIWebsocketHandshake(ctx, e.cfg, respHSRetry)
if errSendRetry := writeCodexWebsocketMessage(sess, connRetry, wsReqBodyRetry); errSendRetry != nil {
recordAPIResponseError(ctx, e.cfg, errSendRetry)
helps.RecordAPIWebsocketError(ctx, e.cfg, "send_retry", errSendRetry)
e.invalidateUpstreamConn(sess, connRetry, "send_error", errSendRetry)
sess.clearActive(readCh)
sess.reqMu.Unlock()
@@ -542,8 +554,8 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
}
terminateReason = "read_error"
terminateErr = errRead
recordAPIResponseError(ctx, e.cfg, errRead)
reporter.publishFailure(ctx)
helps.RecordAPIWebsocketError(ctx, e.cfg, "read", errRead)
reporter.PublishFailure(ctx)
_ = send(cliproxyexecutor.StreamChunk{Err: errRead})
return
}
@@ -552,8 +564,8 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
err = fmt.Errorf("codex websockets executor: unexpected binary message")
terminateReason = "unexpected_binary"
terminateErr = err
recordAPIResponseError(ctx, e.cfg, err)
reporter.publishFailure(ctx)
helps.RecordAPIWebsocketError(ctx, e.cfg, "unexpected_binary", err)
reporter.PublishFailure(ctx)
if sess != nil {
e.invalidateUpstreamConn(sess, conn, "unexpected_binary", err)
}
@@ -567,13 +579,13 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
if len(payload) == 0 {
continue
}
appendAPIResponseChunk(ctx, e.cfg, payload)
helps.AppendAPIWebsocketResponse(ctx, e.cfg, payload)
if wsErr, ok := parseCodexWebsocketError(payload); ok {
terminateReason = "upstream_error"
terminateErr = wsErr
recordAPIResponseError(ctx, e.cfg, wsErr)
reporter.publishFailure(ctx)
helps.RecordAPIWebsocketError(ctx, e.cfg, "upstream_error", wsErr)
reporter.PublishFailure(ctx)
if sess != nil {
e.invalidateUpstreamConn(sess, conn, "upstream_error", wsErr)
}
@@ -584,8 +596,8 @@ func (e *CodexWebsocketsExecutor) ExecuteStream(ctx context.Context, auth *clipr
payload = normalizeCodexWebsocketCompletion(payload)
eventType := gjson.GetBytes(payload, "type").String()
if eventType == "response.completed" || eventType == "response.done" {
if detail, ok := parseCodexUsage(payload); ok {
reporter.publish(ctx, detail)
if detail, ok := helps.ParseCodexUsage(payload); ok {
reporter.Publish(ctx, detail)
}
}
@@ -722,7 +734,7 @@ func newProxyAwareWebsocketDialer(cfg *config.Config, auth *cliproxyauth.Auth) *
}
switch setting.URL.Scheme {
case "socks5":
case "socks5", "socks5h":
var proxyAuth *proxy.Auth
if setting.URL.User != nil {
username := setting.URL.User.Username()
@@ -767,19 +779,19 @@ func applyCodexPromptCacheHeaders(from sdktranslator.Format, req cliproxyexecuto
return rawJSON, headers
}
var cache codexCache
var cache helps.CodexCache
if from == "claude" {
userIDResult := gjson.GetBytes(req.Payload, "metadata.user_id")
if userIDResult.Exists() {
key := fmt.Sprintf("%s-%s", req.Model, userIDResult.String())
if cached, ok := getCodexCache(key); ok {
if cached, ok := helps.GetCodexCache(key); ok {
cache = cached
} else {
cache = codexCache{
cache = helps.CodexCache{
ID: uuid.New().String(),
Expire: time.Now().Add(1 * time.Hour),
}
setCodexCache(key, cache)
helps.SetCodexCache(key, cache)
}
}
} else if from == "openai-response" {
@@ -791,7 +803,6 @@ func applyCodexPromptCacheHeaders(from sdktranslator.Format, req cliproxyexecuto
if cache.ID != "" {
rawJSON, _ = sjson.SetBytes(rawJSON, "prompt_cache_key", cache.ID)
headers.Set("Conversation_id", cache.ID)
headers.Set("Session_id", cache.ID)
}
return rawJSON, headers
@@ -806,11 +817,11 @@ func applyCodexWebsocketHeaders(ctx context.Context, headers http.Header, auth *
}
var ginHeaders http.Header
if ginCtx := ginContextFrom(ctx); ginCtx != nil && ginCtx.Request != nil {
ginHeaders = ginCtx.Request.Header
if ginCtx, ok := ctx.Value("gin").(*gin.Context); ok && ginCtx != nil && ginCtx.Request != nil {
ginHeaders = ginCtx.Request.Header.Clone()
}
cfgUserAgent, cfgBetaFeatures := codexHeaderDefaults(cfg, auth)
_, cfgBetaFeatures := codexHeaderDefaults(cfg, auth)
ensureHeaderWithPriority(headers, ginHeaders, "x-codex-beta-features", cfgBetaFeatures, "")
misc.EnsureHeader(headers, ginHeaders, "x-codex-turn-state", "")
misc.EnsureHeader(headers, ginHeaders, "x-codex-turn-metadata", "")
@@ -826,8 +837,10 @@ func applyCodexWebsocketHeaders(ctx context.Context, headers http.Header, auth *
betaHeader = codexResponsesWebsocketBetaHeaderValue
}
headers.Set("OpenAI-Beta", betaHeader)
misc.EnsureHeader(headers, ginHeaders, "Session_id", uuid.NewString())
ensureHeaderWithConfigPrecedence(headers, ginHeaders, "User-Agent", cfgUserAgent, codexUserAgent)
if strings.Contains(headers.Get("User-Agent"), "Mac OS") {
misc.EnsureHeader(headers, ginHeaders, "Session_id", uuid.NewString())
}
headers.Del("User-Agent")
isAPIKey := false
if auth != nil && auth.Attributes != nil {
@@ -1011,6 +1024,32 @@ func encodeCodexWebsocketAsSSE(payload []byte) []byte {
return line
}
func websocketUpgradeRequestLog(info helps.UpstreamRequestLog) helps.UpstreamRequestLog {
upgradeInfo := info
upgradeInfo.URL = helps.WebsocketUpgradeRequestURL(info.URL)
upgradeInfo.Method = http.MethodGet
upgradeInfo.Body = nil
upgradeInfo.Headers = info.Headers.Clone()
if upgradeInfo.Headers == nil {
upgradeInfo.Headers = make(http.Header)
}
if strings.TrimSpace(upgradeInfo.Headers.Get("Connection")) == "" {
upgradeInfo.Headers.Set("Connection", "Upgrade")
}
if strings.TrimSpace(upgradeInfo.Headers.Get("Upgrade")) == "" {
upgradeInfo.Headers.Set("Upgrade", "websocket")
}
return upgradeInfo
}
func recordAPIWebsocketHandshake(ctx context.Context, cfg *config.Config, resp *http.Response) {
if resp == nil {
return
}
helps.RecordAPIWebsocketHandshake(ctx, cfg, resp.StatusCode, resp.Header.Clone())
closeHTTPResponseBody(resp, "codex websockets executor: close handshake response body error")
}
func websocketHandshakeBody(resp *http.Response) []byte {
if resp == nil || resp.Body == nil {
return nil
@@ -1055,16 +1094,23 @@ func (e *CodexWebsocketsExecutor) getOrCreateSession(sessionID string) *codexWeb
if sessionID == "" {
return nil
}
e.sessMu.Lock()
defer e.sessMu.Unlock()
if e.sessions == nil {
e.sessions = make(map[string]*codexWebsocketSession)
if e == nil {
return nil
}
if sess, ok := e.sessions[sessionID]; ok && sess != nil {
store := e.store
if store == nil {
store = globalCodexWebsocketSessionStore
}
store.mu.Lock()
defer store.mu.Unlock()
if store.sessions == nil {
store.sessions = make(map[string]*codexWebsocketSession)
}
if sess, ok := store.sessions[sessionID]; ok && sess != nil {
return sess
}
sess := &codexWebsocketSession{sessionID: sessionID}
e.sessions[sessionID] = sess
store.sessions[sessionID] = sess
return sess
}
@@ -1210,14 +1256,20 @@ func (e *CodexWebsocketsExecutor) CloseExecutionSession(sessionID string) {
return
}
if sessionID == cliproxyauth.CloseAllExecutionSessionsID {
e.closeAllExecutionSessions("executor_replaced")
// Executor replacement can happen during hot reload (config/credential changes).
// Do not force-close upstream websocket sessions here, otherwise in-flight
// downstream websocket requests get interrupted.
return
}
e.sessMu.Lock()
sess := e.sessions[sessionID]
delete(e.sessions, sessionID)
e.sessMu.Unlock()
store := e.store
if store == nil {
store = globalCodexWebsocketSessionStore
}
store.mu.Lock()
sess := store.sessions[sessionID]
delete(store.sessions, sessionID)
store.mu.Unlock()
e.closeExecutionSession(sess, "session_closed")
}
@@ -1227,15 +1279,19 @@ func (e *CodexWebsocketsExecutor) closeAllExecutionSessions(reason string) {
return
}
e.sessMu.Lock()
sessions := make([]*codexWebsocketSession, 0, len(e.sessions))
for sessionID, sess := range e.sessions {
delete(e.sessions, sessionID)
store := e.store
if store == nil {
store = globalCodexWebsocketSessionStore
}
store.mu.Lock()
sessions := make([]*codexWebsocketSession, 0, len(store.sessions))
for sessionID, sess := range store.sessions {
delete(store.sessions, sessionID)
if sess != nil {
sessions = append(sessions, sess)
}
}
e.sessMu.Unlock()
store.mu.Unlock()
for i := range sessions {
e.closeExecutionSession(sessions[i], reason)
@@ -1243,6 +1299,10 @@ func (e *CodexWebsocketsExecutor) closeAllExecutionSessions(reason string) {
}
func (e *CodexWebsocketsExecutor) closeExecutionSession(sess *codexWebsocketSession, reason string) {
closeCodexWebsocketSession(sess, reason)
}
func closeCodexWebsocketSession(sess *codexWebsocketSession, reason string) {
if sess == nil {
return
}
@@ -1283,6 +1343,69 @@ func logCodexWebsocketDisconnected(sessionID string, authID string, wsURL string
log.Infof("codex websockets: upstream disconnected session=%s auth=%s url=%s reason=%s", strings.TrimSpace(sessionID), strings.TrimSpace(authID), strings.TrimSpace(wsURL), strings.TrimSpace(reason))
}
// CloseCodexWebsocketSessionsForAuthID closes all active Codex upstream websocket sessions
// associated with the supplied auth ID.
func CloseCodexWebsocketSessionsForAuthID(authID string, reason string) {
authID = strings.TrimSpace(authID)
if authID == "" {
return
}
reason = strings.TrimSpace(reason)
if reason == "" {
reason = "auth_removed"
}
store := globalCodexWebsocketSessionStore
if store == nil {
return
}
type sessionItem struct {
sessionID string
sess *codexWebsocketSession
}
store.mu.Lock()
items := make([]sessionItem, 0, len(store.sessions))
for sessionID, sess := range store.sessions {
items = append(items, sessionItem{sessionID: sessionID, sess: sess})
}
store.mu.Unlock()
matches := make([]sessionItem, 0)
for i := range items {
sess := items[i].sess
if sess == nil {
continue
}
sess.connMu.Lock()
sessAuthID := strings.TrimSpace(sess.authID)
sess.connMu.Unlock()
if sessAuthID == authID {
matches = append(matches, items[i])
}
}
if len(matches) == 0 {
return
}
toClose := make([]*codexWebsocketSession, 0, len(matches))
store.mu.Lock()
for i := range matches {
current, ok := store.sessions[matches[i].sessionID]
if !ok || current == nil || current != matches[i].sess {
continue
}
delete(store.sessions, matches[i].sessionID)
toClose = append(toClose, current)
}
store.mu.Unlock()
for i := range toClose {
closeCodexWebsocketSession(toClose[i], reason)
}
}
// CodexAutoExecutor routes Codex requests to the websocket transport only when:
// 1. The downstream transport is websocket, and
// 2. The selected auth enables websockets.

View File

@@ -0,0 +1,48 @@
package executor
import (
"testing"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
)
func TestCodexWebsocketsExecutor_SessionStoreSurvivesExecutorReplacement(t *testing.T) {
sessionID := "test-session-store-survives-replace"
globalCodexWebsocketSessionStore.mu.Lock()
delete(globalCodexWebsocketSessionStore.sessions, sessionID)
globalCodexWebsocketSessionStore.mu.Unlock()
exec1 := NewCodexWebsocketsExecutor(nil)
sess1 := exec1.getOrCreateSession(sessionID)
if sess1 == nil {
t.Fatalf("expected session to be created")
}
exec2 := NewCodexWebsocketsExecutor(nil)
sess2 := exec2.getOrCreateSession(sessionID)
if sess2 == nil {
t.Fatalf("expected session to be available across executors")
}
if sess1 != sess2 {
t.Fatalf("expected the same session instance across executors")
}
exec1.CloseExecutionSession(cliproxyauth.CloseAllExecutionSessionsID)
globalCodexWebsocketSessionStore.mu.Lock()
_, stillPresent := globalCodexWebsocketSessionStore.sessions[sessionID]
globalCodexWebsocketSessionStore.mu.Unlock()
if !stillPresent {
t.Fatalf("expected session to remain after executor replacement close marker")
}
exec2.CloseExecutionSession(sessionID)
globalCodexWebsocketSessionStore.mu.Lock()
_, presentAfterClose := globalCodexWebsocketSessionStore.sessions[sessionID]
globalCodexWebsocketSessionStore.mu.Unlock()
if presentAfterClose {
t.Fatalf("expected session to be removed after explicit close")
}
}

View File

@@ -38,8 +38,8 @@ func TestApplyCodexWebsocketHeadersDefaultsToCurrentResponsesBeta(t *testing.T)
if got := headers.Get("OpenAI-Beta"); got != codexResponsesWebsocketBetaHeaderValue {
t.Fatalf("OpenAI-Beta = %s, want %s", got, codexResponsesWebsocketBetaHeaderValue)
}
if got := headers.Get("User-Agent"); got != codexUserAgent {
t.Fatalf("User-Agent = %s, want %s", got, codexUserAgent)
if got := headers.Get("User-Agent"); got != "" {
t.Fatalf("User-Agent = %s, want empty", got)
}
if got := headers.Get("Version"); got != "" {
t.Fatalf("Version = %q, want empty", got)
@@ -97,8 +97,8 @@ func TestApplyCodexWebsocketHeadersUsesConfigDefaultsForOAuth(t *testing.T) {
headers := applyCodexWebsocketHeaders(context.Background(), http.Header{}, auth, "", cfg)
if got := headers.Get("User-Agent"); got != "my-codex-client/1.0" {
t.Fatalf("User-Agent = %s, want %s", got, "my-codex-client/1.0")
if got := headers.Get("User-Agent"); got != "" {
t.Fatalf("User-Agent = %s, want empty", got)
}
if got := headers.Get("x-codex-beta-features"); got != "feature-a,feature-b" {
t.Fatalf("x-codex-beta-features = %s, want %s", got, "feature-a,feature-b")
@@ -129,8 +129,8 @@ func TestApplyCodexWebsocketHeadersPrefersExistingHeadersOverClientAndConfig(t *
got := applyCodexWebsocketHeaders(ctx, headers, auth, "", cfg)
if gotVal := got.Get("User-Agent"); gotVal != "existing-ua" {
t.Fatalf("User-Agent = %s, want %s", gotVal, "existing-ua")
if gotVal := got.Get("User-Agent"); gotVal != "" {
t.Fatalf("User-Agent = %s, want empty", gotVal)
}
if gotVal := got.Get("x-codex-beta-features"); gotVal != "existing-beta" {
t.Fatalf("x-codex-beta-features = %s, want %s", gotVal, "existing-beta")
@@ -155,8 +155,8 @@ func TestApplyCodexWebsocketHeadersConfigUserAgentOverridesClientHeader(t *testi
headers := applyCodexWebsocketHeaders(ctx, http.Header{}, auth, "", cfg)
if got := headers.Get("User-Agent"); got != "config-ua" {
t.Fatalf("User-Agent = %s, want %s", got, "config-ua")
if got := headers.Get("User-Agent"); got != "" {
t.Fatalf("User-Agent = %s, want empty", got)
}
if got := headers.Get("x-codex-beta-features"); got != "client-beta" {
t.Fatalf("x-codex-beta-features = %s, want %s", got, "client-beta")
@@ -177,8 +177,8 @@ func TestApplyCodexWebsocketHeadersIgnoresConfigForAPIKeyAuth(t *testing.T) {
headers := applyCodexWebsocketHeaders(context.Background(), http.Header{}, auth, "sk-test", cfg)
if got := headers.Get("User-Agent"); got != codexUserAgent {
t.Fatalf("User-Agent = %s, want %s", got, codexUserAgent)
if got := headers.Get("User-Agent"); got != "" {
t.Fatalf("User-Agent = %s, want empty", got)
}
if got := headers.Get("x-codex-beta-features"); got != "" {
t.Fatalf("x-codex-beta-features = %q, want empty", got)

View File

@@ -0,0 +1,129 @@
package executor
import (
"context"
"net/http"
"time"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/usage"
"github.com/tidwall/gjson"
"github.com/tiktoken-go/tokenizer"
)
func newProxyAwareHTTPClient(ctx context.Context, cfg *config.Config, auth *cliproxyauth.Auth, timeout time.Duration) *http.Client {
return helps.NewProxyAwareHTTPClient(ctx, cfg, auth, timeout)
}
func parseOpenAIUsage(data []byte) usage.Detail {
return helps.ParseOpenAIUsage(data)
}
func parseOpenAIStreamUsage(line []byte) (usage.Detail, bool) {
return helps.ParseOpenAIStreamUsage(line)
}
func parseOpenAIResponsesUsage(data []byte) usage.Detail {
return helps.ParseOpenAIUsage(data)
}
func parseOpenAIResponsesStreamUsage(line []byte) (usage.Detail, bool) {
return helps.ParseOpenAIStreamUsage(line)
}
func getTokenizer(model string) (tokenizer.Codec, error) {
return helps.TokenizerForModel(model)
}
func countOpenAIChatTokens(enc tokenizer.Codec, payload []byte) (int64, error) {
return helps.CountOpenAIChatTokens(enc, payload)
}
func countClaudeChatTokens(enc tokenizer.Codec, payload []byte) (int64, error) {
return helps.CountClaudeChatTokens(enc, payload)
}
func buildOpenAIUsageJSON(count int64) []byte {
return helps.BuildOpenAIUsageJSON(count)
}
type upstreamRequestLog = helps.UpstreamRequestLog
func recordAPIRequest(ctx context.Context, cfg *config.Config, info upstreamRequestLog) {
helps.RecordAPIRequest(ctx, cfg, info)
}
func recordAPIResponseMetadata(ctx context.Context, cfg *config.Config, status int, headers http.Header) {
helps.RecordAPIResponseMetadata(ctx, cfg, status, headers)
}
func recordAPIResponseError(ctx context.Context, cfg *config.Config, err error) {
helps.RecordAPIResponseError(ctx, cfg, err)
}
func appendAPIResponseChunk(ctx context.Context, cfg *config.Config, chunk []byte) {
helps.AppendAPIResponseChunk(ctx, cfg, chunk)
}
func payloadRequestedModel(opts cliproxyexecutor.Options, fallback string) string {
return helps.PayloadRequestedModel(opts, fallback)
}
func applyPayloadConfigWithRoot(cfg *config.Config, model, protocol, root string, payload, original []byte, requestedModel string) []byte {
return helps.ApplyPayloadConfigWithRoot(cfg, model, protocol, root, payload, original, requestedModel)
}
func summarizeErrorBody(contentType string, body []byte) string {
return helps.SummarizeErrorBody(contentType, body)
}
func apiKeyFromContext(ctx context.Context) string {
return helps.APIKeyFromContext(ctx)
}
func tokenizerForModel(model string) (tokenizer.Codec, error) {
return helps.TokenizerForModel(model)
}
func collectOpenAIContent(content gjson.Result, segments *[]string) {
helps.CollectOpenAIContent(content, segments)
}
type usageReporter struct {
reporter *helps.UsageReporter
}
func newUsageReporter(ctx context.Context, provider, model string, auth *cliproxyauth.Auth) *usageReporter {
return &usageReporter{reporter: helps.NewUsageReporter(ctx, provider, model, auth)}
}
func (r *usageReporter) publish(ctx context.Context, detail usage.Detail) {
if r == nil || r.reporter == nil {
return
}
r.reporter.Publish(ctx, detail)
}
func (r *usageReporter) publishFailure(ctx context.Context) {
if r == nil || r.reporter == nil {
return
}
r.reporter.PublishFailure(ctx)
}
func (r *usageReporter) trackFailure(ctx context.Context, errPtr *error) {
if r == nil || r.reporter == nil {
return
}
r.reporter.TrackFailure(ctx, errPtr)
}
func (r *usageReporter) ensurePublished(ctx context.Context) {
if r == nil || r.reporter == nil {
return
}
r.reporter.EnsurePublished(ctx)
}

File diff suppressed because it is too large Load Diff

View File

@@ -18,6 +18,7 @@ import (
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/misc"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/geminicli"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
@@ -81,6 +82,11 @@ func (e *GeminiCLIExecutor) PrepareRequest(req *http.Request, auth *cliproxyauth
}
req.Header.Set("Authorization", "Bearer "+tok.AccessToken)
applyGeminiCLIHeaders(req, "unknown")
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(req, attrs)
return nil
}
@@ -112,8 +118,8 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
return resp, err
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini-cli")
@@ -132,8 +138,8 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
}
basePayload = fixGeminiCLIImageAspectRatio(baseModel, basePayload)
requestedModel := payloadRequestedModel(opts, req.Model)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
basePayload = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
action := "generateContent"
if req.Metadata != nil {
@@ -190,7 +196,8 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
reqHTTP.Header.Set("Authorization", "Bearer "+tok.AccessToken)
applyGeminiCLIHeaders(reqHTTP, attemptModel)
reqHTTP.Header.Set("Accept", "application/json")
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
util.ApplyCustomHeadersFromAttrs(reqHTTP, auth.Attributes)
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: reqHTTP.Header.Clone(),
@@ -204,7 +211,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
httpResp, errDo := httpClient.Do(reqHTTP)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
err = errDo
return resp, err
}
@@ -213,15 +220,15 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("gemini cli executor: close response body error: %v", errClose)
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
err = errRead
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
if httpResp.StatusCode >= 200 && httpResp.StatusCode < 300 {
reporter.publish(ctx, parseGeminiCLIUsage(data))
reporter.Publish(ctx, helps.ParseGeminiCLIUsage(data))
var param any
out := sdktranslator.TranslateNonStream(respCtx, to, from, attemptModel, opts.OriginalRequest, payload, data, &param)
resp = cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}
@@ -230,7 +237,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
lastStatus = httpResp.StatusCode
lastBody = append([]byte(nil), data...)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
if httpResp.StatusCode == 429 {
if idx+1 < len(models) {
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
@@ -245,7 +252,7 @@ func (e *GeminiCLIExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth
}
if len(lastBody) > 0 {
appendAPIResponseChunk(ctx, e.cfg, lastBody)
helps.AppendAPIResponseChunk(ctx, e.cfg, lastBody)
}
if lastStatus == 0 {
lastStatus = 429
@@ -266,8 +273,8 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
return nil, err
}
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini-cli")
@@ -286,8 +293,8 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
}
basePayload = fixGeminiCLIImageAspectRatio(baseModel, basePayload)
requestedModel := payloadRequestedModel(opts, req.Model)
basePayload = applyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
basePayload = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, "gemini", "request", basePayload, originalTranslated, requestedModel)
projectID := resolveGeminiProjectID(auth)
@@ -335,7 +342,8 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
reqHTTP.Header.Set("Authorization", "Bearer "+tok.AccessToken)
applyGeminiCLIHeaders(reqHTTP, attemptModel)
reqHTTP.Header.Set("Accept", "text/event-stream")
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
util.ApplyCustomHeadersFromAttrs(reqHTTP, auth.Attributes)
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: reqHTTP.Header.Clone(),
@@ -349,25 +357,25 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
httpResp, errDo := httpClient.Do(reqHTTP)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
err = errDo
return nil, err
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
data, errRead := io.ReadAll(httpResp.Body)
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("gemini cli executor: close response body error: %v", errClose)
}
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
err = errRead
return nil, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
lastStatus = httpResp.StatusCode
lastBody = append([]byte(nil), data...)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), data))
if httpResp.StatusCode == 429 {
if idx+1 < len(models) {
log.Debugf("gemini cli executor: rate limited, retrying with next model: %s", models[idx+1])
@@ -394,9 +402,9 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
var param any
for scanner.Scan() {
line := scanner.Bytes()
appendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := parseGeminiCLIStreamUsage(line); ok {
reporter.publish(ctx, detail)
helps.AppendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := helps.ParseGeminiCLIStreamUsage(line); ok {
reporter.Publish(ctx, detail)
}
if bytes.HasPrefix(line, dataTag) {
segments := sdktranslator.TranslateStream(respCtx, to, from, attemptModel, opts.OriginalRequest, reqBody, bytes.Clone(line), &param)
@@ -411,8 +419,8 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
out <- cliproxyexecutor.StreamChunk{Payload: segments[i]}
}
if errScan := scanner.Err(); errScan != nil {
recordAPIResponseError(ctx, e.cfg, errScan)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errScan)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errScan}
}
return
@@ -420,13 +428,13 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
data, errRead := io.ReadAll(resp.Body)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errRead}
return
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseGeminiCLIUsage(data))
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
reporter.Publish(ctx, helps.ParseGeminiCLIUsage(data))
var param any
segments := sdktranslator.TranslateStream(respCtx, to, from, attemptModel, opts.OriginalRequest, reqBody, data, &param)
for i := range segments {
@@ -443,7 +451,7 @@ func (e *GeminiCLIExecutor) ExecuteStream(ctx context.Context, auth *cliproxyaut
}
if len(lastBody) > 0 {
appendAPIResponseChunk(ctx, e.cfg, lastBody)
helps.AppendAPIResponseChunk(ctx, e.cfg, lastBody)
}
if lastStatus == 0 {
lastStatus = 429
@@ -516,7 +524,8 @@ func (e *GeminiCLIExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.
reqHTTP.Header.Set("Authorization", "Bearer "+tok.AccessToken)
applyGeminiCLIHeaders(reqHTTP, baseModel)
reqHTTP.Header.Set("Accept", "application/json")
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
util.ApplyCustomHeadersFromAttrs(reqHTTP, auth.Attributes)
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: reqHTTP.Header.Clone(),
@@ -530,17 +539,19 @@ func (e *GeminiCLIExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.
resp, errDo := httpClient.Do(reqHTTP)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return cliproxyexecutor.Response{}, errDo
}
data, errRead := io.ReadAll(resp.Body)
_ = resp.Body.Close()
recordAPIResponseMetadata(ctx, e.cfg, resp.StatusCode, resp.Header.Clone())
if errClose := resp.Body.Close(); errClose != nil {
helps.LogWithRequestID(ctx).Errorf("response body close error: %v", errClose)
}
helps.RecordAPIResponseMetadata(ctx, e.cfg, resp.StatusCode, resp.Header.Clone())
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
return cliproxyexecutor.Response{}, errRead
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
if resp.StatusCode >= 200 && resp.StatusCode < 300 {
count := gjson.GetBytes(data, "totalTokens").Int()
translated := sdktranslator.TranslateTokenCount(respCtx, to, from, count, data)
@@ -611,7 +622,7 @@ func prepareGeminiCLITokenSource(ctx context.Context, cfg *config.Config, auth *
}
ctxToken := ctx
if httpClient := newProxyAwareHTTPClient(ctx, cfg, auth, 0); httpClient != nil {
if httpClient := helps.NewProxyAwareHTTPClient(ctx, cfg, auth, 0); httpClient != nil {
ctxToken = context.WithValue(ctxToken, oauth2.HTTPClient, httpClient)
}
@@ -707,7 +718,7 @@ func geminiOAuthMetadata(auth *cliproxyauth.Auth) map[string]any {
}
func newHTTPClient(ctx context.Context, cfg *config.Config, auth *cliproxyauth.Auth, timeout time.Duration) *http.Client {
return newProxyAwareHTTPClient(ctx, cfg, auth, timeout)
return helps.NewProxyAwareHTTPClient(ctx, cfg, auth, timeout)
}
func cloneMap(in map[string]any) map[string]any {

View File

@@ -13,6 +13,7 @@ import (
"strings"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
@@ -85,7 +86,7 @@ func (e *GeminiExecutor) HttpRequest(ctx context.Context, auth *cliproxyauth.Aut
if err := e.PrepareRequest(httpReq, auth); err != nil {
return nil, err
}
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
return httpClient.Do(httpReq)
}
@@ -110,8 +111,8 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
apiKey, bearer := geminiCreds(auth)
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
// Official Gemini API via API key or OAuth bearer
from := opts.SourceFormat
@@ -130,8 +131,8 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := "generateContent"
@@ -165,7 +166,7 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -177,10 +178,10 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
defer func() {
@@ -188,21 +189,21 @@ func (e *GeminiExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
log.Errorf("gemini executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
data, err := io.ReadAll(httpResp.Body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return resp, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseGeminiUsage(data))
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
reporter.Publish(ctx, helps.ParseGeminiUsage(data))
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, opts.OriginalRequest, body, data, &param)
resp = cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}
@@ -218,8 +219,8 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
apiKey, bearer := geminiCreds(auth)
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
@@ -237,8 +238,8 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
baseURL := resolveGeminiBaseURL(auth)
@@ -268,7 +269,7 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -280,17 +281,17 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return nil, err
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("gemini executor: close response body error: %v", errClose)
}
@@ -310,14 +311,14 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
var param any
for scanner.Scan() {
line := scanner.Bytes()
appendAPIResponseChunk(ctx, e.cfg, line)
filtered := FilterSSEUsageMetadata(line)
payload := jsonPayload(filtered)
helps.AppendAPIResponseChunk(ctx, e.cfg, line)
filtered := helps.FilterSSEUsageMetadata(line)
payload := helps.JSONPayload(filtered)
if len(payload) == 0 {
continue
}
if detail, ok := parseGeminiStreamUsage(payload); ok {
reporter.publish(ctx, detail)
if detail, ok := helps.ParseGeminiStreamUsage(payload); ok {
reporter.Publish(ctx, detail)
}
lines := sdktranslator.TranslateStream(ctx, to, from, req.Model, opts.OriginalRequest, body, bytes.Clone(payload), &param)
for i := range lines {
@@ -329,8 +330,8 @@ func (e *GeminiExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
out <- cliproxyexecutor.StreamChunk{Payload: lines[i]}
}
if errScan := scanner.Err(); errScan != nil {
recordAPIResponseError(ctx, e.cfg, errScan)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errScan)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errScan}
}
}()
@@ -381,7 +382,7 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -393,23 +394,27 @@ func (e *GeminiExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Aut
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
resp, err := httpClient.Do(httpReq)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return cliproxyexecutor.Response{}, err
}
defer func() { _ = resp.Body.Close() }()
recordAPIResponseMetadata(ctx, e.cfg, resp.StatusCode, resp.Header.Clone())
defer func() {
if errClose := resp.Body.Close(); errClose != nil {
helps.LogWithRequestID(ctx).Errorf("response body close error: %v", errClose)
}
}()
helps.RecordAPIResponseMetadata(ctx, e.cfg, resp.StatusCode, resp.Header.Clone())
data, err := io.ReadAll(resp.Body)
if err != nil {
recordAPIResponseError(ctx, e.cfg, err)
helps.RecordAPIResponseError(ctx, e.cfg, err)
return cliproxyexecutor.Response{}, err
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
if resp.StatusCode < 200 || resp.StatusCode >= 300 {
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", resp.StatusCode, summarizeErrorBody(resp.Header.Get("Content-Type"), data))
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", resp.StatusCode, helps.SummarizeErrorBody(resp.Header.Get("Content-Type"), data))
return cliproxyexecutor.Response{}, statusErr{code: resp.StatusCode, msg: string(data)}
}

View File

@@ -16,7 +16,9 @@ import (
vertexauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/vertex"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/router-for-me/CLIProxyAPI/v6/internal/util"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
@@ -227,7 +229,7 @@ func (e *GeminiVertexExecutor) HttpRequest(ctx context.Context, auth *cliproxyau
if err := e.PrepareRequest(httpReq, auth); err != nil {
return nil, err
}
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
return httpClient.Do(httpReq)
}
@@ -301,8 +303,8 @@ func (e *GeminiVertexExecutor) Refresh(_ context.Context, auth *cliproxyauth.Aut
func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, projectID, location string, saJSON []byte) (resp cliproxyexecutor.Response, err error) {
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
var body []byte
@@ -332,8 +334,8 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
}
@@ -362,6 +364,11 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
return resp, statusErr{code: 500, msg: "internal server error"}
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -369,7 +376,7 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -381,10 +388,10 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return resp, errDo
}
defer func() {
@@ -392,21 +399,21 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
log.Errorf("vertex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
data, errRead := io.ReadAll(httpResp.Body)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
return resp, errRead
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseGeminiUsage(data))
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
reporter.Publish(ctx, helps.ParseGeminiUsage(data))
// For Imagen models, convert response to Gemini format before translation
// This ensures Imagen responses use the same format as gemini-3-pro-image-preview
@@ -427,8 +434,8 @@ func (e *GeminiVertexExecutor) executeWithServiceAccount(ctx context.Context, au
func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, apiKey, baseURL string) (resp cliproxyexecutor.Response, err error) {
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
@@ -447,8 +454,8 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := getVertexAction(baseModel, false)
@@ -477,6 +484,11 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
httpReq.Header.Set("x-goog-api-key", apiKey)
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -484,7 +496,7 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -496,10 +508,10 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return resp, errDo
}
defer func() {
@@ -507,21 +519,21 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
log.Errorf("vertex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
err = statusErr{code: httpResp.StatusCode, msg: string(b)}
return resp, err
}
data, errRead := io.ReadAll(httpResp.Body)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
return resp, errRead
}
appendAPIResponseChunk(ctx, e.cfg, data)
reporter.publish(ctx, parseGeminiUsage(data))
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
reporter.Publish(ctx, helps.ParseGeminiUsage(data))
var param any
out := sdktranslator.TranslateNonStream(ctx, to, from, req.Model, opts.OriginalRequest, body, data, &param)
resp = cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}
@@ -532,8 +544,8 @@ func (e *GeminiVertexExecutor) executeWithAPIKey(ctx context.Context, auth *clip
func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, projectID, location string, saJSON []byte) (_ *cliproxyexecutor.StreamResult, err error) {
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
@@ -552,8 +564,8 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := getVertexAction(baseModel, true)
@@ -581,6 +593,11 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
return nil, statusErr{code: 500, msg: "internal server error"}
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -588,7 +605,7 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -600,17 +617,17 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return nil, errDo
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("vertex executor: close response body error: %v", errClose)
}
@@ -630,9 +647,9 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
var param any
for scanner.Scan() {
line := scanner.Bytes()
appendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := parseGeminiStreamUsage(line); ok {
reporter.publish(ctx, detail)
helps.AppendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := helps.ParseGeminiStreamUsage(line); ok {
reporter.Publish(ctx, detail)
}
lines := sdktranslator.TranslateStream(ctx, to, from, req.Model, opts.OriginalRequest, body, bytes.Clone(line), &param)
for i := range lines {
@@ -644,8 +661,8 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
out <- cliproxyexecutor.StreamChunk{Payload: lines[i]}
}
if errScan := scanner.Err(); errScan != nil {
recordAPIResponseError(ctx, e.cfg, errScan)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errScan)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errScan}
}
}()
@@ -656,8 +673,8 @@ func (e *GeminiVertexExecutor) executeStreamWithServiceAccount(ctx context.Conte
func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, apiKey, baseURL string) (_ *cliproxyexecutor.StreamResult, err error) {
baseModel := thinking.ParseSuffix(req.Model).ModelName
reporter := newUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.trackFailure(ctx, &err)
reporter := helps.NewUsageReporter(ctx, e.Identifier(), baseModel, auth)
defer reporter.TrackFailure(ctx, &err)
from := opts.SourceFormat
to := sdktranslator.FromString("gemini")
@@ -676,8 +693,8 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
}
body = fixGeminiImageAspectRatio(baseModel, body)
requestedModel := payloadRequestedModel(opts, req.Model)
body = applyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
requestedModel := helps.PayloadRequestedModel(opts, req.Model)
body = helps.ApplyPayloadConfigWithRoot(e.cfg, baseModel, to.String(), "", body, originalTranslated, requestedModel)
body, _ = sjson.SetBytes(body, "model", baseModel)
action := getVertexAction(baseModel, true)
@@ -705,6 +722,11 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
httpReq.Header.Set("x-goog-api-key", apiKey)
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -712,7 +734,7 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -724,17 +746,17 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return nil, errDo
}
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
if errClose := httpResp.Body.Close(); errClose != nil {
log.Errorf("vertex executor: close response body error: %v", errClose)
}
@@ -754,9 +776,9 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
var param any
for scanner.Scan() {
line := scanner.Bytes()
appendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := parseGeminiStreamUsage(line); ok {
reporter.publish(ctx, detail)
helps.AppendAPIResponseChunk(ctx, e.cfg, line)
if detail, ok := helps.ParseGeminiStreamUsage(line); ok {
reporter.Publish(ctx, detail)
}
lines := sdktranslator.TranslateStream(ctx, to, from, req.Model, opts.OriginalRequest, body, bytes.Clone(line), &param)
for i := range lines {
@@ -768,8 +790,8 @@ func (e *GeminiVertexExecutor) executeStreamWithAPIKey(ctx context.Context, auth
out <- cliproxyexecutor.StreamChunk{Payload: lines[i]}
}
if errScan := scanner.Err(); errScan != nil {
recordAPIResponseError(ctx, e.cfg, errScan)
reporter.publishFailure(ctx)
helps.RecordAPIResponseError(ctx, e.cfg, errScan)
reporter.PublishFailure(ctx)
out <- cliproxyexecutor.StreamChunk{Err: errScan}
}
}()
@@ -812,6 +834,11 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
return cliproxyexecutor.Response{}, statusErr{code: 500, msg: "internal server error"}
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -819,7 +846,7 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -831,10 +858,10 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return cliproxyexecutor.Response{}, errDo
}
defer func() {
@@ -842,19 +869,19 @@ func (e *GeminiVertexExecutor) countTokensWithServiceAccount(ctx context.Context
log.Errorf("vertex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
return cliproxyexecutor.Response{}, statusErr{code: httpResp.StatusCode, msg: string(b)}
}
data, errRead := io.ReadAll(httpResp.Body)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
return cliproxyexecutor.Response{}, errRead
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
count := gjson.GetBytes(data, "totalTokens").Int()
out := sdktranslator.TranslateTokenCount(ctx, to, from, count, data)
return cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}, nil
@@ -896,6 +923,11 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
httpReq.Header.Set("x-goog-api-key", apiKey)
}
applyGeminiHeaders(httpReq, auth)
var attrs map[string]string
if auth != nil {
attrs = auth.Attributes
}
util.ApplyCustomHeadersFromAttrs(httpReq, attrs)
var authID, authLabel, authType, authValue string
if auth != nil {
@@ -903,7 +935,7 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
authLabel = auth.Label
authType, authValue = auth.AccountInfo()
}
recordAPIRequest(ctx, e.cfg, upstreamRequestLog{
helps.RecordAPIRequest(ctx, e.cfg, helps.UpstreamRequestLog{
URL: url,
Method: http.MethodPost,
Headers: httpReq.Header.Clone(),
@@ -915,10 +947,10 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
AuthValue: authValue,
})
httpClient := newProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpClient := helps.NewProxyAwareHTTPClient(ctx, e.cfg, auth, 0)
httpResp, errDo := httpClient.Do(httpReq)
if errDo != nil {
recordAPIResponseError(ctx, e.cfg, errDo)
helps.RecordAPIResponseError(ctx, e.cfg, errDo)
return cliproxyexecutor.Response{}, errDo
}
defer func() {
@@ -926,19 +958,19 @@ func (e *GeminiVertexExecutor) countTokensWithAPIKey(ctx context.Context, auth *
log.Errorf("vertex executor: close response body error: %v", errClose)
}
}()
recordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
helps.RecordAPIResponseMetadata(ctx, e.cfg, httpResp.StatusCode, httpResp.Header.Clone())
if httpResp.StatusCode < 200 || httpResp.StatusCode >= 300 {
b, _ := io.ReadAll(httpResp.Body)
appendAPIResponseChunk(ctx, e.cfg, b)
logWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, summarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
helps.AppendAPIResponseChunk(ctx, e.cfg, b)
helps.LogWithRequestID(ctx).Debugf("request error, error status: %d, error message: %s", httpResp.StatusCode, helps.SummarizeErrorBody(httpResp.Header.Get("Content-Type"), b))
return cliproxyexecutor.Response{}, statusErr{code: httpResp.StatusCode, msg: string(b)}
}
data, errRead := io.ReadAll(httpResp.Body)
if errRead != nil {
recordAPIResponseError(ctx, e.cfg, errRead)
helps.RecordAPIResponseError(ctx, e.cfg, errRead)
return cliproxyexecutor.Response{}, errRead
}
appendAPIResponseChunk(ctx, e.cfg, data)
helps.AppendAPIResponseChunk(ctx, e.cfg, data)
count := gjson.GetBytes(data, "totalTokens").Int()
out := sdktranslator.TranslateTokenCount(ctx, to, from, count, data)
return cliproxyexecutor.Response{Payload: out, Headers: httpResp.Header.Clone()}, nil
@@ -1012,7 +1044,7 @@ func vertexBaseURL(location string) string {
}
func vertexAccessToken(ctx context.Context, cfg *config.Config, auth *cliproxyauth.Auth, saJSON []byte) (string, error) {
if httpClient := newProxyAwareHTTPClient(ctx, cfg, auth, 0); httpClient != nil {
if httpClient := helps.NewProxyAwareHTTPClient(ctx, cfg, auth, 0); httpClient != nil {
ctx = context.WithValue(ctx, oauth2.HTTPClient, httpClient)
}
// Use cloud-platform scope for Vertex AI.

View File

@@ -7,6 +7,7 @@ import (
"fmt"
"io"
"net/http"
"slices"
"strings"
"sync"
"time"
@@ -15,6 +16,7 @@ import (
copilotauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/copilot"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/executor/helps"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
@@ -40,7 +42,7 @@ const (
copilotEditorVersion = "vscode/1.107.0"
copilotPluginVersion = "copilot-chat/0.35.0"
copilotIntegrationID = "vscode-chat"
copilotOpenAIIntent = "conversation-panel"
copilotOpenAIIntent = "conversation-edits"
copilotGitHubAPIVer = "2025-04-01"
)
@@ -104,6 +106,12 @@ func (e *GitHubCopilotExecutor) HttpRequest(ctx context.Context, auth *cliproxya
// Execute handles non-streaming requests to GitHub Copilot.
func (e *GitHubCopilotExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (resp cliproxyexecutor.Response, err error) {
if nativeExec, nativeAuth, nativeReq, ok, errGateway := e.nativeGateway(ctx, auth, req); errGateway != nil {
return resp, errGateway
} else if ok {
return nativeExec.Execute(ctx, nativeAuth, nativeReq, opts)
}
apiToken, baseURL, errToken := e.ensureAPIToken(ctx, auth)
if errToken != nil {
return resp, errToken
@@ -126,6 +134,7 @@ func (e *GitHubCopilotExecutor) Execute(ctx context.Context, auth *cliproxyauth.
body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
body = e.normalizeModel(req.Model, body)
body = flattenAssistantContent(body)
body = stripUnsupportedBetas(body)
// Detect vision content before input normalization removes messages
hasVision := detectVisionContent(body)
@@ -142,6 +151,7 @@ func (e *GitHubCopilotExecutor) Execute(ctx context.Context, auth *cliproxyauth.
if useResponses {
body = normalizeGitHubCopilotResponsesInput(body)
body = normalizeGitHubCopilotResponsesTools(body)
body = applyGitHubCopilotResponsesDefaults(body)
} else {
body = normalizeGitHubCopilotChatTools(body)
}
@@ -225,15 +235,22 @@ func (e *GitHubCopilotExecutor) Execute(ctx context.Context, auth *cliproxyauth.
if useResponses && from.String() == "claude" {
converted = translateGitHubCopilotResponsesNonStreamToClaude(data)
} else {
data = normalizeGitHubCopilotReasoningField(data)
converted = sdktranslator.TranslateNonStream(ctx, to, from, req.Model, bytes.Clone(opts.OriginalRequest), body, data, &param)
}
resp = cliproxyexecutor.Response{Payload: converted}
resp = cliproxyexecutor.Response{Payload: converted, Headers: httpResp.Header.Clone()}
reporter.ensurePublished(ctx)
return resp, nil
}
// ExecuteStream handles streaming requests to GitHub Copilot.
func (e *GitHubCopilotExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (_ *cliproxyexecutor.StreamResult, err error) {
if nativeExec, nativeAuth, nativeReq, ok, errGateway := e.nativeGateway(ctx, auth, req); errGateway != nil {
return nil, errGateway
} else if ok {
return nativeExec.ExecuteStream(ctx, nativeAuth, nativeReq, opts)
}
apiToken, baseURL, errToken := e.ensureAPIToken(ctx, auth)
if errToken != nil {
return nil, errToken
@@ -256,6 +273,7 @@ func (e *GitHubCopilotExecutor) ExecuteStream(ctx context.Context, auth *cliprox
body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), true)
body = e.normalizeModel(req.Model, body)
body = flattenAssistantContent(body)
body = stripUnsupportedBetas(body)
// Detect vision content before input normalization removes messages
hasVision := detectVisionContent(body)
@@ -272,6 +290,7 @@ func (e *GitHubCopilotExecutor) ExecuteStream(ctx context.Context, auth *cliprox
if useResponses {
body = normalizeGitHubCopilotResponsesInput(body)
body = normalizeGitHubCopilotResponsesTools(body)
body = applyGitHubCopilotResponsesDefaults(body)
} else {
body = normalizeGitHubCopilotChatTools(body)
}
@@ -378,7 +397,20 @@ func (e *GitHubCopilotExecutor) ExecuteStream(ctx context.Context, auth *cliprox
if useResponses && from.String() == "claude" {
chunks = translateGitHubCopilotResponsesStreamToClaude(bytes.Clone(line), &param)
} else {
chunks = sdktranslator.TranslateStream(ctx, to, from, req.Model, bytes.Clone(opts.OriginalRequest), body, bytes.Clone(line), &param)
// Strip SSE "data: " prefix before reasoning field normalization,
// since normalizeGitHubCopilotReasoningField expects pure JSON.
// Re-wrap with the prefix afterward for the translator.
normalizedLine := bytes.Clone(line)
if bytes.HasPrefix(line, dataTag) {
sseData := bytes.TrimSpace(line[len(dataTag):])
if !bytes.Equal(sseData, []byte("[DONE]")) && gjson.ValidBytes(sseData) {
normalized := normalizeGitHubCopilotReasoningField(bytes.Clone(sseData))
if !bytes.Equal(normalized, sseData) {
normalizedLine = append(append([]byte(nil), dataTag...), normalized...)
}
}
}
chunks = sdktranslator.TranslateStream(ctx, to, from, req.Model, bytes.Clone(opts.OriginalRequest), body, normalizedLine, &param)
}
for i := range chunks {
out <- cliproxyexecutor.StreamChunk{Payload: bytes.Clone(chunks[i])}
@@ -400,9 +432,34 @@ func (e *GitHubCopilotExecutor) ExecuteStream(ctx context.Context, auth *cliprox
}, nil
}
// CountTokens is not supported for GitHub Copilot.
func (e *GitHubCopilotExecutor) CountTokens(_ context.Context, _ *cliproxyauth.Auth, _ cliproxyexecutor.Request, _ cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
return cliproxyexecutor.Response{}, statusErr{code: http.StatusNotImplemented, msg: "count tokens not supported for github-copilot"}
// CountTokens estimates token count locally using tiktoken, since the GitHub
// Copilot API does not expose a dedicated token counting endpoint.
func (e *GitHubCopilotExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
if nativeExec, nativeAuth, nativeReq, ok, errGateway := e.nativeGateway(ctx, auth, req); errGateway != nil {
return cliproxyexecutor.Response{}, errGateway
} else if ok {
return nativeExec.CountTokens(ctx, nativeAuth, nativeReq, opts)
}
baseModel := thinking.ParseSuffix(req.Model).ModelName
from := opts.SourceFormat
to := sdktranslator.FromString("openai")
translated := sdktranslator.TranslateRequest(from, to, baseModel, req.Payload, false)
enc, err := helps.TokenizerForModel(baseModel)
if err != nil {
return cliproxyexecutor.Response{}, fmt.Errorf("github copilot executor: tokenizer init failed: %w", err)
}
count, err := helps.CountOpenAIChatTokens(enc, translated)
if err != nil {
return cliproxyexecutor.Response{}, fmt.Errorf("github copilot executor: token counting failed: %w", err)
}
usageJSON := helps.BuildOpenAIUsageJSON(count)
translatedUsage := sdktranslator.TranslateTokenCount(ctx, to, from, count, usageJSON)
return cliproxyexecutor.Response{Payload: translatedUsage}, nil
}
// Refresh validates the GitHub token is still working.
@@ -428,6 +485,70 @@ func (e *GitHubCopilotExecutor) Refresh(ctx context.Context, auth *cliproxyauth.
return auth, nil
}
func (e *GitHubCopilotExecutor) nativeGateway(
ctx context.Context,
auth *cliproxyauth.Auth,
req cliproxyexecutor.Request,
) (cliproxyauth.ProviderExecutor, *cliproxyauth.Auth, cliproxyexecutor.Request, bool, error) {
if !githubCopilotUsesAnthropicGateway(req.Model) {
return nil, nil, req, false, nil
}
if auth == nil || metaStringValue(auth.Metadata, "access_token") == "" {
return nil, nil, req, false, nil
}
apiToken, baseURL, err := e.ensureAPIToken(ctx, auth)
if err != nil {
return nil, nil, req, false, err
}
nativeAuth := buildCopilotAnthropicGatewayAuth(auth, apiToken, baseURL, req.Payload)
if nativeAuth == nil {
return nil, nil, req, false, nil
}
return NewClaudeExecutor(e.cfg), nativeAuth, req, true, nil
}
func githubCopilotUsesAnthropicGateway(model string) bool {
baseModel := strings.ToLower(thinking.ParseSuffix(model).ModelName)
return strings.HasPrefix(baseModel, "claude-")
}
func buildCopilotAnthropicGatewayAuth(auth *cliproxyauth.Auth, apiToken, baseURL string, body []byte) *cliproxyauth.Auth {
apiToken = strings.TrimSpace(apiToken)
baseURL = strings.TrimRight(strings.TrimSpace(baseURL), "/")
if apiToken == "" || baseURL == "" {
return nil
}
nativeAuth := auth.Clone()
if nativeAuth == nil {
nativeAuth = &cliproxyauth.Auth{}
}
nativeAuth.Provider = "claude"
if nativeAuth.Attributes == nil {
nativeAuth.Attributes = make(map[string]string)
}
nativeAuth.Attributes["api_key"] = apiToken
nativeAuth.Attributes["base_url"] = baseURL
nativeAuth.Attributes["header:Content-Type"] = "application/json"
nativeAuth.Attributes["header:Accept"] = "application/json"
nativeAuth.Attributes["header:User-Agent"] = copilotUserAgent
nativeAuth.Attributes["header:Editor-Version"] = copilotEditorVersion
nativeAuth.Attributes["header:Editor-Plugin-Version"] = copilotPluginVersion
nativeAuth.Attributes["header:Openai-Intent"] = copilotOpenAIIntent
nativeAuth.Attributes["header:Copilot-Integration-Id"] = copilotIntegrationID
nativeAuth.Attributes["header:X-Github-Api-Version"] = copilotGitHubAPIVer
nativeAuth.Attributes["header:X-Request-Id"] = uuid.NewString()
if isAgentInitiated(body) {
nativeAuth.Attributes["header:X-Initiator"] = "agent"
} else {
nativeAuth.Attributes["header:X-Initiator"] = "user"
}
if detectVisionContent(body) {
nativeAuth.Attributes["header:Copilot-Vision-Request"] = "true"
}
return nativeAuth
}
// ensureAPIToken gets or refreshes the Copilot API token.
func (e *GitHubCopilotExecutor) ensureAPIToken(ctx context.Context, auth *cliproxyauth.Auth) (string, string, error) {
if auth == nil {
@@ -491,46 +612,127 @@ func (e *GitHubCopilotExecutor) applyHeaders(r *http.Request, apiToken string, b
r.Header.Set("X-Request-Id", uuid.NewString())
initiator := "user"
if role := detectLastConversationRole(body); role == "assistant" || role == "tool" {
if isAgentInitiated(body) {
initiator = "agent"
}
r.Header.Set("X-Initiator", initiator)
}
func detectLastConversationRole(body []byte) string {
// isAgentInitiated determines whether the current request is agent-initiated
// (tool callbacks, continuations) rather than user-initiated (new user prompt).
//
// GitHub Copilot uses the X-Initiator header for billing:
// - "user" → consumes premium request quota
// - "agent" → free (tool loops, continuations)
//
// The challenge: Claude Code sends tool results as role:"user" messages with
// content type "tool_result". After translation to OpenAI format, the tool_result
// part becomes a separate role:"tool" message, but if the original Claude message
// also contained text content (e.g. skill invocations, attachment descriptions),
// a role:"user" message is emitted AFTER the tool message, making the last message
// appear user-initiated when it's actually part of an agent tool loop.
//
// VSCode Copilot Chat solves this with explicit flags (iterationNumber,
// isContinuation, subAgentInvocationId). Since CPA doesn't have these flags,
// we infer agent status by checking whether the conversation contains prior
// assistant/tool messages — if it does, the current request is a continuation.
//
// References:
// - opencode#8030, opencode#15824: same root cause and fix approach
// - vscode-copilot-chat: toolCallingLoop.ts (iterationNumber === 0)
// - pi-ai: github-copilot-headers.ts (last message role check)
func isAgentInitiated(body []byte) bool {
if len(body) == 0 {
return ""
return false
}
// Chat Completions API: check messages array
if messages := gjson.GetBytes(body, "messages"); messages.Exists() && messages.IsArray() {
arr := messages.Array()
if len(arr) == 0 {
return false
}
lastRole := ""
for i := len(arr) - 1; i >= 0; i-- {
if role := arr[i].Get("role").String(); role != "" {
return role
if r := arr[i].Get("role").String(); r != "" {
lastRole = r
break
}
}
// If last message is assistant or tool, clearly agent-initiated.
if lastRole == "assistant" || lastRole == "tool" {
return true
}
// If last message is "user", check whether it contains tool results
// (indicating a tool-loop continuation) or if the preceding message
// is an assistant tool_use. This is more precise than checking for
// any prior assistant message, which would false-positive on genuine
// multi-turn follow-ups.
if lastRole == "user" {
// Check if the last user message contains tool_result content
lastContent := arr[len(arr)-1].Get("content")
if lastContent.Exists() && lastContent.IsArray() {
for _, part := range lastContent.Array() {
if part.Get("type").String() == "tool_result" {
return true
}
}
}
// Check if the second-to-last message is an assistant with tool_use
if len(arr) >= 2 {
prev := arr[len(arr)-2]
if prev.Get("role").String() == "assistant" {
prevContent := prev.Get("content")
if prevContent.Exists() && prevContent.IsArray() {
for _, part := range prevContent.Array() {
if part.Get("type").String() == "tool_use" {
return true
}
}
}
}
}
}
return false
}
// Responses API: check input array
if inputs := gjson.GetBytes(body, "input"); inputs.Exists() && inputs.IsArray() {
arr := inputs.Array()
for i := len(arr) - 1; i >= 0; i-- {
item := arr[i]
if len(arr) == 0 {
return false
}
// Most Responses input items carry a top-level role.
if role := item.Get("role").String(); role != "" {
return role
// Check last item
last := arr[len(arr)-1]
if role := last.Get("role").String(); role == "assistant" {
return true
}
switch last.Get("type").String() {
case "function_call", "function_call_arguments", "computer_call":
return true
case "function_call_output", "function_call_response", "tool_result", "computer_call_output":
return true
}
// If last item is user-role, check for prior non-user items
for _, item := range arr {
if role := item.Get("role").String(); role == "assistant" {
return true
}
switch item.Get("type").String() {
case "function_call", "function_call_arguments", "computer_call":
return "assistant"
case "function_call_output", "function_call_response", "tool_result", "computer_call_output":
return "tool"
case "function_call", "function_call_output", "function_call_response",
"function_call_arguments", "computer_call", "computer_call_output":
return true
}
}
}
return ""
return false
}
// detectVisionContent checks if the request body contains vision/image content.
@@ -572,6 +774,85 @@ func (e *GitHubCopilotExecutor) normalizeModel(model string, body []byte) []byte
return body
}
// copilotUnsupportedBetas lists beta headers that are Anthropic-specific and
// must not be forwarded to GitHub Copilot. The context-1m beta enables 1M
// context on Anthropic's API, but Copilot's Claude models are limited to
// ~128K-200K. Passing it through would not enable 1M on Copilot, but stripping
// it from the translated body avoids confusing downstream translators.
var copilotUnsupportedBetas = []string{
"context-1m-2025-08-07",
}
// stripUnsupportedBetas removes Anthropic-specific beta entries from the
// translated request body. In OpenAI format the betas may appear under
// "metadata.betas" or a top-level "betas" array; in Claude format they sit at
// "betas". This function checks all known locations.
func stripUnsupportedBetas(body []byte) []byte {
betaPaths := []string{"betas", "metadata.betas"}
for _, path := range betaPaths {
arr := gjson.GetBytes(body, path)
if !arr.Exists() || !arr.IsArray() {
continue
}
var filtered []string
changed := false
for _, item := range arr.Array() {
beta := item.String()
if isCopilotUnsupportedBeta(beta) {
changed = true
continue
}
filtered = append(filtered, beta)
}
if !changed {
continue
}
if len(filtered) == 0 {
body, _ = sjson.DeleteBytes(body, path)
} else {
body, _ = sjson.SetBytes(body, path, filtered)
}
}
return body
}
func isCopilotUnsupportedBeta(beta string) bool {
return slices.Contains(copilotUnsupportedBetas, beta)
}
// normalizeGitHubCopilotReasoningField maps Copilot's non-standard
// 'reasoning_text' field to the standard OpenAI 'reasoning_content' field
// that the SDK translator expects. This handles both streaming deltas
// (choices[].delta.reasoning_text) and non-streaming messages
// (choices[].message.reasoning_text). The field is only renamed when
// 'reasoning_content' is absent or null, preserving standard responses.
// All choices are processed to support n>1 requests.
func normalizeGitHubCopilotReasoningField(data []byte) []byte {
choices := gjson.GetBytes(data, "choices")
if !choices.Exists() || !choices.IsArray() {
return data
}
for i := range choices.Array() {
// Non-streaming: choices[i].message.reasoning_text
msgRT := fmt.Sprintf("choices.%d.message.reasoning_text", i)
msgRC := fmt.Sprintf("choices.%d.message.reasoning_content", i)
if rt := gjson.GetBytes(data, msgRT); rt.Exists() && rt.String() != "" {
if rc := gjson.GetBytes(data, msgRC); !rc.Exists() || rc.Type == gjson.Null || rc.String() == "" {
data, _ = sjson.SetBytes(data, msgRC, rt.String())
}
}
// Streaming: choices[i].delta.reasoning_text
deltaRT := fmt.Sprintf("choices.%d.delta.reasoning_text", i)
deltaRC := fmt.Sprintf("choices.%d.delta.reasoning_content", i)
if rt := gjson.GetBytes(data, deltaRT); rt.Exists() && rt.String() != "" {
if rc := gjson.GetBytes(data, deltaRC); !rc.Exists() || rc.Type == gjson.Null || rc.String() == "" {
data, _ = sjson.SetBytes(data, deltaRC, rt.String())
}
}
}
return data
}
func useGitHubCopilotResponsesEndpoint(sourceFormat sdktranslator.Format, model string) bool {
if sourceFormat.String() == "openai-response" {
return true
@@ -596,12 +877,7 @@ func lookupGitHubCopilotStaticModelInfo(model string) *registry.ModelInfo {
}
func containsEndpoint(endpoints []string, endpoint string) bool {
for _, item := range endpoints {
if item == endpoint {
return true
}
}
return false
return slices.Contains(endpoints, endpoint)
}
// flattenAssistantContent converts assistant message content from array format
@@ -856,6 +1132,32 @@ func stripGitHubCopilotResponsesUnsupportedFields(body []byte) []byte {
return body
}
// applyGitHubCopilotResponsesDefaults sets required fields for the Responses API
// that both vscode-copilot-chat and pi-ai always include.
//
// References:
// - vscode-copilot-chat: src/platform/endpoint/node/responsesApi.ts
// - pi-ai (badlogic/pi-mono): packages/ai/src/providers/openai-responses.ts
func applyGitHubCopilotResponsesDefaults(body []byte) []byte {
// store: false — prevents request/response storage
if !gjson.GetBytes(body, "store").Exists() {
body, _ = sjson.SetBytes(body, "store", false)
}
// include: ["reasoning.encrypted_content"] — enables reasoning content
// reuse across turns, avoiding redundant computation
if !gjson.GetBytes(body, "include").Exists() {
body, _ = sjson.SetRawBytes(body, "include", []byte(`["reasoning.encrypted_content"]`))
}
// If reasoning.effort is set but reasoning.summary is not, default to "auto"
if gjson.GetBytes(body, "reasoning.effort").Exists() && !gjson.GetBytes(body, "reasoning.summary").Exists() {
body, _ = sjson.SetBytes(body, "reasoning.summary", "auto")
}
return body
}
func normalizeGitHubCopilotResponsesTools(body []byte) []byte {
tools := gjson.GetBytes(body, "tools")
if tools.Exists() {
@@ -1406,6 +1708,21 @@ func FetchGitHubCopilotModels(ctx context.Context, auth *cliproxyauth.Auth, cfg
m.MaxCompletionTokens = defaultCopilotMaxCompletionTokens
}
// Override with real limits from the Copilot API when available.
// The API returns per-account limits (individual vs business) under
// capabilities.limits, which are more accurate than our static
// fallback values. We use max_prompt_tokens as ContextLength because
// that's the hard limit the Copilot API enforces on prompt size —
// exceeding it triggers "prompt token count exceeds the limit" errors.
if limits := entry.Limits(); limits != nil {
if limits.MaxPromptTokens > 0 {
m.ContextLength = limits.MaxPromptTokens
}
if limits.MaxOutputTokens > 0 {
m.MaxCompletionTokens = limits.MaxOutputTokens
}
}
models = append(models, m)
}

View File

@@ -1,11 +1,19 @@
package executor
import (
"context"
"io"
"net/http"
"net/http/httptest"
"strings"
"testing"
"time"
copilotauth "github.com/router-for-me/CLIProxyAPI/v6/internal/auth/copilot"
"github.com/router-for-me/CLIProxyAPI/v6/internal/config"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
cliproxyauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
sdktranslator "github.com/router-for-me/CLIProxyAPI/v6/sdk/translator"
"github.com/tidwall/gjson"
)
@@ -72,26 +80,39 @@ func TestUseGitHubCopilotResponsesEndpoint_CodexModel(t *testing.T) {
}
func TestUseGitHubCopilotResponsesEndpoint_RegistryResponsesOnlyModel(t *testing.T) {
t.Parallel()
// Not parallel: shares global model registry with DynamicRegistryWinsOverStatic.
if !useGitHubCopilotResponsesEndpoint(sdktranslator.FromString("openai"), "gpt-5.4") {
t.Fatal("expected responses-only registry model to use /responses")
}
if !useGitHubCopilotResponsesEndpoint(sdktranslator.FromString("openai"), "gpt-5.4-mini") {
t.Fatal("expected responses-only registry model to use /responses")
}
}
func TestUseGitHubCopilotResponsesEndpoint_DynamicRegistryWinsOverStatic(t *testing.T) {
t.Parallel()
// Not parallel: mutates global model registry, conflicts with RegistryResponsesOnlyModel.
reg := registry.GetGlobalRegistry()
clientID := "github-copilot-test-client"
reg.RegisterClient(clientID, "github-copilot", []*registry.ModelInfo{{
ID: "gpt-5.4",
SupportedEndpoints: []string{"/chat/completions", "/responses"},
}})
reg.RegisterClient(clientID, "github-copilot", []*registry.ModelInfo{
{
ID: "gpt-5.4",
SupportedEndpoints: []string{"/chat/completions", "/responses"},
},
{
ID: "gpt-5.4-mini",
SupportedEndpoints: []string{"/chat/completions", "/responses"},
},
})
defer reg.UnregisterClient(clientID)
if useGitHubCopilotResponsesEndpoint(sdktranslator.FromString("openai"), "gpt-5.4") {
t.Fatal("expected dynamic registry definition to take precedence over static fallback")
}
if useGitHubCopilotResponsesEndpoint(sdktranslator.FromString("openai"), "gpt-5.4-mini") {
t.Fatal("expected dynamic registry definition to take precedence over static fallback")
}
}
func TestUseGitHubCopilotResponsesEndpoint_DefaultChat(t *testing.T) {
@@ -238,14 +259,14 @@ func TestTranslateGitHubCopilotResponsesNonStreamToClaude_TextMapping(t *testing
t.Parallel()
resp := []byte(`{"id":"resp_1","model":"gpt-5-codex","output":[{"type":"message","content":[{"type":"output_text","text":"hello"}]}],"usage":{"input_tokens":3,"output_tokens":5}}`)
out := translateGitHubCopilotResponsesNonStreamToClaude(resp)
if gjson.Get(out, "type").String() != "message" {
t.Fatalf("type = %q, want message", gjson.Get(out, "type").String())
if gjson.GetBytes(out, "type").String() != "message" {
t.Fatalf("type = %q, want message", gjson.GetBytes(out, "type").String())
}
if gjson.Get(out, "content.0.type").String() != "text" {
t.Fatalf("content.0.type = %q, want text", gjson.Get(out, "content.0.type").String())
if gjson.GetBytes(out, "content.0.type").String() != "text" {
t.Fatalf("content.0.type = %q, want text", gjson.GetBytes(out, "content.0.type").String())
}
if gjson.Get(out, "content.0.text").String() != "hello" {
t.Fatalf("content.0.text = %q, want hello", gjson.Get(out, "content.0.text").String())
if gjson.GetBytes(out, "content.0.text").String() != "hello" {
t.Fatalf("content.0.text = %q, want hello", gjson.GetBytes(out, "content.0.text").String())
}
}
@@ -253,14 +274,14 @@ func TestTranslateGitHubCopilotResponsesNonStreamToClaude_ToolUseMapping(t *test
t.Parallel()
resp := []byte(`{"id":"resp_2","model":"gpt-5-codex","output":[{"type":"function_call","id":"fc_1","call_id":"call_1","name":"sum","arguments":"{\"a\":1}"}],"usage":{"input_tokens":1,"output_tokens":2}}`)
out := translateGitHubCopilotResponsesNonStreamToClaude(resp)
if gjson.Get(out, "content.0.type").String() != "tool_use" {
t.Fatalf("content.0.type = %q, want tool_use", gjson.Get(out, "content.0.type").String())
if gjson.GetBytes(out, "content.0.type").String() != "tool_use" {
t.Fatalf("content.0.type = %q, want tool_use", gjson.GetBytes(out, "content.0.type").String())
}
if gjson.Get(out, "content.0.name").String() != "sum" {
t.Fatalf("content.0.name = %q, want sum", gjson.Get(out, "content.0.name").String())
if gjson.GetBytes(out, "content.0.name").String() != "sum" {
t.Fatalf("content.0.name = %q, want sum", gjson.GetBytes(out, "content.0.name").String())
}
if gjson.Get(out, "stop_reason").String() != "tool_use" {
t.Fatalf("stop_reason = %q, want tool_use", gjson.Get(out, "stop_reason").String())
if gjson.GetBytes(out, "stop_reason").String() != "tool_use" {
t.Fatalf("stop_reason = %q, want tool_use", gjson.GetBytes(out, "stop_reason").String())
}
}
@@ -269,18 +290,24 @@ func TestTranslateGitHubCopilotResponsesStreamToClaude_TextLifecycle(t *testing.
var param any
created := translateGitHubCopilotResponsesStreamToClaude([]byte(`data: {"type":"response.created","response":{"id":"resp_1","model":"gpt-5-codex"}}`), &param)
if len(created) == 0 || !strings.Contains(created[0], "message_start") {
if len(created) == 0 || !strings.Contains(string(created[0]), "message_start") {
t.Fatalf("created events = %#v, want message_start", created)
}
delta := translateGitHubCopilotResponsesStreamToClaude([]byte(`data: {"type":"response.output_text.delta","delta":"he"}`), &param)
joinedDelta := strings.Join(delta, "")
var joinedDelta string
for _, d := range delta {
joinedDelta += string(d)
}
if !strings.Contains(joinedDelta, "content_block_start") || !strings.Contains(joinedDelta, "text_delta") {
t.Fatalf("delta events = %#v, want content_block_start + text_delta", delta)
}
completed := translateGitHubCopilotResponsesStreamToClaude([]byte(`data: {"type":"response.completed","response":{"usage":{"input_tokens":7,"output_tokens":9}}}`), &param)
joinedCompleted := strings.Join(completed, "")
var joinedCompleted string
for _, c := range completed {
joinedCompleted += string(c)
}
if !strings.Contains(joinedCompleted, "message_delta") || !strings.Contains(joinedCompleted, "message_stop") {
t.Fatalf("completed events = %#v, want message_delta + message_stop", completed)
}
@@ -299,15 +326,17 @@ func TestApplyHeaders_XInitiator_UserOnly(t *testing.T) {
}
}
func TestApplyHeaders_XInitiator_UserWhenLastRoleIsUser(t *testing.T) {
func TestApplyHeaders_XInitiator_AgentWhenLastUserButHistoryHasAssistant(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
// Last role governs the initiator decision.
body := []byte(`{"messages":[{"role":"user","content":"hello"},{"role":"assistant","content":"I will read the file"},{"role":"user","content":"tool result here"}]}`)
// When the last role is "user" and the message contains tool_result content,
// the request is a continuation (e.g. Claude tool result translated to a
// synthetic user message). Should be "agent".
body := []byte(`{"messages":[{"role":"user","content":"hello"},{"role":"assistant","content":"I will read the file"},{"role":"user","content":[{"type":"tool_result","tool_use_id":"tu1","content":"file contents..."}]}]}`)
e.applyHeaders(req, "token", body)
if got := req.Header.Get("X-Initiator"); got != "user" {
t.Fatalf("X-Initiator = %q, want user (last role is user)", got)
if got := req.Header.Get("X-Initiator"); got != "agent" {
t.Fatalf("X-Initiator = %q, want agent (last user contains tool_result)", got)
}
}
@@ -315,10 +344,11 @@ func TestApplyHeaders_XInitiator_AgentWithToolRole(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
// When the last message has role "tool", it's clearly agent-initiated.
body := []byte(`{"messages":[{"role":"user","content":"hello"},{"role":"tool","content":"result"}]}`)
e.applyHeaders(req, "token", body)
if got := req.Header.Get("X-Initiator"); got != "agent" {
t.Fatalf("X-Initiator = %q, want agent (tool role exists)", got)
t.Fatalf("X-Initiator = %q, want agent (last role is tool)", got)
}
}
@@ -333,14 +363,15 @@ func TestApplyHeaders_XInitiator_InputArrayLastAssistantMessage(t *testing.T) {
}
}
func TestApplyHeaders_XInitiator_InputArrayLastUserMessage(t *testing.T) {
func TestApplyHeaders_XInitiator_InputArrayAgentWhenLastUserButHistoryHasAssistant(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
// Responses API: last item is user-role but history contains assistant → agent.
body := []byte(`{"input":[{"type":"message","role":"assistant","content":[{"type":"output_text","text":"I can help"}]},{"type":"message","role":"user","content":[{"type":"input_text","text":"Do X"}]}]}`)
e.applyHeaders(req, "token", body)
if got := req.Header.Get("X-Initiator"); got != "user" {
t.Fatalf("X-Initiator = %q, want user (last role is user)", got)
if got := req.Header.Get("X-Initiator"); got != "agent" {
t.Fatalf("X-Initiator = %q, want agent (history has assistant)", got)
}
}
@@ -355,6 +386,33 @@ func TestApplyHeaders_XInitiator_InputArrayLastFunctionCallOutput(t *testing.T)
}
}
func TestApplyHeaders_XInitiator_UserInMultiTurnNoTools(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
// Genuine multi-turn: user → assistant (plain text) → user follow-up.
// No tool messages → should be "user" (not a false-positive).
body := []byte(`{"messages":[{"role":"user","content":"hello"},{"role":"assistant","content":"Hi there!"},{"role":"user","content":"what is 2+2?"}]}`)
e.applyHeaders(req, "token", body)
if got := req.Header.Get("X-Initiator"); got != "user" {
t.Fatalf("X-Initiator = %q, want user (genuine multi-turn, no tools)", got)
}
}
func TestApplyHeaders_XInitiator_UserFollowUpAfterToolHistory(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
// User follow-up after a completed tool-use conversation.
// The last message is a genuine user question — should be "user", not "agent".
// This aligns with opencode's behavior: only active tool loops are agent-initiated.
body := []byte(`{"messages":[{"role":"user","content":"hello"},{"role":"assistant","content":[{"type":"tool_use","id":"tu1","name":"Read","input":{}}]},{"role":"tool","tool_call_id":"tu1","content":"file data"},{"role":"assistant","content":"I read the file."},{"role":"user","content":"What did we do so far?"}]}`)
e.applyHeaders(req, "token", body)
if got := req.Header.Get("X-Initiator"); got != "user" {
t.Fatalf("X-Initiator = %q, want user (genuine follow-up after tool history)", got)
}
}
// --- Tests for x-github-api-version header (Problem M) ---
func TestApplyHeaders_GitHubAPIVersion(t *testing.T) {
@@ -401,3 +459,502 @@ func TestDetectVisionContent_NoMessages(t *testing.T) {
t.Fatal("expected no vision content when messages field is absent")
}
}
// --- Tests for applyGitHubCopilotResponsesDefaults ---
func TestApplyGitHubCopilotResponsesDefaults_SetsAllDefaults(t *testing.T) {
t.Parallel()
body := []byte(`{"input":"hello","reasoning":{"effort":"medium"}}`)
got := applyGitHubCopilotResponsesDefaults(body)
if gjson.GetBytes(got, "store").Bool() != false {
t.Fatalf("store = %v, want false", gjson.GetBytes(got, "store").Raw)
}
inc := gjson.GetBytes(got, "include")
if !inc.IsArray() || inc.Array()[0].String() != "reasoning.encrypted_content" {
t.Fatalf("include = %s, want [\"reasoning.encrypted_content\"]", inc.Raw)
}
if gjson.GetBytes(got, "reasoning.summary").String() != "auto" {
t.Fatalf("reasoning.summary = %q, want auto", gjson.GetBytes(got, "reasoning.summary").String())
}
}
func TestApplyGitHubCopilotResponsesDefaults_DoesNotOverrideExisting(t *testing.T) {
t.Parallel()
body := []byte(`{"input":"hello","store":true,"include":["other"],"reasoning":{"effort":"high","summary":"concise"}}`)
got := applyGitHubCopilotResponsesDefaults(body)
if gjson.GetBytes(got, "store").Bool() != true {
t.Fatalf("store should not be overridden, got %s", gjson.GetBytes(got, "store").Raw)
}
if gjson.GetBytes(got, "include").Array()[0].String() != "other" {
t.Fatalf("include should not be overridden, got %s", gjson.GetBytes(got, "include").Raw)
}
if gjson.GetBytes(got, "reasoning.summary").String() != "concise" {
t.Fatalf("reasoning.summary should not be overridden, got %q", gjson.GetBytes(got, "reasoning.summary").String())
}
}
func TestApplyGitHubCopilotResponsesDefaults_NoReasoningEffort(t *testing.T) {
t.Parallel()
body := []byte(`{"input":"hello"}`)
got := applyGitHubCopilotResponsesDefaults(body)
if gjson.GetBytes(got, "store").Bool() != false {
t.Fatalf("store = %v, want false", gjson.GetBytes(got, "store").Raw)
}
// reasoning.summary should NOT be set when reasoning.effort is absent
if gjson.GetBytes(got, "reasoning.summary").Exists() {
t.Fatalf("reasoning.summary should not be set when reasoning.effort is absent, got %q", gjson.GetBytes(got, "reasoning.summary").String())
}
}
// --- Tests for normalizeGitHubCopilotReasoningField ---
func TestNormalizeReasoningField_NonStreaming(t *testing.T) {
t.Parallel()
data := []byte(`{"choices":[{"message":{"content":"hello","reasoning_text":"I think..."}}]}`)
got := normalizeGitHubCopilotReasoningField(data)
rc := gjson.GetBytes(got, "choices.0.message.reasoning_content").String()
if rc != "I think..." {
t.Fatalf("reasoning_content = %q, want %q", rc, "I think...")
}
}
func TestNormalizeReasoningField_Streaming(t *testing.T) {
t.Parallel()
data := []byte(`{"choices":[{"delta":{"reasoning_text":"thinking delta"}}]}`)
got := normalizeGitHubCopilotReasoningField(data)
rc := gjson.GetBytes(got, "choices.0.delta.reasoning_content").String()
if rc != "thinking delta" {
t.Fatalf("reasoning_content = %q, want %q", rc, "thinking delta")
}
}
func TestNormalizeReasoningField_PreservesExistingReasoningContent(t *testing.T) {
t.Parallel()
data := []byte(`{"choices":[{"message":{"reasoning_text":"old","reasoning_content":"existing"}}]}`)
got := normalizeGitHubCopilotReasoningField(data)
rc := gjson.GetBytes(got, "choices.0.message.reasoning_content").String()
if rc != "existing" {
t.Fatalf("reasoning_content = %q, want %q (should not overwrite)", rc, "existing")
}
}
func TestNormalizeReasoningField_MultiChoice(t *testing.T) {
t.Parallel()
data := []byte(`{"choices":[{"message":{"reasoning_text":"thought-0"}},{"message":{"reasoning_text":"thought-1"}}]}`)
got := normalizeGitHubCopilotReasoningField(data)
rc0 := gjson.GetBytes(got, "choices.0.message.reasoning_content").String()
rc1 := gjson.GetBytes(got, "choices.1.message.reasoning_content").String()
if rc0 != "thought-0" {
t.Fatalf("choices[0].reasoning_content = %q, want %q", rc0, "thought-0")
}
if rc1 != "thought-1" {
t.Fatalf("choices[1].reasoning_content = %q, want %q", rc1, "thought-1")
}
}
func TestNormalizeReasoningField_NoChoices(t *testing.T) {
t.Parallel()
data := []byte(`{"id":"chatcmpl-123"}`)
got := normalizeGitHubCopilotReasoningField(data)
if string(got) != string(data) {
t.Fatalf("expected no change, got %s", string(got))
}
}
func TestApplyHeaders_OpenAIIntentValue(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
req, _ := http.NewRequest(http.MethodPost, "https://example.com", nil)
e.applyHeaders(req, "token", nil)
if got := req.Header.Get("Openai-Intent"); got != "conversation-edits" {
t.Fatalf("Openai-Intent = %q, want conversation-edits", got)
}
}
// --- Tests for CountTokens (local tiktoken estimation) ---
func TestCountTokens_ReturnsPositiveCount(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
body := []byte(`{"model":"gpt-4o","messages":[{"role":"user","content":"Hello, world!"}]}`)
resp, err := e.CountTokens(context.Background(), nil, cliproxyexecutor.Request{
Model: "gpt-4o",
Payload: body,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai"),
})
if err != nil {
t.Fatalf("CountTokens() error: %v", err)
}
if len(resp.Payload) == 0 {
t.Fatal("CountTokens() returned empty payload")
}
// The response should contain a positive token count.
tokens := gjson.GetBytes(resp.Payload, "usage.prompt_tokens").Int()
if tokens <= 0 {
t.Fatalf("expected positive token count, got %d", tokens)
}
}
func TestCountTokens_ClaudeSourceFormatTranslates(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
body := []byte(`{"model":"claude-sonnet-4","messages":[{"role":"user","content":"Tell me a joke"}],"max_tokens":1024}`)
resp, err := e.CountTokens(context.Background(), nil, cliproxyexecutor.Request{
Model: "claude-sonnet-4",
Payload: body,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("claude"),
})
if err != nil {
t.Fatalf("CountTokens() error: %v", err)
}
// Claude source format → should get input_tokens in response
inputTokens := gjson.GetBytes(resp.Payload, "input_tokens").Int()
if inputTokens <= 0 {
// Fallback: check usage.prompt_tokens (depends on translator registration)
promptTokens := gjson.GetBytes(resp.Payload, "usage.prompt_tokens").Int()
if promptTokens <= 0 {
t.Fatalf("expected positive token count, got payload: %s", resp.Payload)
}
}
}
func TestGitHubCopilotExecute_ClaudeModelUsesNativeGateway(t *testing.T) {
t.Parallel()
var gotPath string
var gotQuery string
var gotAuth string
var gotAPIVersion string
var gotEditorVersion string
var gotIntent string
var gotInitiator string
var gotBody []byte
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
gotQuery = r.URL.RawQuery
gotAuth = r.Header.Get("Authorization")
gotAPIVersion = r.Header.Get("X-Github-Api-Version")
gotEditorVersion = r.Header.Get("Editor-Version")
gotIntent = r.Header.Get("Openai-Intent")
gotInitiator = r.Header.Get("X-Initiator")
gotBody, _ = io.ReadAll(r.Body)
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"id":"msg_1","type":"message","model":"claude-sonnet-4.6","role":"assistant","content":[{"type":"text","text":"ok"}],"usage":{"input_tokens":1,"output_tokens":1}}`))
}))
defer server.Close()
e := NewGitHubCopilotExecutor(&config.Config{})
e.cache["gh-access-token"] = &cachedAPIToken{
token: "copilot-api-token",
apiEndpoint: server.URL,
expiresAt: time.Now().Add(time.Hour),
}
auth := &cliproxyauth.Auth{Metadata: map[string]any{"access_token": "gh-access-token"}}
payload := []byte(`{"model":"claude-sonnet-4.6","max_tokens":256,"messages":[{"role":"user","content":"hello"}]}`)
resp, err := e.Execute(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-sonnet-4.6",
Payload: payload,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("claude"),
OriginalRequest: payload,
})
if err != nil {
t.Fatalf("Execute() error: %v", err)
}
if gotPath != "/v1/messages" {
t.Fatalf("path = %q, want %q", gotPath, "/v1/messages")
}
if gotQuery != "beta=true" {
t.Fatalf("query = %q, want %q", gotQuery, "beta=true")
}
if gotAuth != "Bearer copilot-api-token" {
t.Fatalf("Authorization = %q, want %q", gotAuth, "Bearer copilot-api-token")
}
if gotAPIVersion != copilotGitHubAPIVer {
t.Fatalf("X-Github-Api-Version = %q, want %q", gotAPIVersion, copilotGitHubAPIVer)
}
if gotEditorVersion != copilotEditorVersion {
t.Fatalf("Editor-Version = %q, want %q", gotEditorVersion, copilotEditorVersion)
}
if gotIntent != copilotOpenAIIntent {
t.Fatalf("Openai-Intent = %q, want %q", gotIntent, copilotOpenAIIntent)
}
if gotInitiator != "user" {
t.Fatalf("X-Initiator = %q, want %q", gotInitiator, "user")
}
if gjson.GetBytes(gotBody, "model").String() != "claude-sonnet-4.6" {
t.Fatalf("upstream model = %q, want %q", gjson.GetBytes(gotBody, "model").String(), "claude-sonnet-4.6")
}
if gjson.GetBytes(resp.Payload, "content.0.text").String() != "ok" {
t.Fatalf("response text = %q, want %q", gjson.GetBytes(resp.Payload, "content.0.text").String(), "ok")
}
}
func TestGitHubCopilotExecuteStream_ClaudeModelUsesNativeGateway(t *testing.T) {
t.Parallel()
var gotPath string
var gotInitiator string
var gotAPIVersion string
server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
gotInitiator = r.Header.Get("X-Initiator")
gotAPIVersion = r.Header.Get("X-Github-Api-Version")
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("event: message_start\ndata: {\"type\":\"message_start\",\"message\":{\"id\":\"msg_1\",\"type\":\"message\",\"role\":\"assistant\",\"model\":\"claude-sonnet-4.6\",\"content\":[],\"usage\":{\"input_tokens\":1,\"output_tokens\":0}}}\n\n"))
_, _ = w.Write([]byte("event: content_block_start\ndata: {\"type\":\"content_block_start\",\"index\":0,\"content_block\":{\"type\":\"text\",\"text\":\"\"}}\n\n"))
_, _ = w.Write([]byte("event: content_block_delta\ndata: {\"type\":\"content_block_delta\",\"index\":0,\"delta\":{\"type\":\"text_delta\",\"text\":\"ok\"}}\n\n"))
_, _ = w.Write([]byte("event: content_block_stop\ndata: {\"type\":\"content_block_stop\",\"index\":0}\n\n"))
_, _ = w.Write([]byte("event: message_delta\ndata: {\"type\":\"message_delta\",\"delta\":{\"stop_reason\":\"end_turn\"},\"usage\":{\"output_tokens\":1}}\n\n"))
_, _ = w.Write([]byte("event: message_stop\ndata: {\"type\":\"message_stop\"}\n\n"))
}))
defer server.Close()
e := NewGitHubCopilotExecutor(&config.Config{})
e.cache["gh-access-token"] = &cachedAPIToken{
token: "copilot-api-token",
apiEndpoint: server.URL,
expiresAt: time.Now().Add(time.Hour),
}
auth := &cliproxyauth.Auth{Metadata: map[string]any{"access_token": "gh-access-token"}}
payload := []byte(`{"model":"claude-sonnet-4.6","stream":true,"max_tokens":256,"messages":[{"role":"assistant","content":[{"type":"tool_use","id":"toolu_1","name":"Read","input":{"path":"notes.txt"}}]},{"role":"user","content":[{"type":"tool_result","tool_use_id":"toolu_1","content":"file contents"}]}]}`)
result, err := e.ExecuteStream(context.Background(), auth, cliproxyexecutor.Request{
Model: "claude-sonnet-4.6",
Payload: payload,
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("claude"),
OriginalRequest: payload,
})
if err != nil {
t.Fatalf("ExecuteStream() error: %v", err)
}
var joined strings.Builder
for chunk := range result.Chunks {
if chunk.Err != nil {
t.Fatalf("stream chunk error: %v", chunk.Err)
}
joined.Write(chunk.Payload)
}
if gotPath != "/v1/messages" {
t.Fatalf("path = %q, want %q", gotPath, "/v1/messages")
}
if gotInitiator != "agent" {
t.Fatalf("X-Initiator = %q, want %q", gotInitiator, "agent")
}
if gotAPIVersion != copilotGitHubAPIVer {
t.Fatalf("X-Github-Api-Version = %q, want %q", gotAPIVersion, copilotGitHubAPIVer)
}
if !strings.Contains(joined.String(), "message_start") || !strings.Contains(joined.String(), "text_delta") {
t.Fatalf("stream = %q, want Claude SSE payload", joined.String())
}
}
func TestCountTokens_EmptyPayload(t *testing.T) {
t.Parallel()
e := &GitHubCopilotExecutor{}
resp, err := e.CountTokens(context.Background(), nil, cliproxyexecutor.Request{
Model: "gpt-4o",
Payload: []byte(`{"model":"gpt-4o","messages":[]}`),
}, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai"),
})
if err != nil {
t.Fatalf("CountTokens() error: %v", err)
}
tokens := gjson.GetBytes(resp.Payload, "usage.prompt_tokens").Int()
// Empty messages should return 0 tokens.
if tokens != 0 {
t.Fatalf("expected 0 tokens for empty messages, got %d", tokens)
}
}
func TestStripUnsupportedBetas_RemovesContext1M(t *testing.T) {
t.Parallel()
body := []byte(`{"model":"claude-opus-4.6","betas":["interleaved-thinking-2025-05-14","context-1m-2025-08-07","claude-code-20250219"],"messages":[]}`)
result := stripUnsupportedBetas(body)
betas := gjson.GetBytes(result, "betas")
if !betas.Exists() {
t.Fatal("betas field should still exist after stripping")
}
for _, item := range betas.Array() {
if item.String() == "context-1m-2025-08-07" {
t.Fatal("context-1m-2025-08-07 should have been stripped")
}
}
// Other betas should be preserved
found := false
for _, item := range betas.Array() {
if item.String() == "interleaved-thinking-2025-05-14" {
found = true
}
}
if !found {
t.Fatal("other betas should be preserved")
}
}
func TestStripUnsupportedBetas_NoBetasField(t *testing.T) {
t.Parallel()
body := []byte(`{"model":"gpt-4o","messages":[]}`)
result := stripUnsupportedBetas(body)
// Should be unchanged
if string(result) != string(body) {
t.Fatalf("body should be unchanged when no betas field exists, got %s", string(result))
}
}
func TestStripUnsupportedBetas_MetadataBetas(t *testing.T) {
t.Parallel()
body := []byte(`{"model":"claude-opus-4.6","metadata":{"betas":["context-1m-2025-08-07","other-beta"]},"messages":[]}`)
result := stripUnsupportedBetas(body)
betas := gjson.GetBytes(result, "metadata.betas")
if !betas.Exists() {
t.Fatal("metadata.betas field should still exist after stripping")
}
for _, item := range betas.Array() {
if item.String() == "context-1m-2025-08-07" {
t.Fatal("context-1m-2025-08-07 should have been stripped from metadata.betas")
}
}
if betas.Array()[0].String() != "other-beta" {
t.Fatal("other betas in metadata.betas should be preserved")
}
}
func TestStripUnsupportedBetas_AllBetasStripped(t *testing.T) {
t.Parallel()
body := []byte(`{"model":"claude-opus-4.6","betas":["context-1m-2025-08-07"],"messages":[]}`)
result := stripUnsupportedBetas(body)
betas := gjson.GetBytes(result, "betas")
if betas.Exists() {
t.Fatal("betas field should be deleted when all betas are stripped")
}
}
func TestCopilotModelEntry_Limits(t *testing.T) {
t.Parallel()
tests := []struct {
name string
capabilities map[string]any
wantNil bool
wantPrompt int
wantOutput int
wantContext int
}{
{
name: "nil capabilities",
capabilities: nil,
wantNil: true,
},
{
name: "no limits key",
capabilities: map[string]any{"family": "claude-opus-4.6"},
wantNil: true,
},
{
name: "limits is not a map",
capabilities: map[string]any{"limits": "invalid"},
wantNil: true,
},
{
name: "all zero values",
capabilities: map[string]any{
"limits": map[string]any{
"max_context_window_tokens": float64(0),
"max_prompt_tokens": float64(0),
"max_output_tokens": float64(0),
},
},
wantNil: true,
},
{
name: "individual account limits (128K prompt)",
capabilities: map[string]any{
"limits": map[string]any{
"max_context_window_tokens": float64(144000),
"max_prompt_tokens": float64(128000),
"max_output_tokens": float64(64000),
},
},
wantNil: false,
wantPrompt: 128000,
wantOutput: 64000,
wantContext: 144000,
},
{
name: "business account limits (168K prompt)",
capabilities: map[string]any{
"limits": map[string]any{
"max_context_window_tokens": float64(200000),
"max_prompt_tokens": float64(168000),
"max_output_tokens": float64(32000),
},
},
wantNil: false,
wantPrompt: 168000,
wantOutput: 32000,
wantContext: 200000,
},
{
name: "partial limits (only prompt)",
capabilities: map[string]any{
"limits": map[string]any{
"max_prompt_tokens": float64(128000),
},
},
wantNil: false,
wantPrompt: 128000,
wantOutput: 0,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
t.Parallel()
entry := copilotauth.CopilotModelEntry{
ID: "claude-opus-4.6",
Capabilities: tt.capabilities,
}
limits := entry.Limits()
if tt.wantNil {
if limits != nil {
t.Fatalf("expected nil limits, got %+v", limits)
}
return
}
if limits == nil {
t.Fatal("expected non-nil limits, got nil")
}
if limits.MaxPromptTokens != tt.wantPrompt {
t.Errorf("MaxPromptTokens = %d, want %d", limits.MaxPromptTokens, tt.wantPrompt)
}
if limits.MaxOutputTokens != tt.wantOutput {
t.Errorf("MaxOutputTokens = %d, want %d", limits.MaxOutputTokens, tt.wantOutput)
}
if tt.wantContext > 0 && limits.MaxContextWindowTokens != tt.wantContext {
t.Errorf("MaxContextWindowTokens = %d, want %d", limits.MaxContextWindowTokens, tt.wantContext)
}
})
}
}

View File

@@ -30,12 +30,20 @@ const (
gitLabChatEndpoint = "/api/v4/chat/completions"
gitLabCodeSuggestionsEndpoint = "/api/v4/code_suggestions/completions"
gitLabSSEStreamingHeader = "X-Supports-Sse-Streaming"
gitLabContext1MBeta = "context-1m-2025-08-07"
gitLabNativeUserAgent = "CLIProxyAPIPlus/GitLab-Duo"
)
type GitLabExecutor struct {
cfg *config.Config
}
type gitLabCatalogModel struct {
ID string
DisplayName string
Provider string
}
type gitLabPrompt struct {
Instruction string
FileName string
@@ -53,6 +61,23 @@ type gitLabOpenAIStreamState struct {
Finished bool
}
var gitLabAgenticCatalog = []gitLabCatalogModel{
{ID: "duo-chat-gpt-5-1", DisplayName: "GitLab Duo (GPT-5.1)", Provider: "openai"},
{ID: "duo-chat-opus-4-6", DisplayName: "GitLab Duo (Claude Opus 4.6)", Provider: "anthropic"},
{ID: "duo-chat-opus-4-5", DisplayName: "GitLab Duo (Claude Opus 4.5)", Provider: "anthropic"},
{ID: "duo-chat-sonnet-4-6", DisplayName: "GitLab Duo (Claude Sonnet 4.6)", Provider: "anthropic"},
{ID: "duo-chat-sonnet-4-5", DisplayName: "GitLab Duo (Claude Sonnet 4.5)", Provider: "anthropic"},
{ID: "duo-chat-gpt-5-mini", DisplayName: "GitLab Duo (GPT-5 Mini)", Provider: "openai"},
{ID: "duo-chat-gpt-5-2", DisplayName: "GitLab Duo (GPT-5.2)", Provider: "openai"},
{ID: "duo-chat-gpt-5-2-codex", DisplayName: "GitLab Duo (GPT-5.2 Codex)", Provider: "openai"},
{ID: "duo-chat-gpt-5-codex", DisplayName: "GitLab Duo (GPT-5 Codex)", Provider: "openai"},
{ID: "duo-chat-haiku-4-5", DisplayName: "GitLab Duo (Claude Haiku 4.5)", Provider: "anthropic"},
}
var gitLabModelAliases = map[string]string{
"duo-chat-haiku-4-6": "duo-chat-haiku-4-5",
}
func NewGitLabExecutor(cfg *config.Config) *GitLabExecutor {
return &GitLabExecutor{cfg: cfg}
}
@@ -249,12 +274,12 @@ func (e *GitLabExecutor) nativeGateway(
auth *cliproxyauth.Auth,
req cliproxyexecutor.Request,
) (cliproxyauth.ProviderExecutor, *cliproxyauth.Auth, cliproxyexecutor.Request, bool) {
if nativeAuth, ok := buildGitLabAnthropicGatewayAuth(auth); ok {
if nativeAuth, ok := buildGitLabAnthropicGatewayAuth(auth, req.Model); ok {
nativeReq := req
nativeReq.Model = gitLabResolvedModel(auth, req.Model)
return NewClaudeExecutor(e.cfg), nativeAuth, nativeReq, true
}
if nativeAuth, ok := buildGitLabOpenAIGatewayAuth(auth); ok {
if nativeAuth, ok := buildGitLabOpenAIGatewayAuth(auth, req.Model); ok {
nativeReq := req
nativeReq.Model = gitLabResolvedModel(auth, req.Model)
return NewCodexExecutor(e.cfg), nativeAuth, nativeReq, true
@@ -263,10 +288,10 @@ func (e *GitLabExecutor) nativeGateway(
}
func (e *GitLabExecutor) nativeGatewayHTTP(auth *cliproxyauth.Auth) (cliproxyauth.ProviderExecutor, *cliproxyauth.Auth) {
if nativeAuth, ok := buildGitLabAnthropicGatewayAuth(auth); ok {
if nativeAuth, ok := buildGitLabAnthropicGatewayAuth(auth, ""); ok {
return NewClaudeExecutor(e.cfg), nativeAuth
}
if nativeAuth, ok := buildGitLabOpenAIGatewayAuth(auth); ok {
if nativeAuth, ok := buildGitLabOpenAIGatewayAuth(auth, ""); ok {
return NewCodexExecutor(e.cfg), nativeAuth
}
return nil, nil
@@ -664,7 +689,7 @@ func applyGitLabRequestHeaders(req *http.Request, auth *cliproxyauth.Auth) {
if auth != nil {
util.ApplyCustomHeadersFromAttrs(req, auth.Attributes)
}
for key, value := range gitLabGatewayHeaders(auth) {
for key, value := range gitLabGatewayHeaders(auth, "") {
if key == "" || value == "" {
continue
}
@@ -672,34 +697,40 @@ func applyGitLabRequestHeaders(req *http.Request, auth *cliproxyauth.Auth) {
}
}
func gitLabGatewayHeaders(auth *cliproxyauth.Auth) map[string]string {
if auth == nil || auth.Metadata == nil {
return nil
}
raw, ok := auth.Metadata["duo_gateway_headers"]
if !ok {
return nil
}
func gitLabGatewayHeaders(auth *cliproxyauth.Auth, targetProvider string) map[string]string {
out := make(map[string]string)
switch typed := raw.(type) {
case map[string]string:
for key, value := range typed {
key = strings.TrimSpace(key)
value = strings.TrimSpace(value)
if key != "" && value != "" {
out[key] = value
if auth != nil && auth.Metadata != nil {
raw, ok := auth.Metadata["duo_gateway_headers"]
if ok {
switch typed := raw.(type) {
case map[string]string:
for key, value := range typed {
key = strings.TrimSpace(key)
value = strings.TrimSpace(value)
if key != "" && value != "" {
out[key] = value
}
}
case map[string]any:
for key, value := range typed {
key = strings.TrimSpace(key)
if key == "" {
continue
}
strValue := strings.TrimSpace(fmt.Sprint(value))
if strValue != "" {
out[key] = strValue
}
}
}
}
case map[string]any:
for key, value := range typed {
key = strings.TrimSpace(key)
if key == "" {
continue
}
strValue := strings.TrimSpace(fmt.Sprint(value))
if strValue != "" {
out[key] = strValue
}
}
if _, ok := out["User-Agent"]; !ok {
out["User-Agent"] = gitLabNativeUserAgent
}
if strings.EqualFold(strings.TrimSpace(targetProvider), "openai") {
if _, ok := out["anthropic-beta"]; !ok {
out["anthropic-beta"] = gitLabContext1MBeta
}
}
if len(out) == 0 {
@@ -989,8 +1020,8 @@ func gitLabUsage(model string, translatedReq []byte, text string) (int64, int64)
return promptTokens, int64(completionCount)
}
func buildGitLabAnthropicGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Auth, bool) {
if !gitLabUsesAnthropicGateway(auth) {
func buildGitLabAnthropicGatewayAuth(auth *cliproxyauth.Auth, requestedModel string) (*cliproxyauth.Auth, bool) {
if !gitLabUsesAnthropicGateway(auth, requestedModel) {
return nil, false
}
baseURL := gitLabAnthropicGatewayBaseURL(auth)
@@ -1006,7 +1037,8 @@ func buildGitLabAnthropicGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Aut
}
nativeAuth.Attributes["api_key"] = token
nativeAuth.Attributes["base_url"] = baseURL
for key, value := range gitLabGatewayHeaders(auth) {
nativeAuth.Attributes["gitlab_duo_force_context_1m"] = "true"
for key, value := range gitLabGatewayHeaders(auth, "anthropic") {
if key == "" || value == "" {
continue
}
@@ -1015,8 +1047,8 @@ func buildGitLabAnthropicGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Aut
return nativeAuth, true
}
func buildGitLabOpenAIGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Auth, bool) {
if !gitLabUsesOpenAIGateway(auth) {
func buildGitLabOpenAIGatewayAuth(auth *cliproxyauth.Auth, requestedModel string) (*cliproxyauth.Auth, bool) {
if !gitLabUsesOpenAIGateway(auth, requestedModel) {
return nil, false
}
baseURL := gitLabOpenAIGatewayBaseURL(auth)
@@ -1032,7 +1064,7 @@ func buildGitLabOpenAIGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Auth,
}
nativeAuth.Attributes["api_key"] = token
nativeAuth.Attributes["base_url"] = baseURL
for key, value := range gitLabGatewayHeaders(auth) {
for key, value := range gitLabGatewayHeaders(auth, "openai") {
if key == "" || value == "" {
continue
}
@@ -1041,34 +1073,41 @@ func buildGitLabOpenAIGatewayAuth(auth *cliproxyauth.Auth) (*cliproxyauth.Auth,
return nativeAuth, true
}
func gitLabUsesAnthropicGateway(auth *cliproxyauth.Auth) bool {
func gitLabUsesAnthropicGateway(auth *cliproxyauth.Auth, requestedModel string) bool {
if auth == nil || auth.Metadata == nil {
return false
}
provider := strings.ToLower(gitLabMetadataString(auth.Metadata, "model_provider"))
if provider == "" {
modelName := strings.ToLower(gitLabMetadataString(auth.Metadata, "model_name"))
provider = inferGitLabProviderFromModel(modelName)
}
provider := gitLabGatewayProvider(auth, requestedModel)
return provider == "anthropic" &&
gitLabMetadataString(auth.Metadata, "duo_gateway_base_url") != "" &&
gitLabMetadataString(auth.Metadata, "duo_gateway_token") != ""
}
func gitLabUsesOpenAIGateway(auth *cliproxyauth.Auth) bool {
func gitLabUsesOpenAIGateway(auth *cliproxyauth.Auth, requestedModel string) bool {
if auth == nil || auth.Metadata == nil {
return false
}
provider := strings.ToLower(gitLabMetadataString(auth.Metadata, "model_provider"))
if provider == "" {
modelName := strings.ToLower(gitLabMetadataString(auth.Metadata, "model_name"))
provider = inferGitLabProviderFromModel(modelName)
}
provider := gitLabGatewayProvider(auth, requestedModel)
return provider == "openai" &&
gitLabMetadataString(auth.Metadata, "duo_gateway_base_url") != "" &&
gitLabMetadataString(auth.Metadata, "duo_gateway_token") != ""
}
func gitLabGatewayProvider(auth *cliproxyauth.Auth, requestedModel string) string {
modelName := strings.TrimSpace(gitLabResolvedModel(auth, requestedModel))
if provider := inferGitLabProviderFromModel(modelName); provider != "" {
return provider
}
if auth == nil || auth.Metadata == nil {
return ""
}
provider := strings.ToLower(gitLabMetadataString(auth.Metadata, "model_provider"))
if provider == "" {
provider = inferGitLabProviderFromModel(gitLabMetadataString(auth.Metadata, "model_name"))
}
return provider
}
func inferGitLabProviderFromModel(model string) string {
model = strings.ToLower(strings.TrimSpace(model))
switch {
@@ -1151,6 +1190,9 @@ func gitLabBaseURL(auth *cliproxyauth.Auth) string {
func gitLabResolvedModel(auth *cliproxyauth.Auth, requested string) string {
requested = strings.TrimSpace(thinking.ParseSuffix(requested).ModelName)
if requested != "" && !strings.EqualFold(requested, "gitlab-duo") {
if mapped, ok := gitLabModelAliases[strings.ToLower(requested)]; ok && strings.TrimSpace(mapped) != "" {
return mapped
}
return requested
}
if auth != nil && auth.Metadata != nil {
@@ -1277,8 +1319,8 @@ func gitLabAuthKind(method string) string {
}
func GitLabModelsFromAuth(auth *cliproxyauth.Auth) []*registry.ModelInfo {
models := make([]*registry.ModelInfo, 0, 4)
seen := make(map[string]struct{}, 4)
models := make([]*registry.ModelInfo, 0, len(gitLabAgenticCatalog)+4)
seen := make(map[string]struct{}, len(gitLabAgenticCatalog)+4)
addModel := func(id, displayName, provider string) {
id = strings.TrimSpace(id)
if id == "" {
@@ -1302,6 +1344,18 @@ func GitLabModelsFromAuth(auth *cliproxyauth.Auth) []*registry.ModelInfo {
}
addModel("gitlab-duo", "GitLab Duo", "gitlab")
for _, model := range gitLabAgenticCatalog {
addModel(model.ID, model.DisplayName, model.Provider)
}
for alias, upstream := range gitLabModelAliases {
target := strings.TrimSpace(upstream)
displayName := "GitLab Duo Alias"
provider := strings.TrimSpace(inferGitLabProviderFromModel(target))
if provider != "" {
displayName = fmt.Sprintf("GitLab Duo Alias (%s)", provider)
}
addModel(alias, displayName, provider)
}
if auth == nil {
return models
}

View File

@@ -217,6 +217,69 @@ func TestGitLabExecutorExecuteUsesOpenAIGateway(t *testing.T) {
}
}
func TestGitLabExecutorExecuteUsesRequestedModelToSelectOpenAIGateway(t *testing.T) {
var gotAuthHeader, gotRealmHeader, gotBetaHeader, gotUserAgent string
var gotPath string
var gotModel string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
gotAuthHeader = r.Header.Get("Authorization")
gotRealmHeader = r.Header.Get("X-Gitlab-Realm")
gotBetaHeader = r.Header.Get("anthropic-beta")
gotUserAgent = r.Header.Get("User-Agent")
gotModel = gjson.GetBytes(readBody(t, r), "model").String()
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("data: {\"type\":\"response.created\",\"response\":{\"id\":\"resp_1\",\"created_at\":1710000000,\"model\":\"duo-chat-gpt-5-codex\"}}\n\n"))
_, _ = w.Write([]byte("data: {\"type\":\"response.output_text.delta\",\"delta\":\"hello from explicit openai model\"}\n\n"))
_, _ = w.Write([]byte("data: {\"type\":\"response.completed\",\"response\":{\"id\":\"resp_1\",\"created_at\":1710000000,\"model\":\"duo-chat-gpt-5-codex\",\"output\":[{\"type\":\"message\",\"id\":\"msg_1\",\"role\":\"assistant\",\"content\":[{\"type\":\"output_text\",\"text\":\"hello from explicit openai model\"}]}],\"usage\":{\"input_tokens\":11,\"output_tokens\":4,\"total_tokens\":15}}}\n\n"))
}))
defer srv.Close()
exec := NewGitLabExecutor(&config.Config{})
auth := &cliproxyauth.Auth{
Provider: "gitlab",
Metadata: map[string]any{
"duo_gateway_base_url": srv.URL,
"duo_gateway_token": "gateway-token",
"duo_gateway_headers": map[string]string{"X-Gitlab-Realm": "saas"},
"model_provider": "anthropic",
"model_name": "claude-sonnet-4-5",
},
}
req := cliproxyexecutor.Request{
Model: "duo-chat-gpt-5-codex",
Payload: []byte(`{"model":"duo-chat-gpt-5-codex","messages":[{"role":"user","content":"hello"}]}`),
}
resp, err := exec.Execute(context.Background(), auth, req, cliproxyexecutor.Options{
SourceFormat: sdktranslator.FromString("openai"),
})
if err != nil {
t.Fatalf("Execute() error = %v", err)
}
if gotPath != "/v1/proxy/openai/v1/responses" {
t.Fatalf("Path = %q, want %q", gotPath, "/v1/proxy/openai/v1/responses")
}
if gotAuthHeader != "Bearer gateway-token" {
t.Fatalf("Authorization = %q, want Bearer gateway-token", gotAuthHeader)
}
if gotRealmHeader != "saas" {
t.Fatalf("X-Gitlab-Realm = %q, want saas", gotRealmHeader)
}
if gotBetaHeader != gitLabContext1MBeta {
t.Fatalf("anthropic-beta = %q, want %q", gotBetaHeader, gitLabContext1MBeta)
}
if gotUserAgent != gitLabNativeUserAgent {
t.Fatalf("User-Agent = %q, want %q", gotUserAgent, gitLabNativeUserAgent)
}
if gotModel != "duo-chat-gpt-5-codex" {
t.Fatalf("model = %q, want duo-chat-gpt-5-codex", gotModel)
}
if got := gjson.GetBytes(resp.Payload, "choices.0.message.content").String(); got != "hello from explicit openai model" {
t.Fatalf("expected explicit openai model response, got %q payload=%s", got, string(resp.Payload))
}
}
func TestGitLabExecutorRefreshUpdatesMetadata(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
switch r.URL.Path {
@@ -251,13 +314,12 @@ func TestGitLabExecutorRefreshUpdatesMetadata(t *testing.T) {
ID: "gitlab-auth.json",
Provider: "gitlab",
Metadata: map[string]any{
"base_url": srv.URL,
"access_token": "oauth-access",
"refresh_token": "oauth-refresh",
"oauth_client_id": "client-id",
"oauth_client_secret": "client-secret",
"auth_method": "oauth",
"oauth_expires_at": "2000-01-01T00:00:00Z",
"base_url": srv.URL,
"access_token": "oauth-access",
"refresh_token": "oauth-refresh",
"oauth_client_id": "client-id",
"auth_method": "oauth",
"oauth_expires_at": "2000-01-01T00:00:00Z",
},
}
@@ -397,9 +459,11 @@ func TestGitLabExecutorExecuteStreamFallsBackToSyntheticChat(t *testing.T) {
}
func TestGitLabExecutorExecuteStreamUsesAnthropicGateway(t *testing.T) {
var gotPath string
var gotPath, gotBetaHeader, gotUserAgent string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
gotPath = r.URL.Path
gotBetaHeader = r.Header.Get("Anthropic-Beta")
gotUserAgent = r.Header.Get("User-Agent")
w.Header().Set("Content-Type", "text/event-stream")
_, _ = w.Write([]byte("event: message_start\n"))
_, _ = w.Write([]byte("data: {\"type\":\"message_start\",\"message\":{\"id\":\"msg_1\",\"type\":\"message\",\"role\":\"assistant\",\"model\":\"claude-sonnet-4-5\",\"content\":[],\"stop_reason\":null,\"stop_sequence\":null,\"usage\":{\"input_tokens\":0,\"output_tokens\":0}}}\n\n"))
@@ -441,6 +505,12 @@ func TestGitLabExecutorExecuteStreamUsesAnthropicGateway(t *testing.T) {
if gotPath != "/v1/proxy/anthropic/v1/messages" {
t.Fatalf("Path = %q, want %q", gotPath, "/v1/proxy/anthropic/v1/messages")
}
if !strings.Contains(gotBetaHeader, gitLabContext1MBeta) {
t.Fatalf("Anthropic-Beta = %q, want to contain %q", gotBetaHeader, gitLabContext1MBeta)
}
if gotUserAgent != gitLabNativeUserAgent {
t.Fatalf("User-Agent = %q, want %q", gotUserAgent, gitLabNativeUserAgent)
}
if !strings.Contains(strings.Join(lines, "\n"), "hello from gateway") {
t.Fatalf("expected anthropic gateway stream, got %q", strings.Join(lines, "\n"))
}

View File

@@ -1,11 +1,11 @@
package executor
package helps
import (
"sync"
"time"
)
type codexCache struct {
type CodexCache struct {
ID string
Expire time.Time
}
@@ -13,7 +13,7 @@ type codexCache struct {
// codexCacheMap stores prompt cache IDs keyed by model+user_id.
// Protected by codexCacheMu. Entries expire after 1 hour.
var (
codexCacheMap = make(map[string]codexCache)
codexCacheMap = make(map[string]CodexCache)
codexCacheMu sync.RWMutex
)
@@ -50,20 +50,20 @@ func purgeExpiredCodexCache() {
}
}
// getCodexCache retrieves a cached entry, returning ok=false if not found or expired.
func getCodexCache(key string) (codexCache, bool) {
// GetCodexCache retrieves a cached entry, returning ok=false if not found or expired.
func GetCodexCache(key string) (CodexCache, bool) {
codexCacheCleanupOnce.Do(startCodexCacheCleanup)
codexCacheMu.RLock()
cache, ok := codexCacheMap[key]
codexCacheMu.RUnlock()
if !ok || cache.Expire.Before(time.Now()) {
return codexCache{}, false
return CodexCache{}, false
}
return cache, true
}
// setCodexCache stores a cache entry.
func setCodexCache(key string, cache codexCache) {
// SetCodexCache stores a cache entry.
func SetCodexCache(key string, cache CodexCache) {
codexCacheCleanupOnce.Do(startCodexCacheCleanup)
codexCacheMu.Lock()
codexCacheMap[key] = cache

View File

@@ -0,0 +1,38 @@
package helps
import "github.com/tidwall/gjson"
var defaultClaudeBuiltinToolNames = []string{
"web_search",
"code_execution",
"text_editor",
"computer",
}
func newClaudeBuiltinToolRegistry() map[string]bool {
registry := make(map[string]bool, len(defaultClaudeBuiltinToolNames))
for _, name := range defaultClaudeBuiltinToolNames {
registry[name] = true
}
return registry
}
func AugmentClaudeBuiltinToolRegistry(body []byte, registry map[string]bool) map[string]bool {
if registry == nil {
registry = newClaudeBuiltinToolRegistry()
}
tools := gjson.GetBytes(body, "tools")
if !tools.Exists() || !tools.IsArray() {
return registry
}
tools.ForEach(func(_, tool gjson.Result) bool {
if tool.Get("type").String() == "" {
return true
}
if name := tool.Get("name").String(); name != "" {
registry[name] = true
}
return true
})
return registry
}

View File

@@ -0,0 +1,32 @@
package helps
import "testing"
func TestClaudeBuiltinToolRegistry_DefaultSeedFallback(t *testing.T) {
registry := AugmentClaudeBuiltinToolRegistry(nil, nil)
for _, name := range defaultClaudeBuiltinToolNames {
if !registry[name] {
t.Fatalf("default builtin %q missing from fallback registry", name)
}
}
}
func TestClaudeBuiltinToolRegistry_AugmentsTypedBuiltinsFromBody(t *testing.T) {
registry := AugmentClaudeBuiltinToolRegistry([]byte(`{
"tools": [
{"type": "web_search_20250305", "name": "web_search"},
{"type": "custom_builtin_20250401", "name": "special_builtin"},
{"name": "Read"}
]
}`), nil)
if !registry["web_search"] {
t.Fatal("expected default typed builtin web_search in registry")
}
if !registry["special_builtin"] {
t.Fatal("expected typed builtin from body to be added to registry")
}
if registry["Read"] {
t.Fatal("expected untyped custom tool to stay out of builtin registry")
}
}

Some files were not shown because too many files have changed in this diff Show More