Commit Graph

2831 Commits

Author SHA1 Message Date
Luis Pater
98509f615c Merge pull request #485 from kunish/fix/copilot-premium-request-inflation
fix(copilot): reduce premium request inflation, enable thinking, and use dynamic API limits
v6.9.13-1
2026-04-04 02:19:56 +08:00
Luis Pater
e7a66ae504 Merge branch 'router-for-me:main' into main v6.9.13-0 2026-04-04 02:18:06 +08:00
Luis Pater
754b126944 fix(executor): remove commented-out code in QwenExecutor 2026-04-04 02:14:48 +08:00
Luis Pater
ae37ccffbf Merge pull request #2520 from Arronlong/main
fix:qwen invalid_parameter_error
2026-04-04 02:13:09 +08:00
Luis Pater
42c062bb5b Merge pull request #2509 from adamhelfgott/fix-claude-thinking-temperature
Normalize Claude temperature when thinking is enabled
2026-04-03 23:55:50 +08:00
kunish
87bf0b73d5 fix(copilot): use dynamic API limits to prevent prompt token overflow
The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that differ from the static 200K context length advertised
by the proxy. This mismatch caused Claude Code to accumulate context
beyond the actual limit, triggering "prompt token count exceeds the limit
of 128000" errors.

Changes:
- Extract max_prompt_tokens and max_output_tokens from the Copilot
  /models API response (capabilities.limits) and use them as the
  authoritative ContextLength and MaxCompletionTokens values
- Add CopilotModelLimits struct and Limits() helper to parse limits
  from the existing Capabilities map
- Fix GitLab Duo context-1m beta header not being set when routing
  through the Anthropic gateway (gitlab_duo_force_context_1m attr
  was set but only gin headers were checked)
- Fix flaky parallel tests that shared global model registry state
2026-04-03 23:54:17 +08:00
Luis Pater
f389667ec3 Merge pull request #2513 from lonr-6/codex/fix-ws-custom-tool-repair-v2
fix: repair responses websocket custom tool call pairing
2026-04-03 23:45:38 +08:00
Arronlong
29dba0399b Comment out system message check in Qwen executor
fix qwen invalid_parameter_error
2026-04-03 23:07:33 +08:00
Luis Pater
a824e7cd0b feat(models): add GPT-5.3, GPT-5.4, and GPT-5.4-mini with enhanced "thinking" levels 2026-04-03 23:05:10 +08:00
Luis Pater
140faef7dc Merge branch 'router-for-me:main' into main v6.9.12-0 2026-04-03 21:48:23 +08:00
Luis Pater
adb580b344 feat(security): add configuration to toggle Gemini CLI endpoint access
Closes: #2445
2026-04-03 21:46:49 +08:00
Luis Pater
06405f2129 fix(security): enforce stricter localhost validation for GeminiCLIAPIHandler
Closes: #2445
2026-04-03 21:22:03 +08:00
kunish
b849bf79d6 fix(copilot): address code review — SSE reasoning, multi-choice, agent detection
- Strip SSE `data:` prefix before normalizing reasoning_text→reasoning_content
  in streaming mode; re-wrap afterward for the translator
- Iterate all choices in normalizeGitHubCopilotReasoningField (not just
  choices[0]) to support n>1 requests
- Remove over-broad tool-role fallback in isAgentInitiated that scanned
  all messages for role:"tool", aligning with opencode's approach of only
  detecting active tool loops — genuine user follow-ups after tool use are
  no longer mis-classified as agent-initiated
- Add 5 reasoning normalization tests; update 2 X-Initiator tests to match
  refined semantics
2026-04-03 20:51:19 +08:00
kunish
59af2c57b1 fix(copilot): reduce premium request inflation and enable thinking
This commit addresses three issues with Claude Code through GitHub
Copilot:

1. **Premium request inflation**: Responses API requests were missing
   Openai-Intent headers and proper defaults, causing Copilot to bill
   each tool-loop continuation as a new premium request. Fixed by adding
   isAgentInitiated() heuristic (checks for tool_result content or
   preceding assistant tool_use), applying Responses API defaults
   (store, include, reasoning.summary), and local tiktoken-based token
   counting to avoid extra API calls.

2. **Context overflow**: Claude Code's modelSupports1M() hardcodes
   opus-4-6 as 1M-capable, but Copilot only supports ~128K-200K.
   Fixed by stripping the context-1m-2025-08-07 beta from translated
   request bodies. Also forwards response headers in non-streaming
   Execute() and registers the GET /copilot-quota management API route.

3. **Thinking not working**: Add ThinkingSupport with level-based
   reasoning to Claude models in the static definitions. Normalize
   Copilot's non-standard 'reasoning_text' response field to
   'reasoning_content' before passing to the SDK translator. Use
   caller-provided context in CountTokens instead of Background().
2026-04-03 20:24:30 +08:00
Kai Wang
d1fd2c4ad4 fix: repair websocket custom tool calls 2026-04-03 17:11:44 +08:00
Kai Wang
b6c6379bfa fix: repair websocket custom tool calls 2026-04-03 17:11:42 +08:00
Kai Wang
8f0e66b72e fix: repair websocket custom tool calls 2026-04-03 17:11:41 +08:00
Adam Helfgott
f63cf6ff7a Normalize Claude temperature for thinking 2026-04-03 03:45:51 -04:00
Luis Pater
d2419ed49d feat(executor): ensure default system message in QwenExecutor payload 2026-04-03 11:18:48 +08:00
Luis Pater
516d22c695 Merge pull request #484 from Ve-ria/main
更新CodeBuddy CN的模型列表
v6.9.10-1
2026-04-03 11:10:32 +08:00
rensumo
73cda6e836 Update CodeBuddy DeepSeek model description 2026-04-03 11:03:33 +08:00
rensumo
0805989ee5 更新CodeBuddy CN的模型列表 2026-04-03 10:59:27 +08:00
Luis Pater
75da02af55 Merge branch 'router-for-me:main' into main v6.9.10-0 2026-04-02 22:34:47 +08:00
Luis Pater
ab9ebea592 Merge PR #2474
# Conflicts:
#	internal/api/modules/amp/response_rewriter.go
#	internal/api/modules/amp/response_rewriter_test.go
2026-04-02 22:31:12 +08:00
Luis Pater
7ee37ee4b9 feat: add /healthz endpoint and test coverage for health check
Closes: #2493
2026-04-02 21:56:27 +08:00
Luis Pater
837afffb31 docs: remove README_JA.md and clean up related links from README files 2026-04-02 21:37:47 +08:00
Luis Pater
03a1bac898 Merge upstream v6.9.9 (PR #483) v6.9.9-0 2026-04-02 21:31:21 +08:00
Luis Pater
3171d524f0 docs: fix duplicated ProxyPal entry in README files 2026-04-02 21:22:40 +08:00
Luis Pater
3e78a8d500 Merge branch 'main' into dev 2026-04-02 21:21:26 +08:00
Luis Pater
fcba912cc4 Merge pull request #2492 from davidwushi1145/main
fix(responses): reassemble split SSE event/data frames before streaming
2026-04-02 21:20:31 +08:00
Luis Pater
7170eeea5f Merge pull request #2454 from buddingnewinsights/add-proxypal-to-readme
docs: add ProxyPal to "Who is with us?" section
2026-04-02 21:18:22 +08:00
Luis Pater
e3eb048c7a Merge pull request #2489 from Soein/upstream-pr
fix: 增强 Claude 反代检测对抗能力
2026-04-02 21:16:58 +08:00
Luis Pater
a59e92435b Merge pull request #2490 from router-for-me/logs
Refactor websocket logging and error handling
2026-04-02 20:47:31 +08:00
davidwushi1145
108895fc04 Harden Responses SSE framing against partial chunk boundaries
Follow-up review found two real framing hazards in the handler-layer
framer: it could flush a partial `data:` payload before the JSON was
complete, and it could inject an extra newline before chunks that
already began with `\n`/`\r\n`. This commit tightens the framer so it
only emits undelimited events when the buffered `data:` payload is
already valid JSON (or `[DONE]`), skips newline injection for chunks
that already start with a line break, and avoids the heavier
`bytes.Split` path while scanning SSE fields.

The regression suite now covers split `data:` payload chunks,
newline-prefixed chunks, and dropping incomplete trailing data on
flush, so the original Responses fix remains intact while the review
concerns are explicitly locked down.

Constraint: Keep the follow-up limited to handler-layer framing and tests
Rejected: Ignore the review and rely on current executor chunk shapes | leaves partial data payload corruption possible
Rejected: Build a fully generic SSE parser | wider change than needed for the identified risks
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not emit undelimited Responses SSE events unless buffered `data:` content is already complete and valid
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1
Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/...
Not-tested: Full repository test suite outside sdk/api/handlers packages
2026-04-02 20:39:49 +08:00
davidwushi1145
abc293c642 Prevent malformed Responses SSE frames from breaking stream clients
Line-oriented upstream executors can emit `event:` and `data:` as
separate chunks, but the Responses handler had started terminating
each incoming chunk as a full SSE event. That split `response.created`
into an empty event plus a later data block, which broke downstream
clients like OpenClaw.

This keeps the fix in the handler layer: a small stateful framer now
buffers standalone `event:` lines until the matching `data:` arrives,
preserves already-framed events, and ignores delimiter-only leftovers.
The regression suite now covers split event/data framing, full-event
passthrough, terminal errors, and the bootstrap path that forwards
line-oriented openai-response streams from non-Codex executors too.

Constraint: Keep the fix localized to Responses handler framing instead of patching every executor
Rejected: Revert to v6.9.7 chunk writing | would reintroduce data-only framing regressions
Rejected: Patch each line-oriented executor separately | duplicates fragile SSE assembly logic
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Do not assume incoming Responses stream chunks are already complete SSE events; preserve handler-layer reassembly for split `event:`/`data:` inputs
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers/openai -count=1
Tested: /tmp/go1.26.1/go/bin/go test ./sdk/api/handlers -count=1
Tested: /tmp/go1.26.1/go test ./sdk/api/handlers/... -count=1
Tested: /tmp/go1.26.1/go/bin/go vet ./sdk/api/handlers/...
Tested: Temporary patched server on 127.0.0.1:18317 -> /v1/models 200, /v1/responses non-stream 200, /v1/responses stream emitted combined `event:` + `data:` frames
Not-tested: Full repository test suite outside sdk/api/handlers packages
2026-04-02 20:26:42 +08:00
pzy
bb44671845 fix: 修复反代检测对抗的 3 个问题
- computeFingerprint 使用 rune 索引替代字节索引,修复多字节字符指纹不匹配
- utls Chrome TLS 指纹仅对 Anthropic 官方域名生效,自定义 base_url 走标准 transport
- IPv6 地址使用 net.JoinHostPort 正确拼接端口
2026-04-02 19:12:55 +08:00
Luis Pater
09e480036a feat(auth): add support for managing custom headers in auth files
Closes #2457
2026-04-02 19:11:09 +08:00
pzy
249f969110 fix: Claude API 请求使用 utls Chrome TLS 指纹
Claude executor 的 API 请求之前使用 Go 标准库 crypto/tls,JA3 指纹
与真实 Claude Code(Bun/BoringSSL)不匹配,可被 Cloudflare 识别。

- 新增 helps/utls_client.go,封装 utls Chrome 指纹 + HTTP/2 + 代理支持
- Claude executor 的 4 处 NewProxyAwareHTTPClient 替换为 NewUtlsHTTPClient
- 其他 executor(Gemini/Codex/iFlow 等)不受影响,仍用标准 TLS
- 非 HTTPS 请求自动回退到标准 transport
2026-04-02 19:09:56 +08:00
hkfires
4f8acec2d8 refactor(logging): centralize websocket handshake recording 2026-04-02 18:39:32 +08:00
hkfires
34339f61ee Refactor websocket logging and error handling
- Introduced new logging functions for websocket requests, handshakes, errors, and responses in `logging_helpers.go`.
- Updated `CodexWebsocketsExecutor` to utilize the new logging functions for improved clarity and consistency in websocket operations.
- Modified the handling of websocket upgrade rejections to log relevant metadata.
- Changed the request body key to a timeline body key in `openai_responses_websocket.go` to better reflect its purpose.
- Enhanced tests to verify the correct logging of websocket events and responses, including disconnect events and error handling scenarios.
2026-04-02 17:30:51 +08:00
pzy
4045378cb4 fix: 增强 Claude 反代检测对抗能力
基于 Claude Code v2.1.88 源码分析,修复多个可被 Anthropic 检测的差距:

- 实现消息指纹算法(SHA256 盐值 + 字符索引),替代随机 buildHash
- billing header cc_version 从设备 profile 动态取版本号,不再硬编码
- billing header cc_entrypoint 从客户端 UA 解析,支持 cli/vscode/local-agent
- billing header 新增 cc_workload 支持(通过 X-CPA-Claude-Workload 头传入)
- 新增 X-Claude-Code-Session-Id 头(每 apiKey 缓存 UUID,TTL=1h)
- 新增 x-client-request-id 头(仅 api.anthropic.com,每请求 UUID)
- 补全 4 个缺失的 beta flags(structured-outputs/fast-mode/redact-thinking/token-efficient-tools)
- OAuth scope 对齐 Claude Code 2.1.88(移除 org:create_api_key,添加 sessions/mcp/file_upload)
- Anthropic-Dangerous-Direct-Browser-Access 仅在 API key 模式发送
- 响应头网关指纹清洗(剥离 litellm/helicone/portkey/cloudflare/kong/braintrust 前缀头)
2026-04-02 15:55:22 +08:00
Luis Pater
2df35449fe Fix executor compat helpers v6.9.8-0 2026-04-02 12:20:12 +08:00
Luis Pater
c744179645 Merge PR #479 2026-04-02 12:15:33 +08:00
Luis Pater
9720b03a6b Merge pull request #477 from ben-vargas/plus-main
fix(copilot): route Gemini preview models to chat endpoint and correct context lengths
2026-04-02 11:36:51 +08:00
Luis Pater
f2c0f3d325 Merge pull request #476 from hungthai1401/fix/ghc-gpt54mini
Fix GitHub Copilot gpt-5.4-mini endpoint routing
2026-04-02 11:36:26 +08:00
Luis Pater
4f99bc54f1 test: update codex header expectations 2026-04-02 11:19:37 +08:00
Luis Pater
913f4a9c5f test: fix executor tests after helpers refactor 2026-04-02 11:12:30 +08:00
Luis Pater
25d1c18a3f fix: scope experimental cch signing to billing header 2026-04-02 11:03:11 +08:00
Luis Pater
d09dd4d0b2 Merge commit '15c2f274ea690c9a7c9db22f9f454af869db5375' into dev 2026-04-02 10:59:54 +08:00
Luis Pater
474fb042da Merge pull request #2476 from router-for-me/cherry-pick/pr-2438-to-dev
Cherry-pick PR #2438 onto dev
2026-04-02 10:36:50 +08:00