Commit Graph

91 Commits

Author SHA1 Message Date
Luis Pater
cc8dc7f62c Merge branch 'main' into dev 2026-03-05 23:13:21 +08:00
Luis Pater
a3846ea513 Merge pull request #1870 from sususu98/fix/remove-instructions-restore
cleanup(translator): remove leftover instructions restore in codex responses
2026-03-05 23:12:31 +08:00
Luis Pater
0e6bb076e9 fix(translator): comment out service_tier removal from OpenAI response processing 2026-03-05 22:49:38 +08:00
sususu98
68a6cabf8b style: blank unused params in codex responses translator 2026-03-05 16:42:48 +08:00
sususu98
ac0e387da1 cleanup(translator): remove leftover instructions restore in codex responses
The instructions restore logic was originally needed when the proxy
injected custom instructions (per-model system prompts) into requests.
Since ac802a46 removed the injection system, the proxy no longer
modifies instructions before forwarding. The upstream response's
instructions field now matches the client's original value, making
the restore a no-op.

Also removes unused sjson import.

Closes router-for-me/CLIProxyAPI#1868
2026-03-05 16:34:55 +08:00
Luis Pater
5850492a93 Fixed: #1548
test(translator): add unit tests for fallback logic in `ConvertCodexResponseToOpenAI` model assignment
2026-03-05 12:11:54 +08:00
sususu98
d26ad8224d fix(translator): strip defer_loading from Claude tool declarations in Codex and Gemini translators
Claude's Tool Search feature (advanced-tool-use-2025-11-20 beta) adds
defer_loading field to tool definitions. When proxying Claude requests
to Codex or Gemini, this unknown field causes 400 errors upstream.

Strip defer_loading (and cache_control where missing) in all three
Claude-to-upstream translation paths:
- codex/claude: defer_loading + cache_control
- gemini-cli/claude: defer_loading
- gemini/claude: defer_loading

Fixes #1725, Fixes #1375
2026-03-04 14:21:30 +08:00
Luis Pater
9f95b31158 **fix(translator): enhance handling of mixed output content in Claude requests** 2026-03-03 21:49:41 +08:00
hkfires
ce87714ef1 feat(thinking): normalize effort levels in adaptive thinking requests to prevent validation errors 2026-03-03 15:10:47 +08:00
hkfires
d2e5857b82 feat(thinking): enhance adaptive thinking support across models and update test cases 2026-03-03 13:00:24 +08:00
hkfires
c44793789b feat(thinking): add adaptive thinking support for Claude models
Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.
2026-03-03 09:05:31 +08:00
hkfires
914db94e79 refactor(headers): streamline User-Agent handling and introduce GeminiCLI versioning 2026-03-02 13:04:30 +08:00
hkfires
b148820c35 fix(translator): handle Claude thinking type "auto" like adaptive 2026-03-01 10:30:19 +08:00
Luis Pater
a6ce5f36e6 Fixed: #1758
fix(codex): filter billing headers from system result text and update template logic
2026-03-01 01:45:35 +08:00
Luis Pater
d24ea4ce2a Merge pull request #1664 from ciberponk/pr/responses-compaction-compat
feat: add codex responses compatibility for compaction payloads
2026-02-25 01:21:59 +08:00
Luis Pater
c3e12c5e58 Merge pull request #1654 from alexey-yanchenko/feature/pass-file-inputs
Pass file input from /chat/completions and /responses to codex and claude
2026-02-24 05:53:11 +08:00
fan
afc8a0f9be refactor: simplify context_management compatibility handling 2026-02-21 22:20:48 +08:00
ciberponk
d693d7993b feat: support responses compaction payload compatibility for codex translator 2026-02-21 12:56:10 +08:00
Alexey Yanchenko
0cbfe7f457 Pass file input from /chat/completions and /responses to codex and claude 2026-02-20 10:25:44 +07:00
Kirill Turanskiy
1cc21cc45b fix: prevent duplicate function call arguments when delta events precede done
Non-spark codex models (gpt-5.3-codex, gpt-5.2-codex) stream function call
arguments via multiple delta events followed by a done event. The done handler
unconditionally emitted the full arguments, duplicating what deltas already
streamed. This produced invalid double JSON that Claude Code couldn't parse,
causing tool calls to fail with missing parameters and infinite retry loops.

Add HasReceivedArgumentsDelta flag to track whether delta events were received.
The done handler now only emits arguments when no deltas preceded it (spark
models), while delta-based streaming continues to work for non-spark models.
2026-02-19 23:18:14 +03:00
Kirill Turanskiy
07cf616e2b fix: handle response.function_call_arguments.done in codex→claude streaming translator
Some Codex models (e.g. gpt-5.3-codex-spark) send function call arguments
in a single "done" event without preceding "delta" events. The streaming
translator only handled "delta" events, causing tool call arguments to be
lost — resulting in empty tool inputs and infinite retry loops in clients
like Claude Code.

Emit the full arguments from the "done" event as a single input_json_delta
so downstream clients receive the complete tool input.
2026-02-19 23:18:14 +03:00
Kirill Turanskiy
5fa23c7f41 fix: handle tool call argument streaming in Codex→OpenAI translator
The OpenAI Chat Completions translator was silently dropping
response.function_call_arguments.delta and
response.function_call_arguments.done Codex SSE events, meaning
tool call arguments were never streamed incrementally to clients.

Add proper handling mirroring the proven Claude translator pattern:

- response.output_item.added: announce tool call (id, name, empty args)
- response.function_call_arguments.delta: stream argument chunks
- response.function_call_arguments.done: emit full args if no deltas
- response.output_item.done: defensive fallback for backward compat

State tracking via HasReceivedArgumentsDelta and HasToolCallAnnounced
ensures no duplicate argument emission and correct behavior for models
like codex-spark that skip delta events entirely.
2026-02-18 19:09:05 +03:00
Alexey Yanchenko
63d4de5eea Pass cache usage from codex to openai chat completions 2026-02-15 12:04:15 +07:00
Luis Pater
a146c6c0aa Merge pull request #1523 from xxddff/feature/removeUserField
fix(codex): remove unsupported 'user' field from /v1/responses payload
2026-02-11 20:38:16 +08:00
xxddff
bb9fe52f1e Update internal/translator/codex/openai/responses/codex_openai-responses_request_test.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-10 18:24:58 +09:00
xxddff
afe4c1bfb7 更新internal/translator/codex/openai/responses/codex_openai-responses_request.go
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2026-02-10 18:24:26 +09:00
xxddff
865af9f19e Implement test for user field deletion
Add test to verify deletion of user field in response
2026-02-10 17:38:49 +09:00
xxddff
2b97cb98b5 Delete 'user' field from raw JSON
Remove the 'user' field from the raw JSON as requested.
2026-02-10 17:35:54 +09:00
hkfires
938a799263 feat(translator): support Claude thinking type adaptive 2026-02-10 16:20:32 +08:00
Luis Pater
80b5e79e75 fix(translator): normalize and restrict stop_reason/finish_reason usage
- Standardized the handling of `stop_reason` and `finish_reason` across Codex and Gemini responses.
- Restricted pass-through of specific reasons (`max_tokens`, `stop`) for consistency.
- Enhanced fallback logic for undefined reasons.
2026-02-07 02:07:51 +08:00
Luis Pater
a5a25dec57 refactor(translator, executor): remove redundant bytes.Clone calls for improved performance
- Replaced all instances of `bytes.Clone` with direct references to enhance efficiency.
- Simplified payload handling across executors and translators by eliminating unnecessary data duplication.
2026-02-06 03:26:29 +08:00
neavo
6c65fdf54b fix(gemini): support snake_case thinking config fields from Python SDK
Google official Gemini Python SDK sends thinking_level, thinking_budget,
and include_thoughts (snake_case) instead of thinkingLevel, thinkingBudget,
and includeThoughts (camelCase). This caused thinking configuration to be
ignored when using Python SDK.

Changes:
- Extract layer: extractGeminiConfig now reads snake_case as fallback
- Apply layer: Gemini/CLI/Antigravity appliers clean up snake_case fields
- Translator layer: Gemini->OpenAI/Claude/Codex translators support fallback
- Tests: Added 4 test cases for snake_case field coverage

Fixes #1426
2026-02-04 21:12:47 +08:00
Luis Pater
d885b81f23 Fixed: #1403
fix(translator): handle "input" field transformation for OpenAI responses
2026-02-03 21:49:30 +08:00
hkfires
354f6582b2 fix(codex): convert system role to developer for codex input 2026-02-01 15:37:37 +08:00
hkfires
fe3ebe3532 docs(translator): update Codex Claude request transform docs 2026-02-01 14:55:41 +08:00
hkfires
ac802a4646 refactor(codex): remove codex instructions injection support 2026-02-01 14:33:31 +08:00
Luis Pater
f99cddf97f fix(translator): handle stop_reason and MAX_TOKENS for Claude responses 2026-01-31 04:03:01 +08:00
hkfires
cf9daf470c feat(translator): report cached token usage in Claude output 2026-01-19 11:23:44 +08:00
hkfires
d5ef4a6d15 refactor(translator): remove registry model lookups from thinking config conversions 2026-01-18 10:30:14 +08:00
Luis Pater
65b4e1ec6c feat(codex): enable instruction toggling and update role terminology
- Added conditional logic for Codex instruction injection based on configuration.
- Updated role terminology from "user" to "developer" for better alignment with context.
2026-01-17 04:12:29 +08:00
Luis Pater
6600d58ba2 feat(codex): enhance input transformation and remove unused safety_identifier field
- Added logic to transform `inputResults` into structured JSON for improved processing.
- Removed redundant `safety_identifier` field in executor payload to streamline requests.
2026-01-16 19:59:01 +08:00
hkfires
ed8b0f25ee fix(thinking): use LookupModelInfo for model data 2026-01-15 13:06:41 +08:00
hkfires
0b06d637e7 refactor: improve thinking logic 2026-01-15 13:06:39 +08:00
hkfires
220ca45f74 fix(codex): only override instructions when upstream provides them 2026-01-11 15:52:21 +08:00
hkfires
70a82d80ac fix(codex): only override instructions in responses for OpenCode UA 2026-01-11 15:19:37 +08:00
hkfires
ac626111ac feat(codex): add OpenCode instructions based on user agent 2026-01-11 13:36:35 +08:00
Luis Pater
d47b7dc79a refactor(response): enhance parameter handling for Codex to Claude conversion 2026-01-09 05:20:19 +08:00
Luis Pater
3d01b3cfe8 Merge pull request #553 from XInTheDark/fix/builtin-tools-web-search
fix(translator): preserve built-in tools (web_search) to Responses API
2026-01-09 04:40:13 +08:00
Luis Pater
a86d501dc2 refactor: replace json.Marshal and json.Unmarshal with sjson and gjson
Optimized the handling of JSON serialization and deserialization by replacing redundant `json.Marshal` and `json.Unmarshal` calls with `sjson` and `gjson`. Introduced a `marshalJSONValue` utility for compact JSON encoding, improving performance and code simplicity. Removed unused `encoding/json` imports.
2025-12-22 11:44:06 +08:00
hkfires
28a428ae2f fix(thinking): align budget effort mapping across translators
Unify thinking budget-to-effort conversion in a shared helper, handle disabled/default thinking cases in translators, adjust zero-budget mapping, and drop the old OpenAI-specific helper with updated tests.
2025-12-16 18:34:43 +08:00