CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-04-12 17:24:13 +00:00

Author	SHA1	Message	Date
VooDisss	511b8a992e	fix(codex): restore prompt cache continuity for Codex requests Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself. The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context. This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists. The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse. To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.	2026-03-27 17:49:29 +02:00
Luis Pater	d475aaba96	Fixed: #2274 fix(translator): omit null content fields in Codex OpenAI tool call responses	2026-03-24 01:00:57 +08:00
Luis Pater	97c0487add	Merge pull request #2223 from cnrpman/fix/codex-responses-web-search-preview-compat fix: normalize web_search_preview for codex responses	2026-03-24 00:25:37 +08:00
Junyi Du	d1df70d02f	chore: add codex builtin tool normalization logging	2026-03-20 14:08:37 +08:00
Luis Pater	2bd646ad70	refactor: replace `sjson.Set` usage with `sjson.SetBytes` to optimize mutable JSON transformations	2026-03-19 17:58:54 +08:00
Junyi Du	793840cdb4	fix: cover dated and nested codex web search aliases	2026-03-19 03:41:12 +08:00
Junyi Du	8f421de532	fix: handle sjson errors in codex tool normalization	2026-03-19 03:36:06 +08:00
Junyi Du	be2dd60ee7	fix: normalize web_search_preview for codex responses	2026-03-19 03:23:14 +08:00
Darley	9c6c3612a8	fix(claude): read disable_parallel_tool_use from tool_choice	2026-03-17 19:35:41 +08:00
Darley	19e1a4447a	fix(claude): honor disable_parallel_tool_use	2026-03-17 19:17:41 +08:00
Muran-prog	0b94d36c4a	test: use exact match for tool name assertion Address review feedback - drop function.name fallback and strings.Contains in favor of direct == comparison.	2026-03-14 21:45:28 +02:00
Muran-prog	c8cee6a209	fix: skip empty assistant message in tool call translation (#2132 ) When assistant has tool_calls but no text content, the translator emitted an empty message into the Responses API input array before function_call items. The API then couldn't match function_call_output to its function_call by call_id, returning: No tool output found for function call ... Only emit assistant messages that have content parts. Tool-call-only messages now produce function_call items directly. Added 9 tests for tool calling translation covering single/parallel calls, multi-turn conversations, name shortening, empty content edge cases, and call_id integrity.	2026-03-14 21:01:01 +02:00
Luis Pater	7b7b258c38	Fixed: #2022 test(translator): add tests for handling Claude system messages as string and array	2026-03-11 10:47:33 +08:00
Luis Pater	38277c1ea6	Merge pull request #1875 from woqiqishi/fix/tool-use-id-sanitize fix: sanitize tool_use.id to comply with Claude API regex ^[a-zA-Z0-9_-]+$	2026-03-07 22:06:36 +08:00
Luis Pater	2695a99623	fix(translator): conditionally remove `service_tier` from OpenAI response processing	2026-03-06 11:07:22 +08:00
Xu Hong	553d6f50ea	fix: sanitize tool_use.id to comply with Claude API regex ^[a-zA-Z0-9_-]+$ Add util.SanitizeClaudeToolID() to replace non-conforming characters in tool_use.id fields across all five response translators (gemini, codex, openai, antigravity, gemini-cli). Upstream tool names may contain dots or other special characters (e.g. "fs.readFile") that violate Claude's ID validation regex. The sanitizer replaces such characters with underscores and provides a generated fallback for empty IDs. Fixes #1872, Fixes #1849 Made-with: Cursor	2026-03-06 00:10:09 +08:00
Luis Pater	cc8dc7f62c	Merge branch 'main' into dev	2026-03-05 23:13:21 +08:00
Luis Pater	a3846ea513	Merge pull request #1870 from sususu98/fix/remove-instructions-restore cleanup(translator): remove leftover instructions restore in codex responses	2026-03-05 23:12:31 +08:00
Luis Pater	0e6bb076e9	fix(translator): comment out `service_tier` removal from OpenAI response processing	2026-03-05 22:49:38 +08:00
sususu98	68a6cabf8b	style: blank unused params in codex responses translator	2026-03-05 16:42:48 +08:00
sususu98	ac0e387da1	cleanup(translator): remove leftover instructions restore in codex responses The instructions restore logic was originally needed when the proxy injected custom instructions (per-model system prompts) into requests. Since `ac802a46` removed the injection system, the proxy no longer modifies instructions before forwarding. The upstream response's instructions field now matches the client's original value, making the restore a no-op. Also removes unused sjson import. Closes router-for-me/CLIProxyAPI#1868	2026-03-05 16:34:55 +08:00
Luis Pater	5850492a93	Fixed: #1548 test(translator): add unit tests for fallback logic in `ConvertCodexResponseToOpenAI` model assignment	2026-03-05 12:11:54 +08:00
sususu98	d26ad8224d	fix(translator): strip defer_loading from Claude tool declarations in Codex and Gemini translators Claude's Tool Search feature (advanced-tool-use-2025-11-20 beta) adds defer_loading field to tool definitions. When proxying Claude requests to Codex or Gemini, this unknown field causes 400 errors upstream. Strip defer_loading (and cache_control where missing) in all three Claude-to-upstream translation paths: - codex/claude: defer_loading + cache_control - gemini-cli/claude: defer_loading - gemini/claude: defer_loading Fixes #1725, Fixes #1375	2026-03-04 14:21:30 +08:00
Luis Pater	9f95b31158	fix(translator): enhance handling of mixed output content in Claude requests	2026-03-03 21:49:41 +08:00
hkfires	ce87714ef1	feat(thinking): normalize effort levels in adaptive thinking requests to prevent validation errors	2026-03-03 15:10:47 +08:00
hkfires	d2e5857b82	feat(thinking): enhance adaptive thinking support across models and update test cases	2026-03-03 13:00:24 +08:00
hkfires	c44793789b	feat(thinking): add adaptive thinking support for Claude models Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.	2026-03-03 09:05:31 +08:00
hkfires	914db94e79	refactor(headers): streamline User-Agent handling and introduce GeminiCLI versioning	2026-03-02 13:04:30 +08:00
hkfires	b148820c35	fix(translator): handle Claude thinking type "auto" like adaptive	2026-03-01 10:30:19 +08:00
Luis Pater	a6ce5f36e6	Fixed: #1758 fix(codex): filter billing headers from system result text and update template logic	2026-03-01 01:45:35 +08:00
Luis Pater	d24ea4ce2a	Merge pull request #1664 from ciberponk/pr/responses-compaction-compat feat: add codex responses compatibility for compaction payloads	2026-02-25 01:21:59 +08:00
Luis Pater	c3e12c5e58	Merge pull request #1654 from alexey-yanchenko/feature/pass-file-inputs Pass file input from /chat/completions and /responses to codex and claude	2026-02-24 05:53:11 +08:00
fan	afc8a0f9be	refactor: simplify context_management compatibility handling	2026-02-21 22:20:48 +08:00
ciberponk	d693d7993b	feat: support responses compaction payload compatibility for codex translator	2026-02-21 12:56:10 +08:00
Alexey Yanchenko	0cbfe7f457	Pass file input from /chat/completions and /responses to codex and claude	2026-02-20 10:25:44 +07:00
Kirill Turanskiy	1cc21cc45b	fix: prevent duplicate function call arguments when delta events precede done Non-spark codex models (gpt-5.3-codex, gpt-5.2-codex) stream function call arguments via multiple delta events followed by a done event. The done handler unconditionally emitted the full arguments, duplicating what deltas already streamed. This produced invalid double JSON that Claude Code couldn't parse, causing tool calls to fail with missing parameters and infinite retry loops. Add HasReceivedArgumentsDelta flag to track whether delta events were received. The done handler now only emits arguments when no deltas preceded it (spark models), while delta-based streaming continues to work for non-spark models.	2026-02-19 23:18:14 +03:00
Kirill Turanskiy	07cf616e2b	fix: handle response.function_call_arguments.done in codex→claude streaming translator Some Codex models (e.g. gpt-5.3-codex-spark) send function call arguments in a single "done" event without preceding "delta" events. The streaming translator only handled "delta" events, causing tool call arguments to be lost — resulting in empty tool inputs and infinite retry loops in clients like Claude Code. Emit the full arguments from the "done" event as a single input_json_delta so downstream clients receive the complete tool input.	2026-02-19 23:18:14 +03:00
Kirill Turanskiy	5fa23c7f41	fix: handle tool call argument streaming in Codex→OpenAI translator The OpenAI Chat Completions translator was silently dropping response.function_call_arguments.delta and response.function_call_arguments.done Codex SSE events, meaning tool call arguments were never streamed incrementally to clients. Add proper handling mirroring the proven Claude translator pattern: - response.output_item.added: announce tool call (id, name, empty args) - response.function_call_arguments.delta: stream argument chunks - response.function_call_arguments.done: emit full args if no deltas - response.output_item.done: defensive fallback for backward compat State tracking via HasReceivedArgumentsDelta and HasToolCallAnnounced ensures no duplicate argument emission and correct behavior for models like codex-spark that skip delta events entirely.	2026-02-18 19:09:05 +03:00
Alexey Yanchenko	63d4de5eea	Pass cache usage from codex to openai chat completions	2026-02-15 12:04:15 +07:00
Luis Pater	a146c6c0aa	Merge pull request #1523 from xxddff/feature/removeUserField fix(codex): remove unsupported 'user' field from /v1/responses payload	2026-02-11 20:38:16 +08:00
xxddff	bb9fe52f1e	Update internal/translator/codex/openai/responses/codex_openai-responses_request_test.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-10 18:24:58 +09:00
xxddff	afe4c1bfb7	更新internal/translator/codex/openai/responses/codex_openai-responses_request.go Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>	2026-02-10 18:24:26 +09:00
xxddff	865af9f19e	Implement test for user field deletion Add test to verify deletion of user field in response	2026-02-10 17:38:49 +09:00
xxddff	2b97cb98b5	Delete 'user' field from raw JSON Remove the 'user' field from the raw JSON as requested.	2026-02-10 17:35:54 +09:00
hkfires	938a799263	feat(translator): support Claude thinking type adaptive	2026-02-10 16:20:32 +08:00
Luis Pater	80b5e79e75	fix(translator): normalize and restrict `stop_reason`/`finish_reason` usage - Standardized the handling of `stop_reason` and `finish_reason` across Codex and Gemini responses. - Restricted pass-through of specific reasons (`max_tokens`, `stop`) for consistency. - Enhanced fallback logic for undefined reasons.	2026-02-07 02:07:51 +08:00
Luis Pater	a5a25dec57	refactor(translator, executor): remove redundant `bytes.Clone` calls for improved performance - Replaced all instances of `bytes.Clone` with direct references to enhance efficiency. - Simplified payload handling across executors and translators by eliminating unnecessary data duplication.	2026-02-06 03:26:29 +08:00
neavo	6c65fdf54b	fix(gemini): support snake_case thinking config fields from Python SDK Google official Gemini Python SDK sends thinking_level, thinking_budget, and include_thoughts (snake_case) instead of thinkingLevel, thinkingBudget, and includeThoughts (camelCase). This caused thinking configuration to be ignored when using Python SDK. Changes: - Extract layer: extractGeminiConfig now reads snake_case as fallback - Apply layer: Gemini/CLI/Antigravity appliers clean up snake_case fields - Translator layer: Gemini->OpenAI/Claude/Codex translators support fallback - Tests: Added 4 test cases for snake_case field coverage Fixes #1426	2026-02-04 21:12:47 +08:00
Luis Pater	d885b81f23	Fixed: #1403 fix(translator): handle "input" field transformation for OpenAI responses	2026-02-03 21:49:30 +08:00
hkfires	354f6582b2	fix(codex): convert system role to developer for codex input	2026-02-01 15:37:37 +08:00

1 2 3

107 Commits