Commit Graph

419 Commits

Author SHA1 Message Date
Luis Pater
d2c7e4e96a refactor(runtime): move executor utilities to helps package and update references 2026-04-01 03:08:20 +08:00
Luis Pater
51fd58d74f fix(codex): use normalizeCodexInstructions to set default instructions 2026-03-31 12:16:57 +08:00
Luis Pater
faae9c2f7c Merge pull request #2422 from MonsterQiu/fix/codex-compact-instructions
fix(codex): add default instructions for /responses/compact
2026-03-31 12:14:20 +08:00
Luis Pater
bc3a6e4646 Merge pull request #2434 from MonsterQiu/fix/codex-responses-null-instructions
fix(codex): normalize null instructions for /responses requests
2026-03-31 12:01:21 +08:00
MonsterQiu
39b9a38fbc fix(codex): normalize null instructions across responses paths 2026-03-31 10:32:39 +08:00
MonsterQiu
bd855abec9 fix(codex): normalize null instructions for responses requests 2026-03-31 10:29:02 +08:00
xixiwenxuanhe
a0bf33eca6 fix(antigravity): preserve fallback and honor config gate 2026-03-31 00:14:05 +08:00
xixiwenxuanhe
88dd9c715d feat(antigravity): add AI credits quota fallback 2026-03-30 23:58:12 +08:00
MonsterQiu
d3b94c9241 fix(codex): normalize null instructions for compact requests 2026-03-30 22:58:05 +08:00
MonsterQiu
d11936f292 fix(codex): add default instructions for /responses/compact 2026-03-30 22:44:46 +08:00
Luis Pater
13aa5b3375 Revert "fix(codex): restore prompt cache continuity for Codex requests" 2026-03-29 22:18:14 +08:00
Luis Pater
1587ff5e74 Merge pull request #2389 from router-for-me/claude
fix(claude): add default max_tokens for models
2026-03-29 13:03:20 +08:00
hkfires
f033d3a6df fix(claude): enhance ensureModelMaxTokens to use registered max_completion_tokens and fallback to default 2026-03-29 13:00:43 +08:00
hkfires
145e0e0b5d fix(claude): add default max_tokens for models 2026-03-29 12:46:00 +08:00
Luis Pater
55271403fb Merge pull request #2374 from VooDisss/codex-cache-clean
fix(codex): restore prompt cache continuity for Codex requests
2026-03-28 21:16:51 +08:00
Luis Pater
b9b127a7ea Merge pull request #2347 from edlsh/fix/codex-strip-stream-options
fix(codex): strip stream_options from Responses API requests
2026-03-28 21:03:01 +08:00
VooDisss
e5d3541b5a refactor(codex): remove stale affinity cleanup leftovers
Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.
2026-03-27 20:40:26 +02:00
VooDisss
26eca8b6ba fix(codex): preserve continuity and safe affinity fallback
Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.
2026-03-27 18:27:33 +02:00
VooDisss
62b17f40a1 refactor(codex): align continuity helpers with review feedback
Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.
2026-03-27 18:11:57 +02:00
VooDisss
511b8a992e fix(codex): restore prompt cache continuity for Codex requests
Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself.

The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context.

This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists.

The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse.

To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.
2026-03-27 17:49:29 +02:00
edlsh
754f3bcbc3 fix(codex): strip stream_options from Responses API requests
The Codex/OpenAI Responses API does not support the stream_options
parameter. When clients (e.g. Amp CLI) include stream_options in their
requests, CLIProxyAPI forwards it as-is, causing a 400 error:

  {"detail":"Unsupported parameter: stream_options"}

Strip stream_options alongside the other unsupported parameters
(previous_response_id, prompt_cache_retention, safety_identifier)
in Execute, ExecuteStream, and CountTokens.
2026-03-25 11:58:36 -04:00
pjpj
36973d4a6f Handle Codex capacity errors as retryable 2026-03-25 23:25:31 +08:00
hkfires
528b1a2307 feat(codex): pass through codex client identity headers 2026-03-25 08:48:18 +08:00
Luis Pater
0906aeca87 Merge pull request #2254 from clcc2019/main
refactor: streamline usage reporting by consolidating record publishi…
2026-03-24 00:39:31 +08:00
Luis Pater
a000eb523d Merge pull request #2213 from TTTPOB/ua-fix
feat(claude): stabilize device fingerprint across mixed Claude Code and cloaked clients
2026-03-23 22:53:51 +08:00
dslife2025
0ed2d16596 Merge branch 'router-for-me:main' into main 2026-03-23 09:50:43 +08:00
clcc2019
c1bf298216 refactor: streamline usage reporting by consolidating record publishing logic
- Introduced a new method `buildRecord` in `usageReporter` to encapsulate record creation, improving code readability and maintainability.
- Added latency tracking to usage records, ensuring accurate reporting of request latencies.
- Updated tests to validate the inclusion of latency in usage records and ensure proper functionality of the new reporting structure.
2026-03-20 19:44:26 +08:00
Luis Pater
2bd646ad70 refactor: replace sjson.Set usage with sjson.SetBytes to optimize mutable JSON transformations 2026-03-19 17:58:54 +08:00
tpob
52c1fa025e fix(claude): learn official fingerprints after custom baselines 2026-03-19 13:59:41 +08:00
tpob
680105f84d fix(claude): refresh cached fingerprint after baseline upgrades 2026-03-19 13:28:58 +08:00
tpob
f7069e9548 fix(claude): pin stabilized OS arch to baseline 2026-03-19 13:07:16 +08:00
tpob
8179d5a8a4 fix(claude): avoid racy fingerprint downgrades 2026-03-19 01:03:41 +08:00
tpob
6fa7abe434 fix(claude): keep configured baseline above older fingerprints 2026-03-19 01:02:04 +08:00
tpob
dd64adbeeb fix(claude): preserve legacy user agent overrides 2026-03-19 00:03:09 +08:00
tpob
616d41c06a fix(claude): restore legacy runtime OS arch fallback 2026-03-19 00:01:50 +08:00
tpob
e0e337aeb9 feat(claude): add switch for device profile stabilization 2026-03-18 19:31:59 +08:00
tpob
d52839fced fix: stabilize claude device fingerprint 2026-03-18 18:46:54 +08:00
Zhenyu Qi
aec65e3be3 fix(openai_compat): add stream_options.include_usage for streaming usage tracking 2026-03-13 00:48:17 -07:00
Luis Pater
817cebb321 Merge pull request #2082 from router-for-me/antigravity
Refactor Antigravity model handling and improve logging
2026-03-12 10:39:13 +08:00
hkfires
dea3e74d35 feat(antigravity): refactor model handling and remove unused code 2026-03-12 09:24:45 +08:00
Luis Pater
89d7be9525 Merge branch 'dev' into codex/custom-useragent-request 2026-03-11 22:55:50 +08:00
lang-911
70988d387b Add Codex websocket header defaults 2026-03-11 00:34:57 -07:00
Luis Pater
ddaa9d2436 Fixed: #2034
feat(proxy): centralize proxy handling with `proxyutil` package and enhance test coverage

- Added `proxyutil` package to simplify proxy handling across the codebase.
- Refactored various components (`executor`, `cliproxy`, `auth`, etc.) to use `proxyutil` for consistent and reusable proxy logic.
- Introduced support for "direct" proxy mode to explicitly bypass all proxies.
- Updated tests to validate proxy behavior (e.g., `direct`, HTTP/HTTPS, and SOCKS5).
- Enhanced YAML configuration documentation for proxy options.
2026-03-11 11:08:02 +08:00
hkfires
d1e3195e6f feat(codex): register models by plan tier 2026-03-10 11:20:37 +08:00
Dominic Robinson
a1e0fa0f39 test(executor): cover string system prompt handling in checkSystemInstructionsWithMode 2026-03-09 12:40:27 +00:00
Dominic Robinson
5c9997cdac fix: Preserve system prompt when sent as a string instead of content block array 2026-03-09 07:38:11 +00:00
hkfires
424711b718 fix(executor): use aiplatform base url for vertex api key calls 2026-03-08 20:13:12 +08:00
Luis Pater
1042489f85 Merge pull request #1893 from thebtf/fix/normalize-ttl-byte-preservation-mainline
fix: preserve original JSON bytes in normalizeCacheControlTTL
2026-03-07 22:14:13 +08:00
Luis Pater
5ebc58fab4 refactor(executor): remove legacy connCreateSent logic and standardize response.create usage for all websocket events
- Simplified connection logic by removing `connCreateSent` and related state handling.
- Updated `buildCodexWebsocketRequestBody` to always use `response.create`.
- Added unit tests to validate `response.create` behavior and beta header preservation.
- Dropped unsupported `response.append` and outdated `response.done` event types.
2026-03-07 09:07:23 +08:00
Kirill Turanskiy
97fdd2e088 fix: preserve original JSON bytes in normalizeCacheControlTTL when no TTL change needed
normalizeCacheControlTTL unconditionally re-serializes the entire request
body through json.Unmarshal/json.Marshal even when no TTL normalization
is needed. Go's json.Marshal randomizes map key order and HTML-escapes
<, >, & characters (to \u003c, \u003e, \u0026), producing different raw
bytes on every call.

Anthropic's prompt caching uses byte-prefix matching, so any byte-level
difference causes a cache miss. This means the ~119K system prompt and
tools are re-processed on every request when routed through CPA.

The fix adds a bool return to normalizeTTLForBlock to indicate whether
it actually modified anything, and skips the marshal step in
normalizeCacheControlTTL when no blocks were changed.
2026-03-05 22:28:01 +03:00