CLIProxyAPIPlus

mirror of https://github.com/router-for-me/CLIProxyAPIPlus.git synced 2026-04-11 16:53:34 +00:00

Author	SHA1	Message	Date
Thai Nguyen Hung	bd09c0bf09	feat(registry): add gpt-5.4-mini model to GitHub Copilot registry	2026-04-01 10:04:38 +07:00
Luis Pater	d5930f4e44	Merge branch 'main' into plus	2026-03-29 12:40:17 +08:00
Luis Pater	55271403fb	Merge pull request #2374 from VooDisss/codex-cache-clean fix(codex): restore prompt cache continuity for Codex requests	2026-03-28 21:16:51 +08:00
Luis Pater	b9b127a7ea	Merge pull request #2347 from edlsh/fix/codex-strip-stream-options fix(codex): strip stream_options from Responses API requests	2026-03-28 21:03:01 +08:00
Luis Pater	b8b89f34f4	Merge pull request #442 from LuxVTZ/feat/gitlab-duo-panel-parity Improve GitLab Duo gateway compatibility\n\nRestore internal/runtime/executor/claude_executor.go to main during merge.	2026-03-28 05:06:41 +08:00
VooDisss	e5d3541b5a	refactor(codex): remove stale affinity cleanup leftovers Drop the last affinity-related executor artifacts so the PR stays focused on the minimal Codex continuity fix set: stable prompt cache identity, stable session_id, and the executor-only behavior that was validated to restore cache reads.	2026-03-27 20:40:26 +02:00
VooDisss	26eca8b6ba	fix(codex): preserve continuity and safe affinity fallback Restore Claude continuity after the continuity refactor, keep auth-affinity keys out of upstream Codex session identifiers, and only persist affinity after successful execution so retries can still rotate to healthy credentials when the first auth fails.	2026-03-27 18:27:33 +02:00
VooDisss	62b17f40a1	refactor(codex): align continuity helpers with review feedback Align websocket continuity resolution with the HTTP Codex path, make auth-affinity principal keys use a stable string representation, and extract small helpers that remove duplicated continuity and affinity logic without changing the validated cache-hit behavior.	2026-03-27 18:11:57 +02:00
VooDisss	511b8a992e	fix(codex): restore prompt cache continuity for Codex requests Prompt caching on Codex was not reliably reusable through the proxy because repeated chat-completions requests could reach the upstream without the same continuity envelope. In practice this showed up most clearly with OpenCode, where cache reads worked in the reference client but not through CLIProxyAPI, although the root cause is broader than OpenCode itself. The proxy was breaking continuity in several ways: executor-layer Codex request preparation stripped prompt_cache_retention, chat-completions translation did not preserve that field, continuity headers used a different shape than the working client behavior, and OpenAI-style Codex requests could be sent without a stable prompt_cache_key. When that happened, session_id fell back to a fresh random value per request, so upstream Codex treated repeated requests as unrelated turns instead of as part of the same cacheable context. This change fixes that by preserving caller-provided prompt_cache_retention on Codex execution paths, preserving prompt_cache_retention when translating OpenAI chat-completions requests to Codex, aligning Codex continuity headers to session_id, and introducing an explicit Codex continuity policy that derives a stable continuity key from the best available signal. The resolution order prefers an explicit prompt_cache_key, then execution session metadata, then an explicit idempotency key, then stable request-affinity metadata, then a stable client-principal hash, and finally a stable auth-ID hash when no better continuity signal exists. The same continuity key is applied to both prompt_cache_key in the request body and session_id in the request headers so repeated requests reuse the same upstream cache/session identity. The auth manager also keeps auth selection sticky for repeated request sequences, preventing otherwise-equivalent Codex requests from drifting across different upstream auth contexts and accidentally breaking cache reuse. To keep the implementation maintainable, the continuity resolution and diagnostics are centralized in a dedicated Codex continuity helper instead of being scattered across executor flow code. Regression coverage now verifies retention preservation, continuity-key precedence, stable auth-ID fallback, websocket parity, translator preservation, and auth-affinity behavior. Manual validation confirmed prompt cache reads now occur through CLIProxyAPI when using Codex via OpenCode, and the fix should also benefit other clients that rely on stable repeated Codex request continuity.	2026-03-27 17:49:29 +02:00
MrHuangJser	1b7447b682	feat(cursor): implement StatusError for conductor cooldown integration Cursor executor errors were plain fmt.Errorf — the conductor couldn't extract HTTP status codes, so exhausted accounts never entered cooldown. Changes: - Add ConnectError struct to proto/connect.go: ParseConnectEndStream now returns *ConnectError with Code/Message fields for precise matching - Add cursorStatusErr implementing StatusError + RetryAfter interfaces - Add classifyCursorError() with two-layer classification: Layer 1: exact match on ConnectError.Code (gRPC standard codes) resource_exhausted → 429, unauthenticated → 401, permission_denied → 403, unavailable → 503, internal → 500 Layer 2: fuzzy string match for H2 errors (RST_STREAM → 502) - Log all ConnectError code/message pairs for observing real server error codes (we have no samples yet) - Wrap Execute and ExecuteStream error returns with classifyCursorError Now the conductor properly marks Cursor auths as cooldown on quota errors, enabling exponential backoff and round-robin failover. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 11:42:22 +08:00
MrHuangJser	40dee4453a	feat(cursor): auto-migrate sessions to healthy account on quota exhaustion When a Cursor account's quota is exhausted, sessions bound to it can now seamlessly continue on a different account: Layer 1 — Checkpoint decoupling: Key checkpoints by conversationId (not authID:conversationId). Store authID inside savedCheckpoint. On lookup, if auth changed, discard the stale checkpoint and flatten conversation history into userText. Layer 2 — Cross-account session cleanup: When a request arrives for a conversation whose session belongs to a different (now-exhausted) auth, close the old H2 stream and remove the stale session to free resources. Layer 3 — H2Stream.Err() exposure: New Err() method on H2Stream so callers can inspect RST_STREAM, GOAWAY, or other stream-level errors after closure. Layer 4 — processH2SessionFrames error propagation: Returns error instead of bare return. Connect EndStream errors (quota, rate limit) are now propagated instead of being logged and swallowed. Layer 5 — Pre-response transparent retry: If the stream fails before any data is sent to the client, return an error to the conductor so it retries with a different auth — fully transparent to the client. Layer 6 — Post-response error logging: If the stream fails after data was already sent, log a warning. The conductor's existing cooldown mechanism ensures the next request routes to a healthy account. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 10:50:32 +08:00
黄姜恒	de5fe71478	feat(cursor): multi-account routing with round-robin and session isolation - Add cursor/filename.go for multi-account credential file naming - Include auth.ID in session and checkpoint keys for per-account isolation - Record authID in cursorSession, validate on resume to prevent cross-account access - Management API /cursor-auth-url supports ?label= for creating named accounts - Leverages existing conductor round-robin + failover framework Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 11:27:49 +08:00
黄姜恒	c95620f90e	feat(cursor): conversation checkpoint + session_id for multi-turn context - Capture conversation_checkpoint_update from Cursor server (was ignored) - Store checkpoint per conversationId, replay as conversation_state on next request - Use protowire to embed raw checkpoint bytes directly (no deserialization) - Extract session_id from Claude Code metadata for stable conversationId across resume - Flatten conversation history into userText as fallback when no checkpoint available - Use conversationId as session key for reliable tool call resume - Add checkpoint TTL cleanup (30min) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 10:51:47 +08:00
edlsh	754f3bcbc3	fix(codex): strip stream_options from Responses API requests The Codex/OpenAI Responses API does not support the stream_options parameter. When clients (e.g. Amp CLI) include stream_options in their requests, CLIProxyAPI forwards it as-is, causing a 400 error: {"detail":"Unsupported parameter: stream_options"} Strip stream_options alongside the other unsupported parameters (previous_response_id, prompt_cache_retention, safety_identifier) in Execute, ExecuteStream, and CountTokens.	2026-03-25 11:58:36 -04:00
pjpj	36973d4a6f	Handle Codex capacity errors as retryable	2026-03-25 23:25:31 +08:00
黄姜恒	9613f0b3f9	feat(cursor): deterministic conversation_id from Claude Code session cch Extract the cch hash from Claude Code's billing header in the system prompt (x-anthropic-billing-header: ...cch=XXXXX;) and use it to derive a deterministic conversation_id instead of generating a random UUID. Same Claude Code session → same cch → same conversation_id → Cursor server can reuse conversation state across multiple turns, preserving tool call results and other context without re-encoding history. Also cleans up temporary debug logging from previous iterations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 20:29:49 +08:00
黄姜恒	274f29e26b	fix(cursor): improve session key uniqueness for multi-session safety Include system prompt prefix (first 200 chars) in session key derivation. Claude Code sessions have unique system prompts containing cwd, session_id, file paths, etc., making collisions between concurrent sessions from the same user virtually impossible. Session key now = SHA256(apiKey + model + systemPrompt[:200] + firstUserMsg) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 17:24:37 +08:00
黄姜恒	c8e79c3787	fix(cursor): prevent session key collision across users Include client API key in session key derivation to prevent different users sharing the same proxy from accidentally resuming each other's H2 streams when they send identical first messages with the same model. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 17:19:11 +08:00
黄姜恒	8afef43887	fix(cursor): preserve tool call context in multi-turn conversations When an assistant message appears after tool results without a pending user message, append it to the last turn's assistant text instead of dropping it. Also add bakeToolResultsIntoTurns() to merge tool results into turn context when no active H2 session exists for resume, ensuring the model sees the full tool interaction history in follow-up requests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 17:15:24 +08:00
黄姜恒	c1083cbfc6	fix(cursor): MCP tool call resume, H2 flow control, and token usage - Rewrite tool call mechanism from interrupt-resume to inline-wait mode: processH2SessionFrames no longer exits on mcpArgs; instead blocks on toolResultCh while continuing to handle KV/heartbeat messages, then sends MCP result and continues processing text in the same goroutine. Fixes the issue where server stopped generating text after resume. - Add switchable output channel (outMu/currentOut) so first HTTP response closes after tool_calls+[DONE], and resumed text goes to a new channel returned by resumeWithToolResults. Reset streamParam on switch so Translator produces fresh message_start/content_block_start events. - Implement send-side H2 flow control: track server's initial window size and WINDOW_UPDATE increments; Write() blocks when window exhausted. Fixes RST_STREAM FLOW_CONTROL_ERROR on large requests (178KB+). - Decode new InteractionUpdate fields: TurnEndedUpdate (field 14) as stream termination signal, HeartbeatUpdate (field 13) silently ignored, TokenDeltaUpdate (field 8) for token usage tracking. - Include token usage in final stop chunk (prompt_tokens estimated from payload size, completion_tokens from accumulated TokenDeltaUpdate deltas) so Claude CLI status bar shows non-zero token counts. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-25 17:03:14 +08:00
黄姜恒	19c52bcb60	feat: stash code	2026-03-25 10:14:14 +08:00
Luis Pater	7fa527193c	Merge pull request #453 from HeCHieh/fix/github-copilot-gpt54-responses Fix GitHub Copilot gpt-5.4 endpoint routing	2026-03-25 09:45:23 +08:00
Luis Pater	ed0eb51b4d	Merge pull request #450 from lwiles692/feature/add-codebuddy-support feat(auth): add CodeBuddy-CN browser OAuth authentication support	2026-03-25 09:43:52 +08:00
Luis Pater	0e4f669c8b	Merge branch 'router-for-me:main' into main	2026-03-25 09:38:34 +08:00
hkfires	528b1a2307	feat(codex): pass through codex client identity headers	2026-03-25 08:48:18 +08:00
Luis Pater	1315f710f5	Merge branch 'main' into plus	2026-03-24 00:43:26 +08:00
Luis Pater	0906aeca87	Merge pull request #2254 from clcc2019/main refactor: streamline usage reporting by consolidating record publishi…	2026-03-24 00:39:31 +08:00
Luis Pater	a000eb523d	Merge pull request #2213 from TTTPOB/ua-fix feat(claude): stabilize device fingerprint across mixed Claude Code and cloaked clients	2026-03-23 22:53:51 +08:00
dslife2025	0ed2d16596	Merge branch 'router-for-me:main' into main	2026-03-23 09:50:43 +08:00
Luis Pater	f3c59165d7	Merge branch 'pr-454' # Conflicts: # cmd/server/main.go # internal/translator/claude/openai/chat-completions/claude_openai_response.go	2026-03-22 22:52:46 +08:00
hechieh	e6690cb447	Refine GitHub Copilot endpoint selection Amp-Thread-ID: https://ampcode.com/threads/T-019d14cd-bc90-70ce-b1ae-87bc97332650 Co-authored-by: Amp <amp@ampcode.com>	2026-03-22 19:43:35 +08:00
hechieh	35907416b8	Fix GitHub Copilot gpt-5.4 endpoint routing Amp-Thread-ID: https://ampcode.com/threads/T-019d14cd-bc90-70ce-b1ae-87bc97332650 Co-authored-by: Amp <amp@ampcode.com>	2026-03-22 19:05:44 +08:00
clcc2019	c1bf298216	refactor: streamline usage reporting by consolidating record publishing logic - Introduced a new method `buildRecord` in `usageReporter` to encapsulate record creation, improving code readability and maintainability. - Added latency tracking to usage records, ensuring accurate reporting of request latencies. - Updated tests to validate the inclusion of latency in usage records and ensure proper functionality of the new reporting structure.	2026-03-20 19:44:26 +08:00
Luis Pater	2bd646ad70	refactor: replace `sjson.Set` usage with `sjson.SetBytes` to optimize mutable JSON transformations	2026-03-19 17:58:54 +08:00
tpob	52c1fa025e	fix(claude): learn official fingerprints after custom baselines	2026-03-19 13:59:41 +08:00
tpob	680105f84d	fix(claude): refresh cached fingerprint after baseline upgrades	2026-03-19 13:28:58 +08:00
tpob	f7069e9548	fix(claude): pin stabilized OS arch to baseline	2026-03-19 13:07:16 +08:00
tpob	8179d5a8a4	fix(claude): avoid racy fingerprint downgrades	2026-03-19 01:03:41 +08:00
tpob	6fa7abe434	fix(claude): keep configured baseline above older fingerprints	2026-03-19 01:02:04 +08:00
tpob	dd64adbeeb	fix(claude): preserve legacy user agent overrides	2026-03-19 00:03:09 +08:00
tpob	616d41c06a	fix(claude): restore legacy runtime OS arch fallback	2026-03-19 00:01:50 +08:00
tpob	e0e337aeb9	feat(claude): add switch for device profile stabilization	2026-03-18 19:31:59 +08:00
tpob	d52839fced	fix: stabilize claude device fingerprint	2026-03-18 18:46:54 +08:00
Wei Lee	4022e69651	feat(auth): add CodeBuddy-CN browser OAuth authentication support	2026-03-18 17:50:12 +08:00
Luis Pater	c6cb24039d	Merge branch 'main' into plus	2026-03-15 01:50:32 +08:00
luxvtz	5da0decef6	Improve GitLab Duo gateway compatibility	2026-03-14 03:18:43 -07:00
Zhenyu Qi	aec65e3be3	fix(openai_compat): add stream_options.include_usage for streaming usage tracking	2026-03-13 00:48:17 -07:00
Luis Pater	34c8ccb961	Fixed: #437 feat(runtime): strip `service_tier` in GitHub Copilot response normalization	2026-03-13 11:50:21 +08:00
Luis Pater	d08e164af3	chore(runtime): remove unused `FetchAntigravityModels` function from executor	2026-03-13 11:38:44 +08:00
Luis Pater	86d5db472a	Merge branch 'main' into plus	2026-03-13 11:28:52 +08:00

1 2 3 4 5 ...

657 Commits