Compare commits

...

11 Commits

Author SHA1 Message Date
Luis Pater
84e9793e61 Merge pull request #399 from router-for-me/plus
v6.8.34
2026-03-01 02:44:36 +08:00
Luis Pater
32e64dacfd Merge branch 'main' into plus 2026-03-01 02:44:26 +08:00
Luis Pater
cc1d8f6629 Fixed: #1747
feat(auth): add configurable max-retry-credentials for finer control over cross-credential retries
2026-03-01 02:42:36 +08:00
Luis Pater
5446cd2b02 Merge pull request #1761 from margbug01/fix/thinking-chain-display
fix: support thinking.type=auto from Amp client and decouple thinking translation from unsigned history
2026-03-01 02:30:42 +08:00
margbug01
8de0885b7d fix: support thinking.type="auto" from Amp client for Antigravity Claude models
## Problem

When using Antigravity Claude models through CLIProxyAPI, the thinking
chain (reasoning content) does not display in the Amp client.

## Root Cause

The Amp client sends `thinking: {"type": "auto"}` in its requests,
but `ConvertClaudeRequestToAntigravity` only handled `"enabled"` and
`"adaptive"` types in its switch statement. The `"auto"` type was
silently ignored, resulting in no `thinkingConfig` being set in the
translated Gemini request. Without `thinkingConfig`, the Antigravity
API returns responses without any thinking content.

Additionally, the Antigravity API for Claude models does not support
`thinkingBudget: -1` (auto mode sentinel). It requires a concrete
positive budget value. The fix uses 128000 as the budget for "auto"
mode, which `ApplyThinking` will then normalize to stay within the
model's actual limits (e.g., capped to `maxOutputTokens - 1`).

## Changes

### internal/translator/antigravity/claude/antigravity_claude_request.go

1. **Add "auto" case** to the thinking type switch statement.
   Sets `thinkingBudget: 128000` and `includeThoughts: true`.
   The budget is subsequently normalized by `ApplyThinking` based
   on model-specific limits.

2. **Add "auto" to hasThinking check** so that interleaved thinking
   hints are injected for tool-use scenarios when Amp sends
   `thinking.type="auto"`.

### internal/registry/model_definitions_static_data.go

3. **Add Thinking configuration** for `claude-sonnet-4-6`,
   `claude-sonnet-4-5`, and `claude-opus-4-6` in
   `GetAntigravityModelConfig()` -- these were previously missing,
   causing `ApplyThinking` to skip thinking config entirely.

## Testing

- Deployed to Railway test instance (cpa-thinking-test)
- Verified via debug logging that:
  - Amp sends `thinking: {"type": "auto"}`
  - CPA now translates this to `thinkingConfig: {thinkingBudget: 128000, includeThoughts: true}`
  - `ApplyThinking` normalizes the budget to model-specific limits
  - Antigravity API receives the correct thinkingConfig

Amp-Thread-ID: https://ampcode.com/threads/T-019ca511-710d-776d-a07c-4b750f871a93
Co-authored-by: Amp <amp@ampcode.com>
2026-03-01 02:18:43 +08:00
Luis Pater
16243f18fd Merge branch 'router-for-me:main' into main 2026-03-01 01:46:23 +08:00
Luis Pater
a6ce5f36e6 Fixed: #1758
fix(codex): filter billing headers from system result text and update template logic
2026-03-01 01:45:35 +08:00
Luis Pater
e73cf42e28 Merge pull request #1750 from tpm2dot0/fix/claude-code-request-fingerprint-alignment
fix(cloak): align outgoing requests with real Claude Code 2.1.63
2026-03-01 01:27:28 +08:00
exe.dev user
b45343e812 fix(cloak): align outgoing requests with real Claude Code 2.1.63 fingerprint
Captured and compared outgoing requests from CLIProxyAPI against real
Claude Code 2.1.63 and fixed all detectable differences:

Headers:
- Update anthropic-beta to match 2.1.63: replace fine-grained-tool-streaming
  and prompt-caching-2024-07-31 with context-management-2025-06-27 and
  prompt-caching-scope-2026-01-05
- Remove X-Stainless-Helper-Method header (real Claude Code does not send it)
- Update default User-Agent from "claude-cli/2.1.44 (external, sdk-cli)" to
  "claude-cli/2.1.63 (external, cli)"
- Force Claude Code User-Agent for non-Claude clients to avoid leaking
  real client identity (e.g. curl, OpenAI SDKs) during cloaking

Body:
- Inject x-anthropic-billing-header as system[0] (matches real format)
- Change system prompt identifier from "You are Claude Code..." to
  "You are a Claude agent, built on Anthropic's Claude Agent SDK."
- Add cache_control with ttl:"1h" to match real request format
- Fix user_id format: user_[64hex]_account_[uuid]_session_[uuid]
  (was missing account UUID)
- Disable tool name prefix (set claudeToolPrefix to empty string)

TLS:
- Switch utls fingerprint from HelloFirefox_Auto to HelloChrome_Auto
  (closer to Node.js/OpenSSL used by real Claude Code)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 09:19:06 +00:00
Luis Pater
8599b1560e Fixed: #1716
feat(kimi): add support for explicit disabled thinking and reasoning effort handling
2026-02-28 05:29:07 +08:00
Luis Pater
8bde8c37c0 Fixed: #1711
fix(server): use resolved log directory for request logger initialization and test fallback logic
2026-02-28 05:21:01 +08:00
21 changed files with 586 additions and 104 deletions

View File

@@ -80,6 +80,10 @@ passthrough-headers: false
# Number of times to retry a request. Retries will occur if the HTTP response code is 403, 408, 500, 502, 503, or 504.
request-retry: 3
# Maximum number of different credentials to try for one failed request.
# Set to 0 to keep legacy behavior (try all available credentials).
max-retry-credentials: 0
# Maximum wait time in seconds for a cooled-down credential before triggering a retry.
max-retry-interval: 30

View File

@@ -60,10 +60,8 @@ type ServerOption func(*serverOptionConfig)
func defaultRequestLoggerFactory(cfg *config.Config, configPath string) logging.RequestLogger {
configDir := filepath.Dir(configPath)
if base := util.WritablePath(); base != "" {
return logging.NewFileRequestLogger(cfg.RequestLog, filepath.Join(base, "logs"), configDir, cfg.ErrorLogsMaxFiles)
}
return logging.NewFileRequestLogger(cfg.RequestLog, "logs", configDir, cfg.ErrorLogsMaxFiles)
logsDir := logging.ResolveLogDirectory(cfg)
return logging.NewFileRequestLogger(cfg.RequestLog, logsDir, configDir, cfg.ErrorLogsMaxFiles)
}
// WithMiddleware appends additional Gin middleware during server construction.
@@ -260,7 +258,7 @@ func NewServer(cfg *config.Config, authManager *auth.Manager, accessManager *sdk
s.oldConfigYaml, _ = yaml.Marshal(cfg)
s.applyAccessConfig(nil, cfg)
if authManager != nil {
authManager.SetRetryConfig(cfg.RequestRetry, time.Duration(cfg.MaxRetryInterval)*time.Second)
authManager.SetRetryConfig(cfg.RequestRetry, time.Duration(cfg.MaxRetryInterval)*time.Second, cfg.MaxRetryCredentials)
}
managementasset.SetCurrentConfig(cfg)
auth.SetQuotaCooldownDisabled(cfg.DisableCooling)
@@ -946,7 +944,7 @@ func (s *Server) UpdateClients(cfg *config.Config) {
}
if s.handlers != nil && s.handlers.AuthManager != nil {
s.handlers.AuthManager.SetRetryConfig(cfg.RequestRetry, time.Duration(cfg.MaxRetryInterval)*time.Second)
s.handlers.AuthManager.SetRetryConfig(cfg.RequestRetry, time.Duration(cfg.MaxRetryInterval)*time.Second, cfg.MaxRetryCredentials)
}
// Update log level dynamically when debug flag changes

View File

@@ -7,9 +7,11 @@ import (
"path/filepath"
"strings"
"testing"
"time"
gin "github.com/gin-gonic/gin"
proxyconfig "github.com/router-for-me/CLIProxyAPI/v6/internal/config"
internallogging "github.com/router-for-me/CLIProxyAPI/v6/internal/logging"
sdkaccess "github.com/router-for-me/CLIProxyAPI/v6/sdk/access"
"github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
sdkconfig "github.com/router-for-me/CLIProxyAPI/v6/sdk/config"
@@ -109,3 +111,100 @@ func TestAmpProviderModelRoutes(t *testing.T) {
})
}
}
func TestDefaultRequestLoggerFactory_UsesResolvedLogDirectory(t *testing.T) {
t.Setenv("WRITABLE_PATH", "")
t.Setenv("writable_path", "")
originalWD, errGetwd := os.Getwd()
if errGetwd != nil {
t.Fatalf("failed to get current working directory: %v", errGetwd)
}
tmpDir := t.TempDir()
if errChdir := os.Chdir(tmpDir); errChdir != nil {
t.Fatalf("failed to switch working directory: %v", errChdir)
}
defer func() {
if errChdirBack := os.Chdir(originalWD); errChdirBack != nil {
t.Fatalf("failed to restore working directory: %v", errChdirBack)
}
}()
// Force ResolveLogDirectory to fallback to auth-dir/logs by making ./logs not a writable directory.
if errWriteFile := os.WriteFile(filepath.Join(tmpDir, "logs"), []byte("not-a-directory"), 0o644); errWriteFile != nil {
t.Fatalf("failed to create blocking logs file: %v", errWriteFile)
}
configDir := filepath.Join(tmpDir, "config")
if errMkdirConfig := os.MkdirAll(configDir, 0o755); errMkdirConfig != nil {
t.Fatalf("failed to create config dir: %v", errMkdirConfig)
}
configPath := filepath.Join(configDir, "config.yaml")
authDir := filepath.Join(tmpDir, "auth")
if errMkdirAuth := os.MkdirAll(authDir, 0o700); errMkdirAuth != nil {
t.Fatalf("failed to create auth dir: %v", errMkdirAuth)
}
cfg := &proxyconfig.Config{
SDKConfig: proxyconfig.SDKConfig{
RequestLog: false,
},
AuthDir: authDir,
ErrorLogsMaxFiles: 10,
}
logger := defaultRequestLoggerFactory(cfg, configPath)
fileLogger, ok := logger.(*internallogging.FileRequestLogger)
if !ok {
t.Fatalf("expected *FileRequestLogger, got %T", logger)
}
errLog := fileLogger.LogRequestWithOptions(
"/v1/chat/completions",
http.MethodPost,
map[string][]string{"Content-Type": []string{"application/json"}},
[]byte(`{"input":"hello"}`),
http.StatusBadGateway,
map[string][]string{"Content-Type": []string{"application/json"}},
[]byte(`{"error":"upstream failure"}`),
nil,
nil,
nil,
true,
"issue-1711",
time.Now(),
time.Now(),
)
if errLog != nil {
t.Fatalf("failed to write forced error request log: %v", errLog)
}
authLogsDir := filepath.Join(authDir, "logs")
authEntries, errReadAuthDir := os.ReadDir(authLogsDir)
if errReadAuthDir != nil {
t.Fatalf("failed to read auth logs dir %s: %v", authLogsDir, errReadAuthDir)
}
foundErrorLogInAuthDir := false
for _, entry := range authEntries {
if strings.HasPrefix(entry.Name(), "error-") && strings.HasSuffix(entry.Name(), ".log") {
foundErrorLogInAuthDir = true
break
}
}
if !foundErrorLogInAuthDir {
t.Fatalf("expected forced error log in auth fallback dir %s, got entries: %+v", authLogsDir, authEntries)
}
configLogsDir := filepath.Join(configDir, "logs")
configEntries, errReadConfigDir := os.ReadDir(configLogsDir)
if errReadConfigDir != nil && !os.IsNotExist(errReadConfigDir) {
t.Fatalf("failed to inspect config logs dir %s: %v", configLogsDir, errReadConfigDir)
}
for _, entry := range configEntries {
if strings.HasPrefix(entry.Name(), "error-") && strings.HasSuffix(entry.Name(), ".log") {
t.Fatalf("unexpected forced error log in config dir %s", configLogsDir)
}
}
}

View File

@@ -15,7 +15,7 @@ import (
"golang.org/x/net/proxy"
)
// utlsRoundTripper implements http.RoundTripper using utls with Firefox fingerprint
// utlsRoundTripper implements http.RoundTripper using utls with Chrome fingerprint
// to bypass Cloudflare's TLS fingerprinting on Anthropic domains.
type utlsRoundTripper struct {
// mu protects the connections map and pending map
@@ -100,7 +100,9 @@ func (t *utlsRoundTripper) getOrCreateConnection(host, addr string) (*http2.Clie
return h2Conn, nil
}
// createConnection creates a new HTTP/2 connection with Firefox TLS fingerprint
// createConnection creates a new HTTP/2 connection with Chrome TLS fingerprint.
// Chrome's TLS fingerprint is closer to Node.js/OpenSSL (which real Claude Code uses)
// than Firefox, reducing the mismatch between TLS layer and HTTP headers.
func (t *utlsRoundTripper) createConnection(host, addr string) (*http2.ClientConn, error) {
conn, err := t.dialer.Dial("tcp", addr)
if err != nil {
@@ -108,7 +110,7 @@ func (t *utlsRoundTripper) createConnection(host, addr string) (*http2.ClientCon
}
tlsConfig := &tls.Config{ServerName: host}
tlsConn := tls.UClient(conn, tlsConfig, tls.HelloFirefox_Auto)
tlsConn := tls.UClient(conn, tlsConfig, tls.HelloChrome_Auto)
if err := tlsConn.Handshake(); err != nil {
conn.Close()
@@ -156,7 +158,7 @@ func (t *utlsRoundTripper) RoundTrip(req *http.Request) (*http.Response, error)
}
// NewAnthropicHttpClient creates an HTTP client that bypasses TLS fingerprinting
// for Anthropic domains by using utls with Firefox fingerprint.
// for Anthropic domains by using utls with Chrome fingerprint.
// It accepts optional SDK configuration for proxy settings.
func NewAnthropicHttpClient(cfg *config.SDKConfig) *http.Client {
return &http.Client{

View File

@@ -69,6 +69,9 @@ type Config struct {
// RequestRetry defines the retry times when the request failed.
RequestRetry int `yaml:"request-retry" json:"request-retry"`
// MaxRetryCredentials defines the maximum number of credentials to try for a failed request.
// Set to 0 or a negative value to keep trying all available credentials (legacy behavior).
MaxRetryCredentials int `yaml:"max-retry-credentials" json:"max-retry-credentials"`
// MaxRetryInterval defines the maximum wait time in seconds before retrying a cooled-down credential.
MaxRetryInterval int `yaml:"max-retry-interval" json:"max-retry-interval"`
@@ -673,6 +676,10 @@ func LoadConfigOptional(configFile string, optional bool) (*Config, error) {
cfg.ErrorLogsMaxFiles = 10
}
if cfg.MaxRetryCredentials < 0 {
cfg.MaxRetryCredentials = 0
}
// Sanitize Gemini API key configuration and migrate legacy entries.
cfg.SanitizeGeminiKeys()

View File

@@ -1 +1 @@
[{"type":"text","text":"You are Claude Code, Anthropic's official CLI for Claude.","cache_control":{"type":"ephemeral"}}]
[{"type":"text","text":"You are a Claude agent, built on Anthropic's Claude Agent SDK.","cache_control":{"type":"ephemeral","ttl":"1h"}}]

View File

@@ -969,9 +969,10 @@ func GetAntigravityModelConfig() map[string]*AntigravityModelConfig {
"gemini-3-flash": {Thinking: &ThinkingSupport{Min: 128, Max: 32768, ZeroAllowed: false, DynamicAllowed: true, Levels: []string{"minimal", "low", "medium", "high"}}},
"claude-opus-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-opus-4-6-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-sonnet-4-5": {MaxCompletionTokens: 64000},
"claude-opus-4-6": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 128000},
"claude-sonnet-4-5": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-sonnet-4-6": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-sonnet-4-5-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"claude-sonnet-4-6": {MaxCompletionTokens: 64000},
"claude-sonnet-4-6-thinking": {Thinking: &ThinkingSupport{Min: 1024, Max: 128000, ZeroAllowed: true, DynamicAllowed: true}, MaxCompletionTokens: 64000},
"gpt-oss-120b-medium": {},
"tab_flash_lite_preview": {},

View File

@@ -6,6 +6,9 @@ import (
"compress/flate"
"compress/gzip"
"context"
"crypto/rand"
"crypto/sha256"
"encoding/hex"
"fmt"
"io"
"net/http"
@@ -36,7 +39,9 @@ type ClaudeExecutor struct {
cfg *config.Config
}
const claudeToolPrefix = "proxy_"
// claudeToolPrefix is empty to match real Claude Code behavior (no tool name prefix).
// Previously "proxy_" was used but this is a detectable fingerprint difference.
const claudeToolPrefix = ""
func NewClaudeExecutor(cfg *config.Config) *ClaudeExecutor { return &ClaudeExecutor{cfg: cfg} }
@@ -696,17 +701,13 @@ func applyClaudeHeaders(r *http.Request, auth *cliproxyauth.Auth, apiKey string,
ginHeaders = ginCtx.Request.Header
}
promptCachingBeta := "prompt-caching-2024-07-31"
baseBetas := "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,fine-grained-tool-streaming-2025-05-14," + promptCachingBeta
baseBetas := "claude-code-20250219,oauth-2025-04-20,interleaved-thinking-2025-05-14,context-management-2025-06-27,prompt-caching-scope-2026-01-05"
if val := strings.TrimSpace(ginHeaders.Get("Anthropic-Beta")); val != "" {
baseBetas = val
if !strings.Contains(val, "oauth") {
baseBetas += ",oauth-2025-04-20"
}
}
if !strings.Contains(baseBetas, promptCachingBeta) {
baseBetas += "," + promptCachingBeta
}
// Merge extra betas from request body
if len(extraBetas) > 0 {
@@ -727,8 +728,7 @@ func applyClaudeHeaders(r *http.Request, auth *cliproxyauth.Auth, apiKey string,
misc.EnsureHeader(r.Header, ginHeaders, "Anthropic-Version", "2023-06-01")
misc.EnsureHeader(r.Header, ginHeaders, "Anthropic-Dangerous-Direct-Browser-Access", "true")
misc.EnsureHeader(r.Header, ginHeaders, "X-App", "cli")
// Values below match Claude Code 2.1.44 / @anthropic-ai/sdk 0.74.0 (captured 2026-02-17).
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Helper-Method", "stream")
// Values below match Claude Code 2.1.63 / @anthropic-ai/sdk 0.74.0 (updated 2026-02-28).
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Retry-Count", "0")
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Runtime-Version", hdrDefault(hd.RuntimeVersion, "v24.3.0"))
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Package-Version", hdrDefault(hd.PackageVersion, "0.74.0"))
@@ -737,7 +737,18 @@ func applyClaudeHeaders(r *http.Request, auth *cliproxyauth.Auth, apiKey string,
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Arch", mapStainlessArch())
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Os", mapStainlessOS())
misc.EnsureHeader(r.Header, ginHeaders, "X-Stainless-Timeout", hdrDefault(hd.Timeout, "600"))
misc.EnsureHeader(r.Header, ginHeaders, "User-Agent", hdrDefault(hd.UserAgent, "claude-cli/2.1.44 (external, sdk-cli)"))
// For User-Agent, only forward the client's header if it's already a Claude Code client.
// Non-Claude-Code clients (e.g. curl, OpenAI SDKs) get the default Claude Code User-Agent
// to avoid leaking the real client identity during cloaking.
clientUA := ""
if ginHeaders != nil {
clientUA = ginHeaders.Get("User-Agent")
}
if isClaudeCodeClient(clientUA) {
r.Header.Set("User-Agent", clientUA)
} else {
r.Header.Set("User-Agent", hdrDefault(hd.UserAgent, "claude-cli/2.1.63 (external, cli)"))
}
r.Header.Set("Connection", "keep-alive")
r.Header.Set("Accept-Encoding", "gzip, deflate, br, zstd")
if stream {
@@ -771,22 +782,7 @@ func claudeCreds(a *cliproxyauth.Auth) (apiKey, baseURL string) {
}
func checkSystemInstructions(payload []byte) []byte {
system := gjson.GetBytes(payload, "system")
claudeCodeInstructions := `[{"type":"text","text":"You are Claude Code, Anthropic's official CLI for Claude."}]`
if system.IsArray() {
if gjson.GetBytes(payload, "system.0.text").String() != "You are Claude Code, Anthropic's official CLI for Claude." {
system.ForEach(func(_, part gjson.Result) bool {
if part.Get("type").String() == "text" {
claudeCodeInstructions, _ = sjson.SetRaw(claudeCodeInstructions, "-1", part.Raw)
}
return true
})
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
}
} else {
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
}
return payload
return checkSystemInstructionsWithMode(payload, false)
}
func isClaudeOAuthToken(apiKey string) bool {
@@ -1060,33 +1056,67 @@ func injectFakeUserID(payload []byte, apiKey string, useCache bool) []byte {
return payload
}
// checkSystemInstructionsWithMode injects Claude Code system prompt.
// In strict mode, it replaces all user system messages.
// In non-strict mode (default), it prepends to existing system messages.
// generateBillingHeader creates the x-anthropic-billing-header text block that
// real Claude Code prepends to every system prompt array.
// Format: x-anthropic-billing-header: cc_version=<ver>.<build>; cc_entrypoint=cli; cch=<hash>;
func generateBillingHeader(payload []byte) string {
// Generate a deterministic cch hash from the payload content (system + messages + tools).
// Real Claude Code uses a 5-char hex hash that varies per request.
h := sha256.Sum256(payload)
cch := hex.EncodeToString(h[:])[:5]
// Build hash: 3-char hex, matches the pattern seen in real requests (e.g. "a43")
buildBytes := make([]byte, 2)
_, _ = rand.Read(buildBytes)
buildHash := hex.EncodeToString(buildBytes)[:3]
return fmt.Sprintf("x-anthropic-billing-header: cc_version=2.1.63.%s; cc_entrypoint=cli; cch=%s;", buildHash, cch)
}
// checkSystemInstructionsWithMode injects Claude Code system prompt to match
// the real Claude Code request format:
// system[0]: billing header (no cache_control)
// system[1]: "You are a Claude agent, built on Anthropic's Claude Agent SDK." (with cache_control)
// system[2..]: user's system messages (with cache_control on last)
func checkSystemInstructionsWithMode(payload []byte, strictMode bool) []byte {
system := gjson.GetBytes(payload, "system")
claudeCodeInstructions := `[{"type":"text","text":"You are Claude Code, Anthropic's official CLI for Claude."}]`
billingText := generateBillingHeader(payload)
billingBlock := fmt.Sprintf(`{"type":"text","text":"%s"}`, billingText)
agentBlock := `{"type":"text","text":"You are a Claude agent, built on Anthropic's Claude Agent SDK.","cache_control":{"type":"ephemeral","ttl":"1h"}}`
if strictMode {
// Strict mode: replace all system messages with Claude Code prompt only
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
// Strict mode: billing header + agent identifier only
result := "[" + billingBlock + "," + agentBlock + "]"
payload, _ = sjson.SetRawBytes(payload, "system", []byte(result))
return payload
}
// Non-strict mode (default): prepend Claude Code prompt to existing system messages
if system.IsArray() {
if gjson.GetBytes(payload, "system.0.text").String() != "You are Claude Code, Anthropic's official CLI for Claude." {
system.ForEach(func(_, part gjson.Result) bool {
if part.Get("type").String() == "text" {
claudeCodeInstructions, _ = sjson.SetRaw(claudeCodeInstructions, "-1", part.Raw)
}
return true
})
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
}
} else {
payload, _ = sjson.SetRawBytes(payload, "system", []byte(claudeCodeInstructions))
// Non-strict mode: billing header + agent identifier + user system messages
// Skip if already injected
firstText := gjson.GetBytes(payload, "system.0.text").String()
if strings.HasPrefix(firstText, "x-anthropic-billing-header:") {
return payload
}
result := "[" + billingBlock + "," + agentBlock
if system.IsArray() {
system.ForEach(func(_, part gjson.Result) bool {
if part.Get("type").String() == "text" {
// Add cache_control with ttl to user system messages if not present
partJSON := part.Raw
if !part.Get("cache_control").Exists() {
partJSON, _ = sjson.Set(partJSON, "cache_control.type", "ephemeral")
partJSON, _ = sjson.Set(partJSON, "cache_control.ttl", "1h")
}
result += "," + partJSON
}
return true
})
}
result += "]"
payload, _ = sjson.SetRawBytes(payload, "system", []byte(result))
return payload
}

View File

@@ -9,17 +9,18 @@ import (
"github.com/google/uuid"
)
// userIDPattern matches Claude Code format: user_[64-hex]_account__session_[uuid-v4]
var userIDPattern = regexp.MustCompile(`^user_[a-fA-F0-9]{64}_account__session_[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$`)
// userIDPattern matches Claude Code format: user_[64-hex]_account_[uuid]_session_[uuid]
var userIDPattern = regexp.MustCompile(`^user_[a-fA-F0-9]{64}_account_[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}_session_[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$`)
// generateFakeUserID generates a fake user ID in Claude Code format.
// Format: user_[64-hex-chars]_account__session_[UUID-v4]
// Format: user_[64-hex-chars]_account_[UUID-v4]_session_[UUID-v4]
func generateFakeUserID() string {
hexBytes := make([]byte, 32)
_, _ = rand.Read(hexBytes)
hexPart := hex.EncodeToString(hexBytes)
uuidPart := uuid.New().String()
return "user_" + hexPart + "_account__session_" + uuidPart
accountUUID := uuid.New().String()
sessionUUID := uuid.New().String()
return "user_" + hexPart + "_account_" + accountUUID + "_session_" + sessionUUID
}
// isValidUserID checks if a user ID matches Claude Code format.

View File

@@ -1,8 +1,7 @@
// Package kimi implements thinking configuration for Kimi (Moonshot AI) models.
//
// Kimi models use the OpenAI-compatible reasoning_effort format with discrete levels
// (low/medium/high). The provider strips any existing thinking config and applies
// the unified ThinkingConfig in OpenAI format.
// Kimi models use the OpenAI-compatible reasoning_effort format for enabled thinking
// levels, but use thinking.type=disabled when thinking is explicitly turned off.
package kimi
import (
@@ -17,8 +16,8 @@ import (
// Applier implements thinking.ProviderApplier for Kimi models.
//
// Kimi-specific behavior:
// - Output format: reasoning_effort (string: low/medium/high)
// - Uses OpenAI-compatible format
// - Enabled thinking: reasoning_effort (string levels)
// - Disabled thinking: thinking.type="disabled"
// - Supports budget-to-level conversion
type Applier struct{}
@@ -35,11 +34,19 @@ func init() {
// Apply applies thinking configuration to Kimi request body.
//
// Expected output format:
// Expected output format (enabled):
//
// {
// "reasoning_effort": "high"
// }
//
// Expected output format (disabled):
//
// {
// "thinking": {
// "type": "disabled"
// }
// }
func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *registry.ModelInfo) ([]byte, error) {
if thinking.IsUserDefinedModel(modelInfo) {
return applyCompatibleKimi(body, config)
@@ -60,8 +67,13 @@ func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *
}
effort = string(config.Level)
case thinking.ModeNone:
// Kimi uses "none" to disable thinking
effort = string(thinking.LevelNone)
// Respect clamped fallback level for models that cannot disable thinking.
if config.Level != "" && config.Level != thinking.LevelNone {
effort = string(config.Level)
break
}
// Kimi requires explicit disabled thinking object.
return applyDisabledThinking(body)
case thinking.ModeBudget:
// Convert budget to level using threshold mapping
level, ok := thinking.ConvertBudgetToLevel(config.Budget)
@@ -79,12 +91,7 @@ func (a *Applier) Apply(body []byte, config thinking.ThinkingConfig, modelInfo *
if effort == "" {
return body, nil
}
result, err := sjson.SetBytes(body, "reasoning_effort", effort)
if err != nil {
return body, fmt.Errorf("kimi thinking: failed to set reasoning_effort: %w", err)
}
return result, nil
return applyReasoningEffort(body, effort)
}
// applyCompatibleKimi applies thinking config for user-defined Kimi models.
@@ -101,7 +108,9 @@ func applyCompatibleKimi(body []byte, config thinking.ThinkingConfig) ([]byte, e
}
effort = string(config.Level)
case thinking.ModeNone:
effort = string(thinking.LevelNone)
if config.Level == "" || config.Level == thinking.LevelNone {
return applyDisabledThinking(body)
}
if config.Level != "" {
effort = string(config.Level)
}
@@ -118,9 +127,33 @@ func applyCompatibleKimi(body []byte, config thinking.ThinkingConfig) ([]byte, e
return body, nil
}
result, err := sjson.SetBytes(body, "reasoning_effort", effort)
if err != nil {
return body, fmt.Errorf("kimi thinking: failed to set reasoning_effort: %w", err)
return applyReasoningEffort(body, effort)
}
func applyReasoningEffort(body []byte, effort string) ([]byte, error) {
result, errDeleteThinking := sjson.DeleteBytes(body, "thinking")
if errDeleteThinking != nil {
return body, fmt.Errorf("kimi thinking: failed to clear thinking object: %w", errDeleteThinking)
}
result, errSetEffort := sjson.SetBytes(result, "reasoning_effort", effort)
if errSetEffort != nil {
return body, fmt.Errorf("kimi thinking: failed to set reasoning_effort: %w", errSetEffort)
}
return result, nil
}
func applyDisabledThinking(body []byte) ([]byte, error) {
result, errDeleteThinking := sjson.DeleteBytes(body, "thinking")
if errDeleteThinking != nil {
return body, fmt.Errorf("kimi thinking: failed to clear thinking object: %w", errDeleteThinking)
}
result, errDeleteEffort := sjson.DeleteBytes(result, "reasoning_effort")
if errDeleteEffort != nil {
return body, fmt.Errorf("kimi thinking: failed to clear reasoning_effort: %w", errDeleteEffort)
}
result, errSetType := sjson.SetBytes(result, "thinking.type", "disabled")
if errSetType != nil {
return body, fmt.Errorf("kimi thinking: failed to set thinking.type: %w", errSetType)
}
return result, nil
}

View File

@@ -0,0 +1,72 @@
package kimi
import (
"testing"
"github.com/router-for-me/CLIProxyAPI/v6/internal/registry"
"github.com/router-for-me/CLIProxyAPI/v6/internal/thinking"
"github.com/tidwall/gjson"
)
func TestApply_ModeNone_UsesDisabledThinking(t *testing.T) {
applier := NewApplier()
modelInfo := &registry.ModelInfo{
ID: "kimi-k2.5",
Thinking: &registry.ThinkingSupport{Min: 1024, Max: 32000, ZeroAllowed: true, DynamicAllowed: true},
}
body := []byte(`{"model":"kimi-k2.5","reasoning_effort":"none","thinking":{"type":"enabled","budget_tokens":2048}}`)
out, errApply := applier.Apply(body, thinking.ThinkingConfig{Mode: thinking.ModeNone}, modelInfo)
if errApply != nil {
t.Fatalf("Apply() error = %v", errApply)
}
if got := gjson.GetBytes(out, "thinking.type").String(); got != "disabled" {
t.Fatalf("thinking.type = %q, want %q, body=%s", got, "disabled", string(out))
}
if gjson.GetBytes(out, "thinking.budget_tokens").Exists() {
t.Fatalf("thinking.budget_tokens should be removed, body=%s", string(out))
}
if gjson.GetBytes(out, "reasoning_effort").Exists() {
t.Fatalf("reasoning_effort should be removed in ModeNone, body=%s", string(out))
}
}
func TestApply_ModeLevel_UsesReasoningEffort(t *testing.T) {
applier := NewApplier()
modelInfo := &registry.ModelInfo{
ID: "kimi-k2.5",
Thinking: &registry.ThinkingSupport{Min: 1024, Max: 32000, ZeroAllowed: true, DynamicAllowed: true},
}
body := []byte(`{"model":"kimi-k2.5","thinking":{"type":"disabled"}}`)
out, errApply := applier.Apply(body, thinking.ThinkingConfig{Mode: thinking.ModeLevel, Level: thinking.LevelHigh}, modelInfo)
if errApply != nil {
t.Fatalf("Apply() error = %v", errApply)
}
if got := gjson.GetBytes(out, "reasoning_effort").String(); got != "high" {
t.Fatalf("reasoning_effort = %q, want %q, body=%s", got, "high", string(out))
}
if gjson.GetBytes(out, "thinking").Exists() {
t.Fatalf("thinking should be removed when reasoning_effort is used, body=%s", string(out))
}
}
func TestApply_UserDefinedModeNone_UsesDisabledThinking(t *testing.T) {
applier := NewApplier()
modelInfo := &registry.ModelInfo{
ID: "custom-kimi-model",
UserDefined: true,
}
body := []byte(`{"model":"custom-kimi-model","reasoning_effort":"none"}`)
out, errApply := applier.Apply(body, thinking.ThinkingConfig{Mode: thinking.ModeNone}, modelInfo)
if errApply != nil {
t.Fatalf("Apply() error = %v", errApply)
}
if got := gjson.GetBytes(out, "thinking.type").String(); got != "disabled" {
t.Fatalf("thinking.type = %q, want %q, body=%s", got, "disabled", string(out))
}
if gjson.GetBytes(out, "reasoning_effort").Exists() {
t.Fatalf("reasoning_effort should be removed in ModeNone, body=%s", string(out))
}
}

View File

@@ -37,6 +37,11 @@ func StripThinkingConfig(body []byte, provider string) []byte {
paths = []string{"request.generationConfig.thinkingConfig"}
case "openai":
paths = []string{"reasoning_effort"}
case "kimi":
paths = []string{
"reasoning_effort",
"thinking",
}
case "codex":
paths = []string{"reasoning.effort"}
case "iflow":

View File

@@ -400,7 +400,7 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
hasTools := toolDeclCount > 0
thinkingResult := gjson.GetBytes(rawJSON, "thinking")
thinkingType := thinkingResult.Get("type").String()
hasThinking := thinkingResult.Exists() && thinkingResult.IsObject() && (thinkingType == "enabled" || thinkingType == "adaptive")
hasThinking := thinkingResult.Exists() && thinkingResult.IsObject() && (thinkingType == "enabled" || thinkingType == "adaptive" || thinkingType == "auto")
isClaudeThinking := util.IsClaudeThinkingModel(modelName)
if hasTools && hasThinking && isClaudeThinking {
@@ -440,6 +440,12 @@ func ConvertClaudeRequestToAntigravity(modelName string, inputRawJSON []byte, _
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", budget)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.includeThoughts", true)
}
case "auto":
// Amp sends thinking.type="auto" — use max budget from model config
// Antigravity API for Claude models requires a concrete positive budget,
// not -1. Use a high default that ApplyThinking will cap to model max.
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.thinkingBudget", 128000)
out, _ = sjson.Set(out, "request.generationConfig.thinkingConfig.includeThoughts", true)
case "adaptive":
// Keep adaptive as a high level sentinel; ApplyThinking resolves it
// to model-specific max capability.

View File

@@ -46,15 +46,23 @@ func ConvertClaudeRequestToCodex(modelName string, inputRawJSON []byte, _ bool)
if systemsResult.IsArray() {
systemResults := systemsResult.Array()
message := `{"type":"message","role":"developer","content":[]}`
contentIndex := 0
for i := 0; i < len(systemResults); i++ {
systemResult := systemResults[i]
systemTypeResult := systemResult.Get("type")
if systemTypeResult.String() == "text" {
message, _ = sjson.Set(message, fmt.Sprintf("content.%d.type", i), "input_text")
message, _ = sjson.Set(message, fmt.Sprintf("content.%d.text", i), systemResult.Get("text").String())
text := systemResult.Get("text").String()
if strings.HasPrefix(text, "x-anthropic-billing-header: ") {
continue
}
message, _ = sjson.Set(message, fmt.Sprintf("content.%d.type", contentIndex), "input_text")
message, _ = sjson.Set(message, fmt.Sprintf("content.%d.text", contentIndex), text)
contentIndex++
}
}
template, _ = sjson.SetRaw(template, "input.-1", message)
if contentIndex > 0 {
template, _ = sjson.SetRaw(template, "input.-1", message)
}
}
// Process messages and transform their contents to appropriate formats.

View File

@@ -127,7 +127,8 @@ func (w *Watcher) reloadConfig() bool {
}
authDirChanged := oldConfig == nil || oldConfig.AuthDir != newConfig.AuthDir
forceAuthRefresh := oldConfig != nil && (oldConfig.ForceModelPrefix != newConfig.ForceModelPrefix || !reflect.DeepEqual(oldConfig.OAuthModelAlias, newConfig.OAuthModelAlias))
retryConfigChanged := oldConfig != nil && (oldConfig.RequestRetry != newConfig.RequestRetry || oldConfig.MaxRetryInterval != newConfig.MaxRetryInterval || oldConfig.MaxRetryCredentials != newConfig.MaxRetryCredentials)
forceAuthRefresh := oldConfig != nil && (oldConfig.ForceModelPrefix != newConfig.ForceModelPrefix || !reflect.DeepEqual(oldConfig.OAuthModelAlias, newConfig.OAuthModelAlias) || retryConfigChanged)
log.Infof("config successfully reloaded, triggering client reload")
w.reloadClients(authDirChanged, affectedOAuthProviders, forceAuthRefresh)

View File

@@ -54,6 +54,9 @@ func BuildConfigChangeDetails(oldCfg, newCfg *config.Config) []string {
if oldCfg.RequestRetry != newCfg.RequestRetry {
changes = append(changes, fmt.Sprintf("request-retry: %d -> %d", oldCfg.RequestRetry, newCfg.RequestRetry))
}
if oldCfg.MaxRetryCredentials != newCfg.MaxRetryCredentials {
changes = append(changes, fmt.Sprintf("max-retry-credentials: %d -> %d", oldCfg.MaxRetryCredentials, newCfg.MaxRetryCredentials))
}
if oldCfg.MaxRetryInterval != newCfg.MaxRetryInterval {
changes = append(changes, fmt.Sprintf("max-retry-interval: %d -> %d", oldCfg.MaxRetryInterval, newCfg.MaxRetryInterval))
}

View File

@@ -223,6 +223,7 @@ func TestBuildConfigChangeDetails_FlagsAndKeys(t *testing.T) {
UsageStatisticsEnabled: false,
DisableCooling: false,
RequestRetry: 1,
MaxRetryCredentials: 1,
MaxRetryInterval: 1,
WebsocketAuth: false,
QuotaExceeded: config.QuotaExceeded{SwitchProject: false, SwitchPreviewModel: false},
@@ -246,6 +247,7 @@ func TestBuildConfigChangeDetails_FlagsAndKeys(t *testing.T) {
UsageStatisticsEnabled: true,
DisableCooling: true,
RequestRetry: 2,
MaxRetryCredentials: 3,
MaxRetryInterval: 3,
WebsocketAuth: true,
QuotaExceeded: config.QuotaExceeded{SwitchProject: true, SwitchPreviewModel: true},
@@ -283,6 +285,7 @@ func TestBuildConfigChangeDetails_FlagsAndKeys(t *testing.T) {
expectContains(t, details, "disable-cooling: false -> true")
expectContains(t, details, "request-log: false -> true")
expectContains(t, details, "request-retry: 1 -> 2")
expectContains(t, details, "max-retry-credentials: 1 -> 3")
expectContains(t, details, "max-retry-interval: 1 -> 3")
expectContains(t, details, "proxy-url: http://old-proxy -> http://new-proxy")
expectContains(t, details, "ws-auth: false -> true")
@@ -309,6 +312,7 @@ func TestBuildConfigChangeDetails_AllBranches(t *testing.T) {
UsageStatisticsEnabled: false,
DisableCooling: false,
RequestRetry: 1,
MaxRetryCredentials: 1,
MaxRetryInterval: 1,
WebsocketAuth: false,
QuotaExceeded: config.QuotaExceeded{SwitchProject: false, SwitchPreviewModel: false},
@@ -361,6 +365,7 @@ func TestBuildConfigChangeDetails_AllBranches(t *testing.T) {
UsageStatisticsEnabled: true,
DisableCooling: true,
RequestRetry: 2,
MaxRetryCredentials: 3,
MaxRetryInterval: 3,
WebsocketAuth: true,
QuotaExceeded: config.QuotaExceeded{SwitchProject: true, SwitchPreviewModel: true},
@@ -419,6 +424,7 @@ func TestBuildConfigChangeDetails_AllBranches(t *testing.T) {
expectContains(t, changes, "usage-statistics-enabled: false -> true")
expectContains(t, changes, "disable-cooling: false -> true")
expectContains(t, changes, "request-retry: 1 -> 2")
expectContains(t, changes, "max-retry-credentials: 1 -> 3")
expectContains(t, changes, "max-retry-interval: 1 -> 3")
expectContains(t, changes, "proxy-url: http://old-proxy -> http://new-proxy")
expectContains(t, changes, "ws-auth: false -> true")

View File

@@ -1239,6 +1239,67 @@ func TestReloadConfigFiltersAffectedOAuthProviders(t *testing.T) {
}
}
func TestReloadConfigTriggersCallbackForMaxRetryCredentialsChange(t *testing.T) {
tmpDir := t.TempDir()
authDir := filepath.Join(tmpDir, "auth")
if err := os.MkdirAll(authDir, 0o755); err != nil {
t.Fatalf("failed to create auth dir: %v", err)
}
configPath := filepath.Join(tmpDir, "config.yaml")
oldCfg := &config.Config{
AuthDir: authDir,
MaxRetryCredentials: 0,
RequestRetry: 1,
MaxRetryInterval: 5,
}
newCfg := &config.Config{
AuthDir: authDir,
MaxRetryCredentials: 2,
RequestRetry: 1,
MaxRetryInterval: 5,
}
data, errMarshal := yaml.Marshal(newCfg)
if errMarshal != nil {
t.Fatalf("failed to marshal config: %v", errMarshal)
}
if errWrite := os.WriteFile(configPath, data, 0o644); errWrite != nil {
t.Fatalf("failed to write config: %v", errWrite)
}
callbackCalls := 0
callbackMaxRetryCredentials := -1
w := &Watcher{
configPath: configPath,
authDir: authDir,
lastAuthHashes: make(map[string]string),
reloadCallback: func(cfg *config.Config) {
callbackCalls++
if cfg != nil {
callbackMaxRetryCredentials = cfg.MaxRetryCredentials
}
},
}
w.SetConfig(oldCfg)
if ok := w.reloadConfig(); !ok {
t.Fatal("expected reloadConfig to succeed")
}
if callbackCalls != 1 {
t.Fatalf("expected reload callback to be called once, got %d", callbackCalls)
}
if callbackMaxRetryCredentials != 2 {
t.Fatalf("expected callback MaxRetryCredentials=2, got %d", callbackMaxRetryCredentials)
}
w.clientsMutex.RLock()
defer w.clientsMutex.RUnlock()
if w.config == nil || w.config.MaxRetryCredentials != 2 {
t.Fatalf("expected watcher config MaxRetryCredentials=2, got %+v", w.config)
}
}
func TestStartFailsWhenAuthDirMissing(t *testing.T) {
tmpDir := t.TempDir()
configPath := filepath.Join(tmpDir, "config.yaml")

View File

@@ -138,8 +138,9 @@ type Manager struct {
providerOffsets map[string]int
// Retry controls request retry behavior.
requestRetry atomic.Int32
maxRetryInterval atomic.Int64
requestRetry atomic.Int32
maxRetryCredentials atomic.Int32
maxRetryInterval atomic.Int64
// oauthModelAlias stores global OAuth model alias mappings (alias -> upstream name) keyed by channel.
oauthModelAlias atomic.Value
@@ -384,18 +385,22 @@ func compileAPIKeyModelAliasForModels[T interface {
}
}
// SetRetryConfig updates retry attempts and cooldown wait interval.
func (m *Manager) SetRetryConfig(retry int, maxRetryInterval time.Duration) {
// SetRetryConfig updates retry attempts, credential retry limit and cooldown wait interval.
func (m *Manager) SetRetryConfig(retry int, maxRetryInterval time.Duration, maxRetryCredentials int) {
if m == nil {
return
}
if retry < 0 {
retry = 0
}
if maxRetryCredentials < 0 {
maxRetryCredentials = 0
}
if maxRetryInterval < 0 {
maxRetryInterval = 0
}
m.requestRetry.Store(int32(retry))
m.maxRetryCredentials.Store(int32(maxRetryCredentials))
m.maxRetryInterval.Store(maxRetryInterval.Nanoseconds())
}
@@ -506,11 +511,11 @@ func (m *Manager) Execute(ctx context.Context, providers []string, req cliproxye
return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
_, maxWait := m.retrySettings()
_, maxRetryCredentials, maxWait := m.retrySettings()
var lastErr error
for attempt := 0; ; attempt++ {
resp, errExec := m.executeMixedOnce(ctx, normalized, req, opts)
resp, errExec := m.executeMixedOnce(ctx, normalized, req, opts, maxRetryCredentials)
if errExec == nil {
return resp, nil
}
@@ -537,11 +542,11 @@ func (m *Manager) ExecuteCount(ctx context.Context, providers []string, req clip
return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
_, maxWait := m.retrySettings()
_, maxRetryCredentials, maxWait := m.retrySettings()
var lastErr error
for attempt := 0; ; attempt++ {
resp, errExec := m.executeCountMixedOnce(ctx, normalized, req, opts)
resp, errExec := m.executeCountMixedOnce(ctx, normalized, req, opts, maxRetryCredentials)
if errExec == nil {
return resp, nil
}
@@ -568,11 +573,11 @@ func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cli
return nil, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
_, maxWait := m.retrySettings()
_, maxRetryCredentials, maxWait := m.retrySettings()
var lastErr error
for attempt := 0; ; attempt++ {
result, errStream := m.executeStreamMixedOnce(ctx, normalized, req, opts)
result, errStream := m.executeStreamMixedOnce(ctx, normalized, req, opts, maxRetryCredentials)
if errStream == nil {
return result, nil
}
@@ -591,7 +596,7 @@ func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cli
return nil, &Error{Code: "auth_not_found", Message: "no auth available"}
}
func (m *Manager) executeMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
func (m *Manager) executeMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, maxRetryCredentials int) (cliproxyexecutor.Response, error) {
if len(providers) == 0 {
return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
@@ -600,6 +605,12 @@ func (m *Manager) executeMixedOnce(ctx context.Context, providers []string, req
tried := make(map[string]struct{})
var lastErr error
for {
if maxRetryCredentials > 0 && len(tried) >= maxRetryCredentials {
if lastErr != nil {
return cliproxyexecutor.Response{}, lastErr
}
return cliproxyexecutor.Response{}, &Error{Code: "auth_not_found", Message: "no auth available"}
}
auth, executor, provider, errPick := m.pickNextMixed(ctx, providers, routeModel, opts, tried)
if errPick != nil {
if lastErr != nil {
@@ -647,7 +658,7 @@ func (m *Manager) executeMixedOnce(ctx context.Context, providers []string, req
}
}
func (m *Manager) executeCountMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
func (m *Manager) executeCountMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, maxRetryCredentials int) (cliproxyexecutor.Response, error) {
if len(providers) == 0 {
return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
@@ -656,6 +667,12 @@ func (m *Manager) executeCountMixedOnce(ctx context.Context, providers []string,
tried := make(map[string]struct{})
var lastErr error
for {
if maxRetryCredentials > 0 && len(tried) >= maxRetryCredentials {
if lastErr != nil {
return cliproxyexecutor.Response{}, lastErr
}
return cliproxyexecutor.Response{}, &Error{Code: "auth_not_found", Message: "no auth available"}
}
auth, executor, provider, errPick := m.pickNextMixed(ctx, providers, routeModel, opts, tried)
if errPick != nil {
if lastErr != nil {
@@ -703,7 +720,7 @@ func (m *Manager) executeCountMixedOnce(ctx context.Context, providers []string,
}
}
func (m *Manager) executeStreamMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
func (m *Manager) executeStreamMixedOnce(ctx context.Context, providers []string, req cliproxyexecutor.Request, opts cliproxyexecutor.Options, maxRetryCredentials int) (*cliproxyexecutor.StreamResult, error) {
if len(providers) == 0 {
return nil, &Error{Code: "provider_not_found", Message: "no provider supplied"}
}
@@ -712,6 +729,12 @@ func (m *Manager) executeStreamMixedOnce(ctx context.Context, providers []string
tried := make(map[string]struct{})
var lastErr error
for {
if maxRetryCredentials > 0 && len(tried) >= maxRetryCredentials {
if lastErr != nil {
return nil, lastErr
}
return nil, &Error{Code: "auth_not_found", Message: "no auth available"}
}
auth, executor, provider, errPick := m.pickNextMixed(ctx, providers, routeModel, opts, tried)
if errPick != nil {
if lastErr != nil {
@@ -1108,11 +1131,11 @@ func (m *Manager) normalizeProviders(providers []string) []string {
return result
}
func (m *Manager) retrySettings() (int, time.Duration) {
func (m *Manager) retrySettings() (int, int, time.Duration) {
if m == nil {
return 0, 0
return 0, 0, 0
}
return int(m.requestRetry.Load()), time.Duration(m.maxRetryInterval.Load())
return int(m.requestRetry.Load()), int(m.maxRetryCredentials.Load()), time.Duration(m.maxRetryInterval.Load())
}
func (m *Manager) closestCooldownWait(providers []string, model string, attempt int) (time.Duration, bool) {

View File

@@ -2,13 +2,17 @@ package auth
import (
"context"
"net/http"
"sync"
"testing"
"time"
cliproxyexecutor "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/executor"
)
func TestManager_ShouldRetryAfterError_RespectsAuthRequestRetryOverride(t *testing.T) {
m := NewManager(nil, nil, nil)
m.SetRetryConfig(3, 30*time.Second)
m.SetRetryConfig(3, 30*time.Second, 0)
model := "test-model"
next := time.Now().Add(5 * time.Second)
@@ -31,7 +35,7 @@ func TestManager_ShouldRetryAfterError_RespectsAuthRequestRetryOverride(t *testi
t.Fatalf("register auth: %v", errRegister)
}
_, maxWait := m.retrySettings()
_, _, maxWait := m.retrySettings()
wait, shouldRetry := m.shouldRetryAfterError(&Error{HTTPStatus: 500, Message: "boom"}, 0, []string{"claude"}, model, maxWait)
if shouldRetry {
t.Fatalf("expected shouldRetry=false for request_retry=0, got true (wait=%v)", wait)
@@ -56,6 +60,124 @@ func TestManager_ShouldRetryAfterError_RespectsAuthRequestRetryOverride(t *testi
}
}
type credentialRetryLimitExecutor struct {
id string
mu sync.Mutex
calls int
}
func (e *credentialRetryLimitExecutor) Identifier() string {
return e.id
}
func (e *credentialRetryLimitExecutor) Execute(context.Context, *Auth, cliproxyexecutor.Request, cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
e.recordCall()
return cliproxyexecutor.Response{}, &Error{HTTPStatus: 500, Message: "boom"}
}
func (e *credentialRetryLimitExecutor) ExecuteStream(context.Context, *Auth, cliproxyexecutor.Request, cliproxyexecutor.Options) (*cliproxyexecutor.StreamResult, error) {
e.recordCall()
return nil, &Error{HTTPStatus: 500, Message: "boom"}
}
func (e *credentialRetryLimitExecutor) Refresh(_ context.Context, auth *Auth) (*Auth, error) {
return auth, nil
}
func (e *credentialRetryLimitExecutor) CountTokens(context.Context, *Auth, cliproxyexecutor.Request, cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
e.recordCall()
return cliproxyexecutor.Response{}, &Error{HTTPStatus: 500, Message: "boom"}
}
func (e *credentialRetryLimitExecutor) HttpRequest(context.Context, *Auth, *http.Request) (*http.Response, error) {
return nil, nil
}
func (e *credentialRetryLimitExecutor) recordCall() {
e.mu.Lock()
defer e.mu.Unlock()
e.calls++
}
func (e *credentialRetryLimitExecutor) Calls() int {
e.mu.Lock()
defer e.mu.Unlock()
return e.calls
}
func newCredentialRetryLimitTestManager(t *testing.T, maxRetryCredentials int) (*Manager, *credentialRetryLimitExecutor) {
t.Helper()
m := NewManager(nil, nil, nil)
m.SetRetryConfig(0, 0, maxRetryCredentials)
executor := &credentialRetryLimitExecutor{id: "claude"}
m.RegisterExecutor(executor)
auth1 := &Auth{ID: "auth-1", Provider: "claude"}
auth2 := &Auth{ID: "auth-2", Provider: "claude"}
if _, errRegister := m.Register(context.Background(), auth1); errRegister != nil {
t.Fatalf("register auth1: %v", errRegister)
}
if _, errRegister := m.Register(context.Background(), auth2); errRegister != nil {
t.Fatalf("register auth2: %v", errRegister)
}
return m, executor
}
func TestManager_MaxRetryCredentials_LimitsCrossCredentialRetries(t *testing.T) {
request := cliproxyexecutor.Request{Model: "test-model"}
testCases := []struct {
name string
invoke func(*Manager) error
}{
{
name: "execute",
invoke: func(m *Manager) error {
_, errExecute := m.Execute(context.Background(), []string{"claude"}, request, cliproxyexecutor.Options{})
return errExecute
},
},
{
name: "execute_count",
invoke: func(m *Manager) error {
_, errExecute := m.ExecuteCount(context.Background(), []string{"claude"}, request, cliproxyexecutor.Options{})
return errExecute
},
},
{
name: "execute_stream",
invoke: func(m *Manager) error {
_, errExecute := m.ExecuteStream(context.Background(), []string{"claude"}, request, cliproxyexecutor.Options{})
return errExecute
},
},
}
for _, tc := range testCases {
tc := tc
t.Run(tc.name, func(t *testing.T) {
limitedManager, limitedExecutor := newCredentialRetryLimitTestManager(t, 1)
if errInvoke := tc.invoke(limitedManager); errInvoke == nil {
t.Fatalf("expected error for limited retry execution")
}
if calls := limitedExecutor.Calls(); calls != 1 {
t.Fatalf("expected 1 call with max-retry-credentials=1, got %d", calls)
}
unlimitedManager, unlimitedExecutor := newCredentialRetryLimitTestManager(t, 0)
if errInvoke := tc.invoke(unlimitedManager); errInvoke == nil {
t.Fatalf("expected error for unlimited retry execution")
}
if calls := unlimitedExecutor.Calls(); calls != 2 {
t.Fatalf("expected 2 calls with max-retry-credentials=0, got %d", calls)
}
})
}
}
func TestManager_MarkResult_RespectsAuthDisableCoolingOverride(t *testing.T) {
prev := quotaCooldownDisabled.Load()
quotaCooldownDisabled.Store(false)

View File

@@ -347,7 +347,7 @@ func (s *Service) applyRetryConfig(cfg *config.Config) {
return
}
maxInterval := time.Duration(cfg.MaxRetryInterval) * time.Second
s.coreManager.SetRetryConfig(cfg.RequestRetry, maxInterval)
s.coreManager.SetRetryConfig(cfg.RequestRetry, maxInterval, cfg.MaxRetryCredentials)
}
func openAICompatInfoFromAuth(a *coreauth.Auth) (providerKey string, compatName string, ok bool) {