Merge branch 'router-for-me:main' into main

Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict
Fix: disable thinking when tool_choice forces tool use
2026-04-24 05:21:11 +00:00 · 2025-12-28 15:09:33 +08:00 · 2025-12-28 14:04:30 +08:00 · 2025-12-27 16:31:37 -07:00 · 2025-12-28 04:40:32 +08:00 · 2025-12-28 03:07:58 +08:00
24 changed files with 1013 additions and 105 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -13,8 +13,6 @@ Dockerfile
 docs/*
 README.md
 README_CN.md
 MANAGEMENT_API.md
 MANAGEMENT_API_CN.md
 LICENSE
 # Runtime data folders (should be mounted as volumes)
@@ -32,3 +30,4 @@ bin/*
 .agent/*
 .bmad/*
 _bmad/*
 _bmad-output/*
--- a/.github/ISSUE_TEMPLATE/bug_report.md
+++ b/.github/ISSUE_TEMPLATE/bug_report.md
@@ -7,6 +7,13 @@ assignees: ''
 ---
 **Is it a request payload issue?**
 [  ] Yes, this is a request payload issue. I am using a client/cURL to send a request payload, but I received an unexpected error.
 [  ] No, it's another issue.
 **If it's a request payload issue, you MUST know**
 Our team doesn't have any GODs or ORACLEs or MIND READERs. Please make sure to attach the request log or curl payload.
 **Describe the bug**
 A clear and concise description of what the bug is.
--- a/.gitignore
+++ b/.gitignore
@@ -12,11 +12,15 @@ bin/*
 logs/*
 conv/*
 temp/*
 refs/*
 # Storage backends
 pgstore/*
 gitstore/*
 objectstore/*
 # Static assets
 static/*
 refs/*
 # Authentication data
 auths/*
@@ -36,6 +40,7 @@ GEMINI.md
 .agent/*
 .bmad/*
 _bmad/*
 _bmad-output/*
 .mcp/cache/
 # macOS
--- a/config.example.yaml
+++ b/config.example.yaml
@@ -109,6 +109,9 @@ ws-auth: false
 #     headers:
 #       X-Custom-Header: "custom-value"
 #     proxy-url: "socks5://proxy.example.com:1080" # optional: per-key proxy override
 #     models:
 #       - name: "gpt-5-codex" # upstream model name
 #         alias: "codex-latest" # client alias mapped to the upstream model
 #     excluded-models:
 #       - "gpt-5.1"         # exclude specific models (exact match)
 #       - "gpt-5-*"         # wildcard matching prefix (e.g. gpt-5-medium, gpt-5-codex)
--- a/internal/api/handlers/management/api_tools.go
+++ b/internal/api/handlers/management/api_tools.go
@@ -0,0 +1,538 @@
 package management
 import (
 	"context"
 	"encoding/json"
 	"fmt"
 	"io"
 	"net"
 	"net/http"
 	"net/url"
 	"strings"
 	"time"
 	"github.com/gin-gonic/gin"
 	"github.com/router-for-me/CLIProxyAPI/v6/internal/runtime/geminicli"
 	coreauth "github.com/router-for-me/CLIProxyAPI/v6/sdk/cliproxy/auth"
 	log "github.com/sirupsen/logrus"
 	"golang.org/x/net/proxy"
 	"golang.org/x/oauth2"
 	"golang.org/x/oauth2/google"
 )
 const defaultAPICallTimeout = 60 * time.Second
 const (
 	geminiOAuthClientID     = "681255809395-oo8ft2oprdrnp9e3aqf6av3hmdib135j.apps.googleusercontent.com"
 	geminiOAuthClientSecret = "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"
 )
 var geminiOAuthScopes = []string{
 	"https://www.googleapis.com/auth/cloud-platform",
 	"https://www.googleapis.com/auth/userinfo.email",
 	"https://www.googleapis.com/auth/userinfo.profile",
 }
 type apiCallRequest struct {
 	AuthIndexSnake  *string           `json:"auth_index"`
 	AuthIndexCamel  *string           `json:"authIndex"`
 	AuthIndexPascal *string           `json:"AuthIndex"`
 	Method          string            `json:"method"`
 	URL             string            `json:"url"`
 	Header          map[string]string `json:"header"`
 	Data            string            `json:"data"`
 }
 type apiCallResponse struct {
 	StatusCode int                 `json:"status_code"`
 	Header     map[string][]string `json:"header"`
 	Body       string              `json:"body"`
 }
 // APICall makes a generic HTTP request on behalf of the management API caller.
 // It is protected by the management middleware.
 //
 // Endpoint:
 //
 //	POST /v0/management/api-call
 //
 // Authentication:
 //
 //	Same as other management APIs (requires a management key and remote-management rules).
 //	You can provide the key via:
 //	- Authorization: Bearer <key>
 //	- X-Management-Key: <key>
 //
 // Request JSON:
 //   - auth_index / authIndex / AuthIndex (optional):
 //     The credential "auth_index" from GET /v0/management/auth-files (or other endpoints returning it).
 //     If omitted or not found, credential-specific proxy/token substitution is skipped.
 //   - method (required): HTTP method, e.g. GET, POST, PUT, PATCH, DELETE.
 //   - url (required): Absolute URL including scheme and host, e.g. "https://api.example.com/v1/ping".
 //   - header (optional): Request headers map.
 //     Supports magic variable "$TOKEN$" which is replaced using the selected credential:
 //     1) metadata.access_token
 //     2) attributes.api_key
 //     3) metadata.token / metadata.id_token / metadata.cookie
 //     Example: {"Authorization":"Bearer $TOKEN$"}.
 //     Note: if you need to override the HTTP Host header, set header["Host"].
 //   - data (optional): Raw request body as string (useful for POST/PUT/PATCH).
 //
 // Proxy selection (highest priority first):
 //  1. Selected credential proxy_url
 //  2. Global config proxy-url
 //  3. Direct connect (environment proxies are not used)
 //
 // Response JSON (returned with HTTP 200 when the APICall itself succeeds):
 //   - status_code: Upstream HTTP status code.
 //   - header: Upstream response headers.
 //   - body: Upstream response body as string.
 //
 // Example:
 //
 //	curl -sS -X POST "http://127.0.0.1:8317/v0/management/api-call" \
 //	  -H "Authorization: Bearer <MANAGEMENT_KEY>" \
 //	  -H "Content-Type: application/json" \
 //	  -d '{"auth_index":"<AUTH_INDEX>","method":"GET","url":"https://api.example.com/v1/ping","header":{"Authorization":"Bearer $TOKEN$"}}'
 //
 //	curl -sS -X POST "http://127.0.0.1:8317/v0/management/api-call" \
 //	  -H "Authorization: Bearer 831227" \
 //	  -H "Content-Type: application/json" \
 //	  -d '{"auth_index":"<AUTH_INDEX>","method":"POST","url":"https://api.example.com/v1/fetchAvailableModels","header":{"Authorization":"Bearer $TOKEN$","Content-Type":"application/json","User-Agent":"cliproxyapi"},"data":"{}"}'
 func (h *Handler) APICall(c *gin.Context) {
 	var body apiCallRequest
 	if errBindJSON := c.ShouldBindJSON(&body); errBindJSON != nil {
 		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid body"})
 		return
 	}
 	method := strings.ToUpper(strings.TrimSpace(body.Method))
 	if method == "" {
 		c.JSON(http.StatusBadRequest, gin.H{"error": "missing method"})
 		return
 	}
 	urlStr := strings.TrimSpace(body.URL)
 	if urlStr == "" {
 		c.JSON(http.StatusBadRequest, gin.H{"error": "missing url"})
 		return
 	}
 	parsedURL, errParseURL := url.Parse(urlStr)
 	if errParseURL != nil || parsedURL.Scheme == "" || parsedURL.Host == "" {
 		c.JSON(http.StatusBadRequest, gin.H{"error": "invalid url"})
 		return
 	}
 	authIndex := firstNonEmptyString(body.AuthIndexSnake, body.AuthIndexCamel, body.AuthIndexPascal)
 	auth := h.authByIndex(authIndex)
 	reqHeaders := body.Header
 	if reqHeaders == nil {
 		reqHeaders = map[string]string{}
 	}
 	var hostOverride string
 	var token string
 	var tokenResolved bool
 	var tokenErr error
 	for key, value := range reqHeaders {
 		if !strings.Contains(value, "$TOKEN$") {
 			continue
 		}
 		if !tokenResolved {
 			token, tokenErr = h.resolveTokenForAuth(c.Request.Context(), auth)
 			tokenResolved = true
 		}
 		if auth != nil && token == "" {
 			if tokenErr != nil {
 				c.JSON(http.StatusBadRequest, gin.H{"error": "auth token refresh failed"})
 				return
 			}
 			c.JSON(http.StatusBadRequest, gin.H{"error": "auth token not found"})
 			return
 		}
 		if token == "" {
 			continue
 		}
 		reqHeaders[key] = strings.ReplaceAll(value, "$TOKEN$", token)
 	}
 	var requestBody io.Reader
 	if body.Data != "" {
 		requestBody = strings.NewReader(body.Data)
 	}
 	req, errNewRequest := http.NewRequestWithContext(c.Request.Context(), method, urlStr, requestBody)
 	if errNewRequest != nil {
 		c.JSON(http.StatusBadRequest, gin.H{"error": "failed to build request"})
 		return
 	}
 	for key, value := range reqHeaders {
 		if strings.EqualFold(key, "host") {
 			hostOverride = strings.TrimSpace(value)
 			continue
 		}
 		req.Header.Set(key, value)
 	}
 	if hostOverride != "" {
 		req.Host = hostOverride
 	}
 	httpClient := &http.Client{
 		Timeout: defaultAPICallTimeout,
 	}
 	httpClient.Transport = h.apiCallTransport(auth)
 	resp, errDo := httpClient.Do(req)
 	if errDo != nil {
 		log.WithError(errDo).Debug("management APICall request failed")
 		c.JSON(http.StatusBadGateway, gin.H{"error": "request failed"})
 		return
 	}
 	defer func() {
 		if errClose := resp.Body.Close(); errClose != nil {
 			log.Errorf("response body close error: %v", errClose)
 		}
 	}()
 	respBody, errReadAll := io.ReadAll(resp.Body)
 	if errReadAll != nil {
 		c.JSON(http.StatusBadGateway, gin.H{"error": "failed to read response"})
 		return
 	}
 	c.JSON(http.StatusOK, apiCallResponse{
 		StatusCode: resp.StatusCode,
 		Header:     resp.Header,
 		Body:       string(respBody),
 	})
 }
 func firstNonEmptyString(values ...*string) string {
 	for _, v := range values {
 		if v == nil {
 			continue
 		}
 		if out := strings.TrimSpace(*v); out != "" {
 			return out
 		}
 	}
 	return ""
 }
 func tokenValueForAuth(auth *coreauth.Auth) string {
 	if auth == nil {
 		return ""
 	}
 	if v := tokenValueFromMetadata(auth.Metadata); v != "" {
 		return v
 	}
 	if auth.Attributes != nil {
 		if v := strings.TrimSpace(auth.Attributes["api_key"]); v != "" {
 			return v
 		}
 	}
 	if shared := geminicli.ResolveSharedCredential(auth.Runtime); shared != nil {
 		if v := tokenValueFromMetadata(shared.MetadataSnapshot()); v != "" {
 			return v
 		}
 	}
 	return ""
 }
 func (h *Handler) resolveTokenForAuth(ctx context.Context, auth *coreauth.Auth) (string, error) {
 	if auth == nil {
 		return "", nil
 	}
 	provider := strings.ToLower(strings.TrimSpace(auth.Provider))
 	if provider == "gemini-cli" {
 		token, errToken := h.refreshGeminiOAuthAccessToken(ctx, auth)
 		return token, errToken
 	}
 	return tokenValueForAuth(auth), nil
 }
 func (h *Handler) refreshGeminiOAuthAccessToken(ctx context.Context, auth *coreauth.Auth) (string, error) {
 	if ctx == nil {
 		ctx = context.Background()
 	}
 	if auth == nil {
 		return "", nil
 	}
 	metadata, updater := geminiOAuthMetadata(auth)
 	if len(metadata) == 0 {
 		return "", fmt.Errorf("gemini oauth metadata missing")
 	}
 	base := make(map[string]any)
 	if tokenRaw, ok := metadata["token"].(map[string]any); ok && tokenRaw != nil {
 		base = cloneMap(tokenRaw)
 	}
 	var token oauth2.Token
 	if len(base) > 0 {
 		if raw, errMarshal := json.Marshal(base); errMarshal == nil {
 			_ = json.Unmarshal(raw, &token)
 		}
 	}
 	if token.AccessToken == "" {
 		token.AccessToken = stringValue(metadata, "access_token")
 	}
 	if token.RefreshToken == "" {
 		token.RefreshToken = stringValue(metadata, "refresh_token")
 	}
 	if token.TokenType == "" {
 		token.TokenType = stringValue(metadata, "token_type")
 	}
 	if token.Expiry.IsZero() {
 		if expiry := stringValue(metadata, "expiry"); expiry != "" {
 			if ts, errParseTime := time.Parse(time.RFC3339, expiry); errParseTime == nil {
 				token.Expiry = ts
 			}
 		}
 	}
 	conf := &oauth2.Config{
 		ClientID:     geminiOAuthClientID,
 		ClientSecret: geminiOAuthClientSecret,
 		Scopes:       geminiOAuthScopes,
 		Endpoint:     google.Endpoint,
 	}
 	ctxToken := ctx
 	httpClient := &http.Client{
 		Timeout:   defaultAPICallTimeout,
 		Transport: h.apiCallTransport(auth),
 	}
 	ctxToken = context.WithValue(ctxToken, oauth2.HTTPClient, httpClient)
 	src := conf.TokenSource(ctxToken, &token)
 	currentToken, errToken := src.Token()
 	if errToken != nil {
 		return "", errToken
 	}
 	merged := buildOAuthTokenMap(base, currentToken)
 	fields := buildOAuthTokenFields(currentToken, merged)
 	if updater != nil {
 		updater(fields)
 	}
 	return strings.TrimSpace(currentToken.AccessToken), nil
 }
 func geminiOAuthMetadata(auth *coreauth.Auth) (map[string]any, func(map[string]any)) {
 	if auth == nil {
 		return nil, nil
 	}
 	if shared := geminicli.ResolveSharedCredential(auth.Runtime); shared != nil {
 		snapshot := shared.MetadataSnapshot()
 		return snapshot, func(fields map[string]any) { shared.MergeMetadata(fields) }
 	}
 	return auth.Metadata, func(fields map[string]any) {
 		if auth.Metadata == nil {
 			auth.Metadata = make(map[string]any)
 		}
 		for k, v := range fields {
 			auth.Metadata[k] = v
 		}
 	}
 }
 func stringValue(metadata map[string]any, key string) string {
 	if len(metadata) == 0 || key == "" {
 		return ""
 	}
 	if v, ok := metadata[key].(string); ok {
 		return strings.TrimSpace(v)
 	}
 	return ""
 }
 func cloneMap(in map[string]any) map[string]any {
 	if len(in) == 0 {
 		return nil
 	}
 	out := make(map[string]any, len(in))
 	for k, v := range in {
 		out[k] = v
 	}
 	return out
 }
 func buildOAuthTokenMap(base map[string]any, tok *oauth2.Token) map[string]any {
 	merged := cloneMap(base)
 	if merged == nil {
 		merged = make(map[string]any)
 	}
 	if tok == nil {
 		return merged
 	}
 	if raw, errMarshal := json.Marshal(tok); errMarshal == nil {
 		var tokenMap map[string]any
 		if errUnmarshal := json.Unmarshal(raw, &tokenMap); errUnmarshal == nil {
 			for k, v := range tokenMap {
 				merged[k] = v
 			}
 		}
 	}
 	return merged
 }
 func buildOAuthTokenFields(tok *oauth2.Token, merged map[string]any) map[string]any {
 	fields := make(map[string]any, 5)
 	if tok != nil && tok.AccessToken != "" {
 		fields["access_token"] = tok.AccessToken
 	}
 	if tok != nil && tok.TokenType != "" {
 		fields["token_type"] = tok.TokenType
 	}
 	if tok != nil && tok.RefreshToken != "" {
 		fields["refresh_token"] = tok.RefreshToken
 	}
 	if tok != nil && !tok.Expiry.IsZero() {
 		fields["expiry"] = tok.Expiry.Format(time.RFC3339)
 	}
 	if len(merged) > 0 {
 		fields["token"] = cloneMap(merged)
 	}
 	return fields
 }
 func tokenValueFromMetadata(metadata map[string]any) string {
 	if len(metadata) == 0 {
 		return ""
 	}
 	if v, ok := metadata["accessToken"].(string); ok && strings.TrimSpace(v) != "" {
 		return strings.TrimSpace(v)
 	}
 	if v, ok := metadata["access_token"].(string); ok && strings.TrimSpace(v) != "" {
 		return strings.TrimSpace(v)
 	}
 	if tokenRaw, ok := metadata["token"]; ok && tokenRaw != nil {
 		switch typed := tokenRaw.(type) {
 		case string:
 			if v := strings.TrimSpace(typed); v != "" {
 				return v
 			}
 		case map[string]any:
 			if v, ok := typed["access_token"].(string); ok && strings.TrimSpace(v) != "" {
 				return strings.TrimSpace(v)
 			}
 			if v, ok := typed["accessToken"].(string); ok && strings.TrimSpace(v) != "" {
 				return strings.TrimSpace(v)
 			}
 		case map[string]string:
 			if v := strings.TrimSpace(typed["access_token"]); v != "" {
 				return v
 			}
 			if v := strings.TrimSpace(typed["accessToken"]); v != "" {
 				return v
 			}
 		}
 	}
 	if v, ok := metadata["token"].(string); ok && strings.TrimSpace(v) != "" {
 		return strings.TrimSpace(v)
 	}
 	if v, ok := metadata["id_token"].(string); ok && strings.TrimSpace(v) != "" {
 		return strings.TrimSpace(v)
 	}
 	if v, ok := metadata["cookie"].(string); ok && strings.TrimSpace(v) != "" {
 		return strings.TrimSpace(v)
 	}
 	return ""
 }
 func (h *Handler) authByIndex(authIndex string) *coreauth.Auth {
 	authIndex = strings.TrimSpace(authIndex)
 	if authIndex == "" || h == nil || h.authManager == nil {
 		return nil
 	}
 	auths := h.authManager.List()
 	for _, auth := range auths {
 		if auth == nil {
 			continue
 		}
 		auth.EnsureIndex()
 		if auth.Index == authIndex {
 			return auth
 		}
 	}
 	return nil
 }
 func (h *Handler) apiCallTransport(auth *coreauth.Auth) http.RoundTripper {
 	var proxyCandidates []string
 	if auth != nil {
 		if proxyStr := strings.TrimSpace(auth.ProxyURL); proxyStr != "" {
 			proxyCandidates = append(proxyCandidates, proxyStr)
 		}
 	}
 	if h != nil && h.cfg != nil {
 		if proxyStr := strings.TrimSpace(h.cfg.ProxyURL); proxyStr != "" {
 			proxyCandidates = append(proxyCandidates, proxyStr)
 		}
 	}
 	for _, proxyStr := range proxyCandidates {
 		if transport := buildProxyTransport(proxyStr); transport != nil {
 			return transport
 		}
 	}
 	transport, ok := http.DefaultTransport.(*http.Transport)
 	if !ok || transport == nil {
 		return &http.Transport{Proxy: nil}
 	}
 	clone := transport.Clone()
 	clone.Proxy = nil
 	return clone
 }
 func buildProxyTransport(proxyStr string) *http.Transport {
 	proxyStr = strings.TrimSpace(proxyStr)
 	if proxyStr == "" {
 		return nil
 	}
 	proxyURL, errParse := url.Parse(proxyStr)
 	if errParse != nil {
 		log.WithError(errParse).Debug("parse proxy URL failed")
 		return nil
 	}
 	if proxyURL.Scheme == "" || proxyURL.Host == "" {
 		log.Debug("proxy URL missing scheme/host")
 		return nil
 	}
 	if proxyURL.Scheme == "socks5" {
 		var proxyAuth *proxy.Auth
 		if proxyURL.User != nil {
 			username := proxyURL.User.Username()
 			password, _ := proxyURL.User.Password()
 			proxyAuth = &proxy.Auth{User: username, Password: password}
 		}
 		dialer, errSOCKS5 := proxy.SOCKS5("tcp", proxyURL.Host, proxyAuth, proxy.Direct)
 		if errSOCKS5 != nil {
 			log.WithError(errSOCKS5).Debug("create SOCKS5 dialer failed")
 			return nil
 		}
 		return &http.Transport{
 			Proxy: nil,
 			DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
 				return dialer.Dial(network, addr)
 			},
 		}
 	}
 	if proxyURL.Scheme == "http" || proxyURL.Scheme == "https" {
 		return &http.Transport{Proxy: http.ProxyURL(proxyURL)}
 	}
 	log.Debugf("unsupported proxy scheme: %s", proxyURL.Scheme)
 	return nil
 }
--- a/internal/api/handlers/management/config_lists.go
+++ b/internal/api/handlers/management/config_lists.go
@@ -597,11 +597,7 @@ func (h *Handler) PutCodexKeys(c *gin.Context) {
 	filtered := make([]config.CodexKey, 0, len(arr))
 	for i := range arr {
 		entry := arr[i]
-		entry.APIKey = strings.TrimSpace(entry.APIKey)
+		normalizeCodexKey(&entry)
 		entry.BaseURL = strings.TrimSpace(entry.BaseURL)
 		entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
 		entry.Headers = config.NormalizeHeaders(entry.Headers)
 		entry.ExcludedModels = config.NormalizeExcludedModels(entry.ExcludedModels)
 		if entry.BaseURL == "" {
 			continue
 		}
@@ -613,12 +609,13 @@ func (h *Handler) PutCodexKeys(c *gin.Context) {
 }
 func (h *Handler) PatchCodexKey(c *gin.Context) {
 	type codexKeyPatch struct {
-		APIKey         *string            `json:"api-key"`
+		APIKey         *string              `json:"api-key"`
-		Prefix         *string            `json:"prefix"`
+		Prefix         *string              `json:"prefix"`
-		BaseURL        *string            `json:"base-url"`
+		BaseURL        *string              `json:"base-url"`
-		ProxyURL       *string            `json:"proxy-url"`
+		ProxyURL       *string              `json:"proxy-url"`
-		Headers        *map[string]string `json:"headers"`
+		Models         *[]config.CodexModel `json:"models"`
-		ExcludedModels *[]string          `json:"excluded-models"`
+		Headers        *map[string]string   `json:"headers"`
 		ExcludedModels *[]string            `json:"excluded-models"`
 	}
 	var body struct {
 		Index *int           `json:"index"`
@@ -667,12 +664,16 @@ func (h *Handler) PatchCodexKey(c *gin.Context) {
 	if body.Value.ProxyURL != nil {
 		entry.ProxyURL = strings.TrimSpace(*body.Value.ProxyURL)
 	}
 	if body.Value.Models != nil {
 		entry.Models = append([]config.CodexModel(nil), (*body.Value.Models)...)
 	}
 	if body.Value.Headers != nil {
 		entry.Headers = config.NormalizeHeaders(*body.Value.Headers)
 	}
 	if body.Value.ExcludedModels != nil {
 		entry.ExcludedModels = config.NormalizeExcludedModels(*body.Value.ExcludedModels)
 	}
 	normalizeCodexKey(&entry)
 	h.cfg.CodexKey[targetIndex] = entry
 	h.cfg.SanitizeCodexKeys()
 	h.persist(c)
@@ -762,6 +763,32 @@ func normalizeClaudeKey(entry *config.ClaudeKey) {
 	entry.Models = normalized
 }
 func normalizeCodexKey(entry *config.CodexKey) {
 	if entry == nil {
 		return
 	}
 	entry.APIKey = strings.TrimSpace(entry.APIKey)
 	entry.Prefix = strings.TrimSpace(entry.Prefix)
 	entry.BaseURL = strings.TrimSpace(entry.BaseURL)
 	entry.ProxyURL = strings.TrimSpace(entry.ProxyURL)
 	entry.Headers = config.NormalizeHeaders(entry.Headers)
 	entry.ExcludedModels = config.NormalizeExcludedModels(entry.ExcludedModels)
 	if len(entry.Models) == 0 {
 		return
 	}
 	normalized := make([]config.CodexModel, 0, len(entry.Models))
 	for i := range entry.Models {
 		model := entry.Models[i]
 		model.Name = strings.TrimSpace(model.Name)
 		model.Alias = strings.TrimSpace(model.Alias)
 		if model.Name == "" && model.Alias == "" {
 			continue
 		}
 		normalized = append(normalized, model)
 	}
 	entry.Models = normalized
 }
 // GetAmpCode returns the complete ampcode configuration.
 func (h *Handler) GetAmpCode(c *gin.Context) {
 	if h == nil || h.cfg == nil {
--- a/internal/api/server.go
+++ b/internal/api/server.go
@@ -520,6 +520,8 @@ func (s *Server) registerManagementRoutes() {
 		mgmt.PATCH("/proxy-url", s.mgmt.PutProxyURL)
 		mgmt.DELETE("/proxy-url", s.mgmt.DeleteProxyURL)
 		mgmt.POST("/api-call", s.mgmt.APICall)
 		mgmt.GET("/quota-exceeded/switch-project", s.mgmt.GetSwitchProject)
 		mgmt.PUT("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
 		mgmt.PATCH("/quota-exceeded/switch-project", s.mgmt.PutSwitchProject)
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -265,6 +265,9 @@ type CodexKey struct {
 	// ProxyURL overrides the global proxy setting for this API key if provided.
 	ProxyURL string `yaml:"proxy-url" json:"proxy-url"`
 	// Models defines upstream model names and aliases for request routing.
 	Models []CodexModel `yaml:"models" json:"models"`
 	// Headers optionally adds extra HTTP headers for requests sent with this key.
 	Headers map[string]string `yaml:"headers,omitempty" json:"headers,omitempty"`
@@ -272,6 +275,15 @@ type CodexKey struct {
 	ExcludedModels []string `yaml:"excluded-models,omitempty" json:"excluded-models,omitempty"`
 }
 // CodexModel describes a mapping between an alias and the actual upstream model name.
 type CodexModel struct {
 	// Name is the upstream model identifier used when issuing requests.
 	Name string `yaml:"name" json:"name"`
 	// Alias is the client-facing model name that maps to Name.
 	Alias string `yaml:"alias" json:"alias"`
 }
 // GeminiKey represents the configuration for a Gemini API key,
 // including optional overrides for upstream base URL, proxy routing, and headers.
 type GeminiKey struct {
@@ -879,8 +891,8 @@ func getOrCreateMapValue(mapNode *yaml.Node, key string) *yaml.Node {
 }
 // mergeMappingPreserve merges keys from src into dst mapping node while preserving
-// key order and comments of existing keys in dst. Unknown keys from src are appended
+// key order and comments of existing keys in dst. New keys are only added if their
-// to dst at the end, copying their node structure from src.
+// value is non-zero to avoid polluting the config with defaults.
 func mergeMappingPreserve(dst, src *yaml.Node) {
 	if dst == nil || src == nil {
 		return
@@ -891,20 +903,19 @@ func mergeMappingPreserve(dst, src *yaml.Node) {
 		copyNodeShallow(dst, src)
 		return
 	}
 	// Build a lookup of existing keys in dst
 	for i := 0; i+1 < len(src.Content); i += 2 {
 		sk := src.Content[i]
 		sv := src.Content[i+1]
 		idx := findMapKeyIndex(dst, sk.Value)
 		if idx >= 0 {
-			// Merge into existing value node
+			// Merge into existing value node (always update, even to zero values)
 			dv := dst.Content[idx+1]
 			mergeNodePreserve(dv, sv)
 		} else {
-			if shouldSkipEmptyCollectionOnPersist(sk.Value, sv) {
+			// New key: only add if value is non-zero to avoid polluting config with defaults
 			if isZeroValueNode(sv) {
 				continue
 			}
 			// Append new key/value pair by deep-copying from src
 			dst.Content = append(dst.Content, deepCopyNode(sk), deepCopyNode(sv))
 		}
 	}
@@ -987,32 +998,49 @@ func findMapKeyIndex(mapNode *yaml.Node, key string) int {
 	return -1
 }
-func shouldSkipEmptyCollectionOnPersist(key string, node *yaml.Node) bool {
+// isZeroValueNode returns true if the YAML node represents a zero/default value
-	switch key {
+// that should not be written as a new key to preserve config cleanliness.
-	case "generative-language-api-key",
+// For mappings and sequences, recursively checks if all children are zero values.
-		"gemini-api-key",
+func isZeroValueNode(node *yaml.Node) bool {
 		"vertex-api-key",
 		"claude-api-key",
 		"codex-api-key",
 		"openai-compatibility":
 		return isEmptyCollectionNode(node)
 	default:
 		return false
 	}
 }
 func isEmptyCollectionNode(node *yaml.Node) bool {
 	if node == nil {
 		return true
 	}
 	switch node.Kind {
 	case yaml.SequenceNode:
 		return len(node.Content) == 0
 	case yaml.ScalarNode:
-		return node.Tag == "!!null"
+		switch node.Tag {
-	default:
+		case "!!bool":
-		return false
+			return node.Value == "false"
 		case "!!int", "!!float":
 			return node.Value == "0" || node.Value == "0.0"
 		case "!!str":
 			return node.Value == ""
 		case "!!null":
 			return true
 		}
 	case yaml.SequenceNode:
 		if len(node.Content) == 0 {
 			return true
 		}
 		// Check if all elements are zero values
 		for _, child := range node.Content {
 			if !isZeroValueNode(child) {
 				return false
 			}
 		}
 		return true
 	case yaml.MappingNode:
 		if len(node.Content) == 0 {
 			return true
 		}
 		// Check if all values are zero values (values are at odd indices)
 		for i := 1; i < len(node.Content); i += 2 {
 			if !isZeroValueNode(node.Content[i]) {
 				return false
 			}
 		}
 		return true
 	}
 	return false
 }
 // deepCopyNode creates a deep copy of a yaml.Node graph.
--- a/internal/config/sdk_config.go
+++ b/internal/config/sdk_config.go
@@ -30,13 +30,13 @@ type SDKConfig struct {
 // StreamingConfig holds server streaming behavior configuration.
 type StreamingConfig struct {
 	// KeepAliveSeconds controls how often the server emits SSE heartbeats (": keep-alive\n\n").
-	// nil means default (15 seconds). <= 0 disables keep-alives.
+	// <= 0 disables keep-alives. Default is 0.
-	KeepAliveSeconds *int `yaml:"keepalive-seconds,omitempty" json:"keepalive-seconds,omitempty"`
+	KeepAliveSeconds int `yaml:"keepalive-seconds,omitempty" json:"keepalive-seconds,omitempty"`
 	// BootstrapRetries controls how many times the server may retry a streaming request before any bytes are sent,
 	// to allow auth rotation / transient recovery.
-	// nil means default (2). 0 disables bootstrap retries.
+	// <= 0 disables bootstrap retries. Default is 0.
-	BootstrapRetries *int `yaml:"bootstrap-retries,omitempty" json:"bootstrap-retries,omitempty"`
+	BootstrapRetries int `yaml:"bootstrap-retries,omitempty" json:"bootstrap-retries,omitempty"`
 }
 // AccessConfig groups request authentication providers.
--- a/internal/registry/model_definitions.go
+++ b/internal/registry/model_definitions.go
@@ -741,7 +741,7 @@ func GetIFlowModels() []*ModelInfo {
 		{ID: "qwen3-235b-a22b-instruct", DisplayName: "Qwen3-235B-A22B-Instruct", Description: "Qwen3 235B A22B Instruct", Created: 1753401600},
 		{ID: "qwen3-235b", DisplayName: "Qwen3-235B-A22B", Description: "Qwen3 235B A22B", Created: 1753401600},
 		{ID: "minimax-m2", DisplayName: "MiniMax-M2", Description: "MiniMax M2", Created: 1758672000},
-		{ID: "minimax-m2.1", DisplayName: "MiniMax-M2.1", Description: "MiniMax M2.1", Created: 1766448000},
+		{ID: "minimax-m2.1", DisplayName: "MiniMax-M2.1", Description: "MiniMax M2.1", Created: 1766448000, Thinking: iFlowThinkingSupport},
 	}
 	models := make([]*ModelInfo, 0, len(entries))
 	for _, entry := range entries {
--- a/internal/runtime/executor/claude_executor.go
+++ b/internal/runtime/executor/claude_executor.go
@@ -74,6 +74,9 @@ func (e *ClaudeExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, r
 	}
 	body = applyPayloadConfig(e.cfg, req.Model, body)
 	// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
 	body = disableThinkingIfToolChoiceForced(body)
 	// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
 	body = ensureMaxTokensForThinking(req.Model, body)
@@ -185,6 +188,9 @@ func (e *ClaudeExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.A
 	body = checkSystemInstructions(body)
 	body = applyPayloadConfig(e.cfg, req.Model, body)
 	// Disable thinking if tool_choice forces tool use (Anthropic API constraint)
 	body = disableThinkingIfToolChoiceForced(body)
 	// Ensure max_tokens > thinking.budget_tokens when thinking is enabled
 	body = ensureMaxTokensForThinking(req.Model, body)
@@ -461,6 +467,19 @@ func (e *ClaudeExecutor) injectThinkingConfig(modelName string, metadata map[str
 	return util.ApplyClaudeThinkingConfig(body, budget)
 }
 // disableThinkingIfToolChoiceForced checks if tool_choice forces tool use and disables thinking.
 // Anthropic API does not allow thinking when tool_choice is set to "any" or a specific tool.
 // See: https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking#important-considerations
 func disableThinkingIfToolChoiceForced(body []byte) []byte {
 	toolChoiceType := gjson.GetBytes(body, "tool_choice.type").String()
 	// "auto" is allowed with thinking, but "any" or "tool" (specific tool) are not
 	if toolChoiceType == "any" || toolChoiceType == "tool" {
 		// Remove thinking configuration entirely to avoid API error
 		body, _ = sjson.DeleteBytes(body, "thinking")
 	}
 	return body
 }
 // ensureMaxTokensForThinking ensures max_tokens > thinking.budget_tokens when thinking is enabled.
 // Anthropic API requires this constraint; violating it returns a 400 error.
 // This function should be called after all thinking configuration is finalized.
--- a/internal/runtime/executor/codex_executor.go
+++ b/internal/runtime/executor/codex_executor.go
@@ -50,6 +50,16 @@ func (e *CodexExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
 	defer reporter.trackFailure(ctx, &err)
 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
 	if upstreamModel == "" {
 		upstreamModel = req.Model
 	}
 	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
 		upstreamModel = modelOverride
 	} else if !strings.EqualFold(upstreamModel, req.Model) {
 		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
 			upstreamModel = modelOverride
 		}
 	}
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
@@ -147,6 +157,16 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
 	defer reporter.trackFailure(ctx, &err)
 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
 	if upstreamModel == "" {
 		upstreamModel = req.Model
 	}
 	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
 		upstreamModel = modelOverride
 	} else if !strings.EqualFold(upstreamModel, req.Model) {
 		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
 			upstreamModel = modelOverride
 		}
 	}
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
@@ -247,12 +267,22 @@ func (e *CodexExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
 func (e *CodexExecutor) CountTokens(ctx context.Context, auth *cliproxyauth.Auth, req cliproxyexecutor.Request, opts cliproxyexecutor.Options) (cliproxyexecutor.Response, error) {
 	upstreamModel := util.ResolveOriginalModel(req.Model, req.Metadata)
 	if upstreamModel == "" {
 		upstreamModel = req.Model
 	}
 	if modelOverride := e.resolveUpstreamModel(upstreamModel, auth); modelOverride != "" {
 		upstreamModel = modelOverride
 	} else if !strings.EqualFold(upstreamModel, req.Model) {
 		if modelOverride := e.resolveUpstreamModel(req.Model, auth); modelOverride != "" {
 			upstreamModel = modelOverride
 		}
 	}
 	from := opts.SourceFormat
 	to := sdktranslator.FromString("codex")
 	body := sdktranslator.TranslateRequest(from, to, req.Model, bytes.Clone(req.Payload), false)
-	modelForCounting := req.Model
+	modelForCounting := upstreamModel
 	body = ApplyReasoningEffortMetadata(body, req.Metadata, req.Model, "reasoning.effort", false)
 	body, _ = sjson.SetBytes(body, "model", upstreamModel)
@@ -520,3 +550,87 @@ func codexCreds(a *cliproxyauth.Auth) (apiKey, baseURL string) {
 	}
 	return
 }
 func (e *CodexExecutor) resolveUpstreamModel(alias string, auth *cliproxyauth.Auth) string {
 	trimmed := strings.TrimSpace(alias)
 	if trimmed == "" {
 		return ""
 	}
 	entry := e.resolveCodexConfig(auth)
 	if entry == nil {
 		return ""
 	}
 	normalizedModel, metadata := util.NormalizeThinkingModel(trimmed)
 	// Candidate names to match against configured aliases/names.
 	candidates := []string{strings.TrimSpace(normalizedModel)}
 	if !strings.EqualFold(normalizedModel, trimmed) {
 		candidates = append(candidates, trimmed)
 	}
 	if original := util.ResolveOriginalModel(normalizedModel, metadata); original != "" && !strings.EqualFold(original, normalizedModel) {
 		candidates = append(candidates, original)
 	}
 	for i := range entry.Models {
 		model := entry.Models[i]
 		name := strings.TrimSpace(model.Name)
 		modelAlias := strings.TrimSpace(model.Alias)
 		for _, candidate := range candidates {
 			if candidate == "" {
 				continue
 			}
 			if modelAlias != "" && strings.EqualFold(modelAlias, candidate) {
 				if name != "" {
 					return name
 				}
 				return candidate
 			}
 			if name != "" && strings.EqualFold(name, candidate) {
 				return name
 			}
 		}
 	}
 	return ""
 }
 func (e *CodexExecutor) resolveCodexConfig(auth *cliproxyauth.Auth) *config.CodexKey {
 	if auth == nil || e.cfg == nil {
 		return nil
 	}
 	var attrKey, attrBase string
 	if auth.Attributes != nil {
 		attrKey = strings.TrimSpace(auth.Attributes["api_key"])
 		attrBase = strings.TrimSpace(auth.Attributes["base_url"])
 	}
 	for i := range e.cfg.CodexKey {
 		entry := &e.cfg.CodexKey[i]
 		cfgKey := strings.TrimSpace(entry.APIKey)
 		cfgBase := strings.TrimSpace(entry.BaseURL)
 		if attrKey != "" && attrBase != "" {
 			if strings.EqualFold(cfgKey, attrKey) && strings.EqualFold(cfgBase, attrBase) {
 				return entry
 			}
 			continue
 		}
 		if attrKey != "" && strings.EqualFold(cfgKey, attrKey) {
 			if cfgBase == "" || strings.EqualFold(cfgBase, attrBase) {
 				return entry
 			}
 		}
 		if attrKey == "" && attrBase != "" && strings.EqualFold(cfgBase, attrBase) {
 			return entry
 		}
 	}
 	if attrKey != "" {
 		for i := range e.cfg.CodexKey {
 			entry := &e.cfg.CodexKey[i]
 			if strings.EqualFold(strings.TrimSpace(entry.APIKey), attrKey) {
 				return entry
 			}
 		}
 	}
 	return nil
 }
--- a/internal/runtime/executor/iflow_executor.go
+++ b/internal/runtime/executor/iflow_executor.go
@@ -67,6 +67,7 @@ func (e *IFlowExecutor) Execute(ctx context.Context, auth *cliproxyauth.Auth, re
 		return resp, errValidate
 	}
 	body = applyIFlowThinkingConfig(body)
 	body = preserveReasoningContentInMessages(body)
 	body = applyPayloadConfig(e.cfg, req.Model, body)
 	endpoint := strings.TrimSuffix(baseURL, "/") + iflowDefaultEndpoint
@@ -159,6 +160,7 @@ func (e *IFlowExecutor) ExecuteStream(ctx context.Context, auth *cliproxyauth.Au
 		return nil, errValidate
 	}
 	body = applyIFlowThinkingConfig(body)
 	body = preserveReasoningContentInMessages(body)
 	// Ensure tools array exists to avoid provider quirks similar to Qwen's behaviour.
 	toolsResult := gjson.GetBytes(body, "tools")
 	if toolsResult.Exists() && toolsResult.IsArray() && len(toolsResult.Array()) == 0 {
@@ -445,20 +447,98 @@ func ensureToolsArray(body []byte) []byte {
 	return updated
 }
-// applyIFlowThinkingConfig converts normalized reasoning_effort to iFlow chat_template_kwargs.enable_thinking.
+// preserveReasoningContentInMessages ensures reasoning_content from assistant messages in the
-// This should be called after NormalizeThinkingConfig has processed the payload.
+// conversation history is preserved when sending to iFlow models that support thinking.
-// iFlow only supports boolean enable_thinking, so any non-"none" effort enables thinking.
+// This is critical for multi-turn conversations where the model needs to see its previous
-func applyIFlowThinkingConfig(body []byte) []byte {
+// reasoning to maintain coherent thought chains across tool calls and conversation turns.
-	effort := gjson.GetBytes(body, "reasoning_effort")
+//
-	if !effort.Exists() {
+// For GLM-4.7 and MiniMax-M2.1, the full assistant response (including reasoning) must be
 // appended back into message history before the next call.
 func preserveReasoningContentInMessages(body []byte) []byte {
 	model := strings.ToLower(gjson.GetBytes(body, "model").String())
 	// Only apply to models that support thinking with history preservation
 	needsPreservation := strings.HasPrefix(model, "glm-4.7") ||
 		strings.HasPrefix(model, "glm-4-7") ||
 		strings.HasPrefix(model, "minimax-m2.1") ||
 		strings.HasPrefix(model, "minimax-m2-1")
 	if !needsPreservation {
 		return body
 	}
-	val := strings.ToLower(strings.TrimSpace(effort.String()))
+	messages := gjson.GetBytes(body, "messages")
-	enableThinking := val != "none" && val != ""
+	if !messages.Exists() || !messages.IsArray() {
 		return body
 	}
-	body, _ = sjson.DeleteBytes(body, "reasoning_effort")
+	// Check if any assistant message already has reasoning_content preserved
-	body, _ = sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
+	hasReasoningContent := false
 	messages.ForEach(func(_, msg gjson.Result) bool {
 		role := msg.Get("role").String()
 		if role == "assistant" {
 			rc := msg.Get("reasoning_content")
 			if rc.Exists() && rc.String() != "" {
 				hasReasoningContent = true
 				return false // stop iteration
 			}
 		}
 		return true
 	})
 	// If reasoning content is already present, the messages are properly formatted
 	// No need to modify - the client has correctly preserved reasoning in history
 	if hasReasoningContent {
 		log.Debugf("iflow executor: reasoning_content found in message history for %s", model)
 	}
 	return body
 }
 // applyIFlowThinkingConfig converts normalized reasoning_effort to model-specific thinking configurations.
 // This should be called after NormalizeThinkingConfig has processed the payload.
 //
 // Model-specific handling:
 //   - GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false}
 //   - MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation
 //   - Other iFlow models: Uses chat_template_kwargs.enable_thinking (boolean)
 func applyIFlowThinkingConfig(body []byte) []byte {
 	effort := gjson.GetBytes(body, "reasoning_effort")
 	model := strings.ToLower(gjson.GetBytes(body, "model").String())
 	// Check if thinking should be enabled
 	val := ""
 	if effort.Exists() {
 		val = strings.ToLower(strings.TrimSpace(effort.String()))
 	}
 	enableThinking := effort.Exists() && val != "none" && val != ""
 	// Remove reasoning_effort as we'll convert to model-specific format
 	if effort.Exists() {
 		body, _ = sjson.DeleteBytes(body, "reasoning_effort")
 	}
 	// GLM-4.7: Use extra_body with thinking config and clear_thinking: false
 	if strings.HasPrefix(model, "glm-4.7") || strings.HasPrefix(model, "glm-4-7") {
 		if enableThinking {
 			body, _ = sjson.SetBytes(body, "extra_body.thinking.type", "enabled")
 			body, _ = sjson.SetBytes(body, "extra_body.clear_thinking", false)
 		}
 		return body
 	}
 	// MiniMax-M2.1: Use reasoning_split=true for interleaved thinking
 	if strings.HasPrefix(model, "minimax-m2.1") || strings.HasPrefix(model, "minimax-m2-1") {
 		if enableThinking {
 			body, _ = sjson.SetBytes(body, "reasoning_split", true)
 		}
 		return body
 	}
 	// Other iFlow models (including GLM-4.6): Use chat_template_kwargs.enable_thinking
 	if effort.Exists() {
 		body, _ = sjson.SetBytes(body, "chat_template_kwargs.enable_thinking", enableThinking)
 	}
 	return body
 }
--- a/internal/runtime/executor/usage_helpers.go
+++ b/internal/runtime/executor/usage_helpers.go
@@ -482,12 +482,16 @@ func StripUsageMetadataFromJSON(rawJSON []byte) ([]byte, bool) {
 	cleaned := jsonBytes
 	var changed bool
-	if gjson.GetBytes(cleaned, "usageMetadata").Exists() {
+	if usageMetadata = gjson.GetBytes(cleaned, "usageMetadata"); usageMetadata.Exists() {
 		// Rename usageMetadata to cpaUsageMetadata in the message_start event of Claude
 		cleaned, _ = sjson.SetRawBytes(cleaned, "cpaUsageMetadata", []byte(usageMetadata.Raw))
 		cleaned, _ = sjson.DeleteBytes(cleaned, "usageMetadata")
 		changed = true
 	}
-	if gjson.GetBytes(cleaned, "response.usageMetadata").Exists() {
+	if usageMetadata = gjson.GetBytes(cleaned, "response.usageMetadata"); usageMetadata.Exists() {
 		// Rename usageMetadata to cpaUsageMetadata in the message_start event of Claude
 		cleaned, _ = sjson.SetRawBytes(cleaned, "response.cpaUsageMetadata", []byte(usageMetadata.Raw))
 		cleaned, _ = sjson.DeleteBytes(cleaned, "response.usageMetadata")
 		changed = true
 	}
--- a/internal/translator/antigravity/claude/antigravity_claude_response.go
+++ b/internal/translator/antigravity/claude/antigravity_claude_response.go
@@ -99,6 +99,14 @@ func ConvertAntigravityResponseToClaude(_ context.Context, _ string, originalReq
 		// This follows the Claude Code API specification for streaming message initialization
 		messageStartTemplate := `{"type": "message_start", "message": {"id": "msg_1nZdL29xx5MUA1yADyHTEsnR8uuvGzszyY", "type": "message", "role": "assistant", "content": [], "model": "claude-3-5-sonnet-20241022", "stop_reason": null, "stop_sequence": null, "usage": {"input_tokens": 0, "output_tokens": 0}}}`
 		// Use cpaUsageMetadata within the message_start event for Claude.
 		if promptTokenCount := gjson.GetBytes(rawJSON, "response.cpaUsageMetadata.promptTokenCount"); promptTokenCount.Exists() {
 			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.usage.input_tokens", promptTokenCount.Int())
 		}
 		if candidatesTokenCount := gjson.GetBytes(rawJSON, "response.cpaUsageMetadata.candidatesTokenCount"); candidatesTokenCount.Exists() {
 			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.usage.output_tokens", candidatesTokenCount.Int())
 		}
 		// Override default values with actual response metadata if available from the Gemini CLI response
 		if modelVersionResult := gjson.GetBytes(rawJSON, "response.modelVersion"); modelVersionResult.Exists() {
 			messageStartTemplate, _ = sjson.Set(messageStartTemplate, "message.model", modelVersionResult.String())
--- a/internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go
+++ b/internal/translator/antigravity/openai/chat-completions/antigravity_openai_request.go
@@ -247,7 +247,7 @@ func ConvertOpenAIRequestToAntigravity(modelName string, inputRawJSON []byte, _
 			} else if role == "assistant" {
 				node := []byte(`{"role":"model","parts":[]}`)
 				p := 0
-				if content.Type == gjson.String {
+				if content.Type == gjson.String && content.String() != "" {
 					node, _ = sjson.SetBytes(node, "parts.-1.text", content.String())
 					p++
 				} else if content.IsArray() {
--- a/internal/translator/claude/openai/chat-completions/claude_openai_response.go
+++ b/internal/translator/claude/openai/chat-completions/claude_openai_response.go
@@ -209,9 +209,12 @@ func ConvertClaudeResponseToOpenAI(_ context.Context, modelName string, original
 		if usage := root.Get("usage"); usage.Exists() {
 			inputTokens := usage.Get("input_tokens").Int()
 			outputTokens := usage.Get("output_tokens").Int()
-			template, _ = sjson.Set(template, "usage.prompt_tokens", inputTokens)
+			cacheReadInputTokens := usage.Get("cache_read_input_tokens").Int()
 			cacheCreationInputTokens := usage.Get("cache_creation_input_tokens").Int()
 			template, _ = sjson.Set(template, "usage.prompt_tokens", inputTokens+cacheCreationInputTokens)
 			template, _ = sjson.Set(template, "usage.completion_tokens", outputTokens)
 			template, _ = sjson.Set(template, "usage.total_tokens", inputTokens+outputTokens)
 			template, _ = sjson.Set(template, "usage.prompt_tokens_details.cached_tokens", cacheReadInputTokens)
 		}
 		return []string{template}
@@ -285,8 +288,6 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 	var messageID string
 	var model string
 	var createdAt int64
 	var inputTokens, outputTokens int64
 	var reasoningTokens int64
 	var stopReason string
 	var contentParts []string
 	var reasoningParts []string
@@ -303,9 +304,6 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 				messageID = message.Get("id").String()
 				model = message.Get("model").String()
 				createdAt = time.Now().Unix()
 				if usage := message.Get("usage"); usage.Exists() {
 					inputTokens = usage.Get("input_tokens").Int()
 				}
 			}
 		case "content_block_start":
@@ -368,11 +366,14 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 				}
 			}
 			if usage := root.Get("usage"); usage.Exists() {
-				outputTokens = usage.Get("output_tokens").Int()
+				inputTokens := usage.Get("input_tokens").Int()
-				// Estimate reasoning tokens from accumulated thinking content
+				outputTokens := usage.Get("output_tokens").Int()
-				if len(reasoningParts) > 0 {
+				cacheReadInputTokens := usage.Get("cache_read_input_tokens").Int()
-					reasoningTokens = int64(len(strings.Join(reasoningParts, "")) / 4) // Rough estimation
+				cacheCreationInputTokens := usage.Get("cache_creation_input_tokens").Int()
-				}
+				out, _ = sjson.Set(out, "usage.prompt_tokens", inputTokens+cacheCreationInputTokens)
 				out, _ = sjson.Set(out, "usage.completion_tokens", outputTokens)
 				out, _ = sjson.Set(out, "usage.total_tokens", inputTokens+outputTokens)
 				out, _ = sjson.Set(out, "usage.prompt_tokens_details.cached_tokens", cacheReadInputTokens)
 			}
 		}
 	}
@@ -431,16 +432,5 @@ func ConvertClaudeResponseToOpenAINonStream(_ context.Context, _ string, origina
 		out, _ = sjson.Set(out, "choices.0.finish_reason", mapAnthropicStopReasonToOpenAI(stopReason))
 	}
 	// Set usage information including prompt tokens, completion tokens, and total tokens
 	totalTokens := inputTokens + outputTokens
 	out, _ = sjson.Set(out, "usage.prompt_tokens", inputTokens)
 	out, _ = sjson.Set(out, "usage.completion_tokens", outputTokens)
 	out, _ = sjson.Set(out, "usage.total_tokens", totalTokens)
 	// Add reasoning tokens to usage details if any reasoning content was processed
 	if reasoningTokens > 0 {
 		out, _ = sjson.Set(out, "usage.completion_tokens_details.reasoning_tokens", reasoningTokens)
 	}
 	return out
 }
--- a/internal/watcher/diff/model_hash.go
+++ b/internal/watcher/diff/model_hash.go
@@ -56,6 +56,21 @@ func ComputeClaudeModelsHash(models []config.ClaudeModel) string {
 	return hashJoined(keys)
 }
 // ComputeCodexModelsHash returns a stable hash for Codex model aliases.
 func ComputeCodexModelsHash(models []config.CodexModel) string {
 	keys := normalizeModelPairs(func(out func(key string)) {
 		for _, model := range models {
 			name := strings.TrimSpace(model.Name)
 			alias := strings.TrimSpace(model.Alias)
 			if name == "" && alias == "" {
 				continue
 			}
 			out(strings.ToLower(name) + "|" + strings.ToLower(alias))
 		}
 	})
 	return hashJoined(keys)
 }
 // ComputeExcludedModelsHash returns a normalized hash for excluded model lists.
 func ComputeExcludedModelsHash(excluded []string) string {
 	if len(excluded) == 0 {
--- a/internal/watcher/diff/model_hash_test.go
+++ b/internal/watcher/diff/model_hash_test.go
@@ -81,6 +81,15 @@ func TestComputeClaudeModelsHash_Empty(t *testing.T) {
 	}
 }
 func TestComputeCodexModelsHash_Empty(t *testing.T) {
 	if got := ComputeCodexModelsHash(nil); got != "" {
 		t.Fatalf("expected empty hash for nil models, got %q", got)
 	}
 	if got := ComputeCodexModelsHash([]config.CodexModel{}); got != "" {
 		t.Fatalf("expected empty hash for empty slice, got %q", got)
 	}
 }
 func TestComputeClaudeModelsHash_IgnoresBlankAndDedup(t *testing.T) {
 	a := []config.ClaudeModel{
 		{Name: "m1", Alias: "a1"},
@@ -95,6 +104,20 @@ func TestComputeClaudeModelsHash_IgnoresBlankAndDedup(t *testing.T) {
 	}
 }
 func TestComputeCodexModelsHash_IgnoresBlankAndDedup(t *testing.T) {
 	a := []config.CodexModel{
 		{Name: "m1", Alias: "a1"},
 		{Name: " "},
 		{Name: "M1", Alias: "A1"},
 	}
 	b := []config.CodexModel{
 		{Name: "m1", Alias: "a1"},
 	}
 	if h1, h2 := ComputeCodexModelsHash(a), ComputeCodexModelsHash(b); h1 == "" || h1 != h2 {
 		t.Fatalf("expected same hash ignoring blanks/dupes, got %q / %q", h1, h2)
 	}
 }
 func TestComputeExcludedModelsHash_Normalizes(t *testing.T) {
 	hash1 := ComputeExcludedModelsHash([]string{" A ", "b", "a"})
 	hash2 := ComputeExcludedModelsHash([]string{"a", " b", "A"})
@@ -157,3 +180,15 @@ func TestComputeClaudeModelsHash_Deterministic(t *testing.T) {
 		t.Fatalf("expected different hash when models change, got %s", h3)
 	}
 }
 func TestComputeCodexModelsHash_Deterministic(t *testing.T) {
 	models := []config.CodexModel{{Name: "a", Alias: "A"}, {Name: "b"}}
 	h1 := ComputeCodexModelsHash(models)
 	h2 := ComputeCodexModelsHash(models)
 	if h1 == "" || h1 != h2 {
 		t.Fatalf("expected deterministic hash, got %s / %s", h1, h2)
 	}
 	if h3 := ComputeCodexModelsHash([]config.CodexModel{{Name: "a"}}); h3 == h1 {
 		t.Fatalf("expected different hash when models change, got %s", h3)
 	}
 }
--- a/internal/watcher/synthesizer/config.go
+++ b/internal/watcher/synthesizer/config.go
@@ -151,6 +151,9 @@ func (s *ConfigSynthesizer) synthesizeCodexKeys(ctx *SynthesisContext) []*coreau
 		if ck.BaseURL != "" {
 			attrs["base_url"] = ck.BaseURL
 		}
 		if hash := diff.ComputeCodexModelsHash(ck.Models); hash != "" {
 			attrs["models_hash"] = hash
 		}
 		addConfigHeadersToAttrs(ck.Headers, attrs)
 		proxyURL := strings.TrimSpace(ck.ProxyURL)
 		a := &coreauth.Auth{
--- a/sdk/api/handlers/handlers.go
+++ b/sdk/api/handlers/handlers.go
@@ -104,8 +104,8 @@ func BuildErrorResponseBody(status int, errText string) []byte {
 // Returning 0 disables keep-alives (default when unset).
 func StreamingKeepAliveInterval(cfg *config.SDKConfig) time.Duration {
 	seconds := defaultStreamingKeepAliveSeconds
-	if cfg != nil && cfg.Streaming.KeepAliveSeconds != nil {
+	if cfg != nil {
-		seconds = *cfg.Streaming.KeepAliveSeconds
+		seconds = cfg.Streaming.KeepAliveSeconds
 	}
 	if seconds <= 0 {
 		return 0
@@ -116,8 +116,8 @@ func StreamingKeepAliveInterval(cfg *config.SDKConfig) time.Duration {
 // StreamingBootstrapRetries returns how many times a streaming request may be retried before any bytes are sent.
 func StreamingBootstrapRetries(cfg *config.SDKConfig) int {
 	retries := defaultStreamingBootstrapRetries
-	if cfg != nil && cfg.Streaming.BootstrapRetries != nil {
+	if cfg != nil {
-		retries = *cfg.Streaming.BootstrapRetries
+		retries = cfg.Streaming.BootstrapRetries
 	}
 	if retries < 0 {
 		retries = 0
--- a/sdk/api/handlers/handlers_stream_bootstrap_test.go
+++ b/sdk/api/handlers/handlers_stream_bootstrap_test.go
@@ -94,10 +94,9 @@ func TestExecuteStreamWithAuthManager_RetriesBeforeFirstByte(t *testing.T) {
 		registry.GetGlobalRegistry().UnregisterClient(auth2.ID)
 	})
 	bootstrapRetries := 1
 	handler := NewBaseAPIHandlers(&sdkconfig.SDKConfig{
 		Streaming: sdkconfig.StreamingConfig{
-			BootstrapRetries: &bootstrapRetries,
+			BootstrapRetries: 1,
 		},
 	}, manager)
 	dataChan, errChan := handler.ExecuteStreamWithAuthManager(context.Background(), "openai", "test-model", []byte(`{"model":"test-model"}`), "")
--- a/sdk/cliproxy/auth/conductor.go
+++ b/sdk/cliproxy/auth/conductor.go
@@ -263,7 +263,6 @@ func (m *Manager) Execute(ctx context.Context, providers []string, req cliproxye
 		return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
 	defer m.advanceProviderCursor(req.Model, normalized)
 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -302,7 +301,6 @@ func (m *Manager) ExecuteCount(ctx context.Context, providers []string, req clip
 		return cliproxyexecutor.Response{}, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
 	defer m.advanceProviderCursor(req.Model, normalized)
 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -341,7 +339,6 @@ func (m *Manager) ExecuteStream(ctx context.Context, providers []string, req cli
 		return nil, &Error{Code: "provider_not_found", Message: "no provider supplied"}
 	}
 	rotated := m.rotateProviders(req.Model, normalized)
 	defer m.advanceProviderCursor(req.Model, normalized)
 	retryTimes, maxWait := m.retrySettings()
 	attempts := retryTimes + 1
@@ -640,13 +637,20 @@ func (m *Manager) normalizeProviders(providers []string) []string {
 	return result
 }
 // rotateProviders returns a rotated view of the providers list starting from the
 // current offset for the model, and atomically increments the offset for the next call.
 // This ensures concurrent requests get different starting providers.
 func (m *Manager) rotateProviders(model string, providers []string) []string {
 	if len(providers) == 0 {
 		return nil
 	}
-	m.mu.RLock()
+
 	// Atomic read-and-increment: get current offset and advance cursor in one lock
 	m.mu.Lock()
 	offset := m.providerOffsets[model]
-	m.mu.RUnlock()
+	m.providerOffsets[model] = (offset + 1) % len(providers)
 	m.mu.Unlock()
 	if len(providers) > 0 {
 		offset %= len(providers)
 	}
@@ -662,19 +666,6 @@ func (m *Manager) rotateProviders(model string, providers []string) []string {
 	return rotated
 }
 func (m *Manager) advanceProviderCursor(model string, providers []string) {
 	if len(providers) == 0 {
 		m.mu.Lock()
 		delete(m.providerOffsets, model)
 		m.mu.Unlock()
 		return
 	}
 	m.mu.Lock()
 	current := m.providerOffsets[model]
 	m.providerOffsets[model] = (current + 1) % len(providers)
 	m.mu.Unlock()
 }
 func (m *Manager) retrySettings() (int, time.Duration) {
 	if m == nil {
 		return 0, 0
--- a/sdk/cliproxy/service.go
+++ b/sdk/cliproxy/service.go
@@ -745,6 +745,9 @@ func (s *Service) registerModelsForAuth(a *coreauth.Auth) {
 	case "codex":
 		models = registry.GetOpenAIModels()
 		if entry := s.resolveConfigCodexKey(a); entry != nil {
 			if len(entry.Models) > 0 {
 				models = buildCodexConfigModels(entry)
 			}
 			if authKind == "apikey" {
 				excluded = entry.ExcludedModels
 			}
@@ -1188,3 +1191,41 @@ func buildClaudeConfigModels(entry *config.ClaudeKey) []*ModelInfo {
 	}
 	return out
 }
 func buildCodexConfigModels(entry *config.CodexKey) []*ModelInfo {
 	if entry == nil || len(entry.Models) == 0 {
 		return nil
 	}
 	now := time.Now().Unix()
 	out := make([]*ModelInfo, 0, len(entry.Models))
 	seen := make(map[string]struct{}, len(entry.Models))
 	for i := range entry.Models {
 		model := entry.Models[i]
 		name := strings.TrimSpace(model.Name)
 		alias := strings.TrimSpace(model.Alias)
 		if alias == "" {
 			alias = name
 		}
 		if alias == "" {
 			continue
 		}
 		key := strings.ToLower(alias)
 		if _, exists := seen[key]; exists {
 			continue
 		}
 		seen[key] = struct{}{}
 		display := name
 		if display == "" {
 			display = alias
 		}
 		out = append(out, &ModelInfo{
 			ID:          alias,
 			Object:      "model",
 			Created:     now,
 			OwnedBy:     "openai",
 			Type:        "openai",
 			DisplayName: display,
 		})
 	}
 	return out
 }
Author	SHA1	Message	Date
Luis Pater	e3d8d726e6	Merge branch 'router-for-me:main' into main	2025-12-28 15:09:33 +08:00
Luis Pater	457924828a	Merge pull request #757 from ben-vargas/fix-thinking-toolchoice-conflict Fix: disable thinking when tool_choice forces tool use	2025-12-28 14:04:30 +08:00
Ben Vargas	aca2ef6359	Fix: disable thinking when tool_choice forces tool use Anthropic API does not allow extended thinking when tool_choice is set to "any" or a specific tool. This was causing 400 errors when using features like Amp's /handoff command which forces tool_choice. Added disableThinkingIfToolChoiceForced() that removes thinking config when incompatible tool_choice is detected, applied to both streaming and non-streaming paths. Fixes router-for-me/CLIProxyAPI#630	2025-12-27 16:31:37 -07:00
Luis Pater	ade7194792	feat(management): add generic API call handler to management endpoints	2025-12-28 04:40:32 +08:00
Luis Pater	0f51e73baa	Merge branch 'router-for-me:main' into main	2025-12-28 03:07:58 +08:00
Luis Pater	3a436e116a	feat(cliproxy): implement model aliasing and hashing for Codex configurations, enhance request routing logic, and normalize Codex model entries	2025-12-28 03:06:51 +08:00
Luis Pater	d06e2dc83c	Merge branch 'router-for-me:main' into main	2025-12-28 02:10:16 +08:00
Luis Pater	336867853b	Merge pull request #756 from leaph/check-ai-thinking-settings feat(iflow): add model-specific thinking configs for GLM-4.7 and Mini…	2025-12-28 02:08:27 +08:00
leaph	6403ff4ec4	feat(iflow): add model-specific thinking configs for GLM-4.7 and MiniMax-M2.1 - GLM-4.7: Uses extra_body={"thinking": {"type": "enabled"}, "clear_thinking": false} - MiniMax-M2.1: Uses reasoning_split=true for OpenAI-style reasoning separation - Added preserveReasoningContentInMessages() to support re-injection of reasoning content in assistant message history for multi-turn conversations - Added ThinkingSupport to MiniMax-M2.1 model definition	2025-12-27 18:39:15 +01:00
Luis Pater	d222469b44	Update issue templates	2025-12-28 01:22:42 +08:00
Luis Pater	790a17ce98	Merge pull request #70 from router-for-me/plus v6.6.60	2025-12-28 00:57:14 +08:00
Luis Pater	d473c952fb	Merge branch 'main' into plus	2025-12-28 00:56:04 +08:00
Luis Pater	7646a2b877	Fixed: #749 fix(translators): ensure `gjson.String` content is non-empty before setting `parts` in OpenAI request logic	2025-12-28 00:54:26 +08:00
Luis Pater	62090f2568	Merge pull request #750 from router-for-me/config fix(config): preserve original config structure and avoid default value pollution	2025-12-27 22:10:01 +08:00
Luis Pater	d35152bbef	Merge branch 'router-for-me:main' into main	2025-12-27 22:03:50 +08:00
Luis Pater	c281f4cbaf	Fixed: #747 fix(translators): rename and integrate `usageMetadata` as `cpaUsageMetadata` in Claude processing logic	2025-12-27 22:02:11 +08:00
hkfires	09455f9e85	fix(config): make streaming keepalive and retries ints	2025-12-27 20:56:47 +08:00
hkfires	c8e72ba0dc	fix(config): smart merge writes non-default new keys only	2025-12-27 20:28:54 +08:00
hkfires	375ef252ab	docs(config): clarify merge mapping behavior	2025-12-27 19:30:21 +08:00
hkfires	ee552f8720	chore(config): update ignore patterns	2025-12-27 19:13:14 +08:00
hkfires	2e88c4858e	fix(config): avoid adding new keys when merging	2025-12-27 19:00:47 +08:00
Luis Pater	3f50da85c1	Merge pull request #745 from router-for-me/auth fix(auth): make provider rotation atomic	2025-12-27 13:01:22 +08:00
hkfires	8be06255f7	fix(auth): make provider rotation atomic	2025-12-27 12:56:48 +08:00
Luis Pater	60936b5185	Merge branch 'router-for-me:main' into main	2025-12-27 03:57:03 +08:00
Luis Pater	72274099aa	Fixed: #738 fix(translators): refine prompt token calculation by incorporating cached tokens in Claude response handling	2025-12-27 03:56:11 +08:00