chore(deps): bump golang.org/x/term to v0.37.0

Merge pull request #32 from Ravens2121/master
feat: enhance thinking mode support for Kiro translator
2026-03-12 08:43:58 +00:00 · 2025-12-17 00:19:15 +08:00 · 2025-12-17 00:00:57 +08:00 · 2025-12-16 23:06:06 +08:00 · 2025-12-16 13:19:42 +08:00 · 2025-12-16 13:19:32 +08:00
28 changed files with 488 additions and 1054 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -28,3 +28,4 @@ bin/*
 .claude/*
 .vscode/*
 .serena/*
+.bmad/*
--- a/.gitignore
+++ b/.gitignore
@@ -32,6 +32,7 @@ GEMINI.md
 .vscode/*
 .claude/*
 .serena/*
+.bmad/*
 .mcp/cache/

 # macOS
--- a/config.example.yaml
+++ b/config.example.yaml
@@ -25,6 +25,9 @@ remote-management:
  # Disable the bundled management control panel asset download and HTTP route when true.
  disable-control-panel: false

+  # GitHub repository for the management control panel. Accepts a repository URL or releases API URL.
+  panel-github-repository: "https://github.com/router-for-me/Cli-Proxy-API-Management-Center"
+
 # Authentication directory (supports ~ for home directory)
 auth-dir: "~/.cli-proxy-api"

@@ -151,8 +154,8 @@ ws-auth: false
 #   upstream-url: "https://ampcode.com"
 #   # Optional: Override API key for Amp upstream (otherwise uses env or file)
 #   upstream-api-key: ""
-#   # Restrict Amp management routes (/api/auth, /api/user, etc.) to localhost only (recommended)
-#   restrict-management-to-localhost: true
+#   # Restrict Amp management routes (/api/auth, /api/user, etc.) to localhost only (default: false)
+#   restrict-management-to-localhost: false
 #   # Force model mappings to run before checking local API keys (default: false)
 #   force-model-mappings: false
 #   # Amp Model Mappings
--- a/docs/amp-cli-integration.md
+++ b/docs/amp-cli-integration.md
@@ -1,443 +0,0 @@
-# Amp CLI Integration Guide
-
-This guide explains how to use CLIProxyAPI with Amp CLI and Amp IDE extensions, enabling you to use your existing Google/ChatGPT/Claude subscriptions (via OAuth) with Amp's CLI.
-
-## Table of Contents
-
- [Overview](#overview)
-  - [Which Providers Should You Authenticate?](#which-providers-should-you-authenticate)
- [Architecture](#architecture)
- [Configuration](#configuration)
-  - [Model Mapping Configuration](#model-mapping-configuration)
- [Setup](#setup)
- [Usage](#usage)
- [Troubleshooting](#troubleshooting)
-
-## Overview
-
-The Amp CLI integration adds specialized routing to support Amp's API patterns while maintaining full compatibility with all existing CLIProxyAPI features. This allows you to use both traditional CLIProxyAPI features and Amp CLI with the same proxy server.
-
-### Key Features
-
- **Provider route aliases**: Maps Amp's `/api/provider/{provider}/v1...` patterns to CLIProxyAPI handlers
- **Management proxy**: Forwards OAuth and account management requests to Amp's control plane
- **Smart fallback**: Automatically routes unconfigured models to ampcode.com
- **Model mapping**: Route unavailable models to alternatives you have access to (e.g., `claude-opus-4.5` → `claude-sonnet-4`)
- **Secret management**: Configurable precedence (config > env > file) with 5-minute caching
- **Security-first**: Management routes restricted to localhost by default
- **Automatic gzip handling**: Decompresses responses from Amp upstream
-
-### What You Can Do
-
- Use Amp CLI with your Google account (Gemini 3 Pro Preview, Gemini 2.5 Pro, Gemini 2.5 Flash)
- Use Amp CLI with your ChatGPT Plus/Pro subscription (GPT-5, GPT-5 Codex models)
- Use Amp CLI with your Claude Pro/Max subscription (Claude Sonnet 4.5, Opus 4.1)
- Use Amp IDE extensions (VS Code, Cursor, Windsurf, etc.) with the same proxy
- Run multiple CLI tools (Factory + Amp) through one proxy server
- Route unconfigured models automatically through ampcode.com
-
-### Which Providers Should You Authenticate?
-
-**Important**: The providers you need to authenticate depend on which models and features your installed version of Amp currently uses. Amp employs different providers for various agent modes and specialized subagents:
-
- **Smart mode**: Uses Google/Gemini models (Gemini 3 Pro)
- **Rush mode**: Uses Anthropic/Claude models (Claude Haiku 4.5)
- **Oracle subagent**: Uses OpenAI/GPT models (GPT-5 medium reasoning)
- **Librarian subagent**: Uses Anthropic/Claude models (Claude Sonnet 4.5)
- **Search subagent**: Uses Anthropic/Claude models (Claude Haiku 4.5)
- **Review feature**: Uses Google/Gemini models (Gemini 2.5 Flash-Lite)
-
-For the most current information about which models Amp uses, see the **[Amp Models Documentation](https://ampcode.com/models)**.
-
-#### Fallback Behavior
-
-CLIProxyAPI uses a smart fallback system:
-
-1. **Provider authenticated locally** (`--login`, `--codex-login`, `--claude-login`):
-   - Requests use **your OAuth subscription** (ChatGPT Plus/Pro, Claude Pro/Max, Google account)
-   - You benefit from your subscription's included usage quotas
-   - No Amp credits consumed
-
-2. **Provider NOT authenticated locally**:
-   - Requests automatically forward to **ampcode.com**
-   - Uses Amp's backend provider connections
-   - **Requires Amp credits** if the provider is paid (OpenAI, Anthropic paid tiers)
-   - May result in errors if Amp credit balance is insufficient
-
-**Recommendation**: Authenticate all providers you have subscriptions for to maximize value and minimize Amp credit usage. If you don't have subscriptions to all providers Amp uses, ensure you have sufficient Amp credits available for fallback requests.
-
-## Architecture
-
-### Request Flow
-
-```
-Amp CLI/IDE
-  ↓
-  ├─ Provider API requests (/api/provider/{provider}/v1/...)
-  │   ↓
-  │   ├─ Model configured locally?
-  │   │   YES → Use local OAuth tokens (OpenAI/Claude/Gemini handlers)
-  │   │   NO  ↓
-  │   │       ├─ Model mapping configured?
-  │   │       │   YES → Rewrite model → Use local handler (free)
-  │   │       │   NO  → Forward to ampcode.com (uses Amp credits)
-  │   ↓
-  │   Response
-  │
-  └─ Management requests (/api/auth, /api/user, /api/threads, ...)
-      ↓
-      ├─ Localhost check (security)
-      ↓
-      └─ Reverse proxy to ampcode.com
-          ↓
-          Response (auto-decompressed if gzipped)
-```
-
-### Components
-
-The Amp integration is implemented as a modular routing module (`internal/api/modules/amp/`) with these components:
-
-1. **Route Aliases** (`routes.go`): Maps Amp-style paths to standard handlers
-2. **Reverse Proxy** (`proxy.go`): Forwards management requests to ampcode.com
-3. **Fallback Handler** (`fallback_handlers.go`): Routes unconfigured models to ampcode.com
-4. **Secret Management** (`secret.go`): Multi-source API key resolution with caching
-5. **Main Module** (`amp.go`): Orchestrates registration and configuration
-
-## Configuration
-
-### Basic Configuration
-
-Add these fields to your `config.yaml`:
-
-```yaml
-# Amp upstream control plane (required for management routes)
-amp-upstream-url: "https://ampcode.com"
-
-# Optional: Override API key (otherwise uses env or file)
-# amp-upstream-api-key: "your-amp-api-key"
-
-# Security: restrict management routes to localhost (recommended)
-amp-restrict-management-to-localhost: true
-```
-
-### Model Mapping Configuration
-
-When Amp CLI requests a model that you don't have access to, you can configure mappings to route those requests to alternative models that you DO have available. This avoids consuming Amp credits for models you could handle locally.
-
-```yaml
-# Route unavailable models to alternatives
-amp-model-mappings:
-  # Example: Route Claude Opus 4.5 requests to Claude Sonnet 4
-  - from: "claude-opus-4.5"
-    to: "claude-sonnet-4"
-  
-  # Example: Route GPT-5 requests to Gemini 2.5 Pro
-  - from: "gpt-5"
-    to: "gemini-2.5-pro"
-  
-  # Example: Map older model names to newer versions
-  - from: "claude-3-opus-20240229"
-    to: "claude-3-5-sonnet-20241022"
-```
-
-**How it works:**
-
-1. Amp CLI requests a model (e.g., `claude-opus-4.5`)
-2. CLIProxyAPI checks if a local provider is available for that model
-3. If not available, it checks the model mappings
-4. If a mapping exists, the request is rewritten to use the target model
-5. The request is then handled locally (free, using your OAuth subscription)
-
-**Benefits:**
- **Save Amp credits**: Use your local subscriptions instead of forwarding to ampcode.com
- **Hot-reload**: Mappings can be updated without restarting the proxy
- **Structured logging**: Clear logs show when mappings are applied
-
-**Routing Decision Logs:**
-
-The proxy logs each routing decision with structured fields:
-
-```
-[AMP] Using local provider for model: gemini-2.5-pro          # Local provider (free)
-[AMP] Model mapped: claude-opus-4.5 -> claude-sonnet-4        # Mapping applied (free)
-[AMP] Forwarding to ampcode.com (uses Amp credits) - model_id: gpt-5  # Fallback (costs credits)
-```
-
-### Secret Resolution Precedence
-
-The Amp module resolves API keys using this precedence order:
-
-| Source | Key | Priority | Cache |
-|--------|-----|----------|-------|
-| Config file | `amp-upstream-api-key` | High | No |
-| Environment | `AMP_API_KEY` | Medium | No |
-| Amp secrets file | `~/.local/share/amp/secrets.json` | Low | 5 min |
-
-**Recommendation**: Use the Amp secrets file (lowest precedence) for normal usage. This file is automatically managed by `amp login`.
-
-### Security Settings
-
-**`amp-restrict-management-to-localhost`** (default: `true`)
-
-When enabled, management routes (`/api/auth`, `/api/user`, `/api/threads`, etc.) only accept connections from localhost (127.0.0.1, ::1). This prevents:
- Drive-by browser attacks
- Remote access to management endpoints
- CORS-based attacks
- Header spoofing attacks (e.g., `X-Forwarded-For: 127.0.0.1`)
-
-#### How It Works
-
-This restriction uses the **actual TCP connection address** (`RemoteAddr`), not HTTP headers like `X-Forwarded-For`. This prevents header spoofing attacks but has important implications:
-
- ✅ **Works for direct connections**: Running CLIProxyAPI directly on your machine or server
- ⚠️ **May not work behind reverse proxies**: If deploying behind nginx, Cloudflare, or other proxies, the connection will appear to come from the proxy's IP, not localhost
-
-#### Reverse Proxy Deployments
-
-If you need to run CLIProxyAPI behind a reverse proxy (nginx, Caddy, Cloudflare Tunnel, etc.):
-
-1. **Disable the localhost restriction**:
-   ```yaml
-   amp-restrict-management-to-localhost: false
-   ```
-
-2. **Use alternative security measures**:
-   - Firewall rules restricting access to management routes
-   - Proxy-level authentication (HTTP Basic Auth, OAuth)
-   - Network-level isolation (VPN, Tailscale, Cloudflare Access)
-   - Bind CLIProxyAPI to `127.0.0.1` only and access via SSH tunnel
-
-3. **Example nginx configuration** (blocks external access to management routes):
-   ```nginx
-   location /api/auth { deny all; }
-   location /api/user { deny all; }
-   location /api/threads { deny all; }
-   location /api/internal { deny all; }
-   ```
-
-**Important**: Only disable `amp-restrict-management-to-localhost` if you understand the security implications and have other protections in place.
-
-## Setup
-
-### 1. Configure CLIProxyAPI
-
-Create or edit `config.yaml`:
-
-```yaml
-port: 8317
-auth-dir: "~/.cli-proxy-api"
-
-# Amp integration
-amp-upstream-url: "https://ampcode.com"
-amp-restrict-management-to-localhost: true
-
-# Other standard settings...
-debug: false
-logging-to-file: true
-```
-
-### 2. Authenticate with Providers
-
-Run OAuth login for the providers you want to use:
-
-**Google Account (Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 3 Pro Preview):**
-```bash
-./cli-proxy-api --login
-```
-
-**ChatGPT Plus/Pro (GPT-5, GPT-5 Codex):**
-```bash
-./cli-proxy-api --codex-login
-```
-
-**Claude Pro/Max (Claude Sonnet 4.5, Opus 4.1):**
-```bash
-./cli-proxy-api --claude-login
-```
-
-Tokens are saved to:
- Gemini: `~/.cli-proxy-api/gemini-<email>.json`
- OpenAI Codex: `~/.cli-proxy-api/codex-<email>.json`
- Claude: `~/.cli-proxy-api/claude-<email>.json`
-
-### 3. Start the Proxy
-
-```bash
-./cli-proxy-api --config config.yaml
-```
-
-Or run in background with tmux (recommended for remote servers):
-
-```bash
-tmux new-session -d -s proxy "./cli-proxy-api --config config.yaml"
-```
-
-### 4. Configure Amp CLI
-
-#### Option A: Settings File
-
-Edit `~/.config/amp/settings.json`:
-
-```json
-{
-  "amp.url": "http://localhost:8317"
-}
-```
-
-#### Option B: Environment Variable
-
-```bash
-export AMP_URL=http://localhost:8317
-```
-
-### 5. Login and Use Amp
-
-Login through the proxy (proxied to ampcode.com):
-
-```bash
-amp login
-```
-
-Use Amp as normal:
-
-```bash
-amp "Write a hello world program in Python"
-```
-
-### 6. (Optional) Configure Amp IDE Extension
-
-The proxy also works with Amp IDE extensions for VS Code, Cursor, Windsurf, etc.
-
-1. Open Amp extension settings in your IDE
-2. Set **Amp URL** to `http://localhost:8317`
-3. Login with your Amp account
-4. Start using Amp in your IDE
-
-Both CLI and IDE can use the proxy simultaneously.
-
-## Usage
-
-### Supported Routes
-
-#### Provider Aliases (Always Available)
-
-These routes work even without `amp-upstream-url` configured:
-
- `/api/provider/openai/v1/chat/completions`
- `/api/provider/openai/v1/responses`
- `/api/provider/anthropic/v1/messages`
- `/api/provider/google/v1beta/models/:action`
-
-Amp CLI calls these routes with your OAuth-authenticated models configured in CLIProxyAPI.
-
-#### Management Routes (Require `amp-upstream-url`)
-
-These routes are proxied to ampcode.com:
-
- `/api/auth` - Authentication
- `/api/user` - User profile
- `/api/meta` - Metadata
- `/api/threads` - Conversation threads
- `/api/telemetry` - Usage telemetry
- `/api/internal` - Internal APIs
-
-**Security**: Restricted to localhost by default.
-
-### Model Fallback Behavior
-
-When Amp requests a model:
-
-1. **Check local configuration**: Does CLIProxyAPI have OAuth tokens for this model's provider?
-2. **If YES**: Route to local handler (use your OAuth subscription)
-3. **If NO**: Check if a model mapping exists
-4. **If mapping exists**: Rewrite request to mapped model → Route to local handler (free)
-5. **If no mapping**: Forward to ampcode.com (uses Amp credits)
-
-This enables seamless mixed usage:
- Models you've configured (Gemini, ChatGPT, Claude) → Your OAuth subscriptions
- Models with mappings configured → Routed to alternative local models (free)
- Models you haven't configured and have no mapping → Amp's default providers (uses credits)
-
-### Example API Calls
-
-**Chat completion with local OAuth:**
-```bash
-curl http://localhost:8317/api/provider/openai/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-5",
-    "messages": [{"role": "user", "content": "Hello"}]
-  }'
-```
-
-**Management endpoint (localhost only):**
-```bash
-curl http://localhost:8317/api/user
-```
-
-## Troubleshooting
-
-### Common Issues
-
-| Symptom | Likely Cause | Fix |
-|---------|--------------|-----|
-| 404 on `/api/provider/...` | Incorrect route path | Ensure exact path: `/api/provider/{provider}/v1...` |
-| 403 on `/api/user` | Non-localhost request | Run from same machine or disable `amp-restrict-management-to-localhost` (not recommended) |
-| 401/403 from provider | Missing/expired OAuth | Re-run `--codex-login` or `--claude-login` |
-| Amp gzip errors | Response decompression issue | Update to latest build; auto-decompression should handle this |
-| Models not using proxy | Wrong Amp URL | Verify `amp.url` setting or `AMP_URL` environment variable |
-| CORS errors | Protected management endpoint | Use CLI/terminal, not browser |
-
-### Diagnostics
-
-**Check proxy logs:**
-```bash
-# If logging-to-file: true
-tail -f logs/requests.log
-
-# If running in tmux
-tmux attach-session -t proxy
-```
-
-**Enable debug mode** (temporarily):
-```yaml
-debug: true
-```
-
-**Test basic connectivity:**
-```bash
-# Check if proxy is running
-curl http://localhost:8317/v1/models
-
-# Check Amp-specific route
-curl http://localhost:8317/api/provider/openai/v1/models
-```
-
-**Verify Amp configuration:**
-```bash
-# Check if Amp is using proxy
-amp config get amp.url
-
-# Or check environment
-echo $AMP_URL
-```
-
-### Security Checklist
-
- ✅ Keep `amp-restrict-management-to-localhost: true` (default)
- ✅ Don't expose proxy publicly (bind to localhost or use firewall/VPN)
- ✅ Use the Amp secrets file (`~/.local/share/amp/secrets.json`) managed by `amp login`
- ✅ Rotate OAuth tokens periodically by re-running login commands
- ✅ Store config and auth-dir on encrypted disk if handling sensitive data
- ✅ Keep proxy binary up to date for security fixes
-
-## Additional Resources
-
- [CLIProxyAPI Main Documentation](https://help.router-for.me/)
- [Amp CLI Official Manual](https://ampcode.com/manual)
- [Management API Reference](https://help.router-for.me/management/api)
- [SDK Documentation](sdk-usage.md)
-
-## Disclaimer
-
-This integration is for personal/educational use. Using reverse proxies or alternate API bases may violate provider Terms of Service. You are solely responsible for how you use this software. Accounts may be rate-limited, locked, or banned. No warranties. Use at your own risk.
--- a/docs/amp-cli-integration_CN.md
+++ b/docs/amp-cli-integration_CN.md
@@ -1,392 +0,0 @@
-# Amp CLI 集成指南
-
-本指南说明如何在 Amp CLI 和 Amp IDE 扩展中使用 CLIProxyAPI，通过 OAuth 让你能够把已有的 Google/ChatGPT/Claude 订阅与 Amp 的 CLI 一起使用。
-
-## 目录
-
- [概述](#概述)
-  - [应该认证哪些服务提供商？](#应该认证哪些服务提供商)
- [架构](#架构)
- [配置](#配置)
- [设置](#设置)
- [用法](#用法)
- [故障排查](#故障排查)
-
-## 概述
-
-Amp CLI 集成为 Amp 的 API 模式添加了专用路由，同时保持与现有 CLIProxyAPI 功能的完全兼容。这样你可以在同一个代理服务器上同时使用传统 CLIProxyAPI 功能和 Amp CLI。
-
-### 主要特性
-
- **提供者路由别名**：将 Amp 的 `/api/provider/{provider}/v1...` 路径映射到 CLIProxyAPI 处理器
- **管理代理**：将 OAuth 和账号管理请求转发到 Amp 控制平面
- **智能回退**：自动将未配置的模型路由到 ampcode.com
- **密钥管理**：可配置优先级（配置 > 环境变量 > 文件），缓存 5 分钟
- **安全优先**：管理路由默认限制为 localhost
- **自动 gzip 处理**：自动解压来自 Amp 上游的响应
-
-### 你可以做什么
-
- 使用 Amp CLI 搭配你的 Google 账号（Gemini 3 Pro Preview、Gemini 2.5 Pro、Gemini 2.5 Flash）
- 使用 Amp CLI 搭配你的 ChatGPT Plus/Pro 订阅（GPT-5、GPT-5 Codex 模型）
- 使用 Amp CLI 搭配你的 Claude Pro/Max 订阅（Claude Sonnet 4.5、Opus 4.1）
- 将 Amp IDE 扩展（VS Code、Cursor、Windsurf 等）与同一个代理一起使用
- 通过一个代理同时运行多个 CLI 工具（Factory + Amp）
- 将未配置的模型自动路由到 ampcode.com
-
-### 应该认证哪些服务提供商？
-
-**重要**：需要认证的提供商取决于你安装的 Amp 版本当前使用的模型和功能。Amp 的不同智能模式和子代理会使用不同的提供商：
-
- **Smart 模式**：使用 Google/Gemini 模型（Gemini 3 Pro）
- **Rush 模式**：使用 Anthropic/Claude 模型（Claude Haiku 4.5）
- **Oracle 子代理**：使用 OpenAI/GPT 模型（GPT-5 medium reasoning）
- **Librarian 子代理**：使用 Anthropic/Claude 模型（Claude Sonnet 4.5）
- **Search 子代理**：使用 Anthropic/Claude 模型（Claude Haiku 4.5）
- **Review 功能**：使用 Google/Gemini 模型（Gemini 2.5 Flash-Lite）
-
-有关 Amp 当前使用哪些模型的最新信息，请参阅 **[Amp 模型文档](https://ampcode.com/models)**。
-
-#### 回退行为
-
-CLIProxyAPI 采用智能回退机制：
-
-1. **本地已认证提供商**（`--login`、`--codex-login`、`--claude-login`）：
-   - 请求使用**你的 OAuth 订阅**（ChatGPT Plus/Pro、Claude Pro/Max、Google 账号）
-   - 享受订阅自带的额度
-   - 不消耗 Amp 额度
-
-2. **本地未认证提供商**：
-   - 请求自动转发到 **ampcode.com**
-   - 使用 Amp 的后端提供商连接
-   - 如果提供商是付费的（OpenAI、Anthropic 付费档），**需要消耗 Amp 额度**
-   - 若 Amp 额度不足，可能产生错误
-
-**建议**：对你有订阅的所有提供商都进行认证，以最大化价值并尽量减少 Amp 额度消耗。如果没有覆盖 Amp 使用的全部提供商，请确保为回退请求准备足够的 Amp 额度。
-
-## 架构
-
-### 请求流
-
-```
-Amp CLI/IDE
-  ↓
-  ├─ Provider API requests (/api/provider/{provider}/v1/...)
-  │   ↓
-  │   ├─ Model configured locally?
-  │   │   YES → Use local OAuth tokens (OpenAI/Claude/Gemini handlers)
-  │   │   NO  → Forward to ampcode.com (reverse proxy)
-  │   ↓
-  │   Response
-  │
-  └─ Management requests (/api/auth, /api/user, /api/threads, ...)
-      ↓
-      ├─ Localhost check (security)
-      ↓
-      └─ Reverse proxy to ampcode.com
-          ↓
-          Response (auto-decompressed if gzipped)
-```
-
-### 组件
-
-Amp 集成以模块化路由模块（`internal/api/modules/amp/`）实现，包含以下组件：
-
-1. **路由别名**（`routes.go`）：将 Amp 风格的路径映射到标准处理器
-2. **反向代理**（`proxy.go`）：将管理请求转发到 ampcode.com
-3. **回退处理器**（`fallback_handlers.go`）：将未配置的模型路由到 ampcode.com
-4. **密钥管理**（`secret.go`）：多来源 API 密钥解析并带缓存
-5. **主模块**（`amp.go`）：负责注册和配置
-
-## 配置
-
-### 基础配置
-
-在 `config.yaml` 中新增以下字段：
-
-```yaml
-# Amp 上游控制平面（管理路由必需）
-amp-upstream-url: "https://ampcode.com"
-
-# 可选：覆盖 API key（否则使用环境变量或文件）
-# amp-upstream-api-key: "your-amp-api-key"
-
-# 安全性：将管理路由限制为 localhost（推荐）
-amp-restrict-management-to-localhost: true
-```
-
-### 密钥解析优先级
-
-Amp 模块以如下优先级解析 API key：
-
-| 来源 | 键名 | 优先级 | 缓存 |
-|------|------|--------|------|
-| 配置文件 | `amp-upstream-api-key` | 高 | 无 |
-| 环境变量 | `AMP_API_KEY` | 中 | 无 |
-| Amp 密钥文件 | `~/.local/share/amp/secrets.json` | 低 | 5 分钟 |
-
-**建议**：日常使用时采用 Amp 密钥文件（最低优先级）。该文件由 `amp login` 自动管理。
-
-### 安全设置
-
-**`amp-restrict-management-to-localhost`**（默认：`true`）
-
-启用后，管理路由（`/api/auth`、`/api/user`、`/api/threads` 等）只接受来自 localhost（127.0.0.1、::1）的连接，可防止：
- 浏览器探测式攻击
- 对管理端点的远程访问
- 基于 CORS 的攻击
- 伪造头攻击（例如 `X-Forwarded-For: 127.0.0.1`）
-
-#### 工作原理
-
-此限制使用**实际的 TCP 连接地址**（`RemoteAddr`），而非 `X-Forwarded-For` 等 HTTP 头，能防止头部伪造，但有重要影响：
-
- ✅ **直接连接可用**：在本机或服务器直接运行 CLIProxyAPI 时适用
- ⚠️ **可能不适用于反向代理场景**：部署在 nginx、Cloudflare 等代理后，请求源会显示为代理 IP 而非 localhost
-
-#### 反向代理部署
-
-若需要在反向代理（nginx、Caddy、Cloudflare Tunnel 等）后运行 CLIProxyAPI：
-
-1. **关闭 localhost 限制**：
-   ```yaml
-   amp-restrict-management-to-localhost: false
-   ```
-
-2. **使用替代安全措施**：
-   - 防火墙规则限制管理路由访问
-   - 代理层认证（HTTP Basic Auth、OAuth）
-   - 网络隔离（VPN、Tailscale、Cloudflare Access）
-   - 将 CLIProxyAPI 仅绑定 `127.0.0.1`，并通过 SSH 隧道访问
-
-3. **nginx 示例配置**（阻止外部访问管理路由）：
-   ```nginx
-   location /api/auth { deny all; }
-   location /api/user { deny all; }
-   location /api/threads { deny all; }
-   location /api/internal { deny all; }
-   ```
-
-**重要**：只有在理解安全影响并已采取其他防护措施时，才关闭 `amp-restrict-management-to-localhost`。
-
-## 设置
-
-### 1. 配置 CLIProxyAPI
-
-创建或编辑 `config.yaml`：
-
-```yaml
-port: 8317
-auth-dir: "~/.cli-proxy-api"
-
-# Amp 集成
-amp-upstream-url: "https://ampcode.com"
-amp-restrict-management-to-localhost: true
-
-# 其他常规设置...
-debug: false
-logging-to-file: true
-```
-
-### 2. 认证提供商
-
-为要使用的提供商执行 OAuth 登录：
-
-**Google 账号（Gemini 2.5 Pro、Gemini 2.5 Flash、Gemini 3 Pro Preview）：**
-```bash
-./cli-proxy-api --login
-```
-
-**ChatGPT Plus/Pro（GPT-5、GPT-5 Codex）：**
-```bash
-./cli-proxy-api --codex-login
-```
-
-**Claude Pro/Max（Claude Sonnet 4.5、Opus 4.1）：**
-```bash
-./cli-proxy-api --claude-login
-```
-
-令牌会保存到：
- Gemini: `~/.cli-proxy-api/gemini-<email>.json`
- OpenAI Codex: `~/.cli-proxy-api/codex-<email>.json`
- Claude: `~/.cli-proxy-api/claude-<email>.json`
-
-### 3. 启动代理
-
-```bash
-./cli-proxy-api --config config.yaml
-```
-
-或使用 tmux 在后台运行（推荐用于远程服务器）：
-
-```bash
-tmux new-session -d -s proxy "./cli-proxy-api --config config.yaml"
-```
-
-### 4. 配置 Amp CLI
-
-#### 方案 A：配置文件
-
-编辑 `~/.config/amp/settings.json`：
-
-```json
-{
-  "amp.url": "http://localhost:8317"
-}
-```
-
-#### 方案 B：环境变量
-
-```bash
-export AMP_URL=http://localhost:8317
-```
-
-### 5. 登录并使用 Amp
-
-通过代理登录（请求会被代理到 ampcode.com）：
-
-```bash
-amp login
-```
-
-像平常一样使用 Amp：
-
-```bash
-amp "Write a hello world program in Python"
-```
-
-### 6. （可选）配置 Amp IDE 扩展
-
-该代理同样适用于 VS Code、Cursor、Windsurf 等 Amp IDE 扩展。
-
-1. 在 IDE 中打开 Amp 扩展设置
-2. 将 **Amp URL** 设置为 `http://localhost:8317`
-3. 用你的 Amp 账号登录
-4. 在 IDE 中开始使用 Amp
-
-CLI 和 IDE 可同时使用该代理。
-
-## 用法
-
-### 支持的路由
-
-#### 提供商别名（始终可用）
-
-这些路由即使未配置 `amp-upstream-url` 也可使用：
-
- `/api/provider/openai/v1/chat/completions`
- `/api/provider/openai/v1/responses`
- `/api/provider/anthropic/v1/messages`
- `/api/provider/google/v1beta/models/:action`
-
-Amp CLI 会使用你在 CLIProxyAPI 中通过 OAuth 认证的模型来调用这些路由。
-
-#### 管理路由（需要 `amp-upstream-url`）
-
-这些路由会被代理到 ampcode.com：
-
- `/api/auth` - 认证
- `/api/user` - 用户资料
- `/api/meta` - 元数据
- `/api/threads` - 会话线程
- `/api/telemetry` - 使用遥测
- `/api/internal` - 内部 API
-
-**安全性**：默认限制为 localhost。
-
-### 模型回退行为
-
-当 Amp 请求模型时：
-
-1. **检查本地配置**：CLIProxyAPI 是否有该模型提供商的 OAuth 令牌？
-2. **如果有**：路由到本地处理器（使用你的 OAuth 订阅）
-3. **如果没有**：转发到 ampcode.com（使用 Amp 的默认路由）
-
-这实现了无缝混用：
- 你已配置的模型（Gemini、ChatGPT、Claude）→ 你的 OAuth 订阅
- 未配置的模型 → Amp 的默认提供商
-
-### 示例 API 调用
-
-**使用本地 OAuth 的聊天补全：**
-```bash
-curl http://localhost:8317/api/provider/openai/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -d '{
-    "model": "gpt-5",
-    "messages": [{"role": "user", "content": "Hello"}]
-  }'
-```
-
-**管理端点（仅限 localhost）：**
-```bash
-curl http://localhost:8317/api/user
-```
-
-## 故障排查
-
-### 常见问题
-
-| 症状 | 可能原因 | 解决方案 |
-|------|----------|----------|
-| `/api/provider/...` 返回 404 | 路径错误 | 确保路径准确：`/api/provider/{provider}/v1...` |
-| `/api/user` 返回 403 | 非 localhost 请求 | 在同一机器上访问，或关闭 `amp-restrict-management-to-localhost`（不推荐） |
-| 提供商返回 401/403 | OAuth 缺失或过期 | 重新运行 `--codex-login` 或 `--claude-login` |
-| Amp gzip 错误 | 响应解压问题 | 更新到最新构建；自动解压应能处理 |
-| 模型未走代理 | Amp URL 设置错误 | 检查 `amp.url` 设置或 `AMP_URL` 环境变量 |
-| CORS 错误 | 受保护的管理端点 | 使用 CLI/终端而非浏览器 |
-
-### 诊断
-
-**查看代理日志：**
-```bash
-# 若 logging-to-file: true
-tail -f logs/requests.log
-
-# 若运行在 tmux 中
-tmux attach-session -t proxy
-```
-
-**临时开启调试模式：**
-```yaml
-debug: true
-```
-
-**测试基础连通性：**
-```bash
-# 检查代理是否运行
-curl http://localhost:8317/v1/models
-
-# 检查 Amp 特定路由
-curl http://localhost:8317/api/provider/openai/v1/models
-```
-
-**验证 Amp 配置：**
-```bash
-# 检查 Amp 是否使用代理
-amp config get amp.url
-
-# 或检查环境变量
-echo $AMP_URL
-```
-
-### 安全清单
-
- ✅ 保持 `amp-restrict-management-to-localhost: true`（默认）
- ✅ 不要将代理暴露到公共网络（绑定到 localhost 或使用防火墙/VPN）
- ✅ 使用 `amp login` 管理的 Amp 密钥文件（`~/.local/share/amp/secrets.json`）
- ✅ 定期重新登录轮换 OAuth 令牌
- ✅ 若处理敏感数据，使用加密磁盘存储配置和 auth-dir
- ✅ 保持代理二进制为最新版本以获取安全修复
-
-## 其他资源
-
- [CLIProxyAPI 主文档](https://help.router-for.me/)
- [Amp CLI 官方手册](https://ampcode.com/manual)
- [管理 API 参考](https://help.router-for.me/management/api)
- [SDK 文档](sdk-usage.md)
-
-## 免责声明
-
-此集成仅用于个人或教育用途。使用反向代理或替代 API 基址可能违反提供商的服务条款。你需要对自己的使用方式负责。账号可能会被限速、锁定或封禁。软件不附带任何保证，使用风险自负。
--- a/go.mod
+++ b/go.mod
@@ -18,10 +18,10 @@ require (
 	github.com/tidwall/gjson v1.18.0
 	github.com/tidwall/sjson v1.2.5
 	github.com/tiktoken-go/tokenizer v0.7.0
-	golang.org/x/crypto v0.43.0
-	golang.org/x/net v0.46.0
+	golang.org/x/crypto v0.45.0
+	golang.org/x/net v0.47.0
 	golang.org/x/oauth2 v0.30.0
-	golang.org/x/term v0.36.0
+	golang.org/x/term v0.37.0
 	gopkg.in/natefinch/lumberjack.v2 v2.2.1
 	gopkg.in/yaml.v3 v3.0.1
 )
@@ -69,9 +69,9 @@ require (
 	github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
 	github.com/ugorji/go/codec v1.2.12 // indirect
 	golang.org/x/arch v0.8.0 // indirect
-	golang.org/x/sync v0.17.0 // indirect
-	golang.org/x/sys v0.37.0 // indirect
-	golang.org/x/text v0.30.0 // indirect
+	golang.org/x/sync v0.18.0 // indirect
+	golang.org/x/sys v0.38.0 // indirect
+	golang.org/x/text v0.31.0 // indirect
 	google.golang.org/protobuf v1.34.1 // indirect
 	gopkg.in/ini.v1 v1.67.0 // indirect
 )
--- a/go.sum
+++ b/go.sum
@@ -160,23 +160,23 @@ github.com/ugorji/go/codec v1.2.12/go.mod h1:UNopzCgEMSXjBc6AOMqYvWC1ktqTAfzJZUZ
 golang.org/x/arch v0.0.0-20210923205945-b76863e36670/go.mod h1:5om86z9Hs0C8fWVUuoMHwpExlXzs5Tkyp9hOrfG7pp8=
 golang.org/x/arch v0.8.0 h1:3wRIsP3pM4yUptoR96otTUOXI367OS0+c9eeRi9doIc=
 golang.org/x/arch v0.8.0/go.mod h1:FEVrYAQjsQXMVJ1nsMoVVXPZg6p2JE2mx8psSWTDQys=
-golang.org/x/crypto v0.43.0 h1:dduJYIi3A3KOfdGOHX8AVZ/jGiyPa3IbBozJ5kNuE04=
-golang.org/x/crypto v0.43.0/go.mod h1:BFbav4mRNlXJL4wNeejLpWxB7wMbc79PdRGhWKncxR0=
-golang.org/x/net v0.46.0 h1:giFlY12I07fugqwPuWJi68oOnpfqFnJIJzaIIm2JVV4=
-golang.org/x/net v0.46.0/go.mod h1:Q9BGdFy1y4nkUwiLvT5qtyhAnEHgnQ/zd8PfU6nc210=
+golang.org/x/crypto v0.45.0 h1:jMBrvKuj23MTlT0bQEOBcAE0mjg8mK9RXFhRH6nyF3Q=
+golang.org/x/crypto v0.45.0/go.mod h1:XTGrrkGJve7CYK7J8PEww4aY7gM3qMCElcJQ8n8JdX4=
+golang.org/x/net v0.47.0 h1:Mx+4dIFzqraBXUugkia1OOvlD6LemFo1ALMHjrXDOhY=
+golang.org/x/net v0.47.0/go.mod h1:/jNxtkgq5yWUGYkaZGqo27cfGZ1c5Nen03aYrrKpVRU=
 golang.org/x/oauth2 v0.30.0 h1:dnDm7JmhM45NNpd8FDDeLhK6FwqbOf4MLCM9zb1BOHI=
 golang.org/x/oauth2 v0.30.0/go.mod h1:B++QgG3ZKulg6sRPGD/mqlHQs5rB3Ml9erfeDY7xKlU=
-golang.org/x/sync v0.17.0 h1:l60nONMj9l5drqw6jlhIELNv9I0A4OFgRsG9k2oT9Ug=
-golang.org/x/sync v0.17.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
+golang.org/x/sync v0.18.0 h1:kr88TuHDroi+UVf+0hZnirlk8o8T+4MrK6mr60WkH/I=
+golang.org/x/sync v0.18.0/go.mod h1:9KTHXmSnoGruLpwFjVSX0lNNA75CykiMECbovNTZqGI=
 golang.org/x/sys v0.0.0-20220715151400-c0bba94af5f8/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.1.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
 golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
-golang.org/x/sys v0.37.0 h1:fdNQudmxPjkdUTPnLn5mdQv7Zwvbvpaxqs831goi9kQ=
-golang.org/x/sys v0.37.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
-golang.org/x/term v0.36.0 h1:zMPR+aF8gfksFprF/Nc/rd1wRS1EI6nDBGyWAvDzx2Q=
-golang.org/x/term v0.36.0/go.mod h1:Qu394IJq6V6dCBRgwqshf3mPF85AqzYEzofzRdZkWss=
-golang.org/x/text v0.30.0 h1:yznKA/E9zq54KzlzBEAWn1NXSQ8DIp/NYMy88xJjl4k=
-golang.org/x/text v0.30.0/go.mod h1:yDdHFIX9t+tORqspjENWgzaCVXgk0yYnYuSZ8UzzBVM=
+golang.org/x/sys v0.38.0 h1:3yZWxaJjBmCWXqhN1qh02AkOnCQ1poK6oF+a7xWL6Gc=
+golang.org/x/sys v0.38.0/go.mod h1:OgkHotnGiDImocRcuBABYBEXf8A9a87e/uXjp9XT3ks=
+golang.org/x/term v0.37.0 h1:8EGAD0qCmHYZg6J17DvsMy9/wJ7/D/4pV/wfnld5lTU=
+golang.org/x/term v0.37.0/go.mod h1:5pB4lxRNYYVZuTLmy8oR2BH8dflOR+IbTYFD8fi3254=
+golang.org/x/text v0.31.0 h1:aC8ghyu4JhP8VojJ2lEHBnochRno1sgL6nEi9WGFGMM=
+golang.org/x/text v0.31.0/go.mod h1:tKRAlv61yKIjGGHX/4tP1LTbc13YSec1pxVEWXzfoeM=
 golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543 h1:E7g+9GITq07hpfrRu66IVDexMakfv52eLZ2CXBWiKr4=
 golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
 google.golang.org/protobuf v1.34.1 h1:9ddQBjfCyZPOHPUiPxpYESBLc+T8P3E+Vo4IbKZgFWg=
--- a/internal/api/middleware/response_writer.go
+++ b/internal/api/middleware/response_writer.go
@@ -71,22 +71,64 @@ func (w *ResponseWriterWrapper) Write(data []byte) (int, error) {
 	n, err := w.ResponseWriter.Write(data)

 	// THEN: Handle logging based on response type
-	if w.isStreaming {
+	if w.isStreaming && w.chunkChannel != nil {
 		// For streaming responses: Send to async logging channel (non-blocking)
-		if w.chunkChannel != nil {
-			select {
-			case w.chunkChannel <- append([]byte(nil), data...): // Non-blocking send with copy
-			default: // Channel full, skip logging to avoid blocking
-			}
+		select {
+		case w.chunkChannel <- append([]byte(nil), data...): // Non-blocking send with copy
+		default: // Channel full, skip logging to avoid blocking
 		}
-	} else {
-		// For non-streaming responses: Buffer complete response
+		return n, err
+	}
+
+	if w.shouldBufferResponseBody() {
 		w.body.Write(data)
 	}

 	return n, err
 }

+func (w *ResponseWriterWrapper) shouldBufferResponseBody() bool {
+	if w.logger != nil && w.logger.IsEnabled() {
+		return true
+	}
+	if !w.logOnErrorOnly {
+		return false
+	}
+	status := w.statusCode
+	if status == 0 {
+		if statusWriter, ok := w.ResponseWriter.(interface{ Status() int }); ok && statusWriter != nil {
+			status = statusWriter.Status()
+		} else {
+			status = http.StatusOK
+		}
+	}
+	return status >= http.StatusBadRequest
+}
+
+// WriteString wraps the underlying ResponseWriter's WriteString method to capture response data.
+// Some handlers (and fmt/io helpers) write via io.StringWriter; without this override, those writes
+// bypass Write() and would be missing from request logs.
+func (w *ResponseWriterWrapper) WriteString(data string) (int, error) {
+	w.ensureHeadersCaptured()
+
+	// CRITICAL: Write to client first (zero latency)
+	n, err := w.ResponseWriter.WriteString(data)
+
+	// THEN: Capture for logging
+	if w.isStreaming && w.chunkChannel != nil {
+		select {
+		case w.chunkChannel <- []byte(data):
+		default:
+		}
+		return n, err
+	}
+
+	if w.shouldBufferResponseBody() {
+		w.body.WriteString(data)
+	}
+	return n, err
+}
+
 // WriteHeader wraps the underlying ResponseWriter's WriteHeader method.
 // It captures the status code, detects if the response is streaming based on the Content-Type header,
 // and initializes the appropriate logging mechanism (standard or streaming).
@@ -160,12 +202,16 @@ func (w *ResponseWriterWrapper) detectStreaming(contentType string) bool {
 		return true
 	}

-	// Check request body for streaming indicators
-	if w.requestInfo.Body != nil {
+	// If a concrete Content-Type is already set (e.g., application/json for error responses),
+	// treat it as non-streaming instead of inferring from the request payload.
+	if strings.TrimSpace(contentType) != "" {
+		return false
+	}
+
+	// Only fall back to request payload hints when Content-Type is not set yet.
+	if w.requestInfo != nil && len(w.requestInfo.Body) > 0 {
 		bodyStr := string(w.requestInfo.Body)
-		if strings.Contains(bodyStr, `"stream": true`) || strings.Contains(bodyStr, `"stream":true`) {
-			return true
-		}
+		return strings.Contains(bodyStr, `"stream": true`) || strings.Contains(bodyStr, `"stream":true`)
 	}

 	return false
@@ -221,7 +267,7 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
 		return nil
 	}

-	if w.isStreaming {
+	if w.isStreaming && w.streamWriter != nil {
 		if w.chunkChannel != nil {
 			close(w.chunkChannel)
 			w.chunkChannel = nil
@@ -233,24 +279,19 @@ func (w *ResponseWriterWrapper) Finalize(c *gin.Context) error {
 		}

 		// Write API Request and Response to the streaming log before closing
-		if w.streamWriter != nil {
-			apiRequest := w.extractAPIRequest(c)
-			if len(apiRequest) > 0 {
-				_ = w.streamWriter.WriteAPIRequest(apiRequest)
-			}
-			apiResponse := w.extractAPIResponse(c)
-			if len(apiResponse) > 0 {
-				_ = w.streamWriter.WriteAPIResponse(apiResponse)
-			}
-			if err := w.streamWriter.Close(); err != nil {
-				w.streamWriter = nil
-				return err
-			}
+		apiRequest := w.extractAPIRequest(c)
+		if len(apiRequest) > 0 {
+			_ = w.streamWriter.WriteAPIRequest(apiRequest)
+		}
+		apiResponse := w.extractAPIResponse(c)
+		if len(apiResponse) > 0 {
+			_ = w.streamWriter.WriteAPIResponse(apiResponse)
+		}
+		if err := w.streamWriter.Close(); err != nil {
 			w.streamWriter = nil
+			return err
 		}
-		if forceLog {
-			return w.logRequest(finalStatusCode, w.cloneHeaders(), w.body.Bytes(), w.extractAPIRequest(c), w.extractAPIResponse(c), slicesAPIResponseError, forceLog)
-		}
+		w.streamWriter = nil
 		return nil
 	}

@@ -335,26 +376,3 @@ func (w *ResponseWriterWrapper) logRequest(statusCode int, headers map[string][]
 		apiResponseErrors,
 	)
 }
-
-// Status returns the HTTP response status code captured by the wrapper.
-// It defaults to 200 if WriteHeader has not been called.
-func (w *ResponseWriterWrapper) Status() int {
-	if w.statusCode == 0 {
-		return 200 // Default status code
-	}
-	return w.statusCode
-}
-
-// Size returns the size of the response body in bytes for non-streaming responses.
-// For streaming responses, it returns -1, as the total size is unknown.
-func (w *ResponseWriterWrapper) Size() int {
-	if w.isStreaming {
-		return -1 // Unknown size for streaming responses
-	}
-	return w.body.Len()
-}
-
-// Written returns true if the response header has been written (i.e., a status code has been set).
-func (w *ResponseWriterWrapper) Written() bool {
-	return w.statusCode != 0
-}
--- a/internal/api/modules/amp/amp.go
+++ b/internal/api/modules/amp/amp.go
@@ -137,7 +137,8 @@ func (m *AmpModule) Register(ctx modules.Context) error {
 		m.registerProviderAliases(ctx.Engine, ctx.BaseHandler, auth)

 		// Register management proxy routes once; middleware will gate access when upstream is unavailable.
-		m.registerManagementRoutes(ctx.Engine, ctx.BaseHandler)
+		// Pass auth middleware to require valid API key for all management routes.
+		m.registerManagementRoutes(ctx.Engine, ctx.BaseHandler, auth)

 		// If no upstream URL, skip proxy routes but provider aliases are still available
 		if upstreamURL == "" {
@@ -187,9 +188,6 @@ func (m *AmpModule) OnConfigUpdated(cfg *config.Config) error {

 	if oldSettings != nil && oldSettings.RestrictManagementToLocalhost != newSettings.RestrictManagementToLocalhost {
 		m.setRestrictToLocalhost(newSettings.RestrictManagementToLocalhost)
-		if !newSettings.RestrictManagementToLocalhost {
-			log.Warnf("amp management routes now accessible from any IP - this is insecure!")
-		}
 	}

 	newUpstreamURL := strings.TrimSpace(newSettings.UpstreamURL)
--- a/internal/api/modules/amp/fallback_handlers.go
+++ b/internal/api/modules/amp/fallback_handlers.go
@@ -64,7 +64,7 @@ func logAmpRouting(routeType AmpRouteType, requestedModel, resolvedModel, provid
 		fields["cost"] = "amp_credits"
 		fields["source"] = "ampcode.com"
 		fields["model_id"] = requestedModel // Explicit model_id for easy config reference
-		log.WithFields(fields).Warnf("forwarding to ampcode.com (uses amp credits) - model_id: %s | To use local proxy, add to config: amp-model-mappings: [{from: \"%s\", to: \"<your-local-model>\"}]", requestedModel, requestedModel)
+		log.WithFields(fields).Warnf("forwarding to ampcode.com (uses amp credits) - model_id: %s | To use local provider, add to config: ampcode.model-mappings: [{from: \"%s\", to: \"<your-local-model>\"}]", requestedModel, requestedModel)

 	case RouteTypeNoProvider:
 		fields["cost"] = "none"
--- a/internal/api/modules/amp/proxy.go
+++ b/internal/api/modules/amp/proxy.go
@@ -44,6 +44,11 @@ func createReverseProxy(upstreamURL string, secretSource SecretSource) (*httputi
 		originalDirector(req)
 		req.Host = parsed.Host

+		// Remove client's Authorization header - it was only used for CLI Proxy API authentication
+		// We will set our own Authorization using the configured upstream-api-key
+		req.Header.Del("Authorization")
+		req.Header.Del("X-Api-Key")
+
 		// Preserve correlation headers for debugging
 		if req.Header.Get("X-Request-ID") == "" {
 			// Could generate one here if needed
@@ -53,7 +58,7 @@ func createReverseProxy(upstreamURL string, secretSource SecretSource) (*httputi
 		// Users going through ampcode.com proxy are paying for the service and should get all features
 		// including 1M context window (context-1m-2025-08-07)

-		// Inject API key from secret source (precedence: config > env > file)
+		// Inject API key from secret source (only uses upstream-api-key from config)
 		if key, err := secretSource.Get(req.Context()); err == nil && key != "" {
 			req.Header.Set("X-Api-Key", key)
 			req.Header.Set("Authorization", fmt.Sprintf("Bearer %s", key))
--- a/internal/api/modules/amp/routes.go
+++ b/internal/api/modules/amp/routes.go
@@ -98,7 +98,8 @@ func (m *AmpModule) managementAvailabilityMiddleware() gin.HandlerFunc {
 // registerManagementRoutes registers Amp management proxy routes
 // These routes proxy through to the Amp control plane for OAuth, user management, etc.
 // Uses dynamic middleware and proxy getter for hot-reload support.
-func (m *AmpModule) registerManagementRoutes(engine *gin.Engine, baseHandler *handlers.BaseAPIHandler) {
+// The auth middleware validates Authorization header against configured API keys.
+func (m *AmpModule) registerManagementRoutes(engine *gin.Engine, baseHandler *handlers.BaseAPIHandler, auth gin.HandlerFunc) {
 	ampAPI := engine.Group("/api")

 	// Always disable CORS for management routes to prevent browser-based attacks
@@ -107,8 +108,9 @@ func (m *AmpModule) registerManagementRoutes(engine *gin.Engine, baseHandler *ha
 	// Apply dynamic localhost-only restriction (hot-reloadable via m.IsRestrictedToLocalhost())
 	ampAPI.Use(m.localhostOnlyMiddleware())

-	if !m.IsRestrictedToLocalhost() {
-		log.Warn("amp management routes are NOT restricted to localhost - this is insecure!")
+	// Apply authentication middleware - requires valid API key in Authorization header
+	if auth != nil {
+		ampAPI.Use(auth)
 	}

 	// Dynamic proxy handler that uses m.getProxy() for hot-reload support
@@ -154,6 +156,9 @@ func (m *AmpModule) registerManagementRoutes(engine *gin.Engine, baseHandler *ha
 	// Root-level routes that AMP CLI expects without /api prefix
 	// These need the same security middleware as the /api/* routes (dynamic for hot-reload)
 	rootMiddleware := []gin.HandlerFunc{m.managementAvailabilityMiddleware(), noCORSMiddleware(), m.localhostOnlyMiddleware()}
+	if auth != nil {
+		rootMiddleware = append(rootMiddleware, auth)
+	}
 	engine.GET("/threads/*path", append(rootMiddleware, proxyHandler)...)
 	engine.GET("/threads.rss", append(rootMiddleware, proxyHandler)...)
 	engine.GET("/news.rss", append(rootMiddleware, proxyHandler)...)
--- a/internal/api/server.go
+++ b/internal/api/server.go
@@ -628,7 +628,7 @@ func (s *Server) serveManagementControlPanel(c *gin.Context) {

 	if _, err := os.Stat(filePath); err != nil {
 		if os.IsNotExist(err) {
-			go managementasset.EnsureLatestManagementHTML(context.Background(), managementasset.StaticDir(s.configFilePath), cfg.ProxyURL)
+			go managementasset.EnsureLatestManagementHTML(context.Background(), managementasset.StaticDir(s.configFilePath), cfg.ProxyURL, cfg.RemoteManagement.PanelGitHubRepository)
 			c.AbortWithStatus(http.StatusNotFound)
 			return
 		}
@@ -948,7 +948,7 @@ func (s *Server) UpdateClients(cfg *config.Config) {

 	if !cfg.RemoteManagement.DisableControlPanel {
 		staticDir := managementasset.StaticDir(s.configFilePath)
-		go managementasset.EnsureLatestManagementHTML(context.Background(), staticDir, cfg.ProxyURL)
+		go managementasset.EnsureLatestManagementHTML(context.Background(), staticDir, cfg.ProxyURL, cfg.RemoteManagement.PanelGitHubRepository)
 	}
 	if s.mgmt != nil {
 		s.mgmt.SetConfig(cfg)
--- a/internal/auth/kiro/protocol_handler.go
+++ b/internal/auth/kiro/protocol_handler.go
@@ -471,7 +471,7 @@ foreach ($port in $ports) {

 	// Create batch wrapper
 	batchPath := filepath.Join(scriptDir, "kiro-oauth-handler.bat")
-	batchContent := fmt.Sprintf("@echo off\npowershell -ExecutionPolicy Bypass -File \"%s\" \"%%1\"\n", scriptPath)
+	batchContent := fmt.Sprintf("@echo off\npowershell -ExecutionPolicy Bypass -File \"%s\" %%1\n", scriptPath)

 	if err := os.WriteFile(batchPath, []byte(batchContent), 0644); err != nil {
 		return fmt.Errorf("failed to write batch wrapper: %w", err)
--- a/internal/config/config.go
+++ b/internal/config/config.go
@@ -17,6 +17,8 @@ import (
 	"gopkg.in/yaml.v3"
 )

+const DefaultPanelGitHubRepository = "https://github.com/router-for-me/Cli-Proxy-API-Management-Center"
+
 // Config represents the application's configuration, loaded from a YAML file.
 type Config struct {
 	config.SDKConfig `yaml:",inline"`
@@ -116,6 +118,9 @@ type RemoteManagement struct {
 	SecretKey string `yaml:"secret-key"`
 	// DisableControlPanel skips serving and syncing the bundled management UI when true.
 	DisableControlPanel bool `yaml:"disable-control-panel"`
+	// PanelGitHubRepository overrides the GitHub repository used to fetch the management panel asset.
+	// Accepts either a repository URL (https://github.com/org/repo) or an API releases endpoint.
+	PanelGitHubRepository string `yaml:"panel-github-repository"`
 }

 // QuotaExceeded defines the behavior when API quota limits are exceeded.
@@ -151,7 +156,7 @@ type AmpCode struct {

 	// RestrictManagementToLocalhost restricts Amp management routes (/api/user, /api/threads, etc.)
 	// to only accept connections from localhost (127.0.0.1, ::1). When true, prevents drive-by
-	// browser attacks and remote access to management endpoints. Default: true (recommended).
+	// browser attacks and remote access to management endpoints. Default: false (API key auth is sufficient).
 	RestrictManagementToLocalhost bool `yaml:"restrict-management-to-localhost" json:"restrict-management-to-localhost"`

 	// ModelMappings defines model name mappings for Amp CLI requests.
@@ -368,7 +373,8 @@ func LoadConfigOptional(configFile string, optional bool) (*Config, error) {
 	cfg.LoggingToFile = false
 	cfg.UsageStatisticsEnabled = false
 	cfg.DisableCooling = false
-	cfg.AmpCode.RestrictManagementToLocalhost = true // Default to secure: only localhost access
+	cfg.AmpCode.RestrictManagementToLocalhost = false // Default to false: API key auth is sufficient
+	cfg.RemoteManagement.PanelGitHubRepository = DefaultPanelGitHubRepository
 	cfg.IncognitoBrowser = false                     // Default to normal browser (AWS uses incognito by force)
 	if err = yaml.Unmarshal(data, &cfg); err != nil {
 		if optional {
@@ -405,6 +411,11 @@ func LoadConfigOptional(configFile string, optional bool) (*Config, error) {
 		_ = SaveConfigPreserveCommentsUpdateNestedScalar(configFile, []string{"remote-management", "secret-key"}, hashed)
 	}

+	cfg.RemoteManagement.PanelGitHubRepository = strings.TrimSpace(cfg.RemoteManagement.PanelGitHubRepository)
+	if cfg.RemoteManagement.PanelGitHubRepository == "" {
+		cfg.RemoteManagement.PanelGitHubRepository = DefaultPanelGitHubRepository
+	}
+
 	// Sync request authentication providers with inline API keys for backwards compatibility.
 	syncInlineAccessProvider(&cfg)

--- a/internal/managementasset/updater.go
+++ b/internal/managementasset/updater.go
@@ -9,6 +9,7 @@ import (
 	"fmt"
 	"io"
 	"net/http"
+	"net/url"
 	"os"
 	"path/filepath"
 	"strings"
@@ -23,10 +24,10 @@ import (
 )

 const (
-	managementReleaseURL = "https://api.github.com/repos/router-for-me/Cli-Proxy-API-Management-Center/releases/latest"
-	managementAssetName  = "management.html"
-	httpUserAgent        = "CLIProxyAPI-management-updater"
-	updateCheckInterval  = 3 * time.Hour
+	defaultManagementReleaseURL = "https://api.github.com/repos/router-for-me/Cli-Proxy-API-Management-Center/releases/latest"
+	managementAssetName         = "management.html"
+	httpUserAgent               = "CLIProxyAPI-management-updater"
+	updateCheckInterval         = 3 * time.Hour
 )

 // ManagementFileName exposes the control panel asset filename.
@@ -97,7 +98,7 @@ func runAutoUpdater(ctx context.Context) {

 		configPath, _ := schedulerConfigPath.Load().(string)
 		staticDir := StaticDir(configPath)
-		EnsureLatestManagementHTML(ctx, staticDir, cfg.ProxyURL)
+		EnsureLatestManagementHTML(ctx, staticDir, cfg.ProxyURL, cfg.RemoteManagement.PanelGitHubRepository)
 	}

 	runOnce()
@@ -181,7 +182,7 @@ func FilePath(configFilePath string) string {
 // EnsureLatestManagementHTML checks the latest management.html asset and updates the local copy when needed.
 // The function is designed to run in a background goroutine and will never panic.
 // It enforces a 3-hour rate limit to avoid frequent checks on config/auth file changes.
-func EnsureLatestManagementHTML(ctx context.Context, staticDir string, proxyURL string) {
+func EnsureLatestManagementHTML(ctx context.Context, staticDir string, proxyURL string, panelRepository string) {
 	if ctx == nil {
 		ctx = context.Background()
 	}
@@ -214,6 +215,7 @@ func EnsureLatestManagementHTML(ctx context.Context, staticDir string, proxyURL
 		return
 	}

+	releaseURL := resolveReleaseURL(panelRepository)
 	client := newHTTPClient(proxyURL)

 	localPath := filepath.Join(staticDir, managementAssetName)
@@ -225,7 +227,7 @@ func EnsureLatestManagementHTML(ctx context.Context, staticDir string, proxyURL
 		localHash = ""
 	}

-	asset, remoteHash, err := fetchLatestAsset(ctx, client)
+	asset, remoteHash, err := fetchLatestAsset(ctx, client, releaseURL)
 	if err != nil {
 		log.WithError(err).Warn("failed to fetch latest management release information")
 		return
@@ -254,8 +256,44 @@ func EnsureLatestManagementHTML(ctx context.Context, staticDir string, proxyURL
 	log.Infof("management asset updated successfully (hash=%s)", downloadedHash)
 }

-func fetchLatestAsset(ctx context.Context, client *http.Client) (*releaseAsset, string, error) {
-	req, err := http.NewRequestWithContext(ctx, http.MethodGet, managementReleaseURL, nil)
+func resolveReleaseURL(repo string) string {
+	repo = strings.TrimSpace(repo)
+	if repo == "" {
+		return defaultManagementReleaseURL
+	}
+
+	parsed, err := url.Parse(repo)
+	if err != nil || parsed.Host == "" {
+		return defaultManagementReleaseURL
+	}
+
+	host := strings.ToLower(parsed.Host)
+	parsed.Path = strings.TrimSuffix(parsed.Path, "/")
+
+	if host == "api.github.com" {
+		if !strings.HasSuffix(strings.ToLower(parsed.Path), "/releases/latest") {
+			parsed.Path = parsed.Path + "/releases/latest"
+		}
+		return parsed.String()
+	}
+
+	if host == "github.com" {
+		parts := strings.Split(strings.Trim(parsed.Path, "/"), "/")
+		if len(parts) >= 2 && parts[0] != "" && parts[1] != "" {
+			repoName := strings.TrimSuffix(parts[1], ".git")
+			return fmt.Sprintf("https://api.github.com/repos/%s/%s/releases/latest", parts[0], repoName)
+		}
+	}
+
+	return defaultManagementReleaseURL
+}
+
+func fetchLatestAsset(ctx context.Context, client *http.Client, releaseURL string) (*releaseAsset, string, error) {
+	if strings.TrimSpace(releaseURL) == "" {
+		releaseURL = defaultManagementReleaseURL
+	}
+
+	req, err := http.NewRequestWithContext(ctx, http.MethodGet, releaseURL, nil)
 	if err != nil {
 		return nil, "", fmt.Errorf("create release request: %w", err)
 	}
--- a/internal/runtime/executor/kiro_executor.go
+++ b/internal/runtime/executor/kiro_executor.go
@@ -166,16 +166,17 @@ type KiroExecutor struct {
 // This is critical because OpenAI and Claude formats have different tool structures:
 // - OpenAI: tools[].function.name, tools[].function.description
 // - Claude: tools[].name, tools[].description
+// headers parameter allows checking Anthropic-Beta header for thinking mode detection.
 // Returns the serialized JSON payload and a boolean indicating whether thinking mode was injected.
-func buildKiroPayloadForFormat(body []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool, sourceFormat sdktranslator.Format) ([]byte, bool) {
+func buildKiroPayloadForFormat(body []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool, sourceFormat sdktranslator.Format, headers http.Header) ([]byte, bool) {
 	switch sourceFormat.String() {
 	case "openai":
 		log.Debugf("kiro: using OpenAI payload builder for source format: %s", sourceFormat.String())
-		return kiroopenai.BuildKiroPayloadFromOpenAI(body, modelID, profileArn, origin, isAgentic, isChatOnly)
+		return kiroopenai.BuildKiroPayloadFromOpenAI(body, modelID, profileArn, origin, isAgentic, isChatOnly, headers, nil)
 	default:
 		// Default to Claude format (also handles "claude", "kiro", etc.)
 		log.Debugf("kiro: using Claude payload builder for source format: %s", sourceFormat.String())
-		return kiroclaude.BuildKiroPayload(body, modelID, profileArn, origin, isAgentic, isChatOnly)
+		return kiroclaude.BuildKiroPayload(body, modelID, profileArn, origin, isAgentic, isChatOnly, headers, nil)
 	}
 }

@@ -249,7 +250,7 @@ func (e *KiroExecutor) executeWithRetry(ctx context.Context, auth *cliproxyauth.

 		// Rebuild payload with the correct origin for this endpoint
 		// Each endpoint requires its matching Origin value in the request body
-		kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
+		kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)

 		log.Debugf("kiro: trying endpoint %d/%d: %s (Name: %s, Origin: %s)",
 			endpointIdx+1, len(endpointConfigs), url, endpointConfig.Name, currentOrigin)
@@ -359,7 +360,7 @@ func (e *KiroExecutor) executeWithRetry(ctx context.Context, auth *cliproxyauth.
 						auth = refreshedAuth
 						accessToken, profileArn = kiroCredentials(auth)
 						// Rebuild payload with new profile ARN if changed
-						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
+						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)
 						log.Infof("kiro: token refreshed successfully, retrying request")
 						continue
 					}
@@ -416,7 +417,7 @@ func (e *KiroExecutor) executeWithRetry(ctx context.Context, auth *cliproxyauth.
 					if refreshedAuth != nil {
 						auth = refreshedAuth
 						accessToken, profileArn = kiroCredentials(auth)
-						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
+						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)
 						log.Infof("kiro: token refreshed for 403, retrying request")
 						continue
 					}
@@ -555,10 +556,7 @@ func (e *KiroExecutor) executeStreamWithRetry(ctx context.Context, auth *cliprox

 		// Rebuild payload with the correct origin for this endpoint
 		// Each endpoint requires its matching Origin value in the request body
-		kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
-		// Kiro API always returns <thinking> tags regardless of whether thinking mode was requested
-		// So we always enable thinking parsing for Kiro responses
-		thinkingEnabled := true
+		kiroPayload, thinkingEnabled := buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)

 		log.Debugf("kiro: stream trying endpoint %d/%d: %s (Name: %s, Origin: %s)",
 			endpointIdx+1, len(endpointConfigs), url, endpointConfig.Name, currentOrigin)
@@ -681,7 +679,7 @@ func (e *KiroExecutor) executeStreamWithRetry(ctx context.Context, auth *cliprox
 						auth = refreshedAuth
 						accessToken, profileArn = kiroCredentials(auth)
 						// Rebuild payload with new profile ARN if changed
-						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
+						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)
 						log.Infof("kiro: token refreshed successfully, retrying stream request")
 						continue
 					}
@@ -738,7 +736,7 @@ func (e *KiroExecutor) executeStreamWithRetry(ctx context.Context, auth *cliprox
 					if refreshedAuth != nil {
 						auth = refreshedAuth
 						accessToken, profileArn = kiroCredentials(auth)
-						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from)
+						kiroPayload, _ = buildKiroPayloadForFormat(body, kiroModelID, profileArn, currentOrigin, isAgentic, isChatOnly, from, opts.Headers)
 						log.Infof("kiro: token refreshed for 403, retrying stream request")
 						continue
 					}
@@ -1702,6 +1700,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 	pendingEndTagChars := 0      // Number of chars that might be start of </thinking>
 	isThinkingBlockOpen := false // Track if thinking content block is open
 	thinkingBlockIndex := -1     // Index of the thinking content block
+	var accumulatedThinkingContent strings.Builder // Accumulate thinking content for signature generation

 	// Code block state tracking for heuristic thinking tag parsing
 	// When inside a markdown code block, <thinking> tags should NOT be parsed
@@ -1847,6 +1846,8 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 							out <- cliproxyexecutor.StreamChunk{Payload: []byte(chunk + "\n\n")}
 						}
 					}
+					// Accumulate thinking content for signature generation
+					accumulatedThinkingContent.WriteString(pendingText)
 				} else {
 					// Output as regular text
 					if !isTextBlockOpen {
@@ -2390,6 +2391,8 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 										out <- cliproxyexecutor.StreamChunk{Payload: []byte(chunk + "\n\n")}
 									}
 								}
+								// Accumulate thinking content for signature generation
+								accumulatedThinkingContent.WriteString(thinkContent)
 							}

 							// Note: Partial tag handling is done via pendingEndTagChars
@@ -2397,7 +2400,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out

 							// Close thinking block
 							if isThinkingBlockOpen {
-								blockStop := kiroclaude.BuildClaudeContentBlockStopEvent(thinkingBlockIndex)
+								blockStop := kiroclaude.BuildClaudeThinkingBlockStopEvent(thinkingBlockIndex)
 								sseData := sdktranslator.TranslateStream(ctx, sdktranslator.FromString("kiro"), targetFormat, model, originalReq, claudeBody, blockStop, &translatorParam)
 								for _, chunk := range sseData {
 									if chunk != "" {
@@ -2405,6 +2408,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 									}
 								}
 								isThinkingBlockOpen = false
+								accumulatedThinkingContent.Reset() // Reset for potential next thinking block
 							}

 							inThinkBlock = false
@@ -2450,6 +2454,8 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 										out <- cliproxyexecutor.StreamChunk{Payload: []byte(chunk + "\n\n")}
 									}
 								}
+								// Accumulate thinking content for signature generation
+								accumulatedThinkingContent.WriteString(contentToEmit)
 							}

 							remaining = ""
@@ -2592,6 +2598,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 			// Handle tool uses in response (with deduplication)
 			for _, tu := range toolUses {
 				toolUseID := kirocommon.GetString(tu, "toolUseId")
+				toolName := kirocommon.GetString(tu, "name")

 				// Check for duplicate
 				if processedIDs[toolUseID] {
@@ -2615,7 +2622,6 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out

 				// Emit tool_use content block
 				contentBlockIndex++
-				toolName := kirocommon.GetString(tu, "name")

 				blockStart := kiroclaude.BuildClaudeContentBlockStartEvent(contentBlockIndex, "tool_use", toolUseID, toolName)
 				sseData := sdktranslator.TranslateStream(ctx, sdktranslator.FromString("kiro"), targetFormat, model, originalReq, claudeBody, blockStart, &translatorParam)
@@ -2888,7 +2894,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out
 		if calculatedInputTokens > 0 {
 			localEstimate := totalUsage.InputTokens
 			totalUsage.InputTokens = calculatedInputTokens
-			log.Infof("kiro: using contextUsagePercentage (%.2f%%) to calculate input tokens: %d (local estimate was: %d)",
+			log.Debugf("kiro: using contextUsagePercentage (%.2f%%) to calculate input tokens: %d (local estimate was: %d)",
 				upstreamContextPercentage, calculatedInputTokens, localEstimate)
 		}
 	}
@@ -2897,7 +2903,7 @@ func (e *KiroExecutor) streamToChannel(ctx context.Context, body io.Reader, out

 	// Log upstream usage information if received
 	if hasUpstreamUsage {
-		log.Infof("kiro: upstream usage - credits: %.4f, context: %.2f%%, final tokens - input: %d, output: %d, total: %d",
+		log.Debugf("kiro: upstream usage - credits: %.4f, context: %.2f%%, final tokens - input: %d, output: %d, total: %d",
 			upstreamCreditUsage, upstreamContextPercentage,
 			totalUsage.InputTokens, totalUsage.OutputTokens, totalUsage.TotalTokens)
 	}
--- a/internal/translator/kiro/claude/kiro_claude_request.go
+++ b/internal/translator/kiro/claude/kiro_claude_request.go
@@ -6,6 +6,7 @@ package claude
 import (
 	"encoding/json"
 	"fmt"
+	"net/http"
 	"strings"
 	"time"
 	"unicode/utf8"
@@ -33,6 +34,7 @@ type KiroInferenceConfig struct {
 	TopP        float64 `json:"topP,omitempty"`
 }

+
 // KiroConversationState holds the conversation context
 type KiroConversationState struct {
 	ChatTriggerType string               `json:"chatTriggerType"` // Required: "MANUAL" - must be first field
@@ -134,9 +136,11 @@ func ConvertClaudeRequestToKiro(modelName string, inputRawJSON []byte, stream bo
 // origin parameter determines which quota to use: "CLI" for Amazon Q, "AI_EDITOR" for Kiro IDE.
 // isAgentic parameter enables chunked write optimization prompt for -agentic model variants.
 // isChatOnly parameter disables tool calling for -chat model variants (pure conversation mode).
-// Supports thinking mode - when Claude API thinking parameter is present, injects thinkingHint.
+// headers parameter allows checking Anthropic-Beta header for thinking mode detection.
+// metadata parameter is kept for API compatibility but no longer used for thinking configuration.
+// Supports thinking mode - when enabled, injects thinking tags into system prompt.
 // Returns the payload and a boolean indicating whether thinking mode was injected.
-func BuildKiroPayload(claudeBody []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool) ([]byte, bool) {
+func BuildKiroPayload(claudeBody []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool, headers http.Header, metadata map[string]any) ([]byte, bool) {
 	// Extract max_tokens for potential use in inferenceConfig
 	// Handle -1 as "use maximum" (Kiro max output is ~32000 tokens)
 	const kiroMaxOutputTokens = 32000
@@ -181,26 +185,9 @@ func BuildKiroPayload(claudeBody []byte, modelID, profileArn, origin string, isA
 	// Extract system prompt
 	systemPrompt := extractSystemPrompt(claudeBody)

-	// Check for thinking mode using the comprehensive IsThinkingEnabled function
-	// This supports Claude API format, OpenAI reasoning_effort, and AMP/Cursor format
-	thinkingEnabled := IsThinkingEnabled(claudeBody)
-	_, budgetTokens := checkThinkingMode(claudeBody) // Get budget tokens from Claude format if available
-	if budgetTokens <= 0 {
-		// Calculate budgetTokens based on max_tokens if available
-		// Use 50% of max_tokens for thinking, with min 8000 and max 24000
-		if maxTokens > 0 {
-			budgetTokens = maxTokens / 2
-			if budgetTokens < 8000 {
-				budgetTokens = 8000
-			}
-			if budgetTokens > 24000 {
-				budgetTokens = 24000
-			}
-			log.Debugf("kiro: budgetTokens calculated from max_tokens: %d (max_tokens=%d)", budgetTokens, maxTokens)
-		} else {
-			budgetTokens = 16000 // Default budget tokens
-		}
-	}
+	// Check for thinking mode using the comprehensive IsThinkingEnabledWithHeaders function
+	// This supports Claude API format, OpenAI reasoning_effort, AMP/Cursor format, and Anthropic-Beta header
+	thinkingEnabled := IsThinkingEnabledWithHeaders(claudeBody, headers)

 	// Inject timestamp context
 	timestamp := time.Now().Format("2006-01-02 15:04:05 MST")
@@ -231,19 +218,26 @@ func BuildKiroPayload(claudeBody []byte, modelID, profileArn, origin string, isA
 		log.Debugf("kiro: injected tool_choice hint into system prompt")
 	}

-	// Inject thinking hint when thinking mode is enabled
-	if thinkingEnabled {
-		if systemPrompt != "" {
-			systemPrompt += "\n"
-		}
-		dynamicThinkingHint := fmt.Sprintf("<thinking_mode>interleaved</thinking_mode><max_thinking_length>%d</max_thinking_length>", budgetTokens)
-		systemPrompt += dynamicThinkingHint
-		log.Debugf("kiro: injected dynamic thinking hint into system prompt, max_thinking_length: %d", budgetTokens)
-	}
-
 	// Convert Claude tools to Kiro format
 	kiroTools := convertClaudeToolsToKiro(tools)

+	// Thinking mode implementation:
+	// Kiro API doesn't accept max_tokens for thinking. Instead, thinking mode is enabled
+	// by injecting <thinking_mode> and <max_thinking_length> tags into the system prompt.
+	// We use a fixed max_thinking_length value since Kiro handles the actual budget internally.
+	if thinkingEnabled {
+		thinkingHint := `<thinking_mode>interleaved</thinking_mode>
+<max_thinking_length>200000</max_thinking_length>
+
+IMPORTANT: You MUST use <thinking>...</thinking> tags to show your reasoning process before providing your final response. Think step by step inside the thinking tags.`
+		if systemPrompt != "" {
+			systemPrompt = thinkingHint + "\n\n" + systemPrompt
+		} else {
+			systemPrompt = thinkingHint
+		}
+		log.Infof("kiro: injected thinking prompt, has_tools: %v", len(kiroTools) > 0)
+	}
+
 	// Process messages and build history
 	history, currentUserMsg, currentToolResults := processMessages(messages, modelID, origin)

@@ -280,6 +274,7 @@ func BuildKiroPayload(claudeBody []byte, modelID, profileArn, origin string, isA
 	}

 	// Build inferenceConfig if we have any inference parameters
+	// Note: Kiro API doesn't actually use max_tokens for thinking budget
 	var inferenceConfig *KiroInferenceConfig
 	if maxTokens > 0 || hasTemperature || hasTopP {
 		inferenceConfig = &KiroInferenceConfig{}
@@ -350,7 +345,7 @@ func extractSystemPrompt(claudeBody []byte) string {
 // checkThinkingMode checks if thinking mode is enabled in the Claude request
 func checkThinkingMode(claudeBody []byte) (bool, int64) {
 	thinkingEnabled := false
-	var budgetTokens int64 = 16000
+	var budgetTokens int64 = 24000

 	thinkingField := gjson.GetBytes(claudeBody, "thinking")
 	if thinkingField.Exists() {
@@ -373,6 +368,32 @@ func checkThinkingMode(claudeBody []byte) (bool, int64) {
 	return thinkingEnabled, budgetTokens
 }

+// hasThinkingTagInBody checks if the request body already contains thinking configuration tags.
+// This is used to prevent duplicate injection when client (e.g., AMP/Cursor) already includes thinking config.
+func hasThinkingTagInBody(body []byte) bool {
+	bodyStr := string(body)
+	return strings.Contains(bodyStr, "<thinking_mode>") || strings.Contains(bodyStr, "<max_thinking_length>")
+}
+
+
+// IsThinkingEnabledFromHeader checks if thinking mode is enabled via Anthropic-Beta header.
+// Claude CLI uses "Anthropic-Beta: interleaved-thinking-2025-05-14" to enable thinking.
+func IsThinkingEnabledFromHeader(headers http.Header) bool {
+	if headers == nil {
+		return false
+	}
+	betaHeader := headers.Get("Anthropic-Beta")
+	if betaHeader == "" {
+		return false
+	}
+	// Check for interleaved-thinking beta feature
+	if strings.Contains(betaHeader, "interleaved-thinking") {
+		log.Debugf("kiro: thinking mode enabled via Anthropic-Beta header: %s", betaHeader)
+		return true
+	}
+	return false
+}
+
 // IsThinkingEnabled is a public wrapper to check if thinking mode is enabled.
 // This is used by the executor to determine whether to parse <thinking> tags in responses.
 // When thinking is NOT enabled in the request, <thinking> tags in responses should be
@@ -383,6 +404,21 @@ func checkThinkingMode(claudeBody []byte) (bool, int64) {
 // - OpenAI format: reasoning_effort parameter
 // - AMP/Cursor format: <thinking_mode>interleaved</thinking_mode> in system prompt
 func IsThinkingEnabled(body []byte) bool {
+	return IsThinkingEnabledWithHeaders(body, nil)
+}
+
+// IsThinkingEnabledWithHeaders checks if thinking mode is enabled from body or headers.
+// This is the comprehensive check that supports all thinking detection methods:
+// - Claude API format: thinking.type = "enabled"
+// - OpenAI format: reasoning_effort parameter
+// - AMP/Cursor format: <thinking_mode>interleaved</thinking_mode> in system prompt
+// - Anthropic-Beta header: interleaved-thinking-2025-05-14
+func IsThinkingEnabledWithHeaders(body []byte, headers http.Header) bool {
+	// Check Anthropic-Beta header first (Claude Code uses this)
+	if IsThinkingEnabledFromHeader(headers) {
+		return true
+	}
+
 	// Check Claude API format first (thinking.type = "enabled")
 	enabled, _ := checkThinkingMode(body)
 	if enabled {
@@ -771,4 +807,4 @@ func BuildAssistantMessageStruct(msg gjson.Result) KiroAssistantResponseMessage
 		Content:  contentBuilder.String(),
 		ToolUses: toolUses,
 	}
-}
+}
--- a/internal/translator/kiro/claude/kiro_claude_response.go
+++ b/internal/translator/kiro/claude/kiro_claude_response.go
@@ -4,6 +4,8 @@
 package claude

 import (
+	"crypto/sha256"
+	"encoding/base64"
 	"encoding/json"
 	"strings"

@@ -14,6 +16,18 @@ import (
 	kirocommon "github.com/router-for-me/CLIProxyAPI/v6/internal/translator/kiro/common"
 )

+// generateThinkingSignature generates a signature for thinking content.
+// This is required by Claude API for thinking blocks in non-streaming responses.
+// The signature is a base64-encoded hash of the thinking content.
+func generateThinkingSignature(thinkingContent string) string {
+	if thinkingContent == "" {
+		return ""
+	}
+	// Generate a deterministic signature based on content hash
+	hash := sha256.Sum256([]byte(thinkingContent))
+	return base64.StdEncoding.EncodeToString(hash[:])
+}
+
 // Local references to kirocommon constants for thinking block parsing
 var (
 	thinkingStartTag = kirocommon.ThinkingStartTag
@@ -149,9 +163,12 @@ func ExtractThinkingFromContent(content string) []map[string]interface{} {
 		if endIdx == -1 {
 			// No closing tag found, treat rest as thinking content (incomplete response)
 			if strings.TrimSpace(remaining) != "" {
+				// Generate signature for thinking content (required by Claude API)
+				signature := generateThinkingSignature(remaining)
 				blocks = append(blocks, map[string]interface{}{
-					"type":     "thinking",
-					"thinking": remaining,
+					"type":      "thinking",
+					"thinking":  remaining,
+					"signature": signature,
 				})
 				log.Warnf("kiro: extractThinkingFromContent - missing closing </thinking> tag")
 			}
@@ -161,9 +178,12 @@ func ExtractThinkingFromContent(content string) []map[string]interface{} {
 		// Extract thinking content between tags
 		thinkContent := remaining[:endIdx]
 		if strings.TrimSpace(thinkContent) != "" {
+			// Generate signature for thinking content (required by Claude API)
+			signature := generateThinkingSignature(thinkContent)
 			blocks = append(blocks, map[string]interface{}{
-				"type":     "thinking",
-				"thinking": thinkContent,
+				"type":      "thinking",
+				"thinking":  thinkContent,
+				"signature": signature,
 			})
 			log.Debugf("kiro: extractThinkingFromContent - extracted thinking block (len: %d)", len(thinkContent))
 		}
--- a/internal/translator/kiro/claude/kiro_claude_stream.go
+++ b/internal/translator/kiro/claude/kiro_claude_stream.go
@@ -99,6 +99,16 @@ func BuildClaudeContentBlockStopEvent(index int) []byte {
 	return []byte("event: content_block_stop\ndata: " + string(result))
 }

+// BuildClaudeThinkingBlockStopEvent creates a content_block_stop SSE event for thinking blocks.
+func BuildClaudeThinkingBlockStopEvent(index int) []byte {
+	event := map[string]interface{}{
+		"type":  "content_block_stop",
+		"index": index,
+	}
+	result, _ := json.Marshal(event)
+	return []byte("event: content_block_stop\ndata: " + string(result))
+}
+
 // BuildClaudeMessageDeltaEvent creates the message_delta event with stop_reason and usage
 func BuildClaudeMessageDeltaEvent(stopReason string, usageInfo usage.Detail) []byte {
 	deltaEvent := map[string]interface{}{
--- a/internal/translator/kiro/openai/kiro_openai.go
+++ b/internal/translator/kiro/openai/kiro_openai.go
@@ -187,6 +187,7 @@ func ConvertKiroNonStreamToOpenAI(ctx context.Context, model string, originalReq

 	// Extract content
 	var content string
+	var reasoningContent string
 	var toolUses []KiroToolUse
 	var stopReason string

@@ -202,7 +203,8 @@ func ConvertKiroNonStreamToOpenAI(ctx context.Context, model string, originalReq
 			case "text":
 				content += block.Get("text").String()
 			case "thinking":
-				// Skip thinking blocks for OpenAI format (or convert to reasoning_content if needed)
+				// Convert thinking blocks to reasoning_content for OpenAI format
+				reasoningContent += block.Get("thinking").String()
 			case "tool_use":
 				toolUseID := block.Get("id").String()
 				toolName := block.Get("name").String()
@@ -233,8 +235,8 @@ func ConvertKiroNonStreamToOpenAI(ctx context.Context, model string, originalReq
 	}
 	usageInfo.TotalTokens = usageInfo.InputTokens + usageInfo.OutputTokens

-	// Build OpenAI response
-	openaiResponse := BuildOpenAIResponse(content, toolUses, model, usageInfo, stopReason)
+	// Build OpenAI response with reasoning_content support
+	openaiResponse := BuildOpenAIResponseWithReasoning(content, reasoningContent, toolUses, model, usageInfo, stopReason)
 	return string(openaiResponse)
 }

--- a/internal/translator/kiro/openai/kiro_openai_request.go
+++ b/internal/translator/kiro/openai/kiro_openai_request.go
@@ -6,11 +6,13 @@ package openai
 import (
 	"encoding/json"
 	"fmt"
+	"net/http"
 	"strings"
 	"time"
 	"unicode/utf8"

 	"github.com/google/uuid"
+	kiroclaude "github.com/router-for-me/CLIProxyAPI/v6/internal/translator/kiro/claude"
 	kirocommon "github.com/router-for-me/CLIProxyAPI/v6/internal/translator/kiro/common"
 	log "github.com/sirupsen/logrus"
 	"github.com/tidwall/gjson"
@@ -133,8 +135,10 @@ func ConvertOpenAIRequestToKiro(modelName string, inputRawJSON []byte, stream bo
 // origin parameter determines which quota to use: "CLI" for Amazon Q, "AI_EDITOR" for Kiro IDE.
 // isAgentic parameter enables chunked write optimization prompt for -agentic model variants.
 // isChatOnly parameter disables tool calling for -chat model variants (pure conversation mode).
+// headers parameter allows checking Anthropic-Beta header for thinking mode detection.
+// metadata parameter is kept for API compatibility but no longer used for thinking configuration.
 // Returns the payload and a boolean indicating whether thinking mode was injected.
-func BuildKiroPayloadFromOpenAI(openaiBody []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool) ([]byte, bool) {
+func BuildKiroPayloadFromOpenAI(openaiBody []byte, modelID, profileArn, origin string, isAgentic, isChatOnly bool, headers http.Header, metadata map[string]any) ([]byte, bool) {
 	// Extract max_tokens for potential use in inferenceConfig
 	// Handle -1 as "use maximum" (Kiro max output is ~32000 tokens)
 	const kiroMaxOutputTokens = 32000
@@ -219,35 +223,30 @@ func BuildKiroPayloadFromOpenAI(openaiBody []byte, modelID, profileArn, origin s
 		log.Debugf("kiro-openai: injected response_format hint into system prompt")
 	}

-	// Check for thinking mode and inject thinking hint
-	// Supports OpenAI reasoning_effort parameter and model name hints
-	thinkingEnabled, budgetTokens := checkThinkingModeFromOpenAI(openaiBody)
-	if thinkingEnabled {
-		// Adjust budgetTokens based on max_tokens if not explicitly set by reasoning_effort
-		// Use 50% of max_tokens for thinking, with min 8000 and max 24000
-		if maxTokens > 0 && budgetTokens == 16000 { // 16000 is the default, meaning not explicitly set
-			calculatedBudget := maxTokens / 2
-			if calculatedBudget < 8000 {
-				calculatedBudget = 8000
-			}
-			if calculatedBudget > 24000 {
-				calculatedBudget = 24000
-			}
-			budgetTokens = calculatedBudget
-			log.Debugf("kiro-openai: budgetTokens calculated from max_tokens: %d (max_tokens=%d)", budgetTokens, maxTokens)
-		}
-
-		if systemPrompt != "" {
-			systemPrompt += "\n"
-		}
-		dynamicThinkingHint := fmt.Sprintf("<thinking_mode>interleaved</thinking_mode><max_thinking_length>%d</max_thinking_length>", budgetTokens)
-		systemPrompt += dynamicThinkingHint
-		log.Debugf("kiro-openai: injected dynamic thinking hint into system prompt, max_thinking_length: %d", budgetTokens)
-	}
+	// Check for thinking mode
+	// Supports OpenAI reasoning_effort parameter, model name hints, and Anthropic-Beta header
+	thinkingEnabled := checkThinkingModeFromOpenAIWithHeaders(openaiBody, headers)

 	// Convert OpenAI tools to Kiro format
 	kiroTools := convertOpenAIToolsToKiro(tools)

+	// Thinking mode implementation:
+	// Kiro API doesn't accept max_tokens for thinking. Instead, thinking mode is enabled
+	// by injecting <thinking_mode> and <max_thinking_length> tags into the system prompt.
+	// We use a fixed max_thinking_length value since Kiro handles the actual budget internally.
+	if thinkingEnabled {
+		thinkingHint := `<thinking_mode>interleaved</thinking_mode>
+<max_thinking_length>200000</max_thinking_length>
+
+IMPORTANT: You MUST use <thinking>...</thinking> tags to show your reasoning process before providing your final response. Think step by step inside the thinking tags.`
+		if systemPrompt != "" {
+			systemPrompt = thinkingHint + "\n\n" + systemPrompt
+		} else {
+			systemPrompt = thinkingHint
+		}
+		log.Debugf("kiro-openai: injected thinking prompt")
+	}
+
 	// Process messages and build history
 	history, currentUserMsg, currentToolResults := processOpenAIMessages(messages, modelID, origin)

@@ -284,6 +283,7 @@ func BuildKiroPayloadFromOpenAI(openaiBody []byte, modelID, profileArn, origin s
 	}

 	// Build inferenceConfig if we have any inference parameters
+	// Note: Kiro API doesn't actually use max_tokens for thinking budget
 	var inferenceConfig *KiroInferenceConfig
 	if maxTokens > 0 || hasTemperature || hasTopP {
 		inferenceConfig = &KiroInferenceConfig{}
@@ -682,13 +682,28 @@ func buildFinalContent(content, systemPrompt string, toolResults []KiroToolResul
 }

 // checkThinkingModeFromOpenAI checks if thinking mode is enabled in the OpenAI request.
-// Returns (thinkingEnabled, budgetTokens).
+// Returns thinkingEnabled.
 // Supports:
 // - reasoning_effort parameter (low/medium/high/auto)
 // - Model name containing "thinking" or "reason"
 // - <thinking_mode> tag in system prompt (AMP/Cursor format)
-func checkThinkingModeFromOpenAI(openaiBody []byte) (bool, int64) {
-	var budgetTokens int64 = 16000 // Default budget
+func checkThinkingModeFromOpenAI(openaiBody []byte) bool {
+	return checkThinkingModeFromOpenAIWithHeaders(openaiBody, nil)
+}
+
+// checkThinkingModeFromOpenAIWithHeaders checks if thinking mode is enabled in the OpenAI request.
+// Returns thinkingEnabled.
+// Supports:
+// - Anthropic-Beta header with interleaved-thinking (Claude CLI)
+// - reasoning_effort parameter (low/medium/high/auto)
+// - Model name containing "thinking" or "reason"
+// - <thinking_mode> tag in system prompt (AMP/Cursor format)
+func checkThinkingModeFromOpenAIWithHeaders(openaiBody []byte, headers http.Header) bool {
+	// Check Anthropic-Beta header first (Claude CLI uses this)
+	if kiroclaude.IsThinkingEnabledFromHeader(headers) {
+		log.Debugf("kiro-openai: thinking mode enabled via Anthropic-Beta header")
+		return true
+	}

 	// Check OpenAI format: reasoning_effort parameter
 	// Valid values: "low", "medium", "high", "auto" (not "none")
@@ -697,18 +712,7 @@ func checkThinkingModeFromOpenAI(openaiBody []byte) (bool, int64) {
 		effort := reasoningEffort.String()
 		if effort != "" && effort != "none" {
 			log.Debugf("kiro-openai: thinking mode enabled via reasoning_effort: %s", effort)
-			// Adjust budget based on effort level
-			switch effort {
-			case "low":
-				budgetTokens = 8000
-			case "medium":
-				budgetTokens = 16000
-			case "high":
-				budgetTokens = 32000
-			case "auto":
-				budgetTokens = 16000
-			}
-			return true, budgetTokens
+			return true
 		}
 	}

@@ -725,17 +729,7 @@ func checkThinkingModeFromOpenAI(openaiBody []byte) (bool, int64) {
 				thinkingMode := bodyStr[startIdx : startIdx+endIdx]
 				if thinkingMode == "interleaved" || thinkingMode == "enabled" {
 					log.Debugf("kiro-openai: thinking mode enabled via AMP/Cursor format: %s", thinkingMode)
-					// Try to extract max_thinking_length if present
-					if maxLenStart := strings.Index(bodyStr, "<max_thinking_length>"); maxLenStart >= 0 {
-						maxLenStart += len("<max_thinking_length>")
-						if maxLenEnd := strings.Index(bodyStr[maxLenStart:], "</max_thinking_length>"); maxLenEnd >= 0 {
-							maxLenStr := bodyStr[maxLenStart : maxLenStart+maxLenEnd]
-							if parsed, err := fmt.Sscanf(maxLenStr, "%d", &budgetTokens); err == nil && parsed == 1 {
-								log.Debugf("kiro-openai: extracted max_thinking_length: %d", budgetTokens)
-							}
-						}
-					}
-					return true, budgetTokens
+					return true
 				}
 			}
 		}
@@ -746,13 +740,21 @@ func checkThinkingModeFromOpenAI(openaiBody []byte) (bool, int64) {
 	modelLower := strings.ToLower(model)
 	if strings.Contains(modelLower, "thinking") || strings.Contains(modelLower, "-reason") {
 		log.Debugf("kiro-openai: thinking mode enabled via model name hint: %s", model)
-		return true, budgetTokens
+		return true
 	}

 	log.Debugf("kiro-openai: no thinking mode detected in OpenAI request")
-	return false, budgetTokens
+	return false
 }

+// hasThinkingTagInBody checks if the request body already contains thinking configuration tags.
+// This is used to prevent duplicate injection when client (e.g., AMP/Cursor) already includes thinking config.
+func hasThinkingTagInBody(body []byte) bool {
+	bodyStr := string(body)
+	return strings.Contains(bodyStr, "<thinking_mode>") || strings.Contains(bodyStr, "<max_thinking_length>")
+}
+
+
 // extractToolChoiceHint extracts tool_choice from OpenAI request and returns a system prompt hint.
 // OpenAI tool_choice values:
 // - "none": Don't use any tools
@@ -845,4 +847,4 @@ func deduplicateToolResults(toolResults []KiroToolResult) []KiroToolResult {
 		}
 	}
 	return unique
-}
+}
--- a/internal/translator/kiro/openai/kiro_openai_response.go
+++ b/internal/translator/kiro/openai/kiro_openai_response.go
@@ -21,12 +21,25 @@ var functionCallIDCounter uint64
 // Supports tool_calls when tools are present in the response.
 // stopReason is passed from upstream; fallback logic applied if empty.
 func BuildOpenAIResponse(content string, toolUses []KiroToolUse, model string, usageInfo usage.Detail, stopReason string) []byte {
+	return BuildOpenAIResponseWithReasoning(content, "", toolUses, model, usageInfo, stopReason)
+}
+
+// BuildOpenAIResponseWithReasoning constructs an OpenAI Chat Completions-compatible response with reasoning_content support.
+// Supports tool_calls when tools are present in the response.
+// reasoningContent is included as reasoning_content field in the message when present.
+// stopReason is passed from upstream; fallback logic applied if empty.
+func BuildOpenAIResponseWithReasoning(content, reasoningContent string, toolUses []KiroToolUse, model string, usageInfo usage.Detail, stopReason string) []byte {
 	// Build the message object
 	message := map[string]interface{}{
 		"role":    "assistant",
 		"content": content,
 	}

+	// Add reasoning_content if present (for thinking/reasoning models)
+	if reasoningContent != "" {
+		message["reasoning_content"] = reasoningContent
+	}
+
 	// Add tool_calls if present
 	if len(toolUses) > 0 {
 		var toolCalls []map[string]interface{}
--- a/internal/translator/openai/openai/responses/openai_openai-responses_request.go
+++ b/internal/translator/openai/openai/responses/openai_openai-responses_request.go
@@ -65,7 +65,7 @@ func ConvertOpenAIResponsesRequestToOpenAIChatCompletions(modelName string, inpu
 			}

 			switch itemType {
-			case "message":
+			case "message", "":
 				// Handle regular message conversion
 				role := item.Get("role").String()
 				message := `{"role":"","content":""}`
@@ -107,6 +107,8 @@ func ConvertOpenAIResponsesRequestToOpenAIChatCompletions(modelName string, inpu
 					if len(toolCalls) > 0 {
 						message, _ = sjson.Set(message, "tool_calls", toolCalls)
 					}
+				} else if content.Type == gjson.String {
+					message, _ = sjson.Set(message, "content", content.String())
 				}

 				out, _ = sjson.SetRaw(out, "messages.-1", message)
--- a/internal/watcher/watcher.go
+++ b/internal/watcher/watcher.go
@@ -2091,6 +2091,11 @@ func buildConfigChangeDetails(oldCfg, newCfg *config.Config) []string {
 	if oldCfg.RemoteManagement.DisableControlPanel != newCfg.RemoteManagement.DisableControlPanel {
 		changes = append(changes, fmt.Sprintf("remote-management.disable-control-panel: %t -> %t", oldCfg.RemoteManagement.DisableControlPanel, newCfg.RemoteManagement.DisableControlPanel))
 	}
+	oldPanelRepo := strings.TrimSpace(oldCfg.RemoteManagement.PanelGitHubRepository)
+	newPanelRepo := strings.TrimSpace(newCfg.RemoteManagement.PanelGitHubRepository)
+	if oldPanelRepo != newPanelRepo {
+		changes = append(changes, fmt.Sprintf("remote-management.panel-github-repository: %s -> %s", oldPanelRepo, newPanelRepo))
+	}
 	if oldCfg.RemoteManagement.SecretKey != newCfg.RemoteManagement.SecretKey {
 		switch {
 		case oldCfg.RemoteManagement.SecretKey == "" && newCfg.RemoteManagement.SecretKey != "":
--- a/sdk/api/handlers/handlers.go
+++ b/sdk/api/handlers/handlers.go
@@ -5,6 +5,7 @@ package handlers

 import (
 	"bytes"
+	"encoding/json"
 	"fmt"
 	"net/http"
 	"strings"
@@ -137,6 +138,16 @@ func (h *BaseAPIHandler) GetContextWithCancel(handler interfaces.APIHandler, c *
 	newCtx = context.WithValue(newCtx, "handler", handler)
 	return newCtx, func(params ...interface{}) {
 		if h.Cfg.RequestLog && len(params) == 1 {
+			if existing, exists := c.Get("API_RESPONSE"); exists {
+				if existingBytes, ok := existing.([]byte); ok && len(bytes.TrimSpace(existingBytes)) > 0 {
+					switch params[0].(type) {
+					case error, string:
+						cancel()
+						return
+					}
+				}
+			}
+
 			var payload []byte
 			switch data := params[0].(type) {
 			case []byte:
@@ -457,12 +468,53 @@ func (h *BaseAPIHandler) WriteErrorResponse(c *gin.Context, msg *interfaces.Erro
 			}
 		}
 	}
-	c.Status(status)
+
+	errText := http.StatusText(status)
 	if msg != nil && msg.Error != nil {
-		_, _ = c.Writer.Write([]byte(msg.Error.Error()))
-	} else {
-		_, _ = c.Writer.Write([]byte(http.StatusText(status)))
+		if v := strings.TrimSpace(msg.Error.Error()); v != "" {
+			errText = v
+		}
 	}
+
+	// Prefer preserving upstream JSON error bodies when possible.
+	buildJSONBody := func() []byte {
+		trimmed := strings.TrimSpace(errText)
+		if trimmed != "" && json.Valid([]byte(trimmed)) {
+			return []byte(trimmed)
+		}
+		errType := "invalid_request_error"
+		switch status {
+		case http.StatusUnauthorized:
+			errType = "authentication_error"
+		case http.StatusForbidden:
+			errType = "permission_error"
+		case http.StatusTooManyRequests:
+			errType = "rate_limit_error"
+		default:
+			if status >= http.StatusInternalServerError {
+				errType = "server_error"
+			}
+		}
+		payload, err := json.Marshal(ErrorResponse{
+			Error: ErrorDetail{
+				Message: errText,
+				Type:    errType,
+			},
+		})
+		if err != nil {
+			return []byte(fmt.Sprintf(`{"error":{"message":%q,"type":"server_error"}}`, errText))
+		}
+		return payload
+	}
+
+	body := buildJSONBody()
+	c.Set("API_RESPONSE", bytes.Clone(body))
+
+	if !c.Writer.Written() {
+		c.Writer.Header().Set("Content-Type", "application/json")
+	}
+	c.Status(status)
+	_, _ = c.Writer.Write(body)
 }

 func (h *BaseAPIHandler) LoggingAPIResponseError(ctx context.Context, err *interfaces.ErrorMessage) {
--- a/sdk/cliproxy/auth/manager.go
+++ b/sdk/cliproxy/auth/manager.go
@@ -375,10 +375,19 @@ func (m *Manager) executeWithProvider(ctx context.Context, provider string, req
 		}

 		accountType, accountInfo := auth.AccountInfo()
+		proxyInfo := auth.ProxyInfo()
 		if accountType == "api_key" {
-			log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			}
 		} else if accountType == "oauth" {
-			log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			}
 		}

 		tried[auth.ID] = struct{}{}
@@ -423,10 +432,19 @@ func (m *Manager) executeCountWithProvider(ctx context.Context, provider string,
 		}

 		accountType, accountInfo := auth.AccountInfo()
+		proxyInfo := auth.ProxyInfo()
 		if accountType == "api_key" {
-			log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			}
 		} else if accountType == "oauth" {
-			log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			}
 		}

 		tried[auth.ID] = struct{}{}
@@ -471,10 +489,19 @@ func (m *Manager) executeStreamWithProvider(ctx context.Context, provider string
 		}

 		accountType, accountInfo := auth.AccountInfo()
+		proxyInfo := auth.ProxyInfo()
 		if accountType == "api_key" {
-			log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use API key %s for model %s %s", util.HideAPIKey(accountInfo), req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use API key %s for model %s", util.HideAPIKey(accountInfo), req.Model)
+			}
 		} else if accountType == "oauth" {
-			log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			if proxyInfo != "" {
+				log.Debugf("Use OAuth %s for model %s %s", accountInfo, req.Model, proxyInfo)
+			} else {
+				log.Debugf("Use OAuth %s for model %s", accountInfo, req.Model)
+			}
 		}

 		tried[auth.ID] = struct{}{}
--- a/sdk/cliproxy/auth/types.go
+++ b/sdk/cliproxy/auth/types.go
@@ -157,6 +157,20 @@ func (m *ModelState) Clone() *ModelState {
 	return &copyState
 }

+func (a *Auth) ProxyInfo() string {
+	if a == nil {
+		return ""
+	}
+	proxyStr := strings.TrimSpace(a.ProxyURL)
+	if proxyStr == "" {
+		return ""
+	}
+	if idx := strings.Index(proxyStr, "://"); idx > 0 {
+		return "via " + proxyStr[:idx] + " proxy"
+	}
+	return "via proxy"
+}
+
 func (a *Auth) AccountInfo() (string, string) {
 	if a == nil {
 		return "", ""
Author	SHA1	Message	Date
Luis Pater	f957b8948c	chore(deps): bump `golang.org/x/term` to v0.37.0	2025-12-17 00:19:15 +08:00
Luis Pater	cd0b14dd2d	Merge pull request #32 from Ravens2121/master feat: enhance thinking mode support for Kiro translator	2025-12-17 00:00:57 +08:00
Ravens2121	894703a484	Merge branch 'router-for-me:main' into master	2025-12-16 23:06:06 +08:00
Luis Pater	d08a2453f7	Merge pull request #33 from router-for-me/plus v6.6.18	2025-12-16 13:19:42 +08:00
Luis Pater	3f53eea1e0	Merge branch 'main' into plus	2025-12-16 13:19:32 +08:00
Luis Pater	5a812a1e93	feat(remote-management): add support for custom GitHub repository for panel updates Introduce `panel-github-repository` in the configuration to allow specifying a custom repository for management panel assets. Update dependency versions and enhance asset URL resolution logic to support overrides.	2025-12-16 13:09:26 +08:00
Chén Mù	5e624cc7b1	Merge pull request #558 from router-for-me/worker chore: ignore .bmad directory	2025-12-16 09:24:32 +08:00
Ravens2121	f3d1cc8dc1	chore: change debug logs from INFO to DEBUG level	2025-12-16 05:32:03 +08:00
Ravens2121	e889efeda7	fix: add signature field to thinking blocks for non-streaming mode - Add generateThinkingSignature() function in kiro_claude_response.go	2025-12-16 05:21:49 +08:00
Ravens2121	0a3a95521c	feat: enhance thinking mode support for Kiro translator Changes:	2025-12-16 05:01:40 +08:00
Luis Pater	4ebaf6f7a9	Merge pull request #31 from router-for-me/plus v6.6.17	2025-12-15 23:54:30 +08:00
Luis Pater	59ac1a3f60	Merge branch 'main' into plus	2025-12-15 23:53:23 +08:00
Luis Pater	3af24597ee	docs: remove Amp CLI integration guides and update references	2025-12-15 23:50:56 +08:00
hkfires	e0be6c5786	chore: ignore .bmad directory	2025-12-15 20:53:43 +08:00
Luis Pater	88b101ebf5	Merge pull request #549 from router-for-me/log Improve Request Logging Efficiency and Standardize Error Responses	2025-12-15 20:43:12 +08:00
Luis Pater	923a5d6efb	Merge branch 'router-for-me:main' into main	2025-12-15 20:40:23 +08:00
Luis Pater	734b7e42ad	Merge pull request #28 from tsln1998/main fix(kiro): remove the extra quotation marks from the protocol handler	2025-12-15 20:40:07 +08:00
Luis Pater	d9a65745df	fix(translator): handle empty item type and string content in OpenAI response parser	2025-12-15 20:35:52 +08:00
hkfires	97ab623d42	fix(api): prevent double logging for streaming responses	2025-12-15 18:00:32 +08:00
hkfires	14aa6cc7e8	fix(api): ensure all response writes are captured for logging The response writer wrapper has been refactored to more reliably capture response bodies for logging, fixing several edge cases. - Implements `WriteString` to capture writes from `io.StringWriter`, which were previously missed by the `Write` method override. - A new `shouldBufferResponseBody` helper centralizes the logic to ensure the body is buffered only when logging is active or for errors when `logOnErrorOnly` is enabled. - Streaming detection is now more robust. It correctly handles non-streaming error responses (e.g., `application/json`) that are generated for a request that was intended to be streaming. BREAKING CHANGE: The public methods `Status()`, `Size()`, and `Written()` have been removed from the `ResponseWriterWrapper` as they are no longer required by the new implementation.	2025-12-15 17:45:16 +08:00
Luis Pater	10e77fcf24	Merge pull request #29 from router-for-me/plus v6.6.15	2025-12-15 16:36:23 +08:00
Luis Pater	bbb21d7c2b	Merge branch 'main' into plus	2025-12-15 16:36:11 +08:00
hkfires	3bc489254b	fix(api): prevent double logging for error responses The WriteErrorResponse function now caches the error response body in the gin context. The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.	2025-12-15 16:36:01 +08:00
hkfires	4c07ea41c3	feat(api): return structured JSON error responses The API error handling is updated to return a structured JSON payload instead of a plain text message. This provides more context and allows clients to programmatically handle different error types. The new error response has the following structure: { "error": { "message": "...", "type": "..." } } The `type` field is determined by the HTTP status code, such as `authentication_error`, `rate_limit_error`, or `server_error`. If the underlying error message from an upstream service is already a valid JSON string, it will be preserved and returned directly. BREAKING CHANGE: API error responses are now in a structured JSON format instead of plain text. Clients expecting plain text error messages will need to be updated to parse the new JSON body.	2025-12-15 16:19:52 +08:00
Luis Pater	f6720f8dfa	Merge pull request #547 from router-for-me/amp feat(amp): require API key authentication for management routes	2025-12-15 16:14:49 +08:00
Chén Mù	e19ab3a066	Merge pull request #543 from router-for-me/log feat(auth): add proxy information to debug logs	2025-12-15 15:59:16 +08:00
Tsln	c46099c5d7	fix(kiro): remove the extra quotation marks from the protocol handler	2025-12-15 15:53:25 +08:00
hkfires	8f1dd69e72	feat(amp): require API key authentication for management routes All Amp management endpoints (e.g., /api/user, /threads) are now protected by the standard API key authentication middleware. This ensures that all management operations require a valid API key, significantly improving security. As a result of this change: - The `restrict-management-to-localhost` setting now defaults to `false`. API key authentication provides a stronger and more flexible security control than IP-based restrictions, improving usability in containerized environments. - The reverse proxy logic now strips the client's `Authorization` header after authenticating the initial request. It then injects the configured `upstream-api-key` for the request to the upstream Amp service. BREAKING CHANGE: Amp management endpoints now require a valid API key for authentication. Requests without a valid API key in the `Authorization` header will be rejected with a 401 Unauthorized error.	2025-12-15 13:24:53 +08:00
hkfires	f26da24a2f	feat(auth): add proxy information to debug logs	2025-12-15 13:14:55 +08:00