Three changes to avoid Anthropic's content-based system prompt validation:
1. Fix identity prefix: Use 'You are Claude Code, Anthropic's official CLI
for Claude.' instead of the SDK agent prefix, matching real Claude Code.
2. Move user system instructions to user message: Only keep billing header +
identity prefix in system[] array. User system instructions are prepended
to the first user message as <system-reminder> blocks.
3. Enable cch signing for OAuth tokens by default: The xxHash64 cch integrity
check was previously gated behind experimentalCCHSigning config flag.
Now automatically enabled when using OAuth tokens.
Related: router-for-me/CLIProxyAPI#2599
Added comprehensive tests to ensure key order is maintained when modifying payloads in `normalizeCacheControlTTL` and `enforceCacheControlLimit` functions. Removed unused helper functions and refactored implementations for better readability and efficiency.
The review asked for the builtin tool registry helper to live with the rest
of executor support utilities. This moves the registry code into the helps
package, exports the minimal surface executor needs, and keeps behavior tests
with the executor while leaving registry-focused checks with the helper.
Constraint: Requested layout keeps executor helper utilities centralized under internal/runtime/executor/helps
Rejected: Keep the files in executor and reply with rationale | conflicts with requested package organization
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep executor behavior tests near applyClaudeToolPrefix and keep pure registry tests in helps
Tested: go test ./internal/runtime/executor/helps ./internal/runtime/executor -run 'Claude|Builtin|Tool'; go test ./test/...; go test ./...
Not-tested: End-to-end Claude Code direct-connect/session runtime behavior
This change stops short of broader Claude Code runtime alignment and instead
hardens two safe edges: builtin tool prefix handling and source-informed
sentinel coverage for future drift checks.
Constraint: Must preserve existing default behavior for current users
Rejected: Implement control-plane/session alignment now | too much runtime risk for a first slice
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Treat the new fixtures as compatibility sentinels, not a full Claude Code schema contract
Tested: go test ./test/...; go test ./sdk/translator/...; go test ./internal/runtime/executor -run 'Claude|Builtin|Tool'; go test ./...
Not-tested: End-to-end Claude Code direct-connect/session runtime behavior
1. Always include interleaved-thinking-2025-05-14 beta header so that
thinking blocks are returned correctly for all Claude models.
2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
error responses (4xx/5xx) with hidden gzip encoding are decoded
properly — prevents garbled error messages reaching the client.
3. In normalizeClaudeBudget, when the adjusted budget falls below the
model minimum, set max_tokens = budgetTokens+1 instead of leaving
the request unchanged (which causes a 400 from the API).
normalizeCacheControlTTL unconditionally re-serializes the entire request
body through json.Unmarshal/json.Marshal even when no TTL normalization
is needed. Go's json.Marshal randomizes map key order and HTML-escapes
<, >, & characters (to \u003c, \u003e, \u0026), producing different raw
bytes on every call.
Anthropic's prompt caching uses byte-prefix matching, so any byte-level
difference causes a cache miss. This means the ~119K system prompt and
tools are re-processed on every request when routed through CPA.
The fix adds a bool return to normalizeTTLForBlock to indicate whether
it actually modified anything, and skips the marshal step in
normalizeCacheControlTTL when no blocks were changed.
- Set Accept-Encoding: identity for SSE streams; upstream must not compress
line-delimited SSE bodies that bufio.Scanner reads directly
- Re-enforce identity after ApplyCustomHeadersFromAttrs to prevent auth
attribute injection from re-enabling compression on the stream path
- Add peekableBody type wrapping bufio.Reader for non-consuming magic-byte
inspection of the first 4 bytes without affecting downstream readers
- Detect gzip (0x1f 0x8b) and zstd (0x28 0xb5 0x2f 0xfd) by magic bytes
when Content-Encoding header is absent, covering misbehaving upstreams
- Remove if-Content-Encoding guard on all three error paths (Execute,
ExecuteStream, CountTokens); unconditionally delegate to decodeResponseBody
so magic-byte detection applies consistently to all response paths
- Add 10 tests covering stream identity enforcement, compressed success bodies,
magic-byte detection without headers, error path decoding, and
auth attribute override prevention
Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.
Captured and compared outgoing requests from CLIProxyAPI against real
Claude Code 2.1.63 and fixed all detectable differences:
Headers:
- Update anthropic-beta to match 2.1.63: replace fine-grained-tool-streaming
and prompt-caching-2024-07-31 with context-management-2025-06-27 and
prompt-caching-scope-2026-01-05
- Remove X-Stainless-Helper-Method header (real Claude Code does not send it)
- Update default User-Agent from "claude-cli/2.1.44 (external, sdk-cli)" to
"claude-cli/2.1.63 (external, cli)"
- Force Claude Code User-Agent for non-Claude clients to avoid leaking
real client identity (e.g. curl, OpenAI SDKs) during cloaking
Body:
- Inject x-anthropic-billing-header as system[0] (matches real format)
- Change system prompt identifier from "You are Claude Code..." to
"You are a Claude agent, built on Anthropic's Claude Agent SDK."
- Add cache_control with ttl:"1h" to match real request format
- Fix user_id format: user_[64hex]_account_[uuid]_session_[uuid]
(was missing account UUID)
- Disable tool name prefix (set claudeToolPrefix to empty string)
TLS:
- Switch utls fingerprint from HelloFirefox_Auto to HelloChrome_Auto
(closer to Node.js/OpenSSL used by real Claude Code)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Default to generating a fresh random user_id per request instead of
reusing cached IDs. Add cache-user-id config option to opt in to the
previous caching behavior.
- Add CacheUserID field to CloakConfig
- Extract user_id cache logic to dedicated file
- Generate fresh user_id by default, cache only when enabled
- Add tests for both paths
Update hardcoded X-Stainless-* and User-Agent defaults to match
Claude Code 2.1.44 / @anthropic-ai/sdk 0.74.0 (verified via
diagnostic proxy capture 2026-02-17).
Changes:
- X-Stainless-Os/Arch: dynamic via runtime.GOOS/GOARCH
- X-Stainless-Package-Version: 0.55.1 → 0.74.0
- X-Stainless-Timeout: 60 → 600
- User-Agent: claude-cli/1.0.83 (external, cli) → claude-cli/2.1.44 (external, sdk-cli)
Add claude-header-defaults config section so values can be updated
without recompilation when Claude Code releases new versions.
Allow disabling the proxy_ tool name prefix on a per-account basis.
Users who route their own Anthropic account through CPA can set
"tool_prefix_disabled": true in their OAuth auth JSON to send tool
names unchanged to Anthropic.
Default behavior is fully preserved — prefix is applied unless
explicitly disabled.
Changes:
- Add ToolPrefixDisabled() accessor to Auth (reads metadata key
"tool_prefix_disabled" or "tool-prefix-disabled")
- Gate all 6 prefix apply/strip points with the new flag
- Add unit tests for the accessor
The proxy_ prefix logic correctly skips built-in tools (those with a
non-empty "type" field) in tools[] definitions but does not skip them
in messages[].content[] tool_use blocks or tool_choice. This causes
web_search in conversation history to become proxy_web_search, which
Anthropic does not recognize.
Fix: collect built-in tool names from tools[] into a set and also
maintain a hardcoded fallback set (web_search, code_execution,
text_editor, computer) for cases where the built-in tool appears in
history but not in the current request's tools[] array. Skip prefixing
in messages and tool_choice when name matches a built-in.
- Collect built-in tool names (those with a "type" field like
web_search, code_execution) and skip prefixing tool_reference
blocks that reference them, preventing name mismatch.
- Refactor if-else if chains to switch statements in all three
prefix functions for idiomatic Go style.
tool_reference blocks can appear nested inside tool_result.content[]
arrays, not just at the top level of messages[].content[]. The prefix
logic now iterates into tool_result blocks with array content to find
and prefix/strip nested tool_reference.tool_name fields.
applyClaudeToolPrefix, stripClaudeToolPrefixFromResponse, and
stripClaudeToolPrefixFromStreamLine now handle "tool_reference" blocks
(field "tool_name") in addition to "tool_use" blocks (field "name").
Without this fix, tool_reference blocks in conversation history retain
their original unprefixed names while tool definitions carry the proxy_
prefix, causing Anthropic API 400 errors: "Tool reference 'X' not found
in available tools."
Co-authored-by: Kirill Turanskiy <kt@novamedia.ru>
- Replaced all instances of `bytes.Clone` with direct references to enhance efficiency.
- Simplified payload handling across executors and translators by eliminating unnecessary data duplication.