The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that are lower than the total context window (200K). When
the dynamic /models API fetch fails or returns no capabilities.limits,
the static fallback of 200K exceeds the real enforced limit, causing
intermittent "prompt token count exceeds the limit" errors.
Two complementary fixes:
1. Lower static Copilot Claude model ContextLength from 200000 to 128000
(the conservative default matching defaultCopilotContextLength). Dynamic
API limits override this when available.
2. Add context_length and max_completion_tokens to Claude-format model
responses so Claude Code CLI can learn the actual Copilot limit instead
of relying on its built-in 1M context configuration.
ensureAmpSignature injects signature:"" into tool_use blocks so the
Amp TUI does not crash on P.signature.length. when Amp sends the
conversation back, Claude rejects the extra field with 400:
tool_use.signature: Extra inputs are not permitted
strip the proxy-injected signature from tool_use blocks in
SanitizeAmpRequestBody before forwarding to the upstream API.
Added comprehensive tests to ensure key order is maintained when modifying payloads in `normalizeCacheControlTTL` and `enforceCacheControlLimit` functions. Removed unused helper functions and refactored implementations for better readability and efficiency.
Added comprehensive support for resolving proxy URLs from configuration based on API key and provider attributes. Introduced new helper functions and extended the test suite to validate fallback mechanisms and compatibility cases.
Introduce `StartAntigravityVersionUpdater` to periodically refresh the cached Antigravity version using a non-blocking background process. Updated main server flow to initialize the updater.
Fetch the latest version from the antigravity auto-updater releases
endpoint and cache it for 6 hours. Falls back to 1.21.9 if the API
is unreachable or returns unexpected data.
The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that differ from the static 200K context length advertised
by the proxy. This mismatch caused Claude Code to accumulate context
beyond the actual limit, triggering "prompt token count exceeds the limit
of 128000" errors.
Changes:
- Extract max_prompt_tokens and max_output_tokens from the Copilot
/models API response (capabilities.limits) and use them as the
authoritative ContextLength and MaxCompletionTokens values
- Add CopilotModelLimits struct and Limits() helper to parse limits
from the existing Capabilities map
- Fix GitLab Duo context-1m beta header not being set when routing
through the Anthropic gateway (gitlab_duo_force_context_1m attr
was set but only gin headers were checked)
- Fix flaky parallel tests that shared global model registry state
- Strip SSE `data:` prefix before normalizing reasoning_text→reasoning_content
in streaming mode; re-wrap afterward for the translator
- Iterate all choices in normalizeGitHubCopilotReasoningField (not just
choices[0]) to support n>1 requests
- Remove over-broad tool-role fallback in isAgentInitiated that scanned
all messages for role:"tool", aligning with opencode's approach of only
detecting active tool loops — genuine user follow-ups after tool use are
no longer mis-classified as agent-initiated
- Add 5 reasoning normalization tests; update 2 X-Initiator tests to match
refined semantics