The Copilot API enforces per-account prompt token limits (128K individual,
168K business) that differ from the static 200K context length advertised
by the proxy. This mismatch caused Claude Code to accumulate context
beyond the actual limit, triggering "prompt token count exceeds the limit
of 128000" errors.
Changes:
- Extract max_prompt_tokens and max_output_tokens from the Copilot
/models API response (capabilities.limits) and use them as the
authoritative ContextLength and MaxCompletionTokens values
- Add CopilotModelLimits struct and Limits() helper to parse limits
from the existing Capabilities map
- Fix GitLab Duo context-1m beta header not being set when routing
through the Anthropic gateway (gitlab_duo_force_context_1m attr
was set but only gin headers were checked)
- Fix flaky parallel tests that shared global model registry state
- Strip SSE `data:` prefix before normalizing reasoning_text→reasoning_content
in streaming mode; re-wrap afterward for the translator
- Iterate all choices in normalizeGitHubCopilotReasoningField (not just
choices[0]) to support n>1 requests
- Remove over-broad tool-role fallback in isAgentInitiated that scanned
all messages for role:"tool", aligning with opencode's approach of only
detecting active tool loops — genuine user follow-ups after tool use are
no longer mis-classified as agent-initiated
- Add 5 reasoning normalization tests; update 2 X-Initiator tests to match
refined semantics
This commit addresses three issues with Claude Code through GitHub
Copilot:
1. **Premium request inflation**: Responses API requests were missing
Openai-Intent headers and proper defaults, causing Copilot to bill
each tool-loop continuation as a new premium request. Fixed by adding
isAgentInitiated() heuristic (checks for tool_result content or
preceding assistant tool_use), applying Responses API defaults
(store, include, reasoning.summary), and local tiktoken-based token
counting to avoid extra API calls.
2. **Context overflow**: Claude Code's modelSupports1M() hardcodes
opus-4-6 as 1M-capable, but Copilot only supports ~128K-200K.
Fixed by stripping the context-1m-2025-08-07 beta from translated
request bodies. Also forwards response headers in non-streaming
Execute() and registers the GET /copilot-quota management API route.
3. **Thinking not working**: Add ThinkingSupport with level-based
reasoning to Claude models in the static definitions. Normalize
Copilot's non-standard 'reasoning_text' response field to
'reasoning_content' before passing to the SDK translator. Use
caller-provided context in CountTokens instead of Background().
Keep the OpenAI Responses computer tool intact when normalizing requests for the GitHub Copilot executor.
This change preserves built-in computer tool definitions instead of dropping them as non-function tools, keeps explicit computer tool_choice selections unchanged, and classifies computer_call / computer_call_output items as assistant and tool turns when deriving the initiator header.
Together these adjustments allow Responses requests that use the computer tool to reach the upstream executor without losing tool metadata or switching turn ownership unexpectedly.
- Fix SSRF: validate API endpoint host against allowlist before use
- Limit /models response body to 2MB to prevent memory exhaustion (DoS)
- Use MakeAuthenticatedRequest for consistent headers across API calls
- Trim trailing slash on API endpoint to prevent double-slash URLs
- Use ListModelsWithGitHubToken to simplify token exchange + listing
- Deduplicate model IDs to prevent incorrect registry reference counting
- Remove dead capabilities enrichment code block
- Remove unused ModelExtra field with misleading json:"-" tag
- Extract magic numbers to named constants (defaultCopilotContextLength)
- Remove redundant hyphenID == id check (already filtered by Contains)
- Use defer cancel() for context timeout in service.go
- Add ListModels/ListModelsWithGitHubToken to CopilotAuth for querying
the /models endpoint at api.githubcopilot.com
- Add FetchGitHubCopilotModels in executor with static fallback on failure
- Update service.go to use dynamic fetching (15s timeout) instead of
hardcoded GetGitHubCopilotModels()
- Add GitHubCopilotAliasesFromModels for auto-generating dot-to-hyphen
model aliases from dynamic model lists
- Updated `ExecuteStream` functions in executors to use `StreamResult` instead of channels.
- Enhanced upstream header handling in OpenAI handlers.
- Improved maintainability and alignment across executors and handlers.
- Fix X-Initiator detection: check for any assistant/tool role
in messages instead of only the last message role, matching
the correct agent detection for multi-turn tool conversations
- Add x-github-api-version: 2025-04-01 header for API compatibility
- Support Business/Enterprise accounts by using Endpoints.API from
the Copilot token response instead of hardcoded base URL
- Fix Responses API vision detection: detect vision content before
input normalization removes the messages array
- Add 8 test cases covering the above fixes
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Change Openai-Intent header from "conversation-edits" to
"conversation-panel" to avoid triggering GitHub's premium
execution path, which caused included models (0x multiplier)
to be billed as premium requests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add response.function_call_arguments.delta handler for tool call parameters
- Rewrite normalizeGitHubCopilotResponsesInput to produce structured input
array (message/function_call/function_call_output) instead of flattened
text, fixing infinite loop in multi-turn tool-use conversations
- Skip flattenAssistantContent for messages containing tool_use blocks,
preventing function_call items from being destroyed
- Add reasoning/thinking stream & non-stream support
- Fix stop_reason mapping (max_tokens/stop) and cached token reporting
- Update test to match new array-based input format
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The normalizeGitHubCopilotResponsesTools filter required type="function",
which dropped Claude-format tools (no type field, uses input_schema).
Relax the filter to accept tools without a type field and map input_schema
to parameters so tools are correctly sent to the upstream API.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
GitHub Copilot API rejects model names with suffixes (e.g. claude-opus-4.6(medium)).
The OAuthModelAlias resolution correctly maps aliases like 'opus(medium)' to
'claude-opus-4.6(medium)' preserving the suffix, but the executor must strip the
suffix before sending to the upstream API since Copilot only accepts bare model names.
Update normalizeModel in github_copilot_executor to strip suffixes using
thinking.ParseSuffix, matching the pattern used by other executors.
Also add test coverage for:
- OAuthModelAliasChannel github-copilot and kiro channel resolution
- Suffix preservation in alias resolution for github-copilot
- normalizeModel suffix stripping in github_copilot_executor
> Copilot Premium usage significantly amplified when using amp
- Add X-Initiator header (user/agent) based on last message role to
prevent Copilot from billing all requests as premium user-initiated
- Add flattenAssistantContent() to convert assistant content from array
to string, preventing Claude from re-answering all previous prompts
- Align Copilot headers (User-Agent, Editor-Version, Openai-Intent) with
pi-ai reference implementation
Closes#113
Amp-Thread-ID: https://ampcode.com/threads/T-019c392b-736e-7489-a06b-f94f7c75f7c0
Co-authored-by: Amp <amp@ampcode.com>
**Problem:**
GitHub Copilot API returns 400 error "missing required Copilot-Vision-Request
header for vision requests" when requests contain image content blocks, even
though the requests are valid Claude API calls.
**Root Cause:**
The GitHub Copilot executor was not detecting vision content in requests and
did not add the required `Copilot-Vision-Request: true` header.
**Solution:**
- Added `detectVisionContent()` function to check for image_url/image content blocks
- Automatically add `Copilot-Vision-Request: true` header when vision content is detected
- Applied fix to both `Execute()` and `ExecuteStream()` methods
**Testing:**
- Tested with Claude Code IDE requests containing code context screenshots
- Vision requests now succeed instead of failing with 400 errors
- Non-vision requests remain unchanged
Fixes issue where GitHub Copilot executor fails all vision-enabled requests,
causing unnecessary fallback to other providers and 0% utilization.
Co-Authored-By: Claude (claude-sonnet-4.5) <noreply@anthropic.com>
Enhance model definitions by including supported API endpoints for each model. This allows for better integration and usage tracking with the GitHub Copilot API.
Updated `applyPayloadConfig` to `applyPayloadConfigWithRoot` across payload translation logic, enabling validation against the original request payload when available. Added support for improved model normalization and translation consistency.
Add complete GitHub Copilot support including:
- Device flow OAuth authentication via GitHub's official client ID
- Token management with automatic caching (25 min TTL)
- OpenAI-compatible API executor for api.githubcopilot.com
- 16 model definitions (GPT-5 variants, Claude variants, Gemini, Grok, Raptor)
- CLI login command via -github-copilot-login flag
- SDK authenticator and refresh registry integration
Enables users to authenticate with their GitHub Copilot subscription and
use it as a backend provider alongside existing providers.