Kiro API endpoints only exist in us-east-1, but OIDC region can vary
by Enterprise user location (e.g., ap-northeast-2 for Korean users).
Previously, when ProfileARN was not available, the code fell back to
using OIDC region for API calls, causing DNS resolution failures:
lookup codewhisperer.ap-northeast-2.amazonaws.com: no such host
This fix removes the OIDC region fallback for API endpoints.
The region priority is now:
1. api_region (explicit override)
2. ProfileARN region
3. us-east-1 (default)
Fixes: Issue #253 (200-400x slower response times due to DNS failures)
Switch from CodeWhisperer endpoint to Amazon Q endpoint for all auth types:
- Use q.{region}.amazonaws.com/generateAssistantResponse as primary endpoint
- Works universally across all AWS regions (CodeWhisperer only exists in us-east-1)
- Use application/json Content-Type instead of application/x-amz-json-1.0
- Remove X-Amz-Target header for Q endpoint (not required)
- Add x-amzn-kiro-agent-mode: vibe header
- Add x-amzn-codewhisperer-optout: true header
- Keep CodeWhisperer endpoint as fallback for compatibility
This change aligns with Amazon's consolidation of services under the Q branding
and provides better multi-region support for Enterprise/IDC users.
Introduce a custom HTTP client utilizing utls with Firefox TLS fingerprinting to bypass Cloudflare fingerprinting on Anthropic domains. Includes support for proxy configuration and enhanced connection management for HTTP/2.
Address @Xm798's feedback: OIDC region may differ from API region in some
Enterprise setups (e.g., OIDC in us-east-2, API in us-east-1).
Region priority (highest to lowest):
1. api_region - explicit override for API endpoint region
2. ProfileARN - extract region from arn:aws:service:REGION:account:resource
3. region - OIDC/Identity region (fallback)
4. us-east-1 - default
Changes:
- Add extractRegionFromProfileARN() to parse region from ARN
- Update getKiroEndpointConfigs() with 4-level region priority
- Add regionSource logging for debugging
Revert the Amazon Q endpoint path to root '/' instead of '/generateAssistantResponse'.
The '/generateAssistantResponse' path is only for CodeWhisperer endpoint with
'GenerateAssistantResponse' target. Amazon Q endpoint uses 'SendMessage' target
which requires the root path.
Thanks to @gemini-code-assist for catching this copy-paste error.
## Problem
- Kiro API endpoints were hardcoded to us-east-1 region
- Enterprise users in other regions (e.g., ap-northeast-2) experienced
significant latency (200-400x slower) due to cross-region API calls
- This is the API endpoint counterpart to quotio PR #241 which fixed
token refresh endpoints
## Solution
- Add buildKiroEndpointConfigs(region) function for dynamic endpoint generation
- Extract region from auth.Metadata["region"] field
- Fallback to us-east-1 for backward compatibility
- Use case-insensitive authMethod comparison (consistent with quotio PR #252)
## Changes
- Add kiroDefaultRegion constant
- Convert hardcoded endpoint URLs to dynamic fmt.Sprintf with region
- Update getKiroEndpointConfigs to extract and use region from auth
- Fix isIDCAuth to use case-insensitive comparison
## Testing
- Backward compatible: defaults to us-east-1 when no region specified
- Enterprise users can now use their local region endpoints
Related:
- quotio PR #241: Dynamic region for token refresh (merged)
- quotio PR #252: authMethod case-insensitive fix
- quotio Issue #253: Performance issue report
- Add firstChunkTimestamp field to ResponseWriterWrapper for sync capture
- Capture TTFB in Write() and WriteString() before async channel send
- Add SetFirstChunkTimestamp() to StreamingLogWriter interface
- Make requestTimestamp/apiResponseTimestamp required in LogRequest()
- Remove timestamp capture from WriteAPIResponse() (now via setter)
- Fix Gemini handler to set API_RESPONSE_TIMESTAMP before writing response
This ensures accurate TTFB measurement for all streaming API formats
(OpenAI, Gemini, Claude) by capturing timestamp synchronously when
the first response chunk arrives, not when the stream finalizes.
Previously:
- REQUEST INFO timestamp was captured at log write time (not request arrival)
- API RESPONSE had NO timestamp at all
This fix:
- Captures REQUEST INFO timestamp when request first arrives
- Adds API RESPONSE timestamp when upstream response arrives
Changes:
- Add Timestamp field to RequestInfo, set at middleware initialization
- Set API_RESPONSE_TIMESTAMP in appendAPIResponse() and gemini handler
- Pass timestamps through logging chain to writeNonStreamingLog()
- Add timestamp output to API RESPONSE section
This enables accurate measurement of backend response latency in error logs.
When the Kiro/AWS CodeWhisperer API receives a Write tool request with content
that exceeds transmission limits, it truncates the tool input. This can result in:
- Empty input buffer (no input transmitted at all)
- Missing 'content' field in the parsed JSON
- Incomplete JSON that fails to parse
This fix detects these truncation scenarios and converts them to Bash tool calls
that echo an error message. This allows Claude Code to execute the Bash command,
see the error output, and the agent can then retry with smaller chunks.
Changes:
- kiro_claude_tools.go: Detect three truncation scenarios in ProcessToolUseEvent:
1. Empty input buffer (no input transmitted)
2. JSON parse failure with file_path but no content field
3. Successfully parsed JSON missing content field
When detected, emit a special '__truncated_write__' marker tool use
- kiro_executor.go: Handle '__truncated_write__' markers in streamToChannel:
1. Extract file_path from the marker for context
2. Create a Bash tool_use that echoes an error message
3. Include retry guidance (700-line chunks recommended)
4. Set hasToolUses=true to ensure stop_reason='tool_use' for agent continuation
This ensures the agent continues and can retry with smaller file chunks instead
of failing silently or showing errors to the user.
When using Gemini API format with Antigravity backend, the executor
renames usageMetadata to cpaUsageMetadata in non-terminal chunks.
The Gemini translator was returning this internal field name directly
to clients instead of the standard usageMetadata field.
Add restoreUsageMetadata() to rename cpaUsageMetadata back to
usageMetadata before returning responses to clients.
Optimize channel operations by introducing reusable context-aware send functions (`send` and `sendErr`) across `wsrelay`, `handlers`, and `cliproxy`. Ensure graceful handling of canceled contexts during stream operations.
**Problem:**
GitHub Copilot API returns 400 error "missing required Copilot-Vision-Request
header for vision requests" when requests contain image content blocks, even
though the requests are valid Claude API calls.
**Root Cause:**
The GitHub Copilot executor was not detecting vision content in requests and
did not add the required `Copilot-Vision-Request: true` header.
**Solution:**
- Added `detectVisionContent()` function to check for image_url/image content blocks
- Automatically add `Copilot-Vision-Request: true` header when vision content is detected
- Applied fix to both `Execute()` and `ExecuteStream()` methods
**Testing:**
- Tested with Claude Code IDE requests containing code context screenshots
- Vision requests now succeed instead of failing with 400 errors
- Non-vision requests remain unchanged
Fixes issue where GitHub Copilot executor fails all vision-enabled requests,
causing unnecessary fallback to other providers and 0% utilization.
Co-Authored-By: Claude (claude-sonnet-4.5) <noreply@anthropic.com>
The background refresher was skipping token files with auth_method values
like 'IdC' or 'IDC' because the comparison was case-sensitive and only
matched lowercase 'idc'.
This fix normalizes the auth_method to lowercase before comparison in:
- token_repository.go: readTokenFile() when filtering tokens to refresh
- background_refresh.go: refreshSingle() when selecting refresh method
Fixes the issue where 'IdC' != 'idc' caused tokens to be skipped entirely.
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.