Address @Xm798's feedback: OIDC region may differ from API region in some
Enterprise setups (e.g., OIDC in us-east-2, API in us-east-1).
Region priority (highest to lowest):
1. api_region - explicit override for API endpoint region
2. ProfileARN - extract region from arn:aws:service:REGION:account:resource
3. region - OIDC/Identity region (fallback)
4. us-east-1 - default
Changes:
- Add extractRegionFromProfileARN() to parse region from ARN
- Update getKiroEndpointConfigs() with 4-level region priority
- Add regionSource logging for debugging
Revert the Amazon Q endpoint path to root '/' instead of '/generateAssistantResponse'.
The '/generateAssistantResponse' path is only for CodeWhisperer endpoint with
'GenerateAssistantResponse' target. Amazon Q endpoint uses 'SendMessage' target
which requires the root path.
Thanks to @gemini-code-assist for catching this copy-paste error.
## Problem
- Kiro API endpoints were hardcoded to us-east-1 region
- Enterprise users in other regions (e.g., ap-northeast-2) experienced
significant latency (200-400x slower) due to cross-region API calls
- This is the API endpoint counterpart to quotio PR #241 which fixed
token refresh endpoints
## Solution
- Add buildKiroEndpointConfigs(region) function for dynamic endpoint generation
- Extract region from auth.Metadata["region"] field
- Fallback to us-east-1 for backward compatibility
- Use case-insensitive authMethod comparison (consistent with quotio PR #252)
## Changes
- Add kiroDefaultRegion constant
- Convert hardcoded endpoint URLs to dynamic fmt.Sprintf with region
- Update getKiroEndpointConfigs to extract and use region from auth
- Fix isIDCAuth to use case-insensitive comparison
## Testing
- Backward compatible: defaults to us-east-1 when no region specified
- Enterprise users can now use their local region endpoints
Related:
- quotio PR #241: Dynamic region for token refresh (merged)
- quotio PR #252: authMethod case-insensitive fix
- quotio Issue #253: Performance issue report
Implement `request_retry` and `disable_cooling` metadata overrides for authentication management. Update retry and cooling logic accordingly across `Manager`, Antigravity executor, and file synthesizer. Add tests to validate new behaviors.
Refactor 401 error handling in both executeWithRetry and
executeStreamWithRetry to always attempt token refresh regardless of
remaining retry attempts. Previously, token refresh was only attempted
when retries remained, which could leave valid refreshed tokens unused.
Also add auth directory resolution in RefreshManager.Initialize to
properly resolve the base directory path before creating the token
repository.
Previously GeminiModels handler unconditionally overwrote displayName
and description with the model name, losing the original values defined
in model definitions (e.g., 'Gemini 3 Pro Preview').
Now only set these fields as fallback when they are missing or empty.
Add support for Imagen 3.0 and 4.0 image generation models in Vertex AI:
- Add 5 Imagen model definitions (4.0, 4.0-ultra, 4.0-fast, 3.0, 3.0-fast)
- Implement :predict action routing for Imagen models
- Convert Imagen request/response format to match Gemini structure like gemini-3-pro-image
- Transform prompts to Imagen's instances/parameters format
- Convert base64 image responses to Gemini-compatible inline data
Update ApplyThinking signature to accept fromFormat and toFormat parameters
instead of a single provider string. This enables:
- Proper level-to-budget conversion when source is level-based (openai/codex)
and target is budget-based (gemini/claude)
- Strict budget range validation when source and target formats match
- Level clamping to nearest supported level for cross-format requests
- Format alias resolution in SDK translator registry for codex/openai-response
Also adds ErrBudgetOutOfRange error code and improves iflow config extraction
to fall back to openai format when iflow-specific config is not present.
- Introduced `payloadModelAliases` and `payloadModelCandidates` functions to support model aliases for improved flexibility.
- Updated rule matching logic to handle multiple model candidates.
- Refactored variable naming in executor to improve code clarity and consistency.
- Added logic to transform `inputResults` into structured JSON for improved processing.
- Removed redundant `safety_identifier` field in executor payload to streamline requests.
- Add x-amzn-kiro-agent-mode: vibe for non-IDC auth (Social, Builder ID)
IDC auth continues to use "spec" mode
- Add x-amzn-codewhisperer-optout: true for all auth types
This opts out of data sharing for service improvement (privacy)
These changes align with other Kiro implementations (kiro.rs, KiroGate,
kiro-gateway, AIClient-2-API) and make requests more similar to real
Kiro IDE clients.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>