Refactor 401 error handling in both executeWithRetry and
executeStreamWithRetry to always attempt token refresh regardless of
remaining retry attempts. Previously, token refresh was only attempted
when retries remained, which could leave valid refreshed tokens unused.
Also add auth directory resolution in RefreshManager.Initialize to
properly resolve the base directory path before creating the token
repository.
- Add x-amzn-kiro-agent-mode: vibe for non-IDC auth (Social, Builder ID)
IDC auth continues to use "spec" mode
- Add x-amzn-codewhisperer-optout: true for all auth types
This opts out of data sharing for service improvement (privacy)
These changes align with other Kiro implementations (kiro.rs, KiroGate,
kiro-gateway, AIClient-2-API) and make requests more similar to real
Kiro IDE clients.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The Q endpoint was using `/` which caused all requests to fail with
400 or UnknownOperationException. Changed to `/generateAssistantResponse`
which is the correct path for the Q endpoint.
This fix restores the Q endpoint failover functionality.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add persistRefreshedAuth function to write refreshed tokens back to the
auth file after inline token refresh. This prevents repeated token
refreshes on every request when the token expires.
Changes:
- Add persistRefreshedAuth() to kiro_executor.go
- Call persist after all token refresh paths (401, 403, pre-request)
- Remove unused log import from sdk/auth/kiro.go
- **Thinking Support**:
- Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
- Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
- Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.
- **Kiro Executor Fixes**:
- Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
- Removed aggressive duplicate content filtering logic that caused truncation/data loss.
- Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
- Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.
- **Performance**:
- Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
## Changes Overview
This commit includes multiple improvements to the Kiro executor for better
stability, API compatibility, and code quality.
## Detailed Changes
### 1. Output Token Calculation Improvement (lines 317-330)
- Replace simple len(content)/4 estimation with tiktoken-based calculation
- Add fallback to character count estimation if tiktoken fails
- Improves token counting accuracy for usage tracking
### 2. Stream Handler Panic Recovery (lines 528-533)
- Add defer/recover block in streamToChannel goroutine
- Prevents single request crashes from affecting the entire service
### 3. Struct Field Reordering (lines 670-673)
- Reorder kiroToolResult struct fields: Content, Status, ToolUseID
- Ensures consistency with API expectations
### 4. Message Merging Function (lines 778-780, 2356-2483)
- Add mergeAdjacentMessages() to combine consecutive messages with same role
- Add helper functions: mergeMessageContent(), blockToMap(), createMergedMessage()
- Required by Kiro API which doesn't allow adjacent messages from same role
### 5. Empty Content Handling (lines 791-800)
- Add default content for empty history messages
- User messages with tool results: "Tool results provided."
- User messages without tool results: "Continue"
### 6. Assistant Last Message Handling (lines 811-830)
- Detect when last message is from assistant
- Create synthetic "Continue" user message to satisfy Kiro API requirements
- Kiro API requires currentMessage to be userInputMessage type
### 7. Duplicate Content Event Detection (lines 1650-1660)
- Track lastContentEvent to detect duplicate streaming events
- Skip redundant events to prevent duplicate content in responses
- Based on AIClient-2-API implementation for Kiro
### 8. Streaming Token Calculation Enhancement (lines 1785-1817)
- Add accumulatedContent buffer for streaming token calculation
- Use tiktoken for accurate output token counting during streaming
- Add fallback to character count estimation with proper logging
### 9. JSON Repair Enhancement (lines 2665-2818)
- Implement conservative JSON repair strategy
- First try to parse JSON directly - if valid, return unchanged
- Add bracket balancing detection and repair
- Only repair when necessary to avoid corrupting valid JSON
- Validate repaired JSON before returning
### 10. HELIOS_CHK Filtering Removal (lines 2500-2504, 3004-3039)
- Remove filterHeliosDebugInfo function
- Remove heliosDebugPattern regex
- HELIOS_CHK fields now handled by client-side processing
### 11. Comment Translation
- Translate Chinese comments to English for code consistency
- Affected areas: token calculation, buffer handling, message processing
- Add --kiro-aws-login flag for AWS Builder ID device code flow
- Add DoKiroAWSLogin function for AWS SSO OIDC authentication
- Complete Kiro integration with AWS, Google OAuth, and social auth
- Add kiro executor, translator, and SDK components
- Update browser support for Kiro authentication flows