- **Thinking Support**:
- Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
- Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
- Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.
- **Kiro Executor Fixes**:
- Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
- Removed aggressive duplicate content filtering logic that caused truncation/data loss.
- Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
- Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.
- **Performance**:
- Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
NormalizeThinkingModel now checks ModelSupportsThinking before removing
"-thinking" or "-thinking-<ver>", avoiding accidental parsing of model
names where the suffix is part of the official id (e.g., kimi-k2-thinking,
qwen3-235b-a22b-thinking-2507).
The registry adds ThinkingSupport metadata for several models and
propagates it via ModelInfo (e.g., kimi-k2-thinking, deepseek-r1,
qwen3-235b-a22b-thinking-2507, minimax-m2), enabling accurate detection
of thinking-capable models and correcting base model inference.
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
* feat(api): add comprehensive ampcode management endpoints
Add new REST API endpoints under /v0/management/ampcode for managing
ampcode configuration including upstream URL, API key, localhost
restriction, model mappings, and force model mappings settings.
- Move force-model-mappings from config_basic to config_lists
- Add GET/PUT/PATCH/DELETE endpoints for all ampcode settings
- Support model mapping CRUD with upsert (PATCH) capability
- Add comprehensive test coverage for all ampcode endpoints
* refactor(api): simplify request body parsing in ampcode handlers
* feat(logging): add upstream API request/response capture to streaming logs
* style(logging): remove redundant separator line from response section
* feat(antigravity): enforce thinking budget limits for Claude models
* refactor(logging): remove unused variable in `ensureAttempt` and redundant function call
---------
Co-authored-by: hkfires <10558748+hkfires@users.noreply.github.com>
- Add --kiro-aws-login flag for AWS Builder ID device code flow
- Add DoKiroAWSLogin function for AWS SSO OIDC authentication
- Complete Kiro integration with AWS, Google OAuth, and social auth
- Add kiro executor, translator, and SDK components
- Update browser support for Kiro authentication flows
Add complete GitHub Copilot support including:
- Device flow OAuth authentication via GitHub's official client ID
- Token management with automatic caching (25 min TTL)
- OpenAI-compatible API executor for api.githubcopilot.com
- 16 model definitions (GPT-5 variants, Claude variants, Gemini, Grok, Raptor)
- CLI login command via -github-copilot-login flag
- SDK authenticator and refresh registry integration
Enables users to authenticate with their GitHub Copilot subscription and
use it as a backend provider alongside existing providers.
- Added `ContextLength` field with a value of 200,000 to all applicable Claude model definitions.
- Standardized `MaxCompletionTokens` values across models for consistency and alignment.
Fixes an issue where Claude thinking models would return 400 errors when
the thinking.budget_tokens was greater than or equal to max_tokens.
Changes:
- Add MaxCompletionTokens: 128000 to all Claude thinking model definitions
- Add ensureMaxTokensForThinking() function in claude_executor.go that:
- Checks if thinking is enabled with a budget_tokens value
- Looks up the model's MaxCompletionTokens from the registry
- Ensures max_tokens is set to at least the model's MaxCompletionTokens
- Falls back to budget_tokens + 4000 buffer if registry lookup fails
This ensures Anthropic API constraint (max_tokens > thinking.budget_tokens)
is always satisfied when using extended thinking features.
Fixes: #339🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
- Added new `Gemini 3 Pro Image Preview` model with detailed metadata and configuration.
- Removed outdated `Claude Sonnet 4.5 Thinking` model definition for cleanup and relevance.