Commit Graph

55 Commits

Author SHA1 Message Date
Luis Pater
cf369d4684 Merge branch 'router-for-me:main' into main 2025-12-29 22:41:44 +08:00
Chén Mù
f39a460487 Merge pull request #761 from router-for-me/log
fix(logging): improve request/response capture
2025-12-28 16:13:10 -08:00
Luis Pater
ee171bc563 feat(api): add ManagementTokenRequester interface for management token request endpoints 2025-12-29 02:42:29 +08:00
hkfires
a95428f204 fix(handlers): preserve upstream response logs before duplicate detection 2025-12-28 22:35:36 +08:00
hkfires
3ca5fb1046 fix(handlers): match raw error text before JSON body for duplicate detection 2025-12-28 19:35:36 +08:00
hkfires
a091d12f4e fix(logging): improve request/response capture 2025-12-28 19:04:31 +08:00
Luis Pater
d473c952fb Merge branch 'main' into plus 2025-12-28 00:56:04 +08:00
hkfires
09455f9e85 fix(config): make streaming keepalive and retries ints 2025-12-27 20:56:47 +08:00
Luis Pater
9fe6a215e6 Merge branch 'router-for-me:main' into main 2025-12-26 05:10:19 +08:00
Thai Nguyen Hung
54f71aa273 fix(test): remove extra argument from ExecuteStreamWithAuthManager call 2025-12-25 21:55:35 +07:00
Luis Pater
8fee16aecd Merge PR #59: v6.6.50 (resolve conflicts) 2025-12-24 11:06:10 +08:00
hkfires
e76ba0ede9 feat(logging): implement request ID tracking and propagation 2025-12-24 08:32:17 +08:00
Luis Pater
e592a57458 Merge branch 'router-for-me:main' into main 2025-12-24 04:25:06 +08:00
Luis Pater
f413feec61 refactor(handlers): streamline error and data channel handling in streaming logic
Improved consistency across OpenAI, Claude, and Gemini handlers by replacing initial `select` statement with a `for` loop for better readability and error-handling robustness.
2025-12-24 04:07:24 +08:00
gwizz
5bf89dd757 fix: keep streaming defaults legacy-safe 2025-12-23 00:53:18 +11:00
gwizz
4442574e53 fix: stop streaming loop on context cancel 2025-12-23 00:37:55 +11:00
gwizz
71a6dffbb6 fix: improve streaming bootstrap and forwarding 2025-12-22 23:34:23 +11:00
Luis Pater
5f65dd5bb4 Merge branch 'router-for-me:main' into main 2025-12-22 16:58:26 +08:00
moxi
830fd8eac2 Fix responses-format handling for chat completions 2025-12-22 13:54:02 +08:00
Luis Pater
40d78908ed Merge branch 'router-for-me:main' into main 2025-12-20 04:50:39 +08:00
Luis Pater
8a5db02165 Fixed: #607
refactor(config): re-export internal configuration types for SDK consumers
2025-12-20 04:49:02 +08:00
Luis Pater
344066fd11 refactor(api): remove unused OpenAI compatibility provider logic
Simplify handler logic by removing OpenAI compatibility provider management, including related mutex handling and configuration updates.
2025-12-17 02:58:14 +08:00
Luis Pater
1efade8bdb Merge branch 'main' into plus 2025-12-17 02:50:14 +08:00
Luis Pater
670685139a fix(api): update route patterns to support wildcards for Gemini actions
Normalize action handling by accommodating wildcard patterns in route definitions for Gemini endpoints. Adjust `request.Action` parsing logic to correctly process routes with prefixed actions.
2025-12-17 01:17:02 +08:00
Luis Pater
52b6306388 feat(config): add support for model prefixes and prefix normalization
Refactor model management to include an optional `prefix` field for model credentials, enabling better namespace handling. Update affected configuration files, APIs, and handlers to support prefix normalization and routing. Remove unused OpenAI compatibility provider logic to simplify processing.
2025-12-17 01:07:26 +08:00
Luis Pater
59ac1a3f60 Merge branch 'main' into plus 2025-12-15 23:53:23 +08:00
hkfires
3bc489254b fix(api): prevent double logging for error responses
The WriteErrorResponse function now caches the error response body in the gin context.

The deferred request logger checks for this cached response. If an error response is found, it bypasses the standard response logging. This prevents scenarios where an error is logged twice or an empty payload log overwrites the original, more detailed error log.
2025-12-15 16:36:01 +08:00
hkfires
4c07ea41c3 feat(api): return structured JSON error responses
The API error handling is updated to return a structured JSON payload
instead of a plain text message. This provides more context and allows
clients to programmatically handle different error types.

The new error response has the following structure:
{
  "error": {
    "message": "...",
    "type": "..."
  }
}

The `type` field is determined by the HTTP status code, such as
`authentication_error`, `rate_limit_error`, or `server_error`.

If the underlying error message from an upstream service is already a
valid JSON string, it will be preserved and returned directly.

BREAKING CHANGE: API error responses are now in a structured JSON
format instead of plain text. Clients expecting plain text error
messages will need to be updated to parse the new JSON body.
2025-12-15 16:19:52 +08:00
Luis Pater
79033aee34 Merge branch 'main' into plus 2025-12-14 00:07:46 +08:00
Ravens2121
db80b20bc2 feat(kiro): enhance thinking support and fix truncation issues
- **Thinking Support**:
    - Enabled thinking support for all Kiro Claude models, including Haiku 4.5 and agentic variants.
    - Updated `model_definitions.go` with thinking configuration (Min: 1024, Max: 32000, ZeroAllowed: true).
    - Fixed `extended_thinking` field names in `model_registry.go` (from `min_budget`/`max_budget` to `min`/`max`) to comply with Claude API specs, enabling thinking control in clients like Claude Code.

- **Kiro Executor Fixes**:
    - Fixed `budget_tokens` handling: explicitly disable thinking when budget is 0 or negative.
    - Removed aggressive duplicate content filtering logic that caused truncation/data loss.
    - Enhanced thinking tag parsing with `extractThinkingFromContent` to correctly handle interleaved thinking/text blocks.
    - Added EOF handling to flush pending thinking tag characters, preventing data loss at stream end.

- **Performance**:
    - Optimized Claude stream handler (v6.2) with reduced buffer size (4KB) and faster flush interval (50ms) to minimize latency and prevent timeouts.
2025-12-13 03:57:13 +08:00
teeverc
5ab3032335 Update sdk/api/handlers/claude/code_handlers.go
thank you gemini

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
2025-12-12 00:26:01 -08:00
teeverc
1215c635a0 fix: flush Claude SSE chunks immediately to match OpenAI behavior
- Write each SSE chunk directly to c.Writer and flush immediately
- Remove buffered writer and ticker-based flushing that caused delayed output
- Add 500ms timeout case for consistency with OpenAI/Gemini handlers
- Clean up unused bufio import

This fixes the 'not streaming' issue where small responses were held
in the buffer until timeout/threshold was reached.

Amp-Thread-ID: https://ampcode.com/threads/T-019b1186-164e-740c-96ab-856f64ee6bee
Co-authored-by: Amp <amp@ampcode.com>
2025-12-12 00:14:19 -08:00
Luis Pater
35fdd7bc05 Merge branch 'router-for-me:main' into main 2025-12-12 08:54:36 +08:00
Luis Pater
6e2306a5f2 refactor(handlers): improve request logging and payload handling 2025-12-12 08:52:52 +08:00
Luis Pater
4ce7c61a17 Merge branch 'main' into plus 2025-12-11 21:33:49 +08:00
hkfires
88bdd25f06 fix(amp): set status on claude stream errors 2025-12-11 20:12:06 +08:00
Luis Pater
4360ed8a7b Merge branch 'router-for-me:main' into main 2025-12-11 03:17:55 +08:00
Luis Pater
423ce97665 feat(util): implement dynamic thinking suffix normalization and refactor budget resolution logic
- Added support for parsing and normalizing dynamic thinking model suffixes.
- Centralized budget resolution across executors and payload helpers.
- Retired legacy Gemini-specific thinking handlers in favor of unified logic.
- Updated executors to use metadata-based thinking configuration.
- Added `ResolveOriginalModel` utility for resolving normalized upstream models using request metadata.
- Updated executors (Gemini, Codex, iFlow, OpenAI, Qwen) to incorporate upstream model resolution and substitute model values in payloads and request URLs.
- Ensured fallbacks handle cases with missing or malformed metadata to derive models robustly.
- Refactored upstream model resolution to dynamically incorporate metadata for selecting and normalizing models.
- Improved handling of thinking configurations and model overrides in executors.
- Removed hardcoded thinking model entries and migrated logic to metadata-based resolution.
- Updated payload mutations to always include the resolved model.
2025-12-11 03:10:50 +08:00
Luis Pater
1fd1ccca17 Merge branch 'router-for-me:main' into main 2025-12-09 21:13:08 +08:00
hkfires
da23ddb061 fix(gemini): normalize model listing output 2025-12-09 17:34:15 +08:00
Mansi
02d8a1cfec feat(kiro): add AWS Builder ID authentication support
- Add --kiro-aws-login flag for AWS Builder ID device code flow
- Add DoKiroAWSLogin function for AWS SSO OIDC authentication
- Complete Kiro integration with AWS, Google OAuth, and social auth
- Add kiro executor, translator, and SDK components
- Update browser support for Kiro authentication flows
2025-12-05 22:46:24 +03:00
hkfires
6c17dbc4da style(amp): tidy whitespace in proxy module and tests 2025-11-26 18:57:26 +08:00
Luis Pater
a4a26d978e Fixed: #339
**feat(handlers, executor): add Gemini 3 Pro Preview support and refine Claude system instructions**

- Added support for the new "Gemini 3 Pro Preview" action in Gemini handlers, including detailed metadata and configuration.
- Removed redundant `cache_control` field from Claude system instructions for cleaner payload structure.
2025-11-26 11:42:57 +08:00
Luis Pater
506f1117dd **fix(handlers): refactor API response capture to append data safely**
- Introduced `appendAPIResponse` helper to preserve and append data to existing API responses.
- Ensured newline inclusion when appending, if necessary.
- Improved `nil` and data type checks for response handling.
- Updated middleware to skip request logging for `GET` requests.
2025-11-25 11:37:02 +08:00
Ben Vargas
70ee4e0aa0 chore: remove unused httpx sdk package 2025-11-19 21:17:52 -07:00
Ben Vargas
8193392bfe Add AMP fallback proxy and shared Gemini normalization
- add fallback handler that forwards Amp provider requests to ampcode.com when the provider isn’t configured locally
- wrap AMP provider routes with the fallback so requests always have a handler
- share Gemini thinking model normalization helper between core handlers and AMP fallback
2025-11-19 18:23:17 -07:00
Ben Vargas
9ad0f3f91e feat: Add Amp CLI integration with comprehensive documentation
Add full Amp CLI support to enable routing AI model requests through the proxy
while maintaining Amp-specific features like thread management, user info, and
telemetry. Includes complete documentation and pull bot configuration.

Features:
- Modular architecture with RouteModule interface for clean integration
- Reverse proxy for Amp management routes (thread/user/meta/ads/telemetry)
- Provider-specific route aliases (/api/provider/{provider}/*)
- Secret management with precedence: config > env > file
- 5-minute secret caching to reduce file I/O
- Automatic gzip decompression for responses
- Proper connection cleanup to prevent leaks
- Localhost-only restriction for management routes (configurable)
- CORS protection for management endpoints

Documentation:
- Complete setup guide (USING_WITH_FACTORY_AND_AMP.md)
- OAuth setup for OpenAI (ChatGPT Plus/Pro) and Anthropic (Claude Pro/Max)
- Factory CLI config examples with all model variants
- Amp CLI/IDE configuration examples
- tmux setup for remote server deployment
- Screenshots and diagrams

Configuration:
- Pull bot disabled for this repo (manual rebase workflow)
- Config fields: AmpUpstreamURL, AmpUpstreamAPIKey, AmpRestrictManagementToLocalhost
- Compatible with upstream DisableCooling and other features

Technical details:
- internal/api/modules/amp/: Complete Amp routing module
- sdk/api/httpx/: HTTP utilities for gzip/transport
- 94.6% test coverage with 34 comprehensive test cases
- Clean integration minimizes merge conflict risk

Security:
- Management routes restricted to localhost by default
- Configurable via amp-restrict-management-to-localhost
- Prevents drive-by browser attacks on user data

This provides a production-ready foundation for Amp CLI integration while
maintaining clean separation from upstream code for easy rebasing.

Amp-Thread-ID: https://ampcode.com/threads/T-9e2befc5-f969-41c6-890c-5b779d58cf18
2025-11-19 18:23:17 -07:00
TUGOhost
92f4278039 feat: add auto model resolution and model creation timestamp tracking
- Add 'created' field to model registry for tracking model creation time
- Implement GetFirstAvailableModel() to find the first available model by newest creation timestamp
- Add ResolveAutoModel() utility function to resolve "auto" model name to actual available model
- Update request handler to resolve "auto" model before processing requests
- Ensures automatic model selection when "auto" is specified as model name

This enables dynamic model selection based on availability and creation time, improving the user experience when no specific model is requested.
2025-11-11 20:30:09 +08:00
tobwen
e5ed2cba4a Add support for dynamic model providers
Implements functionality to parse model names with provider information in the format "provider://model" This allows dynamic provider selection rather than relying only on predefined mappings.

The change affects all execution methods to properly handle these dynamic model specifications while maintaining compatibility with the existing approach for standard model names.
2025-10-28 01:41:54 +01:00
Luis Pater
d225558dae feat: improve error handling with added status codes and headers
- Updated Execute methods to include enhanced error handling via `StatusCode` and `Headers` extraction.
- Introduced structured error responses for cooling down scenarios, providing additional metadata and retry suggestions.
- Refined quota management, allowing for differentiation between cool-down, disabled, and other block reasons.
- Improved model filtering logic based on client availability and suspension criteria.
2025-10-22 09:01:11 +08:00