When adjustedBudget < minBudget, the previous fix blindly set
max_tokens = budgetTokens+1 which could exceed MaxCompletionTokens.
Now: cap max_tokens at MaxCompletionTokens, recalculate budget, and
disable thinking entirely if constraints are unsatisfiable.
Add unit tests covering raise, clamp, disable, and no-op scenarios.
1. Always include interleaved-thinking-2025-05-14 beta header so that
thinking blocks are returned correctly for all Claude models.
2. Remove status-code guard in AMP reverse proxy ModifyResponse so that
error responses (4xx/5xx) with hidden gzip encoding are decoded
properly — prevents garbled error messages reaching the client.
3. In normalizeClaudeBudget, when the adjusted budget falls below the
model minimum, set max_tokens = budgetTokens+1 instead of leaving
the request unchanged (which causes a 400 from the API).
Add support for Claude's "adaptive" and "auto" thinking modes using `output_config.effort`. Introduce support for new effort level "max" in adaptive thinking. Update thinking logic, validate model capabilities, and extend converters and handling to ensure compatibility with adaptive modes. Adjust static model data with supported levels and refine handling across translators and executors.