When an assistant message appears after tool results without a pending
user message, append it to the last turn's assistant text instead of
dropping it. Also add bakeToolResultsIntoTurns() to merge tool results
into turn context when no active H2 session exists for resume, ensuring
the model sees the full tool interaction history in follow-up requests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rewrite tool call mechanism from interrupt-resume to inline-wait mode:
processH2SessionFrames no longer exits on mcpArgs; instead blocks on
toolResultCh while continuing to handle KV/heartbeat messages, then
sends MCP result and continues processing text in the same goroutine.
Fixes the issue where server stopped generating text after resume.
- Add switchable output channel (outMu/currentOut) so first HTTP response
closes after tool_calls+[DONE], and resumed text goes to a new channel
returned by resumeWithToolResults. Reset streamParam on switch so
Translator produces fresh message_start/content_block_start events.
- Implement send-side H2 flow control: track server's initial window size
and WINDOW_UPDATE increments; Write() blocks when window exhausted.
Fixes RST_STREAM FLOW_CONTROL_ERROR on large requests (178KB+).
- Decode new InteractionUpdate fields: TurnEndedUpdate (field 14) as
stream termination signal, HeartbeatUpdate (field 13) silently ignored,
TokenDeltaUpdate (field 8) for token usage tracking.
- Include token usage in final stop chunk (prompt_tokens estimated from
payload size, completion_tokens from accumulated TokenDeltaUpdate deltas)
so Claude CLI status bar shows non-zero token counts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>