Cursor executor errors were plain fmt.Errorf — the conductor couldn't
extract HTTP status codes, so exhausted accounts never entered cooldown.
Changes:
- Add ConnectError struct to proto/connect.go: ParseConnectEndStream now
returns *ConnectError with Code/Message fields for precise matching
- Add cursorStatusErr implementing StatusError + RetryAfter interfaces
- Add classifyCursorError() with two-layer classification:
Layer 1: exact match on ConnectError.Code (gRPC standard codes)
resource_exhausted → 429, unauthenticated → 401,
permission_denied → 403, unavailable → 503, internal → 500
Layer 2: fuzzy string match for H2 errors (RST_STREAM → 502)
- Log all ConnectError code/message pairs for observing real server
error codes (we have no samples yet)
- Wrap Execute and ExecuteStream error returns with classifyCursorError
Now the conductor properly marks Cursor auths as cooldown on quota errors,
enabling exponential backoff and round-robin failover.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a Cursor account's quota is exhausted, sessions bound to it can now
seamlessly continue on a different account:
Layer 1 — Checkpoint decoupling:
Key checkpoints by conversationId (not authID:conversationId). Store
authID inside savedCheckpoint. On lookup, if auth changed, discard the
stale checkpoint and flatten conversation history into userText.
Layer 2 — Cross-account session cleanup:
When a request arrives for a conversation whose session belongs to a
different (now-exhausted) auth, close the old H2 stream and remove
the stale session to free resources.
Layer 3 — H2Stream.Err() exposure:
New Err() method on H2Stream so callers can inspect RST_STREAM,
GOAWAY, or other stream-level errors after closure.
Layer 4 — processH2SessionFrames error propagation:
Returns error instead of bare return. Connect EndStream errors (quota,
rate limit) are now propagated instead of being logged and swallowed.
Layer 5 — Pre-response transparent retry:
If the stream fails before any data is sent to the client, return an
error to the conductor so it retries with a different auth — fully
transparent to the client.
Layer 6 — Post-response error logging:
If the stream fails after data was already sent, log a warning. The
conductor's existing cooldown mechanism ensures the next request routes
to a healthy account.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add cursor/filename.go for multi-account credential file naming
- Include auth.ID in session and checkpoint keys for per-account isolation
- Record authID in cursorSession, validate on resume to prevent cross-account access
- Management API /cursor-auth-url supports ?label= for creating named accounts
- Leverages existing conductor round-robin + failover framework
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Capture conversation_checkpoint_update from Cursor server (was ignored)
- Store checkpoint per conversationId, replay as conversation_state on next request
- Use protowire to embed raw checkpoint bytes directly (no deserialization)
- Extract session_id from Claude Code metadata for stable conversationId across resume
- Flatten conversation history into userText as fallback when no checkpoint available
- Use conversationId as session key for reliable tool call resume
- Add checkpoint TTL cleanup (30min)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract the cch hash from Claude Code's billing header in the system
prompt (x-anthropic-billing-header: ...cch=XXXXX;) and use it to derive
a deterministic conversation_id instead of generating a random UUID.
Same Claude Code session → same cch → same conversation_id → Cursor
server can reuse conversation state across multiple turns, preserving
tool call results and other context without re-encoding history.
Also cleans up temporary debug logging from previous iterations.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Include system prompt prefix (first 200 chars) in session key derivation.
Claude Code sessions have unique system prompts containing cwd, session_id,
file paths, etc., making collisions between concurrent sessions from the
same user virtually impossible.
Session key now = SHA256(apiKey + model + systemPrompt[:200] + firstUserMsg)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Include client API key in session key derivation to prevent different
users sharing the same proxy from accidentally resuming each other's
H2 streams when they send identical first messages with the same model.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When an assistant message appears after tool results without a pending
user message, append it to the last turn's assistant text instead of
dropping it. Also add bakeToolResultsIntoTurns() to merge tool results
into turn context when no active H2 session exists for resume, ensuring
the model sees the full tool interaction history in follow-up requests.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rewrite tool call mechanism from interrupt-resume to inline-wait mode:
processH2SessionFrames no longer exits on mcpArgs; instead blocks on
toolResultCh while continuing to handle KV/heartbeat messages, then
sends MCP result and continues processing text in the same goroutine.
Fixes the issue where server stopped generating text after resume.
- Add switchable output channel (outMu/currentOut) so first HTTP response
closes after tool_calls+[DONE], and resumed text goes to a new channel
returned by resumeWithToolResults. Reset streamParam on switch so
Translator produces fresh message_start/content_block_start events.
- Implement send-side H2 flow control: track server's initial window size
and WINDOW_UPDATE increments; Write() blocks when window exhausted.
Fixes RST_STREAM FLOW_CONTROL_ERROR on large requests (178KB+).
- Decode new InteractionUpdate fields: TurnEndedUpdate (field 14) as
stream termination signal, HeartbeatUpdate (field 13) silently ignored,
TokenDeltaUpdate (field 8) for token usage tracking.
- Include token usage in final stop chunk (prompt_tokens estimated from
payload size, completion_tokens from accumulated TokenDeltaUpdate deltas)
so Claude CLI status bar shows non-zero token counts.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>