fix: add "audio" to openai provider capabilities

The openai provider implements transcribeAudio via
transcribeOpenAiCompatibleAudio (Whisper API), but its capabilities
array only declared ["image"]. This caused the media-understanding
runner to skip the openai provider when processing inbound audio
messages, resulting in raw audio files being passed to agents
instead of transcribed text.

Fix: Add "audio" to the capabilities array so the runner correctly
selects the openai provider for audio transcription.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
openjay
2026-02-09 22:51:19 +08:00
committed by Peter Steinberger
parent 6a425d189e
commit 76d6514ff5

View File

@@ -4,7 +4,7 @@ import { transcribeOpenAiCompatibleAudio } from "./audio.js";
export const openaiProvider: MediaUnderstandingProvider = {
id: "openai",
capabilities: ["image"],
capabilities: ["image", "audio"],
describeImage: describeImageWithModel,
transcribeAudio: transcribeOpenAiCompatibleAudio,
};