Skip to content

Conversation

@akshatvishu
Copy link

@akshatvishu akshatvishu commented Nov 4, 2025

Closes #8996

Description:

The dspy.Audio.from_file (and from_url) method relies on Python's mimetypes.guess_type() to determine the audio format. On some operating systems, this function can return non-standard MIME types, such as audio/x-wav for .wav files.

These non-standard format strings, often prefixed with x- (like x-wav or x-m4a), are then passed to the LLM API (e.g., OpenAI). This can cause a 400 BadRequestError, as the API typically only accepts compliant formats (e.g., wav, m4a).

This patch adds a check to from_file, from_url, and the data URI branch of encode_audio to normalize these formats by removing any x- prefix, ensuring an API-compliant format is always sent.

@TomeHirata
Copy link
Collaborator

Thanks @akshatvishu, can you add a unit test?

@akshatvishu
Copy link
Author

akshatvishu commented Nov 6, 2025

@TomeHirata Added the unit-test and also I slightly changed the logic and used the removeprefix() instead of the replace() to safely remove only the prefix from audio format strings, preventing unintended replacements if "x-" appears elsewhere in the format.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] LiteLLM exception when using OpenAI Speech-To-Text models

2 participants