Skip to content

Conversation

@jarodise
Copy link

@jarodise jarodise commented Jan 8, 2026

This PR integrates full support for the Chatterbox TTS model, including backend enhancements and a complete UI overhaul for voice cloning features.

Changes

1. Server API

  • Added (emotion control) and (guidance) to schema.
  • Added proper handling for base64 data (decodes to temp file).
  • Fixed default (set to 'en' for Chatterbox) to resolve pronunciation issues.

2. Web UI

  • Added conditional controls that only appear for Chatterbox models:
    • Reference Audio Upload: File picker for voice cloning.
    • Language Selector: Supports 23 languages (with English as default).
    • Sliders: Controls for Emotion Exaggeration and Guidance Weight.
  • Audio Player: Now displays the uploaded filename instead of the default voice name.
  • Download Fix: Fixed issue where downloads failed by properly handling audio blobs.
  • Improved UX: Added 5-minute timeout for slower generation and removed confusing static labels.

3. Documentation

  • Updated MLX_AUDIO_GUIDE.md with a comprehensive 'Web UI Usage' section.

This integration allows users to fully utilize Chatterbox's voice cloning and multilingual capabilities directly from the web interface.

- server: Add exaggeration and cfg_weight params
- server: Handle base64 ref_audio decoding
- ui: Add conditional controls for Chatterbox (Ref Audio, Language, Sliders)
- ui: Fix download button and audio blob handling
- ui: Set correct lang_code for Chatterbox
- ui: Show reference filename in audio player instead of default voice name
- ui: Fix download button to correctly save blob as MP3
- ui: Remove confusing 'English-detected' static dropdown
@Blaizzy
Copy link
Owner

Blaizzy commented Jan 8, 2026

Hey @jarodise, thanks for the awesome contribution!
The server changes look great. For the guide and hardcoded frontend variables, here are a couple of suggestions:

  • Dynamic language support – We could add a languages or supported_languages attribute to multilingual models, allowing the frontend to retrieve this from the server dynamically rather than relying on static values.
  • Documentation – We can include the guide as a README in the model folder, though I've already documented this on the model card on HF.

Let me know what you think!

If you agree we can revert other changes and keep server changes for this PR and make couple small PRs addressing the above.

@jarodise
Copy link
Author

jarodise commented Jan 8, 2026

Sure, sounds good! I don't have much experience in coding and this might be my very first code contribution to an open source project. :)

If I understood correctly, I should revert change for UI and documentation from my side?

@Blaizzy
Copy link
Owner

Blaizzy commented Jan 8, 2026

Amazing, that makes it even more special!

I'm here to help you with all your contributions

If it complex you can also open issues detailing the problem or vision and the community and I routinely pick it up

@Blaizzy
Copy link
Owner

Blaizzy commented Jan 12, 2026

If I understood correctly, I should revert change for UI and documentation from my side?

Yes :)

@mrbeals
Copy link

mrbeals commented Jan 24, 2026

I think implementing these changes for chatterbox is a really good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants