-
-
Notifications
You must be signed in to change notification settings - Fork 412
feat(stt): add support for MedASR (Lasr architecture) #376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This commit adds full support for the MedASR model (Lasr architecture) to mlx-audio. Changes: - Added implementation (LasrEncoder, LasrForCTC). - Updated and to support model type. - Fixed model loading logic in to correctly handle mapped model types. - Added for weight conversion from Hugging Face. - Added examples: - : File-based transcription. - : Live transcription with TEN-VAD and ring buffer.
Blaizzy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
Thanks for the feedback, King! The torch and mps in the example scripts was to roughly compare the speed gain with mlx; I've removed them from the example scripts. |
|
I removed another unnecessary file and recommitted. Sorry for the mess. |
Blaizzy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost there. A few final nits and we are ready to merge.
…e method, update example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
WARNING! This is my first PR sorry if it's crappy.
I've gotta come clean here; Antigravity did everything, but I made sure it works !
This PR adds support for Google's MedASR model (Lasr architecture).
Changes:
mlx_audio/stt/models/lasr/implementation (LasrEncoder, LasrForCTC).utils.pyto supportlasr_ctcmodel type.utils.py: Added fallback to mapped model type inget_model_classto correctly resolvelasr_ctc->lasrwhen model name doesn't match a directory.examples/medasr_transcribe.py.Verification:
google/medasr.