-
Notifications
You must be signed in to change notification settings - Fork 30.2k
Add callback to monitor progress in whisper transcription #37483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
Example code:
Will output progress as:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not!
happy to add this if you can also provide an example of a useful monitor progress a a tad bit of doc! 🤗
https://colab.research.google.com/drive/1wEIr1m7-D-EN9M_ygo388bjF8dMK1Oan?usp=sharing demonstrates the callback through a tqdm progress bar in a notebook. As Colab can be sluggish, here is a real-time screen recording showing the same notebook doing a transcription of a 10 minute audio file on a Macbook M4 Max using mps: https://www.dropbox.com/scl/fi/6dzdh1konw5aj7iufr6b6/whisper_progress_monitor.mov?rlkey=kycpt3o4h84e6pzhbp7as8eft&st=ulwz502l&dl=0 This new monitor callback would also be very useful for a web app that allows running various ML tasks via UI that I am currently developing for my PhD thesis. |
cc @ebezzam ! |
@poke1024 thanks for the contribution! Indeed it's a nice feature to keep track of progress for long transcription 👏 Could you resync with main? Your current snippet didn't work for me. I had to directly pass generate_kwargs = {
"monitor_progress": monitor_progress,
}
result = pipe(sample, return_timestamps=True, generate_kwargs=generate_kwargs) |
716a2c8
to
14b399d
Compare
@ebezzam Synced with main, updated the Collab example code to your version (the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @poke1024! Yes the failing tests are unrelated.
I've also updated the docstrings to show an example of your new feature.
@ArthurZucker LGTM for merging 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks both!
[For maintainers] Suggested jobs to run (before merge) run-slow: whisper |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This PR adds a callback in the
generate
function ofWhisperGenerationMixin
to give callers the ability to monitor progress for whisper transcriptions.This is useful in settings where transcription happens in a notebook or UI settings, and callers want to provide users with a progress bar or similar feedback on the progress of long running calls (e.g. >1 minute).
Reviewer suggestion: @eustlb