-
Notifications
You must be signed in to change notification settings - Fork 30.2k
Description
System Info
Hello,
Description:
I'm experiencing issues with the Whisper-Large-v3-turbo model when using it for transcription tasks with the Transformers library (version 4.38.3).
Problems:
Incorrect word timestamps: The timestamps generated by the model are not accurate. I've noticed that the timestamps are often incorrect.

Word repetitions: I've also noticed that the model is repeating words in the transcription output. I've tried setting the repetition_penalty to 1.2, which has helped to reduce the repetitions, but the issue is not completely resolved.
Best regards
Who can help?
No response
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Load the Whisper-Large-v3-turbo model using the Transformers library (version 4.38.3).
Use the model to transcribe an audio file.
Observe the word timestamps and transcription output.
Expected behavior
Accurate word timestamps.
No word repetitions in the transcription output.