-
Notifications
You must be signed in to change notification settings - Fork 30.2k
fix unexpected kws of input_ids when setup no speech detection of whisper #36809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
fix unexpected kws of input_ids when setup no speech detection of whisper #36809
Conversation
…ection of whisper
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. When it is ready for review, please click the |
Hey @zhaozhenyu-newsbreak 👋 Thank you for opening the PR! Could you share a short reproducible script this PR is meant to fix? I confess I'm not seeing the issue :) |
Of course! I follow the instruction of whisper_large_v3 with version of tranformers (4.49.0)
but I encountered an issue
I think the function of _setup_no_speech_detection will add a "input_ids" kw argument, but the forward function of WhisperForConditionalGeneration does not support this argument. Adding a **kwargs into the forward function will fix this issue, but I'm not sure it is an elegant fixing |
It's expecting |
Yeah, adding I would gladly accept a PR that corrects the input preparation, rather than changing the signature of (P.S.: I've edited the PR header, to avoid pinging everyone :) ) |
Actually, changing the "input_ids" to "decoder_input_ids" will also cause a problem. Because the GenerationMixin.prepare_inputs_for_generation() must take a positional argument: "input_ids". It may be complex to fix the issue elegantly, and I need more time to deepdive. Thanks! |
I'm also experiencing this issue as i attempt to upgrade |
cc @ebezzam if you can have a look |
fix unexpected kws of input_ids when setup no speech detection of whisper
What does this PR do?
To fix the unexpected keyword arguments error of input_ids when use WhisperForConditionalGeneration or the pipeline of " pipeline" to do ASR job with setting up the no_speech_threshold, which is similar with https://discuss.huggingface.co/t/unexpected-keywork-argument/91356 but not the same
Fixes # (issue)
The root reason is that the function of _setup_no_speech_detection will add a "input_ids" kw argument, but the forward function of WhisperForConditionalGeneration does not support this argument. I think change the input_ids to decoder_input_ids will fix this issue
Before submitting
Pull Request section?
to it if that's the case.
https://discuss.huggingface.co/t/unexpected-keywork-argument/91356/2 but
documentation guidelines, and
here are tips on formatting docstrings.