-
Notifications
You must be signed in to change notification settings - Fork 29.7k
Fix Qwen2AudioForConditionalGeneration.forward()
#39503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening the PR 🙏
We'll need to pass cache_position
to self.language_model
as well. The non-legacy
processing branch in Qwen2AudioForConditionalGeneration.forward
only prepares inputs_embeds
(and not position-based tensors, like position_ids
or attention_mask
), so further processing of cache_position
probably isn't needed. I'm assuming that before this fix, we were relying on the language_model
's default values for cache_position
, so in practice this change won't have an impact on the outputs from interfaces like generate
.
Let's make sure the slow tests are passing after the changes!
@ebezzam after your changes, remove this line as well. If this was the only cache-related issue, then CI will be happy :D (the line I linked prevents |
This is such an important patch. Thankyou so much, was having so many issues in just inference. |
Thank you @gante for the pointers! I can confirm that slow tests are passing: # RUN_SLOW=1 pytest tests/models/qwen2_audio/test_modeling_qwen2_audio.py
...
============= 15 passed, 33 skipped in 45.03s ============= UPDATE after seeing failing CII tried different combinations of:
# COMMAND: RUN_SLOW=1 pytest tests/models/qwen2_audio
# -- without `all_generative_model_classes = ()` and without `GenerationTesterMixin`
# -- similar to https://app.circleci.com/pipelines/github/huggingface/transformers/138686/workflows/39334258-3551-4727-9629-adcf04e95a67/jobs/1838253
FAILED tests/models/qwen2_audio/test_modeling_qwen2_audio.py::Qwen2AudioForConditionalGenerationModelTest::test_generation_tester_mixin_inheritance - AssertionError: False is not true : This model can call `generate` from `GenerationMixin`, so one of two things must happen: 1) the tester must inherit from `GenerationTesterMixi...
==== 1 failed, 93 passed, 77 skipped, 4 warnings in 164.27s (0:02:44) =========
# -- without `all_generative_model_classes = ()` and with `GenerationTesterMixin`
FAILED tests/models/qwen2_audio/test_modeling_qwen2_audio.py::Qwen2AudioForConditionalGenerationModelTest::test_assisted_decoding_matches_greedy_search_0_random - RuntimeError: The size of tensor a (2) must match the size of tensor b (26) at non-singleton dimension 1
FAILED tests/models/qwen2_audio/test_modeling_qwen2_audio.py::Qwen2AudioForConditionalGenerationModelTest::test_assisted_decoding_matches_greedy_search_1_same - RuntimeError: The size of tensor a (2) must match the size of tensor b (27) at non-singleton dimension 1
FAILED tests/models/qwen2_audio/test_modeling_qwen2_audio.py::Qwen2AudioForConditionalGenerationModelTest::test_assisted_decoding_sample - RuntimeError: The size of tensor a (2) must match the size of tensor b (27) at non-singleton dimension 1
==== 3 failed, 124 passed, 87 skipped, 4 warnings in 179.77s (0:02:59) ======
# with `all_generative_model_classes = ()` and without `GenerationTesterMixin`
==== 95 passed, 76 skipped, 4 warnings in 165.05s (0:02:45) ======= |
remaining failing tests should be fixed adding this method to ensure we don't trigger merging input features when not in prefil stage |
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen2_audio |
cache_position
to Qwen2AudioForConditionalGeneration.forward()
Qwen2AudioForConditionalGeneration.forward()
What does this PR do?
Fixes breaking Qwen2Audio tests: https://github.com/huggingface/transformers/actions/runs/16361063842/job/46229139255
Errors don't display in above Model CI tests, but this is what I got:
In short,
cache_position
was missing inforward
ofQwen2AudioForConditionalGeneration
and adding it resolves the tests 👍However, it probably needs to be processed in some way? Like in
Qwen2VLForConditionalGeneration
?If it's similar as in
Qwen2VLForConditionalGeneration
happy to do it!cc @eustlb, @gante as it seems you were aware of cache related issues here