-
Notifications
You must be signed in to change notification settings - Fork 29.8k
Fix Bark failing tests #39478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix Bark failing tests #39478
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
if speaker_embeddings is not None: | ||
if "repo_or_path" in speaker_embeddings: | ||
speaker_embeddings["repo_or_path"] = pretrained_processor_name_or_path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because Suno models are badly configured to get speaker embedding from Yoach's checkpoints (see repo_or_path
):
- suno: https://huggingface.co/suno/bark/blob/main/speaker_embeddings_path.json
- suno-small: https://huggingface.co/suno/bark-small/blob/main/speaker_embeddings_path.json
So when used from_pretrained
, models get speaker embedding from Yoach's checkpoint.
Best is to probably open PRs on the Hub to fix the repo_or_path
entry and remove these lines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've asked Suno team to merge these two PRs but still waiting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if Suno doesn't merge, do we still keep pulling from Yoach and remove these lines?
cc @eustlb
"semantic_prompt": 1, # 1D array of shape (X,) | ||
"coarse_prompt": 2, # 2D array of shape (2,X) | ||
"fine_prompt": 2, # 2D array of shape (8,X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments for clarity
for prompt_key in self.speaker_embeddings: | ||
if prompt_key != "repo_or_path": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main change is I added a property to easily get available voice presets. Rest is indenting inwards since we don't need prompt_key != "repo_or_path"
anymore
|
||
So for testing purposes, we will remove the unavailable speaker embeddings before saving. | ||
""" | ||
processor._verify_speaker_embeddings(remove_unavailable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TLDR need to remove speaker embedding which couldn't download properly. Maybe because too many? (~700)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format nit: let's use single-line comments (#...
), and add an HF team member (@eustlb ?) in the TODO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure! and I'm a new HF team member working with @eustlb 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you for the PR 🤗
I'd like the approval from someone from the audio side as well, before merging (perhaps @eustlb ?)
|
||
So for testing purposes, we will remove the unavailable speaker embeddings before saving. | ||
""" | ||
processor._verify_speaker_embeddings(remove_unavailable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format nit: let's use single-line comments (#...
), and add an HF team member (@eustlb ?) in the TODO
UPDATE after merging with main, running slow tests leads to two new errors: # RUN_SLOW=1 pytest tests/models/bark
======================= short test summary info =============================================
FAILED tests/models/bark/test_modeling_bark.py::BarkSemanticModelTest::test_eager_padding_matches_padding_free_with_position_ids - KeyError: 'eager'
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_eager_padding_matches_padding_free_with_position_ids - KeyError: 'eager'
======================= 2 failed, 217 passed, 259 skipped, 3 warnings in 1101.26s (0:18:21) ============= probably from #39447? @zucchini-nlp let me know if there is something I can try! |
Ah I see, I marked them as slow in purpose and I will be fixing them tomorrow. They are failing in any case for some special model including Bark, so feel free to ignore. I don't think current PR affected this test :) Edit: submitted a PR in #39582 |
run-slow: bark |
This comment contains run-slow, running the specified jobs: models: ['models/bark'] |
[For maintainers] Suggested jobs to run (before merge) run-slow: bark |
Goal of this PR
There are three failing tests for Bark, see here for modeling ones.
1)
tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_args
Fails because the vocab size is wrongly configured. By default it takes this value, but this is the input vocab size for Bark (instead of output vocab size), which causes failure when reshaping here.
So a new condition should be added to extract the correct value.
@gante I see a TODO about standardizing the special cases. Is proposed solution for now?
2) RESOLVED
tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_batching
Related to #34634 and fixed by #38985
3)
tests/models/bark/test_processor_bark.py::BarkProcessorTest::test_save_load_pretrained_additional_features
This one is very subtle. Fails on first run with no cached models:
But may pass after multiple tries (maybe because missing files are downloaded).
It can explain issues like OP of #34634 where a voice preset may not be (first) available.
Also this error helped me realize the official checkpoints are still pointing to
ylacombe
's checkpoints. I opened PRs on the hub to fix this.