Fix Bark failing tests #39478

ebezzam · 2025-07-17T14:09:03Z

Goal of this PR

There are three failing tests for Bark, see here for modeling ones.

1) `tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_args`

FAILED tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_args - RuntimeError: shape '[1, 518400]' is invalid for input of size 40192

Fails because the vocab size is wrongly configured. By default it takes this value, but this is the input vocab size for Bark (instead of output vocab size), which causes failure when reshaping here.

So a new condition should be added to extract the correct value.

@gante I see a TODO about standardizing the special cases. Is proposed solution for now?

2) RESOLVED `tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_batching`

FAILED tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_batching - RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

Related to #34634 and fixed by #38985

3) `tests/models/bark/test_processor_bark.py::BarkProcessorTest::test_save_load_pretrained_additional_features`

This one is very subtle. Fails on first run with no cached models:

src/transformers/models/bark/processing_bark.py:208: ValueError
---------------------------------- Captured stderr call -------------------------------
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
Retrying in 1s [Retry 1/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
Retrying in 2s [Retry 2/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
Retrying in 4s [Retry 3/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
Retrying in 8s [Retry 4/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
Retrying in 8s [Retry 5/5].
HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
----------------------------------- Captured log call -----------------------------------
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
WARNING  huggingface_hub.utils._http:_http.py:332 Retrying in 1s [Retry 1/5].
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
WARNING  huggingface_hub.utils._http:_http.py:332 Retrying in 2s [Retry 2/5].
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
WARNING  huggingface_hub.utils._http:_http.py:332 Retrying in 4s [Retry 3/5].
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
WARNING  huggingface_hub.utils._http:_http.py:332 Retrying in 8s [Retry 4/5].
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
WARNING  huggingface_hub.utils._http:_http.py:332 Retrying in 8s [Retry 5/5].
WARNING  huggingface_hub.utils._http:_http.py:315 HTTP Error 429 thrown while requesting HEAD https://huggingface.co/ylacombe/bark-small/resolve/main/speaker_embeddings/pl_speaker_3_semantic_prompt.npy
=============================================================================== short test summary info ===============================================================================
FAILED tests/models/bark/test_processor_bark.py::BarkProcessorTest::test_save_load_pretrained_additional_features - ValueError: `ylacombe/bark-small/speaker_embeddings/pl_speaker_3_semantic_prompt.npy` does not exists

But may pass after multiple tries (maybe because missing files are downloaded).

It can explain issues like OP of #34634 where a voice preset may not be (first) available.

Also this error helped me realize the official checkpoints are still pointing to ylacombe's checkpoints. I opened PRs on the hub to fix this.

HuggingFaceDocBuilderDev · 2025-07-17T14:22:25Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ebezzam · 2025-07-18T11:18:52Z

src/transformers/models/bark/processing_bark.py

+        if speaker_embeddings is not None:
+            if "repo_or_path" in speaker_embeddings:
+                speaker_embeddings["repo_or_path"] = pretrained_processor_name_or_path


This is because Suno models are badly configured to get speaker embedding from Yoach's checkpoints (see repo_or_path):

suno: https://huggingface.co/suno/bark/blob/main/speaker_embeddings_path.json

suno-small: https://huggingface.co/suno/bark-small/blob/main/speaker_embeddings_path.json

So when used from_pretrained, models get speaker embedding from Yoach's checkpoint.

Best is to probably open PRs on the Hub to fix the repo_or_path entry and remove these lines?

I've asked Suno team to merge these two PRs but still waiting

large: https://huggingface.co/suno/bark/discussions/59

small: https://huggingface.co/suno/bark-small/discussions/21

But if Suno doesn't merge, do we still keep pulling from Yoach and remove these lines?

cc @eustlb

ebezzam · 2025-07-18T11:19:06Z

src/transformers/models/bark/processing_bark.py

+        "semantic_prompt": 1,  # 1D array of shape (X,)
+        "coarse_prompt": 2,  # 2D array of shape (2,X)
+        "fine_prompt": 2,  # 2D array of shape (8,X)


Add comments for clarity

ebezzam · 2025-07-18T11:20:26Z

src/transformers/models/bark/processing_bark.py

-            for prompt_key in self.speaker_embeddings:
-                if prompt_key != "repo_or_path":


Main change is I added a property to easily get available voice presets. Rest is indenting inwards since we don't need prompt_key != "repo_or_path" anymore

src/transformers/models/bark/processing_bark.py

ebezzam · 2025-07-18T11:21:50Z

tests/models/bark/test_processor_bark.py

+
+        So for testing purposes, we will remove the unavailable speaker embeddings before saving.
+        """
+        processor._verify_speaker_embeddings(remove_unavailable=True)


TLDR need to remove speaker embedding which couldn't download properly. Maybe because too many? (~700)

Format nit: let's use single-line comments (#...), and add an HF team member (@eustlb ?) in the TODO

sure! and I'm a new HF team member working with @eustlb 😉

Rocketknight1 · 2025-07-18T13:04:57Z

cc @eustlb @ydshieh

gante

LGTM, thank you for the PR 🤗

I'd like the approval from someone from the audio side as well, before merging (perhaps @eustlb ?)

src/transformers/models/bark/processing_bark.py

gante · 2025-07-18T15:25:02Z

tests/models/bark/test_processor_bark.py

+
+        So for testing purposes, we will remove the unavailable speaker embeddings before saving.
+        """
+        processor._verify_speaker_embeddings(remove_unavailable=True)


Format nit: let's use single-line comments (#...), and add an HF team member (@eustlb ?) in the TODO

ebezzam · 2025-07-21T17:26:11Z

UPDATE after merging with main, running slow tests leads to two new errors:

# RUN_SLOW=1 pytest tests/models/bark
======================= short test summary info =============================================
FAILED tests/models/bark/test_modeling_bark.py::BarkSemanticModelTest::test_eager_padding_matches_padding_free_with_position_ids - KeyError: 'eager'
FAILED tests/models/bark/test_modeling_bark.py::BarkCoarseModelTest::test_eager_padding_matches_padding_free_with_position_ids - KeyError: 'eager'
======================= 2 failed, 217 passed, 259 skipped, 3 warnings in 1101.26s (0:18:21) =============

probably from #39447? @zucchini-nlp let me know if there is something I can try!

zucchini-nlp · 2025-07-21T18:20:28Z

probably from #39447? @zucchini-nlp let me know if there is something I can try!

Ah I see, I marked them as slow in purpose and I will be fixing them tomorrow. They are failing in any case for some special model including Bark, so feel free to ignore. I don't think current PR affected this test :)

Edit: submitted a PR in #39582

eustlb · 2025-07-24T07:59:45Z

run-slow: bark

github-actions · 2025-07-24T08:01:06Z

This comment contains run-slow, running the specified jobs:

models: ['models/bark']
quantizations: [] ...

github-actions · 2025-07-24T12:28:14Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: bark

Fix vocab size for Bark generation.

7a2bec3

ebezzam mentioned this pull request Jul 17, 2025

Fix BatchEncoding.to() for nested elements #38985

Merged

5 tasks

ebezzam marked this pull request as draft July 18, 2025 07:57

Fix Bark processor tests.

2ca8449

ebezzam commented Jul 18, 2025

View reviewed changes

src/transformers/models/bark/processing_bark.py Outdated Show resolved Hide resolved

ebezzam commented Jul 18, 2025

View reviewed changes

Fix style.

d973bb4

ebezzam marked this pull request as ready for review July 18, 2025 11:50

Merge branch 'main' into bark_fix

11fd423

gante approved these changes Jul 18, 2025

View reviewed changes

ebezzam added 2 commits July 18, 2025 18:33

Address comments.

96c67d9

Fix formatting.

e3e3542

ebezzam mentioned this pull request Jul 21, 2025

🔴 Fix EnCodec internals and integration tests #39431

Merged

Merge branch 'main' into bark_fix

6894e68

ebezzam requested a review from eustlb July 22, 2025 13:49

ebezzam added the Audio label Jul 22, 2025

Merge branch 'main' into bark_fix

7c65d17

		for prompt_key in self.speaker_embeddings:
		if prompt_key != "repo_or_path":

Fix Bark failing tests #39478

Are you sure you want to change the base?

Fix Bark failing tests #39478

Conversation

ebezzam commented Jul 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Goal of this PR

1) tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_args

2) RESOLVED tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_batching

3) tests/models/bark/test_processor_bark.py::BarkProcessorTest::test_save_load_pretrained_additional_features

Uh oh!

HuggingFaceDocBuilderDev commented Jul 17, 2025

Uh oh!

ebezzam Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 commented Jul 18, 2025

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ebezzam commented Jul 21, 2025

Uh oh!

zucchini-nlp commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

eustlb commented Jul 24, 2025

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

github-actions bot commented Jul 24, 2025

Uh oh!

Uh oh!

ebezzam commented Jul 17, 2025 •

edited

Loading

1) `tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_end_to_end_with_args`

2) RESOLVED `tests/models/bark/test_modeling_bark.py::BarkModelIntegrationTests::test_generate_batching`

3) `tests/models/bark/test_processor_bark.py::BarkProcessorTest::test_save_load_pretrained_additional_features`

ebezzam Jul 18, 2025 •

edited

Loading

zucchini-nlp commented Jul 21, 2025 •

edited

Loading