[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens #40726

gante · 2025-09-05T15:42:33Z

What does this PR do?

See title.

In the process, deflakes a lot of tests :) (It was generating forbidden tokens in tests, like image tokens in VLMs -- now we can specify which sequences are forbidden)

py.test tests/models/voxtral/test_modeling_voxtral.py -k test_prompt_lookup_decoding_matches_greedy_search --flake-finder --flake-runs 1000 now runs with without problems.

HuggingFaceDocBuilderDev · 2025-09-05T15:52:05Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/transformers/generation/candidate_generator.py

ydshieh

Super!!!!!

zucchini-nlp

Great thanks! One question, should we raise a warning that prompt decoder will not filter candidates by logits so we can't support all generation config params?

zucchini-nlp · 2025-09-08T09:31:04Z

tests/generation/test_utils.py

-                    "qwen2_5omni",  # the file is named `qwen2_5_omni`, but the model class is `Qwen2_5Omni`,
-                    # All models below: shouldn't suggest audio tokens. Can be fixed by passing `suppress_ids` to candidate generator: @joaa @raushan
-                    "voxtral",
-                    "qwen2audio",


yaaaaaay 💃🏻

src/transformers/generation/utils.py

gante · 2025-09-09T10:17:45Z

@zucchini-nlp great comment. We can indeed make it work with ANY logits processor that blocks tokens, not just bad_word_ids.

The logic is slightly more complex, but it is added in the latest commit. In a nutshell, we simulate using the logits processor with fake input logits, using the same processors as we use for the main model, and any selected token with -inf logits is guaranteed to be a forbidden token. This also works with external logits processors 🤗

Regarding long-term maintenance: if we're keeping assisted generation (we are for now), we're also keeping this one. This is the candidate-based generation strategy with the fewest requirements.

gante · 2025-09-09T10:33:15Z

src/transformers/generation/candidate_generator.py

-        max_matching_ngram_size (`int`):
-            The maximum ngram size to be considered for matching in the prompt
-        num_output_tokens (`int`):
+        eos_token_id (`torch.Tensor`, *optional*):


(docstring args were out of order, and some were missing)

zucchini-nlp

Nice, happy that it worked well! Thanks for aligning prompt decoder with common generation API. Left one tiny question otherwise lgtm

zucchini-nlp · 2025-09-09T10:40:36Z

src/transformers/generation/candidate_generator.py

+                    if self.logits_processor is not None:
+                        sequence_with_candidate = input_ids
+                        for candidate_idx, new_candidate_token in enumerate(chosen_ids):
+                            fake_input_logits = torch.ones((bsz, self.vocab_size), device=input_ids.device)


prob we can create a fake logit once, since it's always same. Can be helpful with models like gemma with huuge vocab size

I thought so at first, but then considered that in-place ops (in custom logits processors) might behave poorly. Double-checked it now, it seems resilient to in-place ops -> changed it.

github-actions · 2025-09-09T10:55:16Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: idefics2, idefics3, qwen2_5_vl, smolvlm

no longer flaky :)

f2090b8

gante requested review from zucchini-nlp and ydshieh September 5, 2025 15:42

ydshieh reviewed Sep 5, 2025

View reviewed changes

src/transformers/generation/candidate_generator.py Outdated Show resolved Hide resolved

ydshieh approved these changes Sep 5, 2025

View reviewed changes

zucchini-nlp reviewed Sep 8, 2025

View reviewed changes

gante added 2 commits September 9, 2025 09:18

PR comments

46cc6d5

any token-blocking logits processor works

1f72346

Merge branch 'main' into test_prompt_lookup

6cb81b4

gante changed the title ~~[generate] PromptLookupCandidateGenerator accepts bad_words_ids~~ [generate] PromptLookupCandidateGenerator won't generate forbidden tokens Sep 9, 2025

gante added 2 commits September 9, 2025 10:31

?

e87e275

default

d57ee31

gante commented Sep 9, 2025

View reviewed changes

zucchini-nlp approved these changes Sep 9, 2025

View reviewed changes

gante added 2 commits September 9, 2025 10:43

-_-

f3453cc

create fake tensors once

4589257

gante enabled auto-merge (squash) September 9, 2025 10:56

gante merged commit ed10021 into huggingface:main Sep 9, 2025
23 checks passed

gante deleted the test_prompt_lookup branch September 9, 2025 14:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens #40726

[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens #40726

gante commented Sep 5, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Sep 5, 2025

Uh oh!

Uh oh!

ydshieh left a comment

Uh oh!

zucchini-nlp left a comment

Uh oh!

zucchini-nlp Sep 8, 2025

Uh oh!

Uh oh!

gante commented Sep 9, 2025 •

edited

Loading

Uh oh!

gante Sep 9, 2025

Uh oh!

zucchini-nlp left a comment

Uh oh!

zucchini-nlp Sep 9, 2025 •

edited

Loading

Uh oh!

gante Sep 9, 2025

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

[generate] PromptLookupCandidateGenerator won't generate forbidden tokens #40726

[generate] PromptLookupCandidateGenerator won't generate forbidden tokens #40726

Conversation

gante commented Sep 5, 2025

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Sep 5, 2025

Uh oh!

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gante commented Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gante Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gante Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

Uh oh!

Uh oh!

[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens #40726

[generate] `PromptLookupCandidateGenerator` won't generate forbidden tokens #40726

gante commented Sep 9, 2025 •

edited

Loading

zucchini-nlp Sep 9, 2025 •

edited

Loading