Skip to content

Conversation

russellb
Copy link
Member

Improve the performance of this test by only creating the tokenizer
once instead of hundreds of times + serialized due to doing it while
holding a semaphore.

The previous code would also frequently get rate limited by
HuggingFace from requesting https://huggingface.co/openai/whisper-large-v3/resolve/main/tokenizer_config.json
too many times. This would sometimes cause the test to fail.

On my laptop, here is the time difference:

Before:

  • 5m3.389s

After:

  • 2m5.471s

This is a piece split out from #21088.

Signed-off-by: Russell Bryant [email protected]

…ctness

Improve the performance of this test by only creating the tokenizer
once instead of hundreds of times + serialized due to doing it while
holding a semaphore.

The previous code would also frequently get rate limited by
HuggingFace from requesting https://huggingface.co/openai/whisper-large-v3/resolve/main/tokenizer_config.json
too many times. This would sometimes cause the test to fail.

On my laptop, here is the time difference:

Before:
- 5m3.389s

After:
- 2m5.471s

This is a piece split out from vllm-project#21088.

Signed-off-by: Russell Bryant <[email protected]>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a significant and well-executed optimization for the test_transcription_api_correctness test. By initializing the tokenizer once outside the main processing loop, it correctly addresses a performance bottleneck and a source of test flakiness caused by repeated network requests to HuggingFace. The code changes are clear, logical, and effectively improve both the speed and reliability of the test suite. No issues were found in the implementation.

Copy link
Member

@DarkLight1337 DarkLight1337 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@DarkLight1337 DarkLight1337 enabled auto-merge (squash) August 29, 2025 02:26
@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 29, 2025
@DarkLight1337 DarkLight1337 merged commit c8b3b29 into vllm-project:main Aug 29, 2025
18 of 21 checks passed
nopperl pushed a commit to pfnet/vllm that referenced this pull request Sep 3, 2025
MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025
MatthewBonanni pushed a commit to MatthewBonanni/vllm that referenced this pull request Sep 3, 2025
842974287 pushed a commit to 842974287/vllm that referenced this pull request Sep 3, 2025
lengrongfu pushed a commit to lengrongfu/vllm that referenced this pull request Sep 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants