-
-
Notifications
You must be signed in to change notification settings - Fork 10k
[tests] Improve speed and reliability of test_transcription_api_correctness #23854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tests] Improve speed and reliability of test_transcription_api_correctness #23854
Conversation
…ctness Improve the performance of this test by only creating the tokenizer once instead of hundreds of times + serialized due to doing it while holding a semaphore. The previous code would also frequently get rate limited by HuggingFace from requesting https://huggingface.co/openai/whisper-large-v3/resolve/main/tokenizer_config.json too many times. This would sometimes cause the test to fail. On my laptop, here is the time difference: Before: - 5m3.389s After: - 2m5.471s This is a piece split out from vllm-project#21088. Signed-off-by: Russell Bryant <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request provides a significant and well-executed optimization for the test_transcription_api_correctness
test. By initializing the tokenizer once outside the main processing loop, it correctly addresses a performance bottleneck and a source of test flakiness caused by repeated network requests to HuggingFace. The code changes are clear, logical, and effectively improve both the speed and reliability of the test suite. No issues were found in the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…ctness (vllm-project#23854) Signed-off-by: Russell Bryant <[email protected]>
…ctness (vllm-project#23854) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Matthew Bonanni <[email protected]>
…ctness (vllm-project#23854) Signed-off-by: Russell Bryant <[email protected]>
…ctness (vllm-project#23854) Signed-off-by: Russell Bryant <[email protected]> Signed-off-by: Shiyan Deng <[email protected]>
…ctness (vllm-project#23854) Signed-off-by: Russell Bryant <[email protected]>
Improve the performance of this test by only creating the tokenizer
once instead of hundreds of times + serialized due to doing it while
holding a semaphore.
The previous code would also frequently get rate limited by
HuggingFace from requesting https://huggingface.co/openai/whisper-large-v3/resolve/main/tokenizer_config.json
too many times. This would sometimes cause the test to fail.
On my laptop, here is the time difference:
Before:
After:
This is a piece split out from #21088.
Signed-off-by: Russell Bryant [email protected]