adjust HPU warmup: use dummy inputs with shape more close to real scenario #689

kaixuanliu · 2025-07-29T09:56:13Z

In original implementation, we use dummy inputs with shapes like [1,128], [1,256],[2,128],[2,256] to do warmup, aiming to generate recipe cache in warmup stage. And in real serving scenario, we padding the input_ids/attention_masks to shapes cached in warmup stage. However, we found precision issue for reranker models following tei docs . We think it may be because wrong graphs/recipe was used during replay stage. Hence we adjust the create_warmup_batch function in this PR, to make the dummy inputs more close to real scenario, hence during warmup stage, these inputs will also be padded in python backend and will generate right recipe caching/graph, which will be the same with serving stage. We made several round experiments, and the wrong output issue disappears after this PR.

kaixuanliu · 2025-07-29T09:59:10Z

@regisss , pls help review, thx!

…nario to avoid wrong output from reranker model Signed-off-by: Liu, Kaixuan <[email protected]>

regisss

LGTM

regisss · 2025-08-08T08:59:31Z

cc @Narsil
The methods warmup_hpu and create_warmup_batch are only used on HPU

adjust HPU warmup: use dummy inputs with shape more close to real sce…

eb26ddf

…nario to avoid wrong output from reranker model Signed-off-by: Liu, Kaixuan <[email protected]>

regisss approved these changes Aug 8, 2025

View reviewed changes

regisss merged commit c8ff435 into huggingface:main Aug 8, 2025
2 of 13 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

adjust HPU warmup: use dummy inputs with shape more close to real scenario #689

adjust HPU warmup: use dummy inputs with shape more close to real scenario #689

Uh oh!

kaixuanliu commented Jul 29, 2025 •

edited

Loading

Uh oh!

kaixuanliu commented Jul 29, 2025

Uh oh!

regisss left a comment

Uh oh!

regisss commented Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

adjust HPU warmup: use dummy inputs with shape more close to real scenario #689

adjust HPU warmup: use dummy inputs with shape more close to real scenario #689

Uh oh!

Conversation

kaixuanliu commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kaixuanliu commented Jul 29, 2025

Uh oh!

regisss left a comment

Choose a reason for hiding this comment

Uh oh!

regisss commented Aug 8, 2025

Uh oh!

Uh oh!

Uh oh!

kaixuanliu commented Jul 29, 2025 •

edited

Loading