Qualcomm AI Engine Direct - fix LPBQ implementation #12663

haowhsu-quic · 2025-07-19T14:09:32Z

Summary

fix LPBQ and make test case more general

Test plan

python backends/qualcomm/tests/test_qnn_delegate.py TestQNNQuantizedOperator.test_qnn_backend_conv2d_block -b build-android -s $DEVICE -m SM8750

- fix LPBQ and make test case more general

pytorch-bot · 2025-07-19T14:09:36Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12663

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit e8ad73c with merge base 4d7f9ca ():

NEW FAILURES - The following jobs have failed:

pull / android / run-emulator (gh)
The process '/usr/bin/sh' failed with exit code 255
pull / test-eval_llama-mmlu-linux / linux-job (gh)
RuntimeError: Command docker exec -t 7ef489acdfb86e238b3eddd619e248be90969d2c74a6411f83dbf4792939d728 /exec failed with exit code 1

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-07-19T14:10:07Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

haowhsu-quic · 2025-07-19T14:16:30Z

Hi @cccclai, I made a mistake when implementing LPBQ. I've tested current version against Llama3.2 1B and can get better results now. Will update the related change on llama.py in another PR.

cccclai · 2025-07-20T03:24:42Z

Hi @cccclai, I made a mistake when implementing LPBQ. I've tested current version against Llama3.2 1B and can get better results now. Will update the related change on llama.py in another PR.

Thanks! Mind sharing a bit more details about the mistake? Trying to follow the code but not super clear

haowhsu-quic · 2025-07-20T03:58:19Z

Yes, currently HTP uses 4bits to store quantized scales. I didn't clip the numeric to the right range (before: 0-255 / after: 1-16). I also double check with AIMET's implementation and the mse between per-channel / per-block.

cccclai · 2025-07-20T05:06:27Z

Yes, currently HTP uses 4bits to store quantized scales. I didn't clip the numeric to the right range (before: 0-255 / after: 1-16). I also double check with AIMET's implementation and the mse between per-channel / per-block.

Ah that makes sense..

cccclai

Thanks for the fix

facebook-github-bot · 2025-07-20T18:48:27Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this in D78635428.

### Summary - fix LPBQ and make test case more general ### Test plan ```python python backends/qualcomm/tests/test_qnn_delegate.py TestQNNQuantizedOperator.test_qnn_backend_conv2d_block -b build-android -s $DEVICE -m SM8750 ```

Qualcomm AI Engine Direct - fix LPBQ implementation

e8ad73c

- fix LPBQ and make test case more general

haowhsu-quic requested a review from cccclai as a code owner July 19, 2025 14:09

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 19, 2025

cccclai approved these changes Jul 20, 2025

View reviewed changes

cccclai merged commit 0fbd6d4 into pytorch:main Jul 21, 2025
100 of 103 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qualcomm AI Engine Direct - fix LPBQ implementation #12663

Qualcomm AI Engine Direct - fix LPBQ implementation #12663

Uh oh!

haowhsu-quic commented Jul 19, 2025

Uh oh!

pytorch-bot bot commented Jul 19, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jul 19, 2025

Uh oh!

haowhsu-quic commented Jul 19, 2025 •

edited

Loading

Uh oh!

cccclai commented Jul 20, 2025

Uh oh!

haowhsu-quic commented Jul 20, 2025 •

edited

Loading

Uh oh!

cccclai commented Jul 20, 2025

Uh oh!

cccclai left a comment

Uh oh!

facebook-github-bot commented Jul 20, 2025

Uh oh!

Uh oh!

Uh oh!

Qualcomm AI Engine Direct - fix LPBQ implementation #12663

Qualcomm AI Engine Direct - fix LPBQ implementation #12663

Uh oh!

Conversation

haowhsu-quic commented Jul 19, 2025

Summary

Test plan

Uh oh!

pytorch-bot bot commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/12663

❌ 2 New Failures

Uh oh!

github-actions bot commented Jul 19, 2025

This PR needs a release notes: label

Uh oh!

haowhsu-quic commented Jul 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cccclai commented Jul 20, 2025

Uh oh!

haowhsu-quic commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cccclai commented Jul 20, 2025

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jul 20, 2025

Uh oh!

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 19, 2025 •

edited

Loading

This PR needs a `release notes:` label

haowhsu-quic commented Jul 19, 2025 •

edited

Loading

haowhsu-quic commented Jul 20, 2025 •

edited

Loading