[CPU][float8] Add QEmbeddingbag kernel #2686

shiyang-weng · 2025-08-05T01:37:04Z

Implemented FP8 QEmbeddingBag on CPU, currently supporting:

include_last_offset=True
mode="sum"

Next steps

expand supported modes.
Use fp8 instructions instead

pytorch-bot · 2025-08-05T01:37:08Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2686

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 0e10992 with merge base 7dbc816 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-08-05T01:39:42Z

test/quantization/test_quant_api.py

+        "CPU" not in torch._C._dispatch_dump("torchao::qembeddingbag"),
+        reason="cpp kernels not built",
+    )
+    def test_embeddingbag_cpu(self):


the test should be added here I think: https://github.com/pytorch/ao/blob/main/test/test_ops.py

pytorch-bot · 2025-08-07T02:19:08Z

❌ 🤖 pytorchbot command failed:

@pytorchbot: error: argument command: invalid choice: 'topic: new feature' (choose from 'merge', 'revert', 'rebase', 'label', 'drci', 'cherry-pick')

usage: @pytorchbot [-h] {merge,revert,rebase,label,drci,cherry-pick} ...

Try @pytorchbot --help for more info.

shiyang-weng · 2025-08-07T02:20:17Z

@pytorchbot label "topic: new feature"

Xia-Weiwen

LGTM. Have you run some benchmark to ensure it's not too slow?

torchao/csrc/cpu/qembeddingbag.cpp

torchao/ops.py

torchao/csrc/cpu/qembeddingbag.cpp

shiyang-weng · 2025-08-12T03:34:47Z

@jerryzh168 Could you help review this pr

jerryzh168 · 2025-08-14T23:20:13Z

torchao/ops.py

@@ -70,6 +70,9 @@
 lib.define(
    "da8w4_linear_cpu(Tensor input, Tensor input_scales, Tensor input_qzeros, Tensor weight, Tensor weight_scales, Tensor weight_qzeros, Tensor compensation, Tensor? bias, ScalarType output_dtype) -> Tensor"
 )
+lib.define(
+    "qembeddingbag(Tensor qweight, Tensor indices, Tensor offsets, Tensor weight_scale, float o_scale, int mode, bool include_last_offset) -> Tensor"


is this the same as https://github.com/pytorch/pytorch/blob/371eacb2ae4ecdabc52ea4634ed21558df2f3bab/aten/src/ATen/native/native_functions.yaml#L2368C1-L2369C1? with the only difference of qweight being float8?

@jerryzh168 Thanks for reviewing. Yes, I think so, except that the implementation in this PR has limited functionality so far.

This operator is used for inference. So I did not add any parameters related to the gradient, including scale_grad_by_freq, sparse, per_sample_weights, padding_idx.

I think we should add this to pytorch directly if that's the case, float8 is a native dtype in pytorch, so I think it makes most of the sense to just add the functionality there, we can error out in the op if some arg combination is not supported or invalid for float8

Intel's platform has fp8 instructions. When we are ready, we hope to update this kernel based on fp8 instructions. As far as I know, the latest GCC is required. Is it difficult to support in PyTorch?

I'm not sure, can you open an issue for this in pytorch/pytorch?

shiyang-weng marked this pull request as draft August 5, 2025 01:37

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 5, 2025

jerryzh168 reviewed Aug 5, 2025

View reviewed changes

shiyang-weng added 5 commits August 5, 2025 09:14

add embeddingbag kernel

a695557

switch to use cvtfp8e4m3_fp32

025aa16

improve code style

ab62099

rm unused buf

badb85d

mv ut to test/test_ops.py

8069e4a

This comment was marked as outdated.

Sign in to view

pytorch-bot bot added the topic: new feature Use this tag if this PR adds a new feature label Aug 7, 2025

shiyang-weng marked this pull request as ready for review August 7, 2025 02:43

Xia-Weiwen approved these changes Aug 7, 2025

View reviewed changes

torchao/csrc/cpu/qembeddingbag.cpp Outdated Show resolved Hide resolved

torchao/csrc/cpu/qembeddingbag.cpp Outdated Show resolved Hide resolved

torchao/ops.py Outdated Show resolved Hide resolved

torchao/csrc/cpu/qembeddingbag.cpp Outdated Show resolved Hide resolved

shiyang-weng added 3 commits August 8, 2025 10:28

refine kernel

9d0f7a5

add test case

ae07dc6

add more assert

72f5017

Xia-Weiwen requested a review from jerryzh168 August 11, 2025 02:10

add more test case

0e10992

jerryzh168 reviewed Aug 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CPU][float8] Add QEmbeddingbag kernel #2686

[CPU][float8] Add QEmbeddingbag kernel #2686

shiyang-weng commented Aug 5, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

jerryzh168 Aug 5, 2025

Uh oh!

This comment was marked as outdated.

pytorch-bot bot commented Aug 7, 2025

Uh oh!

shiyang-weng commented Aug 7, 2025

Uh oh!

Xia-Weiwen left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shiyang-weng commented Aug 12, 2025

Uh oh!

jerryzh168 Aug 14, 2025

Uh oh!

Xia-Weiwen Aug 15, 2025

Uh oh!

shiyang-weng Aug 15, 2025

Uh oh!

jerryzh168 Aug 15, 2025 •

edited

Loading

Uh oh!

shiyang-weng Aug 15, 2025

Uh oh!

jerryzh168 Aug 15, 2025

Uh oh!

Uh oh!

[CPU][float8] Add QEmbeddingbag kernel #2686

Are you sure you want to change the base?

[CPU][float8] Add QEmbeddingbag kernel #2686

Conversation

shiyang-weng commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2686

✅ No Failures

Uh oh!

jerryzh168 Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

This comment was marked as outdated.

pytorch-bot bot commented Aug 7, 2025

Uh oh!

shiyang-weng commented Aug 7, 2025

Uh oh!

Xia-Weiwen left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shiyang-weng commented Aug 12, 2025

Uh oh!

jerryzh168 Aug 14, 2025

Choose a reason for hiding this comment

Uh oh!

Xia-Weiwen Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

shiyang-weng Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shiyang-weng Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Aug 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

shiyang-weng commented Aug 5, 2025 •

edited

Loading

pytorch-bot bot commented Aug 5, 2025 •

edited

Loading

jerryzh168 Aug 15, 2025 •

edited

Loading