add webgpu support for GatherBlockQuantized #25413

guschmue · 2025-07-15T22:59:35Z

add webgpu support for GatherBlockQuantized

Copilot

Pull Request Overview

This PR adds WebGPU support for the GatherBlockQuantized operation by implementing a complete WebGPU kernel. The implementation includes shader code generation, tensor handling for quantized data, and proper integration with the WebGPU execution provider.

Implements WebGPU kernel for GatherBlockQuantized operation with support for 4-bit and 8-bit quantized data
Adds comprehensive test coverage with WebGPU-specific test execution paths
Fixes shader variable naming conflict to prevent compilation issues

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
gather_block_quantized_op_test.cc	Updates test framework to support WebGPU execution and adds device data testing
shader_variable.cc	Fixes parameter naming conflict in shader function generation
webgpu_contrib_kernels.cc	Registers the new GatherBlockQuantized kernel with WebGPU provider
gather_block_quantized.h	Defines the WebGPU kernel class and shader program interface
gather_block_quantized.cc	Implements the complete WebGPU kernel with shader generation and computation logic

Comments suppressed due to low confidence (2)

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc:114

[nitpick] The function name 'splice' is ambiguous and doesn't clearly indicate its purpose of modifying tensor shape vectors. Consider renaming to 'ModifyTensorShape' or 'InsertTensorDimensions' to better reflect its functionality.

TensorShapeVector splice(TensorShapeVector vec, size_t start, size_t deleteCount, const TensorShapeVector toInsert = {}) {

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc:140

The variable name 'is_int8' is misleading since it returns true for both INT8 and UINT8 types. Consider renaming to 'is_8bit' or 'is_byte_type' to accurately reflect that it checks for 8-bit data types regardless of signedness.

  bool is_int8 = x_dtype == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT8 || x_dtype == ONNX_TENSOR_ELEMENT_DATA_TYPE_UINT8;

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc

snnn · 2025-07-25T16:34:18Z

Hi there! We haven't cut the release branch for this version yet, so I'm removing the release:1.23.0 label for now to keep things tidy. Thanks so much for your contribution! We'll make sure this gets included when the release is prepared. 🤖

add webgpu support for GatherBlockQuantized

guschmue added 3 commits July 15, 2025 10:39

webgpu support for GatherBlockQuantized

1a6dbe8

add 8bit quantization

64d5b7e

cleanup

b7c7148

guschmue added the ep:WebGPU ort-web webgpu provider label Jul 15, 2025

guschmue requested a review from Copilot July 15, 2025 22:59

Copilot AI reviewed Jul 15, 2025

View reviewed changes

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc Outdated Show resolved Hide resolved

github-actions bot reviewed Jul 15, 2025

View reviewed changes

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc Show resolved Hide resolved

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc Show resolved Hide resolved

guschmue added 2 commits July 15, 2025 16:05

copilot feedback

5fb5e39

lintrunner

5e2010b

github-advanced-security bot found potential problems Jul 15, 2025

View reviewed changes

onnxruntime/contrib_ops/webgpu/quantization/gather_block_quantized.cc Fixed Show fixed Hide fixed

guschmue added 3 commits July 15, 2025 16:44

move some code to 4bit case

6ce7e05

Merge branch 'main' into gs/GatherBlockQuantized

9a05ad6

fix issue with memory going out of scope

0d106c9

prathikr approved these changes Jul 18, 2025

View reviewed changes

fs-eire approved these changes Jul 18, 2025

View reviewed changes

guschmue merged commit f190e70 into main Jul 18, 2025
93 checks passed

guschmue deleted the gs/GatherBlockQuantized branch July 18, 2025 22:45

guschmue added the release:1.23.0 label Jul 21, 2025

snnn removed the release:1.23.0 label Jul 25, 2025

qti-yuduo pushed a commit to CodeLinaro/onnxruntime that referenced this pull request Aug 8, 2025

add webgpu support for GatherBlockQuantized (microsoft#25413)

a02bb9c

add webgpu support for GatherBlockQuantized

sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025

add webgpu support for GatherBlockQuantized (microsoft#25413)

1ff0edb

add webgpu support for GatherBlockQuantized

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add webgpu support for GatherBlockQuantized #25413

add webgpu support for GatherBlockQuantized #25413

guschmue commented Jul 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

snnn commented Jul 25, 2025

Uh oh!

Uh oh!

add webgpu support for GatherBlockQuantized #25413

add webgpu support for GatherBlockQuantized #25413

Conversation

guschmue commented Jul 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

snnn commented Jul 25, 2025

Uh oh!

Uh oh!