[GPU] Add SliceScatter-15 op support#34085
[GPU] Add SliceScatter-15 op support#34085wilson-seok wants to merge 9 commits intoopenvinotoolkit:masterfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds support for the SliceScatter-15 operation to the Intel GPU Plugin, implementing the complete operation stack from plugin layer through kernel execution, with comprehensive test coverage.
Changes:
- Implements SliceScatter-15 operation with reference and optimized OpenCL kernels
- Adds comprehensive unit and functional tests covering static/dynamic shapes and edge cases
- Includes performance benchmarks for large tensor operations
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| src/plugins/intel_gpu/tests/unit/test_cases/slice_scatter_gpu_test.cpp | Unit tests for SliceScatter covering static/dynamic shapes, edge cases, and performance benchmarks |
| src/plugins/intel_gpu/tests/functional/shared_tests_instances/single_layer_tests/slice_scatter.cpp | Functional test instantiations for SliceScatter layer |
| src/plugins/intel_gpu/src/plugin/ops/slice_scatter.cpp | Plugin operation registration and primitive creation |
| src/plugins/intel_gpu/src/kernel_selector/kernels/slice_scatter/*.{h,cpp} | Kernel selector and implementations (reference and optimized) |
| src/plugins/intel_gpu/src/kernel_selector/common_types.h | Added SLICE_SCATTER kernel type enum |
| src/plugins/intel_gpu/src/kernel_selector/cl_kernels/slice_scatter_*.cl | OpenCL kernel implementations (reference and optimized) |
| src/plugins/intel_gpu/src/graph/slice_scatter.cpp | Graph-level primitive implementation and shape inference |
| src/plugins/intel_gpu/src/graph/include/slice_scatter_inst.h | Primitive instance header with helper classes |
| src/plugins/intel_gpu/src/graph/impls/ocl/slice_scatter.cpp | OCL implementation with kernel parameter preparation |
| src/plugins/intel_gpu/src/graph/impls/ocl/register.{hpp,cpp} | Registration of SliceScatter OCL implementation |
| src/plugins/intel_gpu/src/graph/registry/registry.hpp | Registry entry for SliceScatter primitive |
| src/plugins/intel_gpu/include/intel_gpu/primitives/slice_scatter.hpp | Public primitive header |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/plugins/intel_gpu/src/kernel_selector/kernels/slice_scatter/slice_scatter_kernel_ref.cpp
Outdated
Show resolved
Hide resolved
c6e8d4f to
7a9c887
Compare
src/plugins/intel_gpu/src/kernel_selector/cl_kernels/slice_scatter_opt.cl
Outdated
Show resolved
Hide resolved
src/plugins/intel_gpu/src/kernel_selector/cl_kernels/slice_scatter_ref.cl
Outdated
Show resolved
Hide resolved
src/plugins/intel_gpu/tests/unit/test_cases/slice_scatter_gpu_test.cpp
Outdated
Show resolved
Hide resolved
| namespace { | ||
| template <typename T, class = typename std::enable_if<std::is_integral<T>::value>::type> | ||
| std::vector<std::int64_t> extractIntegerData(const data_node& node, const stream& stream) { | ||
| mem_lock<T> lock{node.get_attached_memory_ptr(), stream}; |
There was a problem hiding this comment.
please use mem_lock<T, mem_lock_type::read> because write does not happen.
There was a problem hiding this comment.
Updated! Thanks!
| @@ -0,0 +1,208 @@ | |||
| // Copyright (C) 2018-2026 Intel Corporation | |||
There was a problem hiding this comment.
random spot) could you add this to ocl_v2 instead of ocl?
There was a problem hiding this comment.
Updated! Thanks!
| REGISTER_DEFAULT_IMPLS(roll, OCL_S); | ||
| REGISTER_DEFAULT_IMPLS(shuffle_channels, OCL_S); | ||
| REGISTER_DEFAULT_IMPLS(slice, OCL_S, OCL_D); | ||
| REGISTER_IMPLS(slice_scatter); |
There was a problem hiding this comment.
for education purpose, could you tell me the difference between register_impls and register_default_impls?
By the way, those two categories are placed separately. If if is register_impls it should be placed near other register_impls.
There was a problem hiding this comment.
maybe REGISTER_DEFAULT_IMPLS(slice_Scatter, OCL_S); would work with less code?
There was a problem hiding this comment.
REGISTER_IMPLS() is better for current implementation. Because custom validate_impl() is required and
Using shape_types::any makes simpler code as slice_scatter impl supports static and dynamic.
5c2bdb5 to
a55e7ea
Compare
- Add primitive definition (slice_scatter.hpp)
- Add graph instance (slice_scatter_inst.h, slice_scatter.cpp)
- Add ref OpenCL kernel with per-element scatter
- Add opt OpenCL kernel with vectorized vstore8 for step=1 cases
- Add kernel selector (ref + opt with priority-based selection)
- Add OCL implementation bridge
- Add plugin ops translation from ov::op::v15::SliceScatter
- Register in common_types, register.hpp/cpp, registry.hpp
- Add unit tests (84 tests: 3 data types x 28 cases)
- Covers: basic bfyx, axes, step>1, negative step, clamping,
INT_MAX stop, negative axes, spec examples, 1D/2D/5D tensors,
dynamic shapes, caching
- Add functional tests (60 tests: static, dynamic, precision)
- Verified on GPU.0 (UHD 770 iGPU) and GPU.1 (Arc A770 dGPU)
…ter_kernel_utils.h Deduplicate identical addJitConstantsForParam function and associated constants (MAX_SUPPORTED_DIM, JIT_AXES_BUFF_SIZE_NAME) from slice_scatter_kernel_ref.cpp and slice_scatter_kernel_opt.cpp into a shared header slice_scatter_kernel_utils.h.
…tion, disable perf tests, use read-only mem_lock - Fix ceil division in slice_scatter_opt.cl bounds check for 4D/5D cases - Add axis range validation in slice_scatter_ref.cl and slice_scatter_opt.cl - Disable benchmark tests (DISABLED_ prefix) to avoid CI overhead - Use mem_lock<T, mem_lock_type::read> for read-only memory access

Details:
Tickets: