SYCL: implement GGML_OP_SET + fix debug src count + align F16 assert #1
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds a SYCL backend implementation for
GGML_OP_SET, fixes the debugsource-count for the operator (2 sources:
src0,src1), and aligns thetype-assert to allow F16 when
GGML_SYCL_F16is enabled.Motivation
GGML_OP_SETis frequently used as a two-step operation:src0 → dstsrc1with offset/strides.num_src=1, which is misleading.device_supports_op()already whitelists F16 (when enabled), so theimplementation assert should match that contract.
What’s in this PR
GGML_OP_SET:memcpy/queue copy).scope_op_debug_print(..., /*num_src=*/2).F32andI32.GGML_SYCL_F16: allowF16as well (copy is size-aware, so safe).Implementation Notes
ggml_type_size(dst->type)to do raw-size-aware copies.SYCL queue). If desired, a submit barrier can be added as a conservative
measure.
device_supports_op().Tests
-DLLAMA_SYCL=ONand ran:./bin/test-backend-ops test -b CPU -o SET✅./bin/test-backend-ops test -b SYCL0 -o SET✅ (OpenCL GPU)GGML_SYCL_F16to validate the F16 path.Performance
Checklist
GGML_SYCL_F16Thanks for reviewing!