Skip to content

Conversation

@GittyBurstein
Copy link
Owner

Summary

This PR adds a SYCL backend implementation for GGML_OP_SET, fixes the debug
source-count for the operator (2 sources: src0, src1), and aligns the
type-assert to allow F16 when GGML_SYCL_F16 is enabled.

Motivation

  • GGML_OP_SET is frequently used as a two-step operation:
    1. base copy of src0 → dst
    2. overlay/set from src1 with offset/strides.
  • The previous debug trace declared num_src=1, which is misleading.
  • device_supports_op() already whitelists F16 (when enabled), so the
    implementation assert should match that contract.

What’s in this PR

  • New/complete SYCL path for GGML_OP_SET:
    • Fast path for contiguous buffers (memcpy/queue copy).
    • Generic kernel for strided / non-contiguous layouts.
    • Two-phase logic (base copy, then overlay) on the same queue.
  • Debug fix: scope_op_debug_print(..., /*num_src=*/2).
  • Type assert alignment:
    • Default: F32 and I32.
    • With GGML_SYCL_F16: allow F16 as well (copy is size-aware, so safe).

Implementation Notes

  • The code uses ggml_type_size(dst->type) to do raw-size-aware copies.
  • No barriers are required for in-order queues (both phases reside on the same
    SYCL queue). If desired, a submit barrier can be added as a conservative
    measure.
  • Kept behavior consistent with other element-wise ops and device_supports_op().

Tests

  • Built with -DLLAMA_SYCL=ON and ran:
    • ./bin/test-backend-ops test -b CPU -o SET
    • ./bin/test-backend-ops test -b SYCL0 -o SET ✅ (OpenCL GPU)
  • Optional: built with GGML_SYCL_F16 to validate the F16 path.

Performance

  • Contiguous paths use direct copies for best throughput.
  • Strided paths use a simple parallel kernel; no observed regressions vs. CPU.

Checklist

  • Compiles with and without GGML_SYCL_F16
  • Tests pass on CPU and SYCL0
  • Style/formatting applied (clang-format)
  • No public API changes

Thanks for reviewing!

@github-actions github-actions bot added documentation Improvements or additions to documentation SYCL ggml labels Sep 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation ggml SYCL

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants