Skip to content
This repository was archived by the owner on Jan 26, 2024. It is now read-only.

Conversation

jchlanda
Copy link

@jchlanda jchlanda commented Oct 6, 2021

Please see https://www.khronos.org/registry/OpenCL/extensions/intel/cl_intel_subgroups.html for the details of the shuffles.

This was uncovered when writing libclc's Intel subgroup shuffles, which use the same built-in bpermute (https://github.com/intel/llvm/pull/4664/files) and was failing tests from llvm-test-suite (among others: https://github.com/intel/llvm-test-suite/blob/intel/SYCL/SubGroup/shuffle.hpp#L88).

@b-sumner
Copy link
Contributor

b-sumner commented Feb 7, 2022

@jchlanda, (self & ~(width-1)) is the lowest lane in the group of width lanes that includes self. If index, the source lane, is below that value, then the shuffle up operation for lane self is a no-op. I do not believe the proposed patch is correct.

Suppose width = 4, and self = 2, and lane_delta = 5. Then (self & ~(width-1)) = 0. The first assignment of index results in -3. The proposed patch incorrectly keeps the index at -3, whereas the current code replaces index with 2 which is correct.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants