-
Notifications
You must be signed in to change notification settings - Fork 7
Description
Specialized shuffles to zip or unzip lanes. #28 proposes interleave and concat which roughly correspond to Arm's zip and unzip instructions. In short, interleave/zip takes odd or even lanes from two vectors and interleaves them in the output. Concat/unzip is the reverse operation - odd or even lanes are from each of the source are together in the destination.
Closest x86 to zip/interleave is unpack, but it takes adjacent lanes from the source operands instead of odd or even. It is called unpack, because when it used with a vector of zeros it is the opposite of "pack" which reduces lane sizes with signed or unsigned saturation.
Obvious takeaway is that this operations exist, but they are quite different on two major platforms. The less obvious thing is how to marry the two approaches.