-
Notifications
You must be signed in to change notification settings - Fork 7
Open
Description
Proposed in #28 (originally #27). It is different from existing splat, since it broadcasts a lane from input, rather than a scalar, also takes an index to select which element to broadcast:
Gets a single lane from vector and broadcast it to the entire vector.
idx
is interpreted modulo the cardinal of the vector.
vec.v8.splat_lane(v: vec.v8, idx: i32) -> vec.v8
vec.v16.splat_lane(v: vec.v16, idx: i32) -> vec.v16
vec.v32.splat_lane(v: vec.v32, idx: i32) -> vec.v32
vec.v64.splat_lane(v: vec.v64, idx: i32) -> vec.v64
vec.v128.splat_lane(v: vec.v128, idx: i32) -> vec.v128
On x86 broadcast instructions first appear in AVX (32-bit floating point elements, AVX2 for integers), however x86 variants don't take an index and only broadcasts first element of the source. General-purpose shuffle would need to be used to emulate this on SSE, which is not great (definitely slower than specialized version). Also, taking an index would lead to this turning into a general purpose shuffle on AVX+ as well.
Metadata
Metadata
Assignees
Labels
No labels