From 7f9018cb1859c38360d64fa824b1959c1973a441 Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 18 May 2022 12:38:23 +0100 Subject: [PATCH 01/25] introduce `repr(scalable)` Co-authored-by: Jamie Cunliffe --- text/0000-repr-scalable.md | 326 +++++++++++++++++++++++++++++++++++++ 1 file changed, 326 insertions(+) create mode 100644 text/0000-repr-scalable.md diff --git a/text/0000-repr-scalable.md b/text/0000-repr-scalable.md new file mode 100644 index 00000000000..d39f9566472 --- /dev/null +++ b/text/0000-repr-scalable.md @@ -0,0 +1,326 @@ +- Feature Name: `repr_scalable` +- Start Date: 2025-07-07 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Extends Rust's existing SIMD infrastructure, `#[repr(simd)]`, with a +complementary scalable representation, `#[repr(scalable)]`, to support scalable +vector types, such as Arm's Scalable Vector Extension (SVE), or RISC-V's Vector +Extension (RVV). + +Like the existing `repr(simd)` representation, `repr(scalable)` is internal +compiler infrastructure that will be used only in the standard library to +introduce scalable vector types which can then be stablised. Only the +infrastructure to define these types are introduced in this RFC, not the types +or intrinsics that use it. + +This RFC builds on Rust's existing SIMD infrastructure, introduced in +[rfcs#1199: SIMD Infrastructure][rfcs#1199]. It depends on +[rfcs#3729: Hierarchy of Sized traits][rfcs#3729]. + +SVE is used in examples throughout this RFC, but the proposed features should be +sufficient to enable support for similar extensions in other architectures, such +as RISC-V's V Extension. + +# Motivation +[motivation]: #motivation + +SIMD types and instructions are a crucial element of high-performance Rust +applications and allow for operating on multiple values in a single instruction. +Many processors have SIMD registers of a known fixed length and provide +intrinsics which operate on these registers. For example, Arm's Neon extension +is well-supported by Rust and provides 128-bit registers and a wide range of +intrinsics. + +Instead of releasing more extensions with ever increasing register bit widths, +AArch64 has introduced a Scalable Vector Extension (SVE). Similarly, RISC-V has +a Vector Extension (RVV). These extensions have vector registers whose width +depends on the CPU implementation and bit-width-agnostic intrinsics for +operating on these registers. By using scalale vectors, code won't need to be +re-written using new architecture extensions with larger registers, new types +and intrinsics, but instead will work on newer processors with different vector +register lengths and performance characteristics. + +Scalable vectors have interesting and challenging implications for Rust, +introducing value types with sizes that can only be known at runtime, requiring +significant changes to the language's notion of sizedness - this support is +being proposed in the [rfcs#3729]. + +Hardware is generally available with SVE, and key Rust stakeholders want to be +able to use these architecture features from Rust. In a [recent discussion on +SVE, Amanieu, co-lead of the library team, said][quote_amanieu]: + +> I've talked with several people in Google, Huawei and Microsoft, all of whom +> have expressed a rather urgent desire for the ability to use SVE intrinsics in +> Rust code, especially now that SVE hardware is generally available. + +Without support in the compiler, leveraging the +[*Hierarchy of Sized traits*][rfcs#3729] proposal, it is not possible to +introduce intrinsics and types exposing the scalable vector support in hardware. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +None of the infrastructure proposed in this RFC is intended to be used directly +by Rust users. + +`repr(scalable)` as described later in +[*Reference-level explanation*][reference-level-explanation] is perma-unstable +and exists only enables scalable vector types to be defined in the standard +library. The specific vector types are intended to eventually be stabilised, but +none are proposed in this RFC. + +## Using scalable vectors +[using-scalable-vectors]: #using-scalable-vectors + +Scalable vector types correspond to vector registers in hardware with unknown +size at compile time. However, it will be a known and fixed size at runtime. +Additional properties could be known during compilation, depending on the +architecture, such as a minimum or maximum size or that the size must be a +multiple of some factor. + +As previously described, users will not define their own scalable vector types +and instead use intrinsics from `std::arch`, and this RFC is not proposing any +such intrinsics, just the infrastructure. However, to illustrate how the types +and intrinsics that this infrastructure will enable can be used, consider the +following example that sums two input vectors: + +```rust +fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { + let len = in_a.len(); + unsafe { + // `svcntw` returns the actual number of elements that are in a 32-bit + // element vector + let step = svcntw() as usize; + for i in (0..len).step_by(step) { + let a = in_a.as_ptr().add(i); + let b = in_b.as_ptr().add(i); + let c = out_c as *mut f32; + let c = c.add(i); + + // `svwhilelt_b32` generates a mask based on comparing the current + // index against the `len` + let pred = svwhilelt_b32(i as _, len as _); + + // `svld1_f32` loads a vector register with the data from address + // `a`, zeroing any elements in the vector that are masked out + let sva = svld1_f32(pred, a); + let svb = svld1_f32(pred, b); + + // `svadd_f32_m` adds `a` and `b`, any lanes that are masked out will + // take the keep value of `a` + let svc = svadd_f32_m(pred, sva, svb); + + // `svst1_f32` will store the result without accessing any memory + // locations that are masked out + svst1_f32(svc, pred, c); + } + } +} +``` + +From a user's perspective, writing code for scalable vectors isn't too different +from when writing code with a fixed sized vector. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +Types annotated with the `#[repr(simd)]` attribute contains either an array +field or multiple fields to indicate the intended size of the SIMD vector that +the type represents. + +Similarly, a `scalable` repr is introduced to define a scalable vector type. +`scalable` accepts an integer to determine the minimum number of elements the +vector contains. For example: + +```rust +#[repr(simd, scalable(4))] +pub struct svfloat32_t { _ty: [f32], } +``` + +As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get +the element type for the codegen backend. + +## Properties of scalable vectors +[properties-of-scalable-vector-types]: #properties-of-scalable-vectors + +Scalable vectors are necessarily non-`const Sized` (from [rfcs#3729]) as they +behave like value types but the exact size cannot be known at compilation time. + +[rfcs#3729] allows these types to implement `Clone` (and consequently `Copy`) as +`Clone` only requires an implementation of `Sized`, irrespective of constness. + +Scalable vector types have some further restrictions due to limitations of the +codegen backend: + +- Cannot be stored in compound types (structs, enums, etc) + - Including coroutines, so these types cannot be held across + an await boundary in async functions. +- Cannot be used in arrays +- Cannot be the type of a static variable. + +Some of these limitations may be able to be lifted in future depending on what +is supported by rustc's codegen backends. + +## ABI +[abi]: #abi + +Rust currently always passes SIMD vectors on the stack to avoid ABI mismatches +between functions annotated with `target_feature` - where the relevant vector +register is guaranteed to be present - and those without - where the relevant +vector register might not be present. + +However, this approach will not work for scalable vector types as the relevant +target feature must to be present to use the instruction that can allocate the +correct size on the stack for the scalable vector. + +Therefore, there is an additional restriction that these types cannot be used in +the argument or return types of functions unless those functions are annotated +with the relevant target feature. + +## Target features +[target-features]: #target-features + +Similarly to the issues with the ABI of scalable vectors, without the relevant +target features, few operations can actually be performed on scalable vectors - +causing issues for the use of scalable vectors in generic code and with traits. +For example, implementations of traits like `Clone` would not be able to +actually perform a clone, and generic functions that are instantiated with +scalable vectors would during instruction selection in the codegen backend. + +When a scalable vector is instantiated into a generic function during +monomorphisation, or a trait method is being implemented for a scalable vector, +then the relevant target feature will be added to the function. + +For example, when instantiating `std::mem::size_of_val` with a scalable vector +during monomorphisation, the relevant target feature will be added to `size_of_val` +for codegen. + +## Implementing `repr(scalable)` +[implementing-repr-scalable]: #implementing-reprscalable + +Implementing `repr(scalable)` largely involves lowering scalable vectors to the +appropriate type in the codegen backend. LLVM has robust support for scalable +vectors and is the default backend, so this section will focus on implementation +in the LLVM codegen backend. Other codegen backends can implement support when +scalable vectors are supported by the backend. + +Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable +vectors to the correct type in LLVM and the `vscale` modifier that is applied to +LLVM's vector types. + +LLVM's scalable vector type is of the form ``: + +- `elements` multiplied by `size_of::<$ty>` gives the smallest allowed register + size and the increment size +- `vscale` is a runtime constant that is used to determine the actual vector + register size + +For example, with SVE, the scalable vector register (`Z` register) size has to +be a multiple of 128 bits and a power of 2. Only the value of `elements` can be +chosen by compiler. For `f32`, `elements` must always be four, as with the +minimum `vscale` of one, `1 * 4 * sizeof(f32)` is the 128-bit minimum register +size. + +At runtime `vscale` could then be any power of two which would result in +register sizes of 128, 256, 512, 1024 and 2048. `vscale` could be any value +providing it gives a legal vector register size for the architecture. + +`repr(scalable)` expects the number of `elements` to be provided rather than +calculating it. This avoids needing to teach the compiler how to calculate the +required `element` count, particularly as some of these scalable types can have +different element counts. For instance, the predicates used in SVE have +different element counts depending on the types they are a predicate for. + +While it is possible to change the vector length at runtime using a +[`prctl()`][prctl] call to the kernel, this would require that `vscale` change, +which is unsupported. As Rust cannot prevent users from doing this, it will be +documented as undefined behaviour, consistent with C and C++. + +# Drawbacks +[drawbacks]: #drawbacks + +- `repr(scalable)` is inherently additional complexity to the language, despite + being largely hidden from users. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Without support for scalable vectors in the language and compiler, it is not +possible to leverage hardware with scalable vectors from Rust. As extensions +with scalable vectors are available in architectures as either the only or +recommended way to do SIMD, lack of support in Rust would severely limit Rust's +suitability on these architectures compared to other systems programming +languages. + +By aligning with the approach taken by C (discussed in the +[*Prior art*][prior-art] below), most of the documentation that already exists +for scalable vector intrinsics in C should still be applicable to Rust. + +# Prior art +[prior-art]: #prior-art + +There are not many languages with support for scalable vectors: + +- SVE in C takes a similar approach as this proposal by using sizeless + incomplete types to represent scalable vectors. However, sizeless types are + not part of the C specification and Arm's C Language Extensions (ACLE) provide + [an edit to the C standard][acle_sizeless] which formally define "sizeless + types". +- [.NET 9 has experimental support for SVE][dotnet], but as a managed language, + the design and implementation considerations in .NET are quite different to + Rust. + +[rfcs#3268] was a previous iteration of this RFC. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +There are currently no unresolved questions. + +# Future possibilities +[future-possibilities]: #future-possibilities + +There are a handful of future possibilities enabled by this RFC: + +## General mechanism for target-feature-affected types +[general-mechanism-target-feature-types]: #general-mechanism-for-target-feature-affected-types + +A more general mechanism for enforcing that SIMD types are only used in +`target_feature`-annotated functions would be useful, as this would enable SVE +types to have fewer distinct restrictions than other SIMD types, and would +enable SIMD vectors to be passed by-register, a performance improvement. + +Such a mechanism would need be introduced gradually to existing SIMD types with +a forward compatibility lint. This will be addressed in a forthcoming RFC. + +## Relaxed restrictions +[relaxed-restrictions]: #relaxed-restrictions + +Some of the restrictions on these types (e.g. use in compound types) could be +relaxed at a later time either by extending rustc's codegen or leveraging newly +added support in LLVM. + +However, as C also has restriction and scalable vectors are nevertheless used in +production code, it is unlikely there will be much demand for those restrictions +to be relaxed. + +## Portable SIMD +[portable-simd]: #portable-simd + +Given that there are significant differences between scalable vectors and +fixed-length vectors, and that `std::simd` is unstable, it is worth +experimenting with architecture-specific support and implementation initially. +Later, there are a variety of approaches that could be taken to incorporate +support for scalable vectors into Portable SIMD. + +[acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types +[dotnet]: https://github.com/dotnet/runtime/issues/93095 +[prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt +[rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html +[rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268 +[rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729 +[quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754 From 48c8407f261092224f48afbb432cea2a5affde2f Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 14 Jul 2025 10:16:37 +0000 Subject: [PATCH 02/25] mention prctl use for child processes --- text/0000-repr-scalable.md | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/text/0000-repr-scalable.md b/text/0000-repr-scalable.md index d39f9566472..c2ef08c3621 100644 --- a/text/0000-repr-scalable.md +++ b/text/0000-repr-scalable.md @@ -237,8 +237,10 @@ different element counts depending on the types they are a predicate for. While it is possible to change the vector length at runtime using a [`prctl()`][prctl] call to the kernel, this would require that `vscale` change, -which is unsupported. As Rust cannot prevent users from doing this, it will be -documented as undefined behaviour, consistent with C and C++. +which is unsupported. `prctl` must only be used to set up the vector length for +child processes, not to change the vector length of the current process. As Rust +cannot prevent users from doing this, it will be documented as undefined +behaviour, consistent with C and C++. # Drawbacks [drawbacks]: #drawbacks From 55f532e0912fbe463ff7e71987e7a248ff611b7a Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 14 Jul 2025 11:14:19 +0000 Subject: [PATCH 03/25] rename to rfcs#3838 --- text/{0000-repr-scalable.md => 3838-repr-scalable.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename text/{0000-repr-scalable.md => 3838-repr-scalable.md} (99%) diff --git a/text/0000-repr-scalable.md b/text/3838-repr-scalable.md similarity index 99% rename from text/0000-repr-scalable.md rename to text/3838-repr-scalable.md index c2ef08c3621..4ff6b80d65d 100644 --- a/text/0000-repr-scalable.md +++ b/text/3838-repr-scalable.md @@ -1,6 +1,6 @@ - Feature Name: `repr_scalable` - Start Date: 2025-07-07 -- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- RFC PR: [rust-lang/rfcs#3838](https://github.com/rust-lang/rfcs/pull/3838) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) # Summary From 6afaff7d5f15a0c3ec6c0e4307225c0ab7d36106 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 14 Jul 2025 14:53:04 +0000 Subject: [PATCH 04/25] permit `repr(transparent)` newtypes --- text/3838-repr-scalable.md | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/text/3838-repr-scalable.md b/text/3838-repr-scalable.md index 4ff6b80d65d..227e804bebb 100644 --- a/text/3838-repr-scalable.md +++ b/text/3838-repr-scalable.md @@ -157,9 +157,14 @@ Scalable vector types have some further restrictions due to limitations of the codegen backend: - Cannot be stored in compound types (structs, enums, etc) - - Including coroutines, so these types cannot be held across - an await boundary in async functions. + + - Including coroutines, so these types cannot be held across an await + boundary in async functions + + - `repr(transparent)` newtypes could be permitted with scalable vectors + - Cannot be used in arrays + - Cannot be the type of a static variable. Some of these limitations may be able to be lifted in future depending on what From 35fd341d85b0ab7d46636ed99bb02bc1edd7323e Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 14 Jul 2025 14:53:32 +0000 Subject: [PATCH 05/25] elaborate on choosing `N` --- text/3838-repr-scalable.md | 113 +++++++++++++++++++++++++++---------- 1 file changed, 83 insertions(+), 30 deletions(-) diff --git a/text/3838-repr-scalable.md b/text/3838-repr-scalable.md index 227e804bebb..7646153a303 100644 --- a/text/3838-repr-scalable.md +++ b/text/3838-repr-scalable.md @@ -7,11 +7,11 @@ [summary]: #summary Extends Rust's existing SIMD infrastructure, `#[repr(simd)]`, with a -complementary scalable representation, `#[repr(scalable)]`, to support scalable -vector types, such as Arm's Scalable Vector Extension (SVE), or RISC-V's Vector -Extension (RVV). +complementary scalable representation, `#[repr(scalable(N)]`, to support +scalable vector types, such as Arm's Scalable Vector Extension (SVE), or +RISC-V's Vector Extension (RVV). -Like the existing `repr(simd)` representation, `repr(scalable)` is internal +Like the existing `repr(simd)` representation, `repr(scalable(N))` is internal compiler infrastructure that will be used only in the standard library to introduce scalable vector types which can then be stablised. Only the infrastructure to define these types are introduced in this RFC, not the types @@ -39,7 +39,7 @@ Instead of releasing more extensions with ever increasing register bit widths, AArch64 has introduced a Scalable Vector Extension (SVE). Similarly, RISC-V has a Vector Extension (RVV). These extensions have vector registers whose width depends on the CPU implementation and bit-width-agnostic intrinsics for -operating on these registers. By using scalale vectors, code won't need to be +operating on these registers. By using scalable vectors, code won't need to be re-written using new architecture extensions with larger registers, new types and intrinsics, but instead will work on newer processors with different vector register lengths and performance characteristics. @@ -132,9 +132,9 @@ Types annotated with the `#[repr(simd)]` attribute contains either an array field or multiple fields to indicate the intended size of the SIMD vector that the type represents. -Similarly, a `scalable` repr is introduced to define a scalable vector type. -`scalable` accepts an integer to determine the minimum number of elements the -vector contains. For example: +Similarly, a `scalable(N)` representation is introduced to define a scalable +vector type. `scalable(N)` accepts an integer to determine the minimum number of +elements the vector contains. For example: ```rust #[repr(simd, scalable(4))] @@ -144,6 +144,37 @@ pub struct svfloat32_t { _ty: [f32], } As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get the element type for the codegen backend. +`svfloat32_t` is a scalable vector with a minimum of four `f32` elements and +potentially more depending on the length of the vector register at runtime. + +## Choosing `N` +[choosing-n]: #choosing-n + +Many intrinsics using scalable vectors accept both a predicate vector argument +and data vector arguments. Predicate vectors determine whether a lane is on or +off for the operation performed by any given intrinsic. Predicate vectors may +use different registers of sizes to the vectors containing data. +`repr(scalable)` is used to define vectors containing both data and predicates. + +As `repr(scalable(N))` is intended to be a permanently unstable attribute, any +value of `N` is accepted by the attribute and it is the responsibility of +whomever is defining the type to provide a valid value. A correct value for `N` +depends on the purpose of the specific scalable vector type and the +architecture. + +For example, with SVE, the scalable vector register length is a minimum of 128 +bits, must be a multiple of 128 bits and a power of 2; and predicate registers +have one bit for each byte in the vector registers. So, for `svfloat32_t` +defined shown above, an `f32` is 32-bits and with `N=4`, the entire minimum +register length of 128 bits is used (4 x 32 = 128). An intrinsic that takes a +`svfloat32_t` may also want to accept as an argument a predicate vector with a +matching four elements (`N=4`), which would only use 4 bits of the predicate +register rather than the full 16 bits. + +See +[*Manually-chosen or compiler-calculated element count*][manual-or-calculated-element-count] +for a discussion on why `N` is not calculated by the compiler. + ## Properties of scalable vectors [properties-of-scalable-vector-types]: #properties-of-scalable-vectors @@ -217,28 +248,17 @@ Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable vectors to the correct type in LLVM and the `vscale` modifier that is applied to LLVM's vector types. -LLVM's scalable vector type is of the form ``: - -- `elements` multiplied by `size_of::<$ty>` gives the smallest allowed register - size and the increment size -- `vscale` is a runtime constant that is used to determine the actual vector - register size +LLVM's scalable vector type is of the form ``. +`vscale` is the scaling factor determined by the hardware at runtime, it can be +any value providing it gives a legal vector register size for the architecture. -For example, with SVE, the scalable vector register (`Z` register) size has to -be a multiple of 128 bits and a power of 2. Only the value of `elements` can be -chosen by compiler. For `f32`, `elements` must always be four, as with the -minimum `vscale` of one, `1 * 4 * sizeof(f32)` is the 128-bit minimum register -size. +For example, a `` is a scalable vector with a minimum of four +`f32` elements and with SVE, `vscale` could then be any power of two which would +result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64 +`f32` elements respectively. -At runtime `vscale` could then be any power of two which would result in -register sizes of 128, 256, 512, 1024 and 2048. `vscale` could be any value -providing it gives a legal vector register size for the architecture. - -`repr(scalable)` expects the number of `elements` to be provided rather than -calculating it. This avoids needing to teach the compiler how to calculate the -required `element` count, particularly as some of these scalable types can have -different element counts. For instance, the predicates used in SVE have -different element counts depending on the types they are a predicate for. +The `N` in the `#[repr(scalable(N))]` determines the `element_count` used in the +LLVM type for a scalable vector. While it is possible to change the vector length at runtime using a [`prctl()`][prctl] call to the kernel, this would require that `vscale` change, @@ -250,8 +270,8 @@ behaviour, consistent with C and C++. # Drawbacks [drawbacks]: #drawbacks -- `repr(scalable)` is inherently additional complexity to the language, despite - being largely hidden from users. +- `repr(scalable(N))` is inherently additional complexity to the language, + despite being largely hidden from users. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -267,6 +287,39 @@ By aligning with the approach taken by C (discussed in the [*Prior art*][prior-art] below), most of the documentation that already exists for scalable vector intrinsics in C should still be applicable to Rust. +## Manually-chosen or compiler-calculated element count +[manual-or-calculated-element-count]: #manually-chosen-or-compiler-calculated-element-count + +`repr(scalable(N))` expects `N` to be provided rather than calculating it. This +avoids needing to teach the compiler how to calculate the required `element` +count, which isn't always trivial. + +Many of the intrinsics which accept scalable vectors as an argument also accept +a predicate vector. Predicate vectors decide which lanes are on or off for an +operation (i.e. which elements in the vector are operated on). Predicate vectors +can be in different and smaller registers than the data. For example, +`` could be the predicate vector for a `` +vector + +For non-predicate scalable vectors, it will be typical that `N` will be +`$minimum_register_length / $type_size` (e.g. `4` for `f32` or `8` for `f16` +with a minimum 128-bit register length). In this circumstance, `N` could be +trivially calculated by the compiler. + +For predicate vectors, it is desirable to be able to to define types where `N` +matches the number of elements in the non-predicate vector, i.e. a +`` to match a ``, `` to +match ``, or `` to match +``. In this circumstance, it might still be possible to give +rustc all of the relevant information such that it could compute `N`, but it +would add extra complexity. + +This RFC takes the position that the additional complexity required to have the +compiler always be able to calculate `N` isn't justified given the permanently +unstable nature of the `repr(scalable(N))` attribute and the scalable vector +types defined in `std::arch` are likely to be few in number, automatically +generated and well-tested. + # Prior art [prior-art]: #prior-art From f514a3ef7e9699d9065dd84c5b19a1e173d7b6c0 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:02:00 +0000 Subject: [PATCH 06/25] rename to `rustc_scalable_vector` --- ...r-scalable.md => 3838-scalable-vectors.md} | 73 +++++++++---------- 1 file changed, 36 insertions(+), 37 deletions(-) rename text/{3838-repr-scalable.md => 3838-scalable-vectors.md} (87%) diff --git a/text/3838-repr-scalable.md b/text/3838-scalable-vectors.md similarity index 87% rename from text/3838-repr-scalable.md rename to text/3838-scalable-vectors.md index 7646153a303..bd731b3938d 100644 --- a/text/3838-repr-scalable.md +++ b/text/3838-scalable-vectors.md @@ -1,4 +1,4 @@ -- Feature Name: `repr_scalable` +- Feature Name: `rustc_scalable_vector` - Start Date: 2025-07-07 - RFC PR: [rust-lang/rfcs#3838](https://github.com/rust-lang/rfcs/pull/3838) - Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) @@ -6,16 +6,14 @@ # Summary [summary]: #summary -Extends Rust's existing SIMD infrastructure, `#[repr(simd)]`, with a -complementary scalable representation, `#[repr(scalable(N)]`, to support -scalable vector types, such as Arm's Scalable Vector Extension (SVE), or -RISC-V's Vector Extension (RVV). +Introduces a new attribute, `#[rustc_scalable_vector(N)]`, which can be used to +define new scalable vector types, such as those in Arm's Scalable Vector +Extension (SVE), or RISC-V's Vector Extension (RVV). -Like the existing `repr(simd)` representation, `repr(scalable(N))` is internal -compiler infrastructure that will be used only in the standard library to -introduce scalable vector types which can then be stablised. Only the -infrastructure to define these types are introduced in this RFC, not the types -or intrinsics that use it. +`rustc_scalable_vector(N)` is internal compiler infrastructure that will be used +only in the standard library to introduce scalable vector types which can then +be stablised. Only the infrastructure to define these types are introduced in +this RFC, not the types or intrinsics that use it. This RFC builds on Rust's existing SIMD infrastructure, introduced in [rfcs#1199: SIMD Infrastructure][rfcs#1199]. It depends on @@ -67,7 +65,7 @@ introduce intrinsics and types exposing the scalable vector support in hardware. None of the infrastructure proposed in this RFC is intended to be used directly by Rust users. -`repr(scalable)` as described later in +`rustc_scalable_vector` as described later in [*Reference-level explanation*][reference-level-explanation] is perma-unstable and exists only enables scalable vector types to be defined in the standard library. The specific vector types are intended to eventually be stabilised, but @@ -132,12 +130,12 @@ Types annotated with the `#[repr(simd)]` attribute contains either an array field or multiple fields to indicate the intended size of the SIMD vector that the type represents. -Similarly, a `scalable(N)` representation is introduced to define a scalable -vector type. `scalable(N)` accepts an integer to determine the minimum number of -elements the vector contains. For example: +Similarly, a `rustc_scalable_vector(N)` representation is introduced to define a +scalable vector type. `rustc_scalable_vector(N)` accepts an integer to determine +the minimum number of elements the vector contains. For example: ```rust -#[repr(simd, scalable(4))] +#[rustc_scalable_vector(4)] pub struct svfloat32_t { _ty: [f32], } ``` @@ -154,13 +152,14 @@ Many intrinsics using scalable vectors accept both a predicate vector argument and data vector arguments. Predicate vectors determine whether a lane is on or off for the operation performed by any given intrinsic. Predicate vectors may use different registers of sizes to the vectors containing data. -`repr(scalable)` is used to define vectors containing both data and predicates. +`rustc_scalable_vector` is used to define vectors containing both data and +predicates. -As `repr(scalable(N))` is intended to be a permanently unstable attribute, any -value of `N` is accepted by the attribute and it is the responsibility of -whomever is defining the type to provide a valid value. A correct value for `N` -depends on the purpose of the specific scalable vector type and the -architecture. +As `rustc_scalable_vector(N)` is intended to be a permanently unstable +attribute, any value of `N` is accepted by the attribute and it is the +responsibility of whomever is defining the type to provide a valid value. A +correct value for `N` depends on the purpose of the specific scalable vector +type and the architecture. For example, with SVE, the scalable vector register length is a minimum of 128 bits, must be a multiple of 128 bits and a power of 2; and predicate registers @@ -235,14 +234,14 @@ For example, when instantiating `std::mem::size_of_val` with a scalable vector during monomorphisation, the relevant target feature will be added to `size_of_val` for codegen. -## Implementing `repr(scalable)` +## Implementing `rustc_scalable_vector` [implementing-repr-scalable]: #implementing-reprscalable -Implementing `repr(scalable)` largely involves lowering scalable vectors to the -appropriate type in the codegen backend. LLVM has robust support for scalable -vectors and is the default backend, so this section will focus on implementation -in the LLVM codegen backend. Other codegen backends can implement support when -scalable vectors are supported by the backend. +Implementing `rustc_scalable_vector` largely involves lowering scalable vectors +to the appropriate type in the codegen backend. LLVM has robust support for +scalable vectors and is the default backend, so this section will focus on +implementation in the LLVM codegen backend. Other codegen backends can implement +support when scalable vectors are supported by the backend. Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable vectors to the correct type in LLVM and the `vscale` modifier that is applied to @@ -257,8 +256,8 @@ For example, a `` is a scalable vector with a minimum of four result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64 `f32` elements respectively. -The `N` in the `#[repr(scalable(N))]` determines the `element_count` used in the -LLVM type for a scalable vector. +The `N` in the `#[rustc_scalable_vector(N)]` determines the `element_count` used +in the LLVM type for a scalable vector. While it is possible to change the vector length at runtime using a [`prctl()`][prctl] call to the kernel, this would require that `vscale` change, @@ -270,8 +269,8 @@ behaviour, consistent with C and C++. # Drawbacks [drawbacks]: #drawbacks -- `repr(scalable(N))` is inherently additional complexity to the language, - despite being largely hidden from users. +- `rustc_scalable_vector(N)` is inherently additional complexity to the + language, despite being largely hidden from users. # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives @@ -290,9 +289,9 @@ for scalable vector intrinsics in C should still be applicable to Rust. ## Manually-chosen or compiler-calculated element count [manual-or-calculated-element-count]: #manually-chosen-or-compiler-calculated-element-count -`repr(scalable(N))` expects `N` to be provided rather than calculating it. This -avoids needing to teach the compiler how to calculate the required `element` -count, which isn't always trivial. +`rustc_scalable_vector(N)` expects `N` to be provided rather than calculating +it. This avoids needing to teach the compiler how to calculate the required +`element` count, which isn't always trivial. Many of the intrinsics which accept scalable vectors as an argument also accept a predicate vector. Predicate vectors decide which lanes are on or off for an @@ -316,9 +315,9 @@ would add extra complexity. This RFC takes the position that the additional complexity required to have the compiler always be able to calculate `N` isn't justified given the permanently -unstable nature of the `repr(scalable(N))` attribute and the scalable vector -types defined in `std::arch` are likely to be few in number, automatically -generated and well-tested. +unstable nature of the `rustc_scalable_vector(N)` attribute and the scalable +vector types defined in `std::arch` are likely to be few in number, +automatically generated and well-tested. # Prior art [prior-art]: #prior-art From 2a7705a531d7fe4292b6cecfbadafd849750d323 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:06:15 +0000 Subject: [PATCH 07/25] add asserts into sve example --- text/3838-scalable-vectors.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index bd731b3938d..f56710ec27b 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -88,6 +88,8 @@ following example that sums two input vectors: ```rust fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { + assert_eq!(in_a.len(), in_b.len()); + assert_eq!(in_a.len(), out_c.len()); let len = in_a.len(); unsafe { // `svcntw` returns the actual number of elements that are in a 32-bit From 8709d8ca40eebf49a157f4e4d1bd1419bf1dc8e2 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:07:31 +0000 Subject: [PATCH 08/25] add missing words --- text/3838-scalable-vectors.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index f56710ec27b..5554a87363a 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -67,7 +67,7 @@ by Rust users. `rustc_scalable_vector` as described later in [*Reference-level explanation*][reference-level-explanation] is perma-unstable -and exists only enables scalable vector types to be defined in the standard +and exists only to enable scalable vector types to be defined in the standard library. The specific vector types are intended to eventually be stabilised, but none are proposed in this RFC. @@ -355,8 +355,8 @@ A more general mechanism for enforcing that SIMD types are only used in types to have fewer distinct restrictions than other SIMD types, and would enable SIMD vectors to be passed by-register, a performance improvement. -Such a mechanism would need be introduced gradually to existing SIMD types with -a forward compatibility lint. This will be addressed in a forthcoming RFC. +Such a mechanism would need to be introduced gradually to existing SIMD types +with a forward compatibility lint. This will be addressed in a forthcoming RFC. ## Relaxed restrictions [relaxed-restrictions]: #relaxed-restrictions From b96e2b7d7e0ca22f41676acaf1332f4d70574031 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:18:50 +0000 Subject: [PATCH 09/25] clarify details of code example --- text/3838-scalable-vectors.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 5554a87363a..0ef39cf03dc 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -101,12 +101,16 @@ fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { let c = out_c as *mut f32; let c = c.add(i); - // `svwhilelt_b32` generates a mask based on comparing the current - // index against the `len` + // `svwhilelt_b32` generates a predicate vector that deals with + // the tail of the iteration - it enables the operations which + // follow for the first `len` elements overall, but disables + // the last `len % step` elements in the last iteration let pred = svwhilelt_b32(i as _, len as _); // `svld1_f32` loads a vector register with the data from address // `a`, zeroing any elements in the vector that are masked out + // + // Does not access memory for inactive elements let sva = svld1_f32(pred, a); let svb = svld1_f32(pred, b); From 83d0b93580b97eb7879d411e45a3c490db654d7c Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:19:01 +0000 Subject: [PATCH 10/25] clarify details of type marker --- text/3838-scalable-vectors.md | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 0ef39cf03dc..f0749d9bac5 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -142,7 +142,7 @@ the minimum number of elements the vector contains. For example: ```rust #[rustc_scalable_vector(4)] -pub struct svfloat32_t { _ty: [f32], } +pub struct svfloat32_t { _ty: &[f32], } ``` As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get @@ -151,6 +151,17 @@ the element type for the codegen backend. `svfloat32_t` is a scalable vector with a minimum of four `f32` elements and potentially more depending on the length of the vector register at runtime. +`_ty` is purely a type marker, used to get the element type for the codegen +backend. It must be an array slice containing one of the following types: + +- `u8`, `u16`, `u32` or `u64` +- `i8`, `i16`, `i32` or `i64` +- `f16`, `f32` or `f64` +- `bool` + +It is not permitted to project into scalable vector types and access the type +marker field. + ## Choosing `N` [choosing-n]: #choosing-n From dcb29b4175bd13db4c07c8e7f0ea8f3a354d8e40 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:33:51 +0000 Subject: [PATCH 11/25] avoid introducing scalable vectors in guide-level --- text/3838-scalable-vectors.md | 21 ++++++++------------- 1 file changed, 8 insertions(+), 13 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index f0749d9bac5..6bdf909083d 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -74,19 +74,17 @@ none are proposed in this RFC. ## Using scalable vectors [using-scalable-vectors]: #using-scalable-vectors -Scalable vector types correspond to vector registers in hardware with unknown -size at compile time. However, it will be a known and fixed size at runtime. -Additional properties could be known during compilation, depending on the -architecture, such as a minimum or maximum size or that the size must be a -multiple of some factor. - -As previously described, users will not define their own scalable vector types -and instead use intrinsics from `std::arch`, and this RFC is not proposing any -such intrinsics, just the infrastructure. However, to illustrate how the types -and intrinsics that this infrastructure will enable can be used, consider the +From a user's perspective, writing code for scalable vectors isn't too different +from when writing code with a fixed sized vector. To illustrate how the types +and intrinsics that this infrastructure will enable could be used, consider the following example that sums two input vectors: ```rust +use std::arch::aarch64::{ + // These intrinsics and types are not proposed by this RFC + svcntw, svwhilelt_b32, svld1_f32, svadd_f32_m, svst1_f32 +}; + fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { assert_eq!(in_a.len(), in_b.len()); assert_eq!(in_a.len(), out_c.len()); @@ -126,9 +124,6 @@ fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { } ``` -From a user's perspective, writing code for scalable vectors isn't too different -from when writing code with a fixed sized vector. - # Reference-level explanation [reference-level-explanation]: #reference-level-explanation From 4123a3a2b2da690fe9732556923ac12e95969b16 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 28 Jul 2025 06:55:45 +0000 Subject: [PATCH 12/25] s/stablised/stabilised --- text/3838-scalable-vectors.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 6bdf909083d..8faebe235f8 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -12,7 +12,7 @@ Extension (SVE), or RISC-V's Vector Extension (RVV). `rustc_scalable_vector(N)` is internal compiler infrastructure that will be used only in the standard library to introduce scalable vector types which can then -be stablised. Only the infrastructure to define these types are introduced in +be stabilised. Only the infrastructure to define these types are introduced in this RFC, not the types or intrinsics that use it. This RFC builds on Rust's existing SIMD infrastructure, introduced in From e5b546c6aef456d4e9740f458cccc275d7beb8c6 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 4 Aug 2025 16:30:42 +0000 Subject: [PATCH 13/25] remove trailing whitespaces --- text/3838-scalable-vectors.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 8faebe235f8..a4f5d13a830 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -198,7 +198,7 @@ behave like value types but the exact size cannot be known at compilation time. Scalable vector types have some further restrictions due to limitations of the codegen backend: -- Cannot be stored in compound types (structs, enums, etc) +- Cannot be stored in compound types (structs, enums, etc) - Including coroutines, so these types cannot be held across an await boundary in async functions @@ -259,11 +259,11 @@ Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable vectors to the correct type in LLVM and the `vscale` modifier that is applied to LLVM's vector types. -LLVM's scalable vector type is of the form ``. +LLVM's scalable vector type is of the form ``. `vscale` is the scaling factor determined by the hardware at runtime, it can be any value providing it gives a legal vector register size for the architecture. -For example, a `` is a scalable vector with a minimum of four +For example, a `` is a scalable vector with a minimum of four `f32` elements and with SVE, `vscale` could then be any power of two which would result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64 `f32` elements respectively. From 388170ada8c455c6b3d588603fedb0648b54ad40 Mon Sep 17 00:00:00 2001 From: David Wood Date: Mon, 4 Aug 2025 16:31:10 +0000 Subject: [PATCH 14/25] improve explanation and justification for manual N --- text/3838-scalable-vectors.md | 463 ++++++++++++++++++++++++++++------ 1 file changed, 391 insertions(+), 72 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index a4f5d13a830..1f7592cc36b 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -127,27 +127,84 @@ fn sve_add(in_a: Vec, in_b: Vec, out_c: &mut Vec) { # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -Types annotated with the `#[repr(simd)]` attribute contains either an array -field or multiple fields to indicate the intended size of the SIMD vector that -the type represents. +Scalable vectors are similar to fixed length vectors already supported by Rust, +enabling operations to be performed on multiple values at once in a single +instruction. Unlike fixed length vectors, the length of scalable vectors is not +fixed, and intrinsics which operate on scalable vectors are length-agnostic. + +Scalable vectors types supported by the `rustc_scalable_vector(N)` attribute can +be thought of as having the form `vscale × N × ty`, where `vscale` is a single, +global, fixed-for-the-runtime-of-the-program "scaling factor" of the CPU. + +Vector registers are of the length `vscale × vunit`, with `vunit` being an +architecture-specific value: + +- ARM SVE: `vunit` is the minimum length of the vector register in the + architecture - [128 bits][sve_minlength]. +- RISC-V V: `vunit` is the least common multiple of the supported element + widths - [64 bits][rvv_bitsperblock]. + +> [!TIP] +> +> While the `vscale` terminology is borrowed from LLVM, `vunit` is invented for +> the purposes of aiding this explanation. + +`N` in `rustc_scalable_vector(N)` defines the value of `N` in a scalable vector +type. Any value of `N` is accepted by the attribute and it is the responsibility +of whomever is defining the type to provide a valid value. A correct value for +`N` depends on the purpose of the specific scalable vector type and the +architecture. See +[*Manually-chosen or compiler-calculated element count*][manual-or-calculated-element-count] +for rationale. + +In the simplest case, a scalable vector register could be depicted as follows: + +```text + ◁────── vscale x vunit ──────▷ ◁─── vunit ───▷ + ◁────── vscale x ty x N ─────▷ ◁─── ty x N ──▷ +┌──────────────────────────────┬───────────────┐ +│ ... │ ty │ ty │ ... │ ← a vector register +└──────────────────────────────┴───────────────┘ +``` -Similarly, a `rustc_scalable_vector(N)` representation is introduced to define a -scalable vector type. `rustc_scalable_vector(N)` accepts an integer to determine -the minimum number of elements the vector contains. For example: +Scalable vector types contain a single field which is used to determine `ty`: ```rust #[rustc_scalable_vector(4)] -pub struct svfloat32_t { _ty: &[f32], } +pub struct svfloat32_t(f32); ``` -As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get -the element type for the codegen backend. +In the example above, `svfloat32_t` is a scalable vector with a minimum of four +`f32` elements when `vscale = 1` and more when `vscale > 1`. `svfloat32_t` could +be depicted as.. + +```text + ◁───────────── vscale x f32 x 4 ─────────────▷ ◁────── f32 x 4 ──────▷ +┌──────────────────────────────────────────────┬───────────────────────┐ +│ ... │ f32 │ f32 │ f32 │ f32 │ ← `svfloat32_t` +└──────────────────────────────────────────────┴───────────────────────┘ +``` -`svfloat32_t` is a scalable vector with a minimum of four `f32` elements and -potentially more depending on the length of the vector register at runtime. +..and when running on hardware with `vscale=2`.. + +```text + ◁──── 2 x f32 x 4 ────▷ ◁────── f32 x 4 ──────▷ +┌───────────────────────┬───────────────────────┐ +│ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ ← `svfloat32_t` +└───────────────────────┴───────────────────────┘ +``` + +..or `vscale=3`: + +```text + ◁──────────────── 3 x f32 x 4 ────────────────▷ ◁────── f32 x 4 ──────▷ +┌───────────────────────┬───────────────────────┬───────────────────────┐ +│ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ f32 │ ← `svfloat32_t` +└───────────────────────┴───────────────────────┴───────────────────────┘ +``` -`_ty` is purely a type marker, used to get the element type for the codegen -backend. It must be an array slice containing one of the following types: +The type marker field is solely used to get the element type for the codegen +backend. It must one of the following types: - `u8`, `u16`, `u32` or `u64` - `i8`, `i16`, `i32` or `i64` @@ -157,35 +214,6 @@ backend. It must be an array slice containing one of the following types: It is not permitted to project into scalable vector types and access the type marker field. -## Choosing `N` -[choosing-n]: #choosing-n - -Many intrinsics using scalable vectors accept both a predicate vector argument -and data vector arguments. Predicate vectors determine whether a lane is on or -off for the operation performed by any given intrinsic. Predicate vectors may -use different registers of sizes to the vectors containing data. -`rustc_scalable_vector` is used to define vectors containing both data and -predicates. - -As `rustc_scalable_vector(N)` is intended to be a permanently unstable -attribute, any value of `N` is accepted by the attribute and it is the -responsibility of whomever is defining the type to provide a valid value. A -correct value for `N` depends on the purpose of the specific scalable vector -type and the architecture. - -For example, with SVE, the scalable vector register length is a minimum of 128 -bits, must be a multiple of 128 bits and a power of 2; and predicate registers -have one bit for each byte in the vector registers. So, for `svfloat32_t` -defined shown above, an `f32` is 32-bits and with `N=4`, the entire minimum -register length of 128 bits is used (4 x 32 = 128). An intrinsic that takes a -`svfloat32_t` may also want to accept as an argument a predicate vector with a -matching four elements (`N=4`), which would only use 4 bits of the predicate -register rather than the full 16 bits. - -See -[*Manually-chosen or compiler-calculated element count*][manual-or-calculated-element-count] -for a discussion on why `N` is not calculated by the compiler. - ## Properties of scalable vectors [properties-of-scalable-vector-types]: #properties-of-scalable-vectors @@ -247,7 +275,7 @@ during monomorphisation, the relevant target feature will be added to `size_of_v for codegen. ## Implementing `rustc_scalable_vector` -[implementing-repr-scalable]: #implementing-reprscalable +[implementing-rustc_scalable_vector]: #implementing-rustc_scalable_vector Implementing `rustc_scalable_vector` largely involves lowering scalable vectors to the appropriate type in the codegen backend. LLVM has robust support for @@ -278,6 +306,11 @@ child processes, not to change the vector length of the current process. As Rust cannot prevent users from doing this, it will be documented as undefined behaviour, consistent with C and C++. +Tuples in RISC-V's V Extension lower to target-specific types in LLVM rather +than generic scalable vector types, so `rustc_scalable_vector` will not +initially support RVV tuples (see +[*RISC-V Vector Extension's tuple types*][rvv-tuples]). + # Drawbacks [drawbacks]: #drawbacks @@ -294,6 +327,10 @@ recommended way to do SIMD, lack of support in Rust would severely limit Rust's suitability on these architectures compared to other systems programming languages. +`rustc_scalable_vector` is preferred over a `repr(scalable)` attribute as there +is existing dissatisfaction with fixed-length vectors being defined using the +`repr(simd)` attribute ([rust#63633]). + By aligning with the approach taken by C (discussed in the [*Prior art*][prior-art] below), most of the documentation that already exists for scalable vector intrinsics in C should still be applicable to Rust. @@ -302,34 +339,298 @@ for scalable vector intrinsics in C should still be applicable to Rust. [manual-or-calculated-element-count]: #manually-chosen-or-compiler-calculated-element-count `rustc_scalable_vector(N)` expects `N` to be provided rather than calculating -it. This avoids needing to teach the compiler how to calculate the required -`element` count, which isn't always trivial. - -Many of the intrinsics which accept scalable vectors as an argument also accept -a predicate vector. Predicate vectors decide which lanes are on or off for an -operation (i.e. which elements in the vector are operated on). Predicate vectors -can be in different and smaller registers than the data. For example, -`` could be the predicate vector for a `` -vector - -For non-predicate scalable vectors, it will be typical that `N` will be -`$minimum_register_length / $type_size` (e.g. `4` for `f32` or `8` for `f16` -with a minimum 128-bit register length). In this circumstance, `N` could be -trivially calculated by the compiler. - -For predicate vectors, it is desirable to be able to to define types where `N` -matches the number of elements in the non-predicate vector, i.e. a -`` to match a ``, `` to -match ``, or `` to match -``. In this circumstance, it might still be possible to give -rustc all of the relevant information such that it could compute `N`, but it -would add extra complexity. - -This RFC takes the position that the additional complexity required to have the -compiler always be able to calculate `N` isn't justified given the permanently -unstable nature of the `rustc_scalable_vector(N)` attribute and the scalable -vector types defined in `std::arch` are likely to be few in number, -automatically generated and well-tested. +it. Calculating `N` would make this attribute more robust and decrease the +likelihood of it being used incorrectly - even for permanently unstable internal +attributes like `rustc_scalable_vector`, this would be worthwhile if feasible. + +In the simplest case, calculating `N` is a simple division: `vunit` (as +previously above) divided by the `element_size`. For example, with ARM SVE, +`vunit=128` so with an `f32` element, `N = 128/32 = 4`; and with RISC-V RVV, +`vunit=64` so with an `f32` element, `N = 64/32 = 2` (assuming `LMUL=1`, but +more on that later). + +There are more complicated scalable vector definitions than those presented in +the [*Reference-level explanation*][reference-level-explanation], which +`rustc_scalable_vector` can support, but that would require more complicated +calculations for `N` or architecture-specific knowledge: + +1. With Arm SVE, each intrinsic that takes a predicate takes a `svbool_t` + (`vscale x i1 x 16`). `svbool_t` could have its `N` calculated as above with + simple division. + + `svbool_t` is used even when the data arguments have fewer elements, e.g. an + `svfloat32_t` (`vscale x f32 x 4`). `svbool_t` has predicates for sixteen + lanes but there are only four lanes in the `svfloat32_t` arguments to enable + or disable. This is slightly unintuitive but matches the definitions of the + intrinsics in the Arm ACLE. + + Within the definition of those intrinsics, the `svbool_t` is cast to a + private `svboolN_t` type which has a number of lanes matching the data + argument (e.g. a `svbool4_t`/`vscale x i1 x 4` for + `svfloat32_t`/`vscale x f32 x 4`). + + ```text + ├──────── vscale x i32 x 4 ───────┤ ├──────────── i32 x 4 ────────────┤ + ┌───────────────────────────────────┬───────────────────────────────────┐ + │ ... │ 0x0000 │ 0x0000 │ 0x0000 │ 0x0000 │ + └───────────────────────────────────┴───────────────────────────────────┘ + △ △ △ △ △ + │ ┌───────────┘ │ │ │ + │ │ ┌──────────────┘ │ │ + │ │ │ ┌─────────────────┘ │ + ┌─────┘ │ │ │ ┌─────────────────────┘ + ┌───────────────────────┬───────────────────────┐ + │ ... │ 0x0 │ 0x0 │ 0x0 │ 0x0 │ + unused space for 12x `i1`s + └───────────────────────┴───────────────────────┘ + ├── vscale x i1 x 4 ──┤ ├─────── i1 x 4 ──────┤ + ``` + + Defining a `svboolN_t` is more complicated than trivial division, requiring + the attribute accept either arbitrary specification of `N` or a type to + calculate `N` with, for example: + + ```rust + // alternative: user-provided arbitrary `N` + #[rustc_scalable_vector(4)] + struct svbool4_t(bool); + + // alternative: add `predicate_of` to attribute + #[rustc_scalable_vector(predicate_of = "u32")] + struct svbool4_t(bool); + + // alternative: use another field to separate element type and size to use for `N` + #[rustc_scalable_vector] + struct svbool4_t(bool, u32); + ``` + +2. Similarly, with Arm SVE, the sign extending intrinsics will internally use + LLVM intrinsics which return vectors with fewer elements than + `vunit / element_size` (similar to `svboolN_t` but for other types). + + For example, the `svldnt1sb_gather_s64offset_s64` intrinsic wraps the + `llvm.aarch64.sve.ldnt1.gather.nxv2i8` intrinsic in LLVM. It returns + `nxv2i8`, which is a `vscale x i8 x 2` that is then cast to `svint64_t`. + + Like in the previous case, `vscale x i8 x 2` cannot be defined without the + attribute accepting arbitrary specification of `N` or a type to calculate `N` + with. + +3. Also with Arm SVE, some load and store intrinsics take tuples of vectors, + such as `svfloat32x2_t`: + + ```text + ◁───────────── vscale x f32 x 4 ─────────────▷ ◁────── f32 x 4 ──────▷ + ┌──────────────────────────────────────────────┬───────────────────────┐ ┐ + │ ... │ f32 │ f32 │ f32 │ f32 │ │ + └──────────────────────────────────────────────┴───────────────────────┘ ├─ svfloat32x2_t + ┌──────────────────────────────────────────────┬───────────────────────┐ │ vscale x f32 x 8 + │ ... │ f32 │ f32 │ f32 │ f32 │ │ + └──────────────────────────────────────────────┴───────────────────────┘ ┘ + ``` + + These types are the opposite of the previous complicating case, containing + more elements than `vunit / element_size`. These use two or more registers to + represent the vector. + + `vscale x f32 x 8` cannot be defined without the attribute accepting + arbitrary specification of `N` or an argument to the attribute to specify the + number of registers used: + + ```rust + // alternative: user-provided arbitrary `N` + #[rustc_scalable_vector(8)] + struct svfloat32x2_t(f32); + + // alternative: add `tuple_of` to attribute + #[rustc_scalable_vector(tuple_of = "2")] // either `1` (default), `2`, `3` or `4` + struct svfloat32x2_t(f32); + ``` + +4. RISC-V RVV's scalable vectors are quite different from Arm's SVE, while + sharing the same underlying infrastructure in LLVM. + + SVE's scalable vector types map directly onto LLVM scalable vector types, and + all of the dynamic parts of the vectors are abstracted by `vscale`: + + ```text + ├───────── vscale x 128 ────────┤ ├── 128 ──┤ + ┌─────────────────────────────────┬───────────┐ + │ ... │ │ + └─────────────────────────────────┴───────────┘ + ``` + + RVV's scalable vector types have an extra dimensions of flexibility, the + "register grouping factor" or `LMUL`, and `SEW`: + + - `SEW` is the "selected element width", and corresponds to the size of the + element type of the vector (`element_size`). + + RVV uses the least common multiple of the supported element types as + `vunit` so that the overall vector length is `VLEN = vscale * vunit`, which + is a constant, rather than `VLEN = vscale * element_size`, which is not a + constant. + + - `LMUL` configures how many vector registers are grouped together to form a + larger logical vector register. `LMUL` can be 1/8, 1/4, 1/2, 1, 2, 4, or 8. + Not all `LMUL` values are valid for each type. + + `LMUL` and `SEW` are part of the processor state and are changed by + compiler-inserted `vsetli` instructions depending on the vector types being + used. + + `LMUL` is distinct from tuple types, which are a separate variable named + `NFIELD` (which is not part of the processor state, as with SVE tuples). + `NFIELD` can be 1, 2, 3, 4, 5, 6, 7 or 8. Not all `NFIELD` values are valid + for each type. + + Scalable vector types which vary in both `LMUL` and `NFIELD` could be exposed + to the user (see [RVV Type System Documentation][rvv_typesystem]). For + example, consider the following types: + + - `vint8mf2_t` has `NFIELD=1`, `LMUL=1/2` and `ty=i8` + - `vuint32m4_t` has `NFIELD=1`, `LMUL=4` and `ty=i32` + - `vint16mf4x6_t` has `NFIELD=6`, `LMUL=1/4` and `ty=i16` + - `vint64m2x3_t` has `NFIELD=3`, `LMUL=2` and `ty=i64` + + This can include types which have different representation but have the same + `N`: + + - `vint32m1x2_t` has `NFIELD=2`, `LMUL=1` and `ty=i32` (`N=4` elements) + - `vint32m2_t` has `NFIELD=1`, `LMUL=2` and `ty=i32` (`N=4` elements) + + When `NFIELD=1`, `LMUL=4` and `ty=i64`, four registers are grouped together + to form a logical vector register, and this has the type + ``: + + ```text + ├────────────────────── VLEN ────────────────────┤ + ├── vscale x 64 bits ──┤ ├────── 64 bits ──────┤ + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + ┌──────────────────────╋─┬───────────────────────┐ ┃ ┬ + │ ... ┃ │ i64 │ ┃ │ + └──────────────────────╋─┴───────────────────────┘ ┃ │ + ┌──────────────────────╋─┬───────────────────────┐ ┃ │ + │ ... ┃ │ i64 │ ┃ │ + └──────────────────────╋─┴───────────────────────┘ ┃ │ LMUL=4 (vint64m4_t) + ┌──────────────────────╋─┬───────────────────────┐ ┃ │ vscale x 4 x i64 + │ ... ┃ │ i64 │ ┃ │ + └──────────────────────╋─┴───────────────────────┘ ┃ │ + ┌──────────────────────╋─┬───────────────────────┐ ┃ │ + │ ... ┃ │ i64 │ ┃ │ + └──────────────────────╋─┴───────────────────────┘ ┃ ┴ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + ``` + + Similarly, when `NFIELD=1`, `LMUL=4` and `ty=i32`, the smaller element type + results in each register containing more elements to add up to `vunit` and + this is repeated across all four registers: + + ```text + ├────────────────────── VLEN ────────────────────┤ + ├── vscale x 64 bits ──┤ ├────── 64 bits ──────┤ + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ ┬ + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ │ + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ │ + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ │ LMUL=4 (vint32m4_t) + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ │ vscale x 8 x i32 + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ │ + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ │ + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ ┴ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + ``` + + It is possible for different scalable vector types with RVV to have the same + value of `N`, consider `NFIELD=1`, `LMUL=2` and `ty=i32`.. + + ```text + ├────────────────────── VLEN ────────────────────┤ + ├── vscale x 64 bits ──┤ ├────── 64 bits ──────┤ + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ ┬ + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ │ LMUL=2 (vint32m2_t) + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ │ vscale x 4 x i32 + │ ... ┃ │ i32 │ i32 │ ┃ │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ ┴ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ + ``` + + ..and `NFIELD=2`, `LMUL=1` and `ty=i32`: + + ```text + ├────────────────────── VLEN ────────────────────┤ + ├── vscale x 64 bits ──┤ ├────── 64 bits ──────┤ + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┐ + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ ┬ │ + │ ... ┃ │ i32 │ i32 │ ┃ │ LMUL=1 │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ ┴ │ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ ├─ vint32m1x2_t + ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │ vscale x 4 x i32 + ┌──────────────────────╋─┬───────────┬───────────┐ ┃ ┬ │ + │ ... ┃ │ i32 │ i32 │ ┃ │ LMUL=1 │ + └──────────────────────╋─┴───────────┴───────────┘ ┃ ┴ │ + ┗━━━━━━━━━━━━━━━━━━━━━━━━━━━┛ ┘ + ``` + + Despite `vint32m2_t` and `vint32m1x2_t` having the same value of `N`, and + hence same LLVM type, `vscale x 4 x i32`, the value of `LMUL` in the + processor state will be different when each are used. In practice, RVV tuples + lower to target-specific types in LLVM rather than generic scalable vector + types, so `rustc_scalable_vector` will not initially support RVV tuples (see + [*RISC-V Vector Extension's tuple types*][rvv-tuples]). + + RVV's scalable vectors cannot be defined without the attribute accepting + arbitrary specification of `N` or an argument to the attribute to specify the + `lmul` used: + + ```rust + // alternative: user-provided arbitrary `N` + #[rustc_scalable_vector(4)] + struct vint32m2_t(i32); + + #[rustc_scalable_vector(1)] + struct vint16mf4_t(i16); + + // alternative: user-provided `LMUL` + #[rustc_scalable_vector(lmul = "2")] + struct vint32m2_t(i32); + + #[rustc_scalable_vector(lmul = "1/4")] + struct vint16mf4_t(i16); + ``` + +It is technically possible to calculate `N`, requiring lots of additional +machinery in the `rustc_scalable_vector` attribute, much of which would be +mutually exclusive or produce invalid types with many values of the parameters. + +There will be a fixed number of scalable vector types that will be defined in +the standard library alongside their intrinsics and well-tested. It is very +likely that their implementations will be automatically generated. For example, +while not being proposed in this RFC, it is expected Arm SVE will define 55 +scalable vector types, exhaustively covering the possible vector types enabled +by the architecture extension (it is assumed that the same will be true for +RISC-V RVV): + +- `svbool_t` +- `sv{int8,uint8}{,x2,x3,x4}_t` +- `sv{int16,uint16}{,x2,x3,x4}_t` +- `sv{float32,int32,uint32}{,x2,x3,x4}_t` +- `sv{float64,int64,uint64}{,x2,x3,x4}_t` +- `svbool{2,4,8}_t` for internal use +- `nxv{2,4,8}{i8,u8}` for internal use +- `nxv{2,4}{i16,u16}` for internal use +- `nxv2{i32,u32}` for internal use + +Given the complexity required in `rustc_scalable_vector` to be able to calculate +`N`, that it would still be possible to use the attribute incorrectly, that the +attribute is permanently unstable, and the low risk of misuse given the intended +use, this proposal argues that allowing arbitrary specification of `N` is +reasonable. # Prior art [prior-art]: #prior-art @@ -388,10 +689,28 @@ experimenting with architecture-specific support and implementation initially. Later, there are a variety of approaches that could be taken to incorporate support for scalable vectors into Portable SIMD. +## RISC-V Vector Extension's tuple types +[rvv-tuples]: #risc-v-vector-extensions-tuple-types + +As explained in +[*Manually-chosen or compiler-calculated element count*][manual-or-calculated-element-count], +there is a distinction in RVV between vectors which would have the same `N` in a +scalable vector type, but which vary in `LMUL` and `NFIELD`. + +For example, `vint32m2_t` and `vint32m1x2_t`, if lowered to scalable vector +types in LLVM, would both be ``. + +RVV's tuple types need to be lowered to target-specific types in the backend +which is out-of-scope of this general infrastructure for scalable vectors. + [acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types [dotnet]: https://github.com/dotnet/runtime/issues/93095 [prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt +[quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754 [rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html [rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268 [rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729 -[quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754 +[rust#63633]: https://github.com/rust-lang/rust/issues/63633 +[rvv_bitsperblock]: https://github.com/llvm/llvm-project/blob/837b2d464ff16fe0d892dcf2827747c97dd5465e/llvm/include/llvm/TargetParser/RISCVTargetParser.h#L51 +[rvv_typesystem]: https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#type-system +[sve_minlength]: https://developer.arm.com/documentation/102476/0101/Introducing-SVE#:~:text=a%20minimum%20of%20128%20bits From 16bb4ebaad14dcc681dd434552c67cad66e852a8 Mon Sep 17 00:00:00 2001 From: David Wood Date: Tue, 5 Aug 2025 05:34:40 +0000 Subject: [PATCH 15/25] prohibit generic instantiation/trait impls --- text/3838-scalable-vectors.md | 103 +++++++++++++++++++++------------- 1 file changed, 65 insertions(+), 38 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 1f7592cc36b..79fd783877f 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -226,6 +226,9 @@ behave like value types but the exact size cannot be known at compilation time. Scalable vector types have some further restrictions due to limitations of the codegen backend: +- Can only be in the signature of a function if it is annotated with the + appropriate target feature (see [*ABI*][abi]) + - Cannot be stored in compound types (structs, enums, etc) - Including coroutines, so these types cannot be held across an await @@ -235,12 +238,20 @@ codegen backend: - Cannot be used in arrays -- Cannot be the type of a static variable. +- Cannot be the type of a static variable + +- Cannot be instantiated into generic functions (see + [*Target features*][target-features]) + +- Cannot have trait implementations (see [*Target features*][target-features]) + + - Including blanket implementations (i.e. `impl Foo for T` is not a valid + candidate for a scalable vector) Some of these limitations may be able to be lifted in future depending on what -is supported by rustc's codegen backends. +is supported by rustc's codegen backends or with evolution of the language. -## ABI +### ABI [abi]: #abi Rust currently always passes SIMD vectors on the stack to avoid ABI mismatches @@ -256,23 +267,24 @@ Therefore, there is an additional restriction that these types cannot be used in the argument or return types of functions unless those functions are annotated with the relevant target feature. -## Target features +### Target features [target-features]: #target-features -Similarly to the issues with the ABI of scalable vectors, without the relevant -target features, few operations can actually be performed on scalable vectors - -causing issues for the use of scalable vectors in generic code and with traits. +Similarly to the challenges with the ABI of scalable vectors, without the +relevant target features, few operations can actually be performed on scalable +vectors - causing issues for the use of scalable vectors in generic code and +with traits implementations. + For example, implementations of traits like `Clone` would not be able to actually perform a clone, and generic functions that are instantiated with scalable vectors would during instruction selection in the codegen backend. -When a scalable vector is instantiated into a generic function during -monomorphisation, or a trait method is being implemented for a scalable vector, -then the relevant target feature will be added to the function. +Without a mechanism for a generic function to be able to inherit target features +from its instantiated types or for trait methods to have target features, it is +not possible for these types to be used with generic functions or traits. -For example, when instantiating `std::mem::size_of_val` with a scalable vector -during monomorphisation, the relevant target feature will be added to `size_of_val` -for codegen. +See +[*Trait implementations and generic instantiation*][trait-implementations-and-generic-instantiation]. ## Implementing `rustc_scalable_vector` [implementing-rustc_scalable_vector]: #implementing-rustc_scalable_vector @@ -651,43 +663,47 @@ There are not many languages with support for scalable vectors: # Unresolved questions [unresolved-questions]: #unresolved-questions -There are currently no unresolved questions. +There is one outstanding unresolved question for scalable vectors: + +- How to support trait implementations and generic instantiation for scalable vectors? + + - See [*Target features*][target-features] and + [*Trait implementations and generic instantiation*][trait-implementations-and-generic-instantiation] # Future possibilities [future-possibilities]: #future-possibilities -There are a handful of future possibilities enabled by this RFC: +There are a handful of future possibilities enabled by this RFC - relaxing +restrictions, architecture-agnostic use or extending the feature to support more +features of the architecture extensions: -## General mechanism for target-feature-affected types -[general-mechanism-target-feature-types]: #general-mechanism-for-target-feature-affected-types +## Trait implementations and generic instantiation +[trait-implementations-and-generic-instantiation]: #trait-implementations-and-generic-instantiation -A more general mechanism for enforcing that SIMD types are only used in -`target_feature`-annotated functions would be useful, as this would enable SVE -types to have fewer distinct restrictions than other SIMD types, and would -enable SIMD vectors to be passed by-register, a performance improvement. +Improvements to the language's `target_feature` infrastructure could enable the +restrictions on trait implementations and generic instantiation to be lifted: -Such a mechanism would need to be introduced gradually to existing SIMD types -with a forward compatibility lint. This will be addressed in a forthcoming RFC. +- Some variety of [rfcs#3820: `target_feature_traits`][rfcs#3280] could help + traits be implemented on scalable vectors -## Relaxed restrictions -[relaxed-restrictions]: #relaxed-restrictions +- Efforts to integrate target features with the effect system ([rust#143352]) + may help enable generic instantiation of scalable vectors -Some of the restrictions on these types (e.g. use in compound types) could be + - Any mechanism that could be applied to scalable vector types could also be + used to enforce that existing SIMD types are only used in + `target_feature`-annotated functions, which would enable fixed-length + vectors to be passed by-register, improving performance + +## Compound types +[compound-types]: #compound-types + +The restriction that scalable vectors cannot be used in compound types could be relaxed at a later time either by extending rustc's codegen or leveraging newly added support in LLVM. -However, as C also has restriction and scalable vectors are nevertheless used in -production code, it is unlikely there will be much demand for those restrictions -to be relaxed. - -## Portable SIMD -[portable-simd]: #portable-simd - -Given that there are significant differences between scalable vectors and -fixed-length vectors, and that `std::simd` is unstable, it is worth -experimenting with architecture-specific support and implementation initially. -Later, there are a variety of approaches that could be taken to incorporate -support for scalable vectors into Portable SIMD. +However, as C also has thus restriction and scalable vectors are nevertheless +used in production code, it is unlikely there will be much demand for those +restrictions to be relaxed in LLVM. ## RISC-V Vector Extension's tuple types [rvv-tuples]: #risc-v-vector-extensions-tuple-types @@ -703,6 +719,15 @@ types in LLVM, would both be ``. RVV's tuple types need to be lowered to target-specific types in the backend which is out-of-scope of this general infrastructure for scalable vectors. +## Portable SIMD +[portable-simd]: #portable-simd + +Given that there are significant differences between scalable vectors and +fixed-length vectors, and that `std::simd` is unstable, it is worth +experimenting with architecture-specific support and implementation initially. +Later, there are a variety of approaches that could be taken to incorporate +support for scalable vectors into Portable SIMD. + [acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types [dotnet]: https://github.com/dotnet/runtime/issues/93095 [prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt @@ -710,7 +735,9 @@ which is out-of-scope of this general infrastructure for scalable vectors. [rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html [rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268 [rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729 +[rfcs#3280]: https://github.com/rust-lang/rfcs/pull/3280 [rust#63633]: https://github.com/rust-lang/rust/issues/63633 +[rust#143352]: https://github.com/rust-lang/rust/issues/143352 [rvv_bitsperblock]: https://github.com/llvm/llvm-project/blob/837b2d464ff16fe0d892dcf2827747c97dd5465e/llvm/include/llvm/TargetParser/RISCVTargetParser.h#L51 [rvv_typesystem]: https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#type-system [sve_minlength]: https://developer.arm.com/documentation/102476/0101/Introducing-SVE#:~:text=a%20minimum%20of%20128%20bits From ca4aa06f89eb725e76b258cd703b94e237b38549 Mon Sep 17 00:00:00 2001 From: David Wood Date: Tue, 5 Aug 2025 06:10:27 +0000 Subject: [PATCH 16/25] mention ffi-safety --- text/3838-scalable-vectors.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 79fd783877f..7bd30ee4beb 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -267,6 +267,9 @@ Therefore, there is an additional restriction that these types cannot be used in the argument or return types of functions unless those functions are annotated with the relevant target feature. +As scalable vectors will always be passed as immediates, they will therefore +have the same ABI as in C, so should be considered FFI-safe. + ### Target features [target-features]: #target-features From 5011dc692f2a9315f0f17c952cd05f6ccdd51310 Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 05:58:17 +0000 Subject: [PATCH 17/25] add much more prior art --- text/3838-scalable-vectors.md | 495 +++++++++++++++++++++++++++++++++- 1 file changed, 491 insertions(+), 4 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 7bd30ee4beb..b0194169cef 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -650,6 +650,11 @@ reasonable. # Prior art [prior-art]: #prior-art +[rfcs#3268] was a previous iteration of this RFC. + +## Other languages +[prior-art-inother-languages]: #other-languages + There are not many languages with support for scalable vectors: - SVE in C takes a similar approach as this proposal by using sizeless @@ -661,7 +666,437 @@ There are not many languages with support for scalable vectors: the design and implementation considerations in .NET are quite different to Rust. -[rfcs#3268] was a previous iteration of this RFC. +## `repr(simd)` and `target_feature` +[prior-art-in-rust]: #reprsimd-and-target_feature + +Both `repr(simd)` and `target_feature` attributes were initially proposed in +RFCs: + +- **[rfcs#2045]: `target_feature`** + - Original accepted RFC for `#[target_feature(enable = "..")]`. + - Of relevance to ABI-affecting target features, there was various discussion + around the RFC which led to discussion of ABI issues being + [relegated to an unresolved question][rfcs#2045-abi] + - At the time of writing the RFC, Portable SIMD types were the only types + that were considered as potentially having ABI issues, rather than types + to be used with vendor intrinsics + - > However, there are types that we might want to add to the language at + > some point, like portable vector types, for which this \[a lack of ABI + > changes] is not the case. + > + > The behaviour of `#[target_feature]` for those types should be + > specified in the RFC that proposes to stabilize those types, and this RFC + > should be amended as necessary. + - It does not appear that this has been considered for any intrinsic types + that were later stabilised (e.g. + [Neon intrinsics on AArch64][rust#90972]) and as such that these types + can exist in featureless functions is an accident of history + +- **[rust#44839]: Tracking issue for RFC 2045: improving `#[target_feature]`** + - Tracking issue for the `#[target_feature]` parts of RFC 2045 + - This issue hasn't been well-maintained and the description is out-of-date. + It aims to track both the addition of intrinsics for various architectures + which use `#[target_feature]` as well as improvements to the + `#[target_feature]` attribute itself + - It was last triaged by the language team in Mar 2022, concluding that the + issue needed an owner + +- **[rfcs#2396]: `#[target_feature]` 1.1** + - Allows specifying `#[target_feature]` functions without making them unsafe, + still requiring calls to be in unsafe blocks unless the calling function + also has the target features enabled + +- **[rfcs#1199]: `repr_simd`** + - Proposed `repr(simd)` attribute, applied to structs with multiple fields, + one for each element in the corresponding vector + - `repr(simd)` has since changed it was proposed in this RFC + - It is used for both portable SIMD types and non-portable types, and now + contains an array (i.e. `[f32; 4]` instead of `(f32, f32, f32, f32)`) + - Largely focused on portable SIMD, rather than non-portable intrinsics + - Proposed intrinsics be declared in `extern "platform-intrinsic"` blocks and + that platform detection be available (though this part was later subsumed by + [rfcs#2045]) + +- **[rust#27731]: Tracking issue for SIMD support** + - Initially tracked implementation of [rfcs#1199], eventually ended up + tracking `simd_ffi` ([rust#53346]), `repr(simd)` and `core::arch` intrinsics + - It was later closed and split up into tracking issues for each + architecture's intrinsics + +There are many existing issues and RFCs related to `repr(simd)`; interactions +between SIMD types and target features; and ABI incompatibilities with SIMD +types, surveyed in the sections below. + +Many of these issues were related to specific intrinsics on specific platforms +(adding, stabilising or fixing bugs with them), these have been omitted and only +issues that affect generic infrastructure are included. + +### Projections into `repr(simd)` +[prior-art-projections]: #projections-into-reprsimd + +A handful of issues are related to projections into `repr(simd)` types being +initially permitted.. + +- **[rust#105439]: ICE due to generating LLVM bitcast vec -> array** + - Accessing the field of a `repr(simd)` type causes an ICE + - Fixed by [rust#105583], changing codegen to remove the illegal operation + - Later addressed holistically by [compiler-team#838] which will ban projecting into `repr(simd)` types + - Landed in [rust#143833] + +- **[rust#137108]: Projecting into non-power-of-two-lanes `repr(simd)` types does the wrong thing** + - Repeat of [rust#105439]: + - Accessing the field of a `repr(simd)` type misbehaves + - Similarly addressed by [compiler-team#838] + +- **[rust#113465]: transmute + tuple access + eq on `repr(simd)`'s inner value seems UB to valgrind** + - Repeat of [rust#105439], except with Portable SIMD type ([portable-simd#339]) + - Accessing the field of a `repr(simd)` type misbehaves + - Similarly addressed by [compiler-team#838] + +These issues are informative for scalable vectors and projection into scalable +vectors will not be supported, as described in +[*Reference-level explanation*][reference-level-explanation]. + +### Inheritance of `target_feature` +[prior-art-target-feat-inheritance]: #inheritance-of-target_feature + +Other issues discussed confusion related to inheritance of `target_feature` to +nested functions and closures... + +- **[rust#58729]: target_feature doesn't trickle down to closures and internal fns** + - `target_feature` attribute doesn't apply to nested functions and closures + - Interaction with nested functions is expected, these never inherit from their parent + - Interaction with closures was a bug + - Prior to [rfcs#2396], closures would have required the ability to be marked as unsafe to support `target_feature` + - After [rfcs#2396], closures inheriting target features was accepted in [rust#73631] (then implemented in [rust#78231]) + - Interactions with `inline(always)` fixed in [rust#111836] + - `target_feature` attributes are ignored from `inline(always)`-annotated closures + +- **[rust#108338]: closure doesn't seem to inherit the target attributes for codegen purposes** + - Basically a dupe of [rust#58729] with same resolution + +- **[rust#111836]: Fix #\[inline(always)] on closures with target feature 1.1** + - Allows `#[inline(always)]` to be used with `#[target_feature]` on closures, + assuming that target features only affect codegen + +Scalable vectors will inherit the behaviour described above. + +### `repr(simd)` syntax +[prior-art-syntax]: #reprsimd-syntax + +There are issues related to how well `repr(simd)` syntax works with other +representation hints and whether a language item would be better: + +- **[rust#47103]: What to do about repr(C, simd)?** + - Unclear what the behaviour of `repr(C, simd)` should be + - When submitted, a warning of incompatible representation hints was emitted + - When omitted, a FFI unsafety warning was emitted when SIMD types used in + FFI + - Passing vectors as immediates is trickier, later resolved in [rfcs#2574], so + discussion focused on passing vectors indirectly over the FFI boundary + - Discussion fizzled out, but with [rust#116558], it may be possible to allow + `repr(C, simd)` + +- **[rust#130402]: When a type is `#[repr(simd)]`, `#[repr(align(N))]` annotations are ignored** + - `repr(align)` ignored when `repr(simd)` is present + - Intended to be fixed after [rust#137256] which refactored layout logic + within the compiler + - Unclear if the fix happened, but the code from the bug report still has + the unexpected alignment + +- **[rust#63633]: Remove repr(simd) attribute and use a lang-item instead** + - Suggests using a language item for the `Simd` type (part of Portable SIMD) + instead of using `repr(simd)` + - Doesn't address what would happen for non-portable intrinsics that also use + this infrastructure + - Various other issues cited as motivation: + - [rust#18147] used Portable SIMD `f64x2` and found that constant + initialisers weren't optimised with `-Copt-level=0` + - Not clear that this applies to types intended for use with + architecture-specific intrinsics + - [rust#47103] + - See above + - [rust#53346] + - See below + - [rust#77529] + - See below + - [rust#77866] defines its own `repr(simd)` type and then passes it to an + LLVM intrinsic binding that has been declared incorrectly + - [rust#81931] defines its own `repr(simd)` type and finds that it is + misaligned according to recommendations for achieving best performance + +On account of these concerns, scalable vectors use `rustc_scalable_vector` +instead. + +### Portable SIMD-specific +[prior-art-portable-simd]: #portable-simd-specific + +A handful of architecture-agnostic issues only relate to Portable SIMD: + +- **[rust#126217]: What should SIMD bitmasks look like?** + - Design discussion related to Portable SIMD + `simd_bitmask`/`simd_bitmask_select` intrinsics + - Not relevant to architecture-specific scalable vector intrinsics + +- **[rust#99211]: fn where clause "constrains trait impls" or something** + - Writing extension traits with const generics which. apply to Portable SIMD + types can run into tricky compiler errors related to the type system + - Only applies to Portable SIMD + +- **[rust#77529]: Invalid monomorphisation when `-Clink-dead-code` is used** + - `repr(simd)` types w/ generics (i.e. Portable SIMD or hand-rolled + equivalents) can have invalid instantiations with `-Clink-dead-code` + +These issues don't apply to scalable vectors. + +### Const-initialisation of vectors +[prior-art-const-init]: #const-initialisation-of-vectors + +There was a single issue related to const-initialisation of non-portable vector +types: + +- **[rust#48745]: Provide a way to const-initialise vendor-specific vector types** + - Initialisation of fixed length vectors was not possible in a const context + for non-portable SIMD + - `mem::transmute` being made constant has addressed this issue + +This issue doesn't apply to scalable vectors as they are inherently non-const. + +### `target_feature` ABI + +There have been well-documented issues with the ABI of fixed-length SIMD +vectors, many of which apply to scalable vectors too, but are harder to resolve: + +- **[rust#44367]: `repr(simd)` is unsound** + - `repr(simd)` types in functions with different target features enabled can + have different ABIs + - Fixed by passing SIMD types indirectly in [rust#47743] + +- **[rust#53346]: `repr(simd)` is unsound in C FFI** + - Same issue as in [rust#44367] but only with `extern "C"` functions where the + Rust ABI does not apply + - Later fixed by [rust#116558] + +- **[rust#87438]: future-incompat: use of SIMD types aren't gated properly** + - Calling a `extern "C"` function with an SIMD vector type in a `repr(C)` or + `repr(transparent)` struct doesn't error + - Later fixed by [rust#116558] + +- **[rfcs#2574]: `simd_ffi`** + - Permits calls to `extern "C"` functions with SIMD types so long as those + functions have the appropriate `target_feature` attribute + - Never fully implemented until [rust#116558] effectively did so + +- **[rust#131800]: Figure out which target features are required for which SIMD size** + - As part of [rust#116558], solicited input in determining which target + features were required for a given vector length so that the lint could + check for those + +- **[rust#133146]: How should we handle dynamic vector ABIs?** + - Follow-up to [rust#131800]: + - Existing ABI compatibility checks rely on the length of the vector and the + architecture to identify an appropriate target feature that must be + enabled, but this approach does not scale to scalable vectors + +- **[rust#133144]: How should we handle matrix ABIs?** + - Follow-up to [rust#131800]: + - Same as [rust#133146] but for matrix extensions + - Out-of-scope for this RFC + +- **[rust#116558]: The `extern "C"` ABI of SIMD vector types depends on target features (tracking issue for `abi_unsupported_vector_types` future-incompatibility lint)** + - Identified ABI incompatibility when calling `extern "C"` functions that used SIMD types + - Rust passes SIMD types indirectly for functions with and without + `target_feature` annotations in its ABI. `extern "C"` functions take SIMD + types as immediates + - Calls from annotated functions to `extern "C"` could use immediates, but + calls from non-annotated functions could not. Rust did not prevent calls + from non-annotated functions. + - A `abi_unsupported_vector_types` future-incompatibility lint was introduced + to enforce that `extern "C"` functions could not have SIMD types in their + signatures without the appropriate target feature being enabled + - The lint has since been removed and replaced with a hard error + - It only triggers when such a function is called + +- **[rust#132865]: Support calling functions with SIMD vectors that couldn't be used in the caller** + - Follow-up to [rust#116558] + - There are valid calls to `extern "C"` functions which take SIMD types that + are not currently accepted, such as checking for the presence of the target + feature and then calling the `extern "C"` function with a newly created + vector + - It is hard to support this as it is not possible to generate a call with a + specific ABI without annotating the entire containing function as having the + target feature ([llvm#70563]) + - This limitation also causes similar issues with inlining ([rust#116573]) + +- **[Pre-RFC: Fixing ABI for SIMD types][pre_rfc_simd]** + - Proposes requiring appropriate target features be enabled when a x86 SIMD + type is used in a function signature + - Written primarily considering x86 SIMD + - Considers both globally-enabled target features (e.g. `-Ctarget-feature` + or default features from target specification) and per-function-enabled + target features (`#[target_feature]`) + - Proposes generating shims to translate between ABIs when calling annotated + functions with SIMD type arguments from non-annotated functions + - Avoids breakage in cases similar to [rust#132865] but between annotated + and non-annotated functions, rather than just Rust ABI to non-Rust ABI + - Errors will be emitted for function pointers based on the target features + of the caller + - Prompted by discussion in [rust#116558] + - Never progressed to being a submitted RFC + - Discussed [on Zulip][pre_rfc_simd_zulip] + - How does the pre-RFC interact with Portable SIMD efforts? + - The inherent portability of these types means that they will need a + matching featureful and featureless ABI. It is suggested that this be + the current indirect ABI, but this isn't seen as desirable - ABI shims + or per-target-feature monomorphisation is to be explored + - Should there be a difference in codegen for calls to function items vs + function pointers (e.g. use of a shim)? + - Suggestion that an ABI shim be used for function pointers rather than + requiring target feature on functions with the call + - Is there a proper featureless ABI for x86 SIMD types? + - Yes, details in thread + - Should these changes also apply to `extern "Rust"`? + - Mixed opinions - enables use of performant ABI, larger breaking change + - Concern that doing this jeopardizes the entire proposal and that Rust is + stuck with the current behaviour + + > I still don't like the idea I'm forced to use an FFI calling + > convention in pure rust code because the default is fundamentally too + > slow + + > it's a tradeoff. should the default be portable or fast. I dont think + > there is an obvious right answer here. might be worth digging out the + > history that led to the current situation -- possibly this decision + > has been made in the past, in favor of "portable", and that's why the + > ABI works the way it does? + - Could use the performant ABI when global target features have feature + enabled + - References later design meeting ([lang-team#235]) + +- **[lang-team#235]: Design meeting: resolve ABI issues around target-feature** + ([meeting notes][lang-team#235-notes]) + - Proposes property that functions with the same signature will always have + the same ABI (i.e. that target features will not be considered in the ABI) + and three possible fixes: + - Track target features as part of function signatures, which is hard to do + without changing function pointer syntax + - Discussed briefly but not proposed due to concerns regarding breaking + change and expectation that function pointer syntax would need + changed/extended, and that it introduces a new semver hazard (adding a + SIMD type field) + - Suggested that if this route were taken then allowing target features + in extern blocks would be desirable and passing SIMD types using + registers could be considered + - Did not discuss challenges related to trait methods and generic + functions + - References [rust#111836] + - Define an ABI which does not depend on target features + - i.e. as the Rust ABI today with indirect passing of SIMD types + - Reject declaring/calling functions with target-feature-requiring types + when the ABI target feature is not available/enabled + - References [Pre-RFC: Fixing ABI for SIMD types][pre_rfc_simd], proposing + a variant of the RFC: + - Instead of applying to all ABIs (as in the pre-RFC) and using the + performant calling convention, it would only to non-Rust ABIs (e.g. + `extern "C"`) and would be based on a size of the vector to feature + mapping (rather than annotating types w/ the required features) + - This narrowly fixes the soundness issue and is the basis for the + currently implemented future incompatibility warning + - Reject enabling/disabling certain ABI-affecting features + - Discusses this as a solution for ABI issues relating to floats + - During the meeting, various points were discussed: + - Should it be possible to declare (non-Rust ABI) functions which take SIMD + vectors as long as there aren't calls? + - Intended to reduce potential opportunities breakage. + - How plausible is it to include ABI-affecting target features in function + pointer types? + - It is suggested that this information could be smuggled through the ABI + name: e.g. `extern "Rust+avx"` + - Concern that this is a wider breaking change and would require treating + ABIs specially or would require shims + - Feeling that this should be possible but not discussed further as an + immediate solution was desired to resolve unsoundness + - There was consensus that crater runs were needed to gather data + - After the meeting, the language team concluded: + > We discussed this question in the T-lang design meeting on 2023-12-20. The + > consensus was generally in favor of fixing this, and we were interested in + > seeing more work in this direction. While we considered various ways that + > this could be fixed, we were supportive of finding the most minimal, + > simplest way to fix this as the first step, assuming that such an approach + > proves to be feasible. We'll need to see at least the results of a crater + > run and further analysis to confirm that feasibility. + - It was explicitly noted in the notes that the language team didn't want to + rule out considering target features as part of the ABI: + + > > Track the target features in the function signature. This would + > > basically mean that function pointers now have to also list the set of + > > ABI-relevant target features that were enabled. This would be a rather + > > fundamental change requiring new function pointer syntax, and hard to + > > do without breaking code, so we mention it only for completeness' + > > sake. + > + > I don't think we should rule this out for the future. We've already + > talked about being able to track calling-convention ABI (extern "C" vs + > other ABIs) in function pointers somehow, so that we can safely track + > which kind of function we have. We've also talked about having this work + > in generics somehow, so that the Fn traits have a (defaulted) parameter + > for ABI or similar, and the monomorphization of a call will call the + > right ABI. + +See [*ABI*][abi] for discussion of these challenges as they apply to scalable +vectors. + +### Multiversioning and effects +[prior-art-future]: #multiversioning-and-effects + +There are ongoing efforts related to improving Rust's SIMD support: + +- **[rust-project-goals#261]: Nightly support for ergonomic SIMD multiversioning** + - Generating efficient code for specific SIMD ISAs requires `target_feature` + attributes on functions, which isn't particularly ergonomic + - Need to do runtime checks then dispatch to functions with target features + - Must be repeated when leaving and entering these functions + - Intermediate functions use `inline(always)` to avoid having to have + different versions for each target feature, which impacts code size + - Various solutions have been proposed - witness types carrying target feature + information, inherited target features from callers, features being const + generic arguments + - Goal aims to explore design space and experiment + +- **[lang-team#309]: SIMD multiversioning** ([pre-read][lang-team#309-notes]) + - Design meeting proposal, has not yet taken place + - Compares two related proposals that address some of the problems with + multiversioning + - Ideas similar to [rfcs#3525] + - In brief: attaching target features to types, introduce traits that + abstract over common operations, functions are generic over those traits + and when instantiated with a function-carrying type, inherit the target + feature and use the trait methods to do SIMD operations + - Ideas similar to [unopened contextual target features RFC][rfcs-contextual] + - In brief: `#[target_features(caller)]` which causes a function to + inherit the features of the caller + - Expands on this with a `#[target_features(generic)]` attribute which + takes target features from the first const generic argument of the + function (i.e. a `const FEATURES: str`) + - Both ideas have many open design questions + +- **[rust#143352]: Tracking issue for Effective Target Features** + - [Initial proposal][rust#143352-proposal] aims to experiment with SIMD + multiversioning based on the effect model used with const traits + - In brief: traits can be defined as having a target feature effect, + implementations of those traits define the target feature that is enabled by + the effect, bounds on the trait in functions will enable the target feature + from the impl + +- **[lang-team#317]: Design meeting: "Marker effects"** ([meeting notes][lang-team#317-notes]) + - Discusses the findings from investigations into keyword generics and the + implementation of const traits, proposing a categorisation of effects and a + subset to focus on initially + - Briefly discusses that effects overlap with the SIMD multiversioning efforts + +These efforts are followed with interest as they may synergise well with +resolving the similar challenges that scalable vectors face. See +[*Trait implementations and generic instantiation*][trait-implementations-and-generic-instantiation]. # Unresolved questions [unresolved-questions]: #unresolved-questions @@ -695,7 +1130,7 @@ restrictions on trait implementations and generic instantiation to be lifted: - Any mechanism that could be applied to scalable vector types could also be used to enforce that existing SIMD types are only used in `target_feature`-annotated functions, which would enable fixed-length - vectors to be passed by-register, improving performance + vectors to be passed as immediates, improving performance ## Compound types [compound-types]: #compound-types @@ -732,15 +1167,67 @@ Later, there are a variety of approaches that could be taken to incorporate support for scalable vectors into Portable SIMD. [acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types +[compiler-team#838]: https://github.com/rust-lang/compiler-team/issues/838 [dotnet]: https://github.com/dotnet/runtime/issues/93095 +[lang-team#235-notes]: https://hackmd.io/Dnd0ZIN6RjqbRlEs2GWE5Q +[lang-team#235]: https://github.com/rust-lang/lang-team/issues/235 +[lang-team#309-notes]: https://hackmd.io/@veluca93/simd-multiversioning +[lang-team#309]: https://github.com/rust-lang/lang-team/issues/309 +[lang-team#317-notes]: https://hackmd.io/xydafCtMQ1aqUbm6wqmEmA?view +[lang-team#317]: https://github.com/rust-lang/lang-team/issues/317 +[llvm#70563]: https://github.com/llvm/llvm-project/issues/70563 +[portable-simd#339]: https://github.com/rust-lang/portable-simd/issues/339 [prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt +[pre_rfc_simd_zulip]: https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Pre-RFC.20discussion.3A.20Forbidding.20SIMD.20types.20w.2Fo.20features/near/399174036 +[pre_rfc_simd]: https://hackmd.io/@chorman0773/SJ1rZPWZ6 [quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754 +[rfcs-contextual]: https://github.com/calebzulawski/rfcs/blob/contextual-target-features/text/0000-contextual-target-features.md [rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html +[rfcs#2045-abi]: https://rust-lang.github.io/rfcs/2045-target-feature.html#how-do-we-handle-abi-issues-with-portable-vector-types +[rfcs#2045]: https://github.com/rust-lang/rfcs/pull/2045 +[rfcs#2396]: https://github.com/rust-lang/rfcs/pull/2396 +[rfcs#2574]: https://github.com/rust-lang/rfcs/pull/2574 [rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268 -[rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729 [rfcs#3280]: https://github.com/rust-lang/rfcs/pull/3280 -[rust#63633]: https://github.com/rust-lang/rust/issues/63633 +[rfcs#3525]: https://github.com/rust-lang/rfcs/pull/3525 +[rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729 +[rust-project-goals#261]: https://github.com/rust-lang/rust-project-goals/issues/261 +[rust#105439]: https://github.com/rust-lang/rust/issues/105439 +[rust#105583]: https://github.com/rust-lang/rust/issues/105583 +[rust#108338]: https://github.com/rust-lang/rust/issues/108338 +[rust#111836]: https://github.com/rust-lang/rust/issues/111836 +[rust#113465]: https://github.com/rust-lang/rust/issues/113465 +[rust#116558]: https://github.com/rust-lang/rust/issues/116558 +[rust#116573]: https://github.com/rust-lang/rust/issues/116573 +[rust#126217]: https://github.com/rust-lang/rust/issues/126217 +[rust#130402]: https://github.com/rust-lang/rust/issues/130402 +[rust#131800]: https://github.com/rust-lang/rust/issues/131800 +[rust#132865]: https://github.com/rust-lang/rust/issues/132865 +[rust#133144]: https://github.com/rust-lang/rust/issues/133144 +[rust#133146]: https://github.com/rust-lang/rust/issues/133146 +[rust#137108]: https://github.com/rust-lang/rust/issues/137108 +[rust#137256]: https://github.com/rust-lang/rust/issues/137256 +[rust#143352-proposal]: https://hackmd.io/M5ZAoRqSTb27oLBuEpByIA [rust#143352]: https://github.com/rust-lang/rust/issues/143352 +[rust#143833]: https://github.com/rust-lang/rust/issues/143833 +[rust#18147]: https://github.com/rust-lang/rust/issues/18147 +[rust#27731]: https://github.com/rust-lang/rust/issues/27731 +[rust#44367]: https://github.com/rust-lang/rust/issues/44367 +[rust#44839]: https://github.com/rust-lang/rust/issues/44839 +[rust#47103]: https://github.com/rust-lang/rust/issues/47103 +[rust#47743]: https://github.com/rust-lang/rust/issues/47743 +[rust#48745]: https://github.com/rust-lang/rust/issues/48745 +[rust#53346]: https://github.com/rust-lang/rust/issues/53346 +[rust#58729]: https://github.com/rust-lang/rust/issues/58729 +[rust#63633]: https://github.com/rust-lang/rust/issues/63633 +[rust#73631]: https://github.com/rust-lang/rust/issues/73631 +[rust#77529]: https://github.com/rust-lang/rust/issues/77529 +[rust#77866]: https://github.com/rust-lang/rust/issues/77866 +[rust#78231]: https://github.com/rust-lang/rust/issues/78231 +[rust#81931]: https://github.com/rust-lang/rust/issues/81931 +[rust#87438]: https://github.com/rust-lang/rust/issues/87438 +[rust#90972]: https://github.com/rust-lang/rust/issues/90972 +[rust#99211]: https://github.com/rust-lang/rust/issues/99211 [rvv_bitsperblock]: https://github.com/llvm/llvm-project/blob/837b2d464ff16fe0d892dcf2827747c97dd5465e/llvm/include/llvm/TargetParser/RISCVTargetParser.h#L51 [rvv_typesystem]: https://github.com/riscv-non-isa/rvv-intrinsic-doc/blob/main/doc/rvv-intrinsic-spec.adoc#type-system [sve_minlength]: https://developer.arm.com/documentation/102476/0101/Introducing-SVE#:~:text=a%20minimum%20of%20128%20bits From 494da8afe28e4e8f3001ae92948281f7c51dd92a Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 06:18:24 +0000 Subject: [PATCH 18/25] move vector length at runtime warning into section --- text/3838-scalable-vectors.md | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index b0194169cef..6a6c198f58b 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -289,6 +289,20 @@ not possible for these types to be used with generic functions or traits. See [*Trait implementations and generic instantiation*][trait-implementations-and-generic-instantiation]. +## Changing vector lengths at runtime +[changing-vector-lengths-at-runtime]: #changing-vector-lengths-at-runtime + +It is possible to change the vector length at runtime using a +[`prctl()`][prctl] call to the Linux kernel, or via similar mechanisms in other +operating systems. + +Doing so would require that `vscale` change, which Rust will not supported. + +`prctl` or similar must only be used to set up the vector length for child +processes, not to change the vector length of the current process. As Rust +cannot prevent users from doing this, it will be documented as undefined +behaviour, consistent with C and C++. + ## Implementing `rustc_scalable_vector` [implementing-rustc_scalable_vector]: #implementing-rustc_scalable_vector @@ -314,13 +328,6 @@ result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64 The `N` in the `#[rustc_scalable_vector(N)]` determines the `element_count` used in the LLVM type for a scalable vector. -While it is possible to change the vector length at runtime using a -[`prctl()`][prctl] call to the kernel, this would require that `vscale` change, -which is unsupported. `prctl` must only be used to set up the vector length for -child processes, not to change the vector length of the current process. As Rust -cannot prevent users from doing this, it will be documented as undefined -behaviour, consistent with C and C++. - Tuples in RISC-V's V Extension lower to target-specific types in LLVM rather than generic scalable vector types, so `rustc_scalable_vector` will not initially support RVV tuples (see From 6186245a5f44f9d41852284c00cd9c066640936c Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 06:41:10 +0000 Subject: [PATCH 19/25] updated representation of tuples of vectors --- text/3838-scalable-vectors.md | 88 ++++++++++++++++++++--------------- 1 file changed, 50 insertions(+), 38 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 6a6c198f58b..9a59ca6551c 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -214,6 +214,33 @@ backend. It must one of the following types: It is not permitted to project into scalable vector types and access the type marker field. +## Tuples of vectors +[tuples-of-vectors]: #tuples-of-vectors + +Structs of scalable vectors are supported, but every element of the struct must +have the same scalable vector type. This will enable definition of "tuple of +vector" types, such as `svfloat32x2_t` below, that are used in some load and +store intrinsics. + +```rust +#[rustc_scalable_vector] +pub struct svfloat32x2_t(svfloat32_t, svfloat32_t); +``` + +```text +◁───────────── vscale x f32 x 4 ─────────────▷ ◁────── f32 x 4 ──────▷ +┌──────────────────────────────────────────────┬───────────────────────┐ ┐ +│ ... │ f32 │ f32 │ f32 │ f32 │ │ +└──────────────────────────────────────────────┴───────────────────────┘ ├─ svfloat32x2_t +┌──────────────────────────────────────────────┬───────────────────────┐ │ +│ ... │ f32 │ f32 │ f32 │ f32 │ │ +└──────────────────────────────────────────────┴───────────────────────┘ ┘ +``` + +Structs must be still be annotated with `#[rustc_scalable_vector]`, so end-users +cannot define their own structs of scalable vectors. It is not permitted to +project into structs and access the individual vectors. + ## Properties of scalable vectors [properties-of-scalable-vector-types]: #properties-of-scalable-vectors @@ -236,6 +263,12 @@ codegen backend: - `repr(transparent)` newtypes could be permitted with scalable vectors + - **Exception:** Scalable vectors can be stored in arrays + + - **Exception:** Scalable vectors can be stored in structs with every + element of the same type (but only if that struct is annotated with + `#[rustc_scalable_vector]`) + - Cannot be used in arrays - Cannot be the type of a static variable @@ -309,12 +342,12 @@ behaviour, consistent with C and C++. Implementing `rustc_scalable_vector` largely involves lowering scalable vectors to the appropriate type in the codegen backend. LLVM has robust support for scalable vectors and is the default backend, so this section will focus on -implementation in the LLVM codegen backend. Other codegen backends can implement -support when scalable vectors are supported by the backend. +implementation in the LLVM codegen backend. Other backends should be able to +support scalable vectors in Rust once they support scalable vectors in general. -Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable -vectors to the correct type in LLVM and the `vscale` modifier that is applied to -LLVM's vector types. +Most of the complexity of scalable vectors are handled by LLVM: lowering Rust's +scalable vectors to the correct type in LLVM and the `vscale` modifier that is +applied to LLVM's vector types. LLVM's scalable vector type is of the form ``. `vscale` is the scaling factor determined by the hardware at runtime, it can be @@ -328,6 +361,14 @@ result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64 The `N` in the `#[rustc_scalable_vector(N)]` determines the `element_count` used in the LLVM type for a scalable vector. +Structs of vectors are lowered to LLVM as struct types containing scalable +vector types. This is supported since the +[*Permit load/store/alloca for struct of the same scalable vector type* LLVM RFC][llvm-rfc-structs]. + +Arrays of vectors are lowered to LLVM as array types containing scalable vector +types. Arrays of vectors are also supported by LLVM since the +[*Enable arrays of scalable vector types* LLVM RFC][llvm-rfc-arrays]. + Tuples in RISC-V's V Extension lower to target-specific types in LLVM rather than generic scalable vector types, so `rustc_scalable_vector` will not initially support RVV tuples (see @@ -437,38 +478,7 @@ calculations for `N` or architecture-specific knowledge: attribute accepting arbitrary specification of `N` or a type to calculate `N` with. -3. Also with Arm SVE, some load and store intrinsics take tuples of vectors, - such as `svfloat32x2_t`: - - ```text - ◁───────────── vscale x f32 x 4 ─────────────▷ ◁────── f32 x 4 ──────▷ - ┌──────────────────────────────────────────────┬───────────────────────┐ ┐ - │ ... │ f32 │ f32 │ f32 │ f32 │ │ - └──────────────────────────────────────────────┴───────────────────────┘ ├─ svfloat32x2_t - ┌──────────────────────────────────────────────┬───────────────────────┐ │ vscale x f32 x 8 - │ ... │ f32 │ f32 │ f32 │ f32 │ │ - └──────────────────────────────────────────────┴───────────────────────┘ ┘ - ``` - - These types are the opposite of the previous complicating case, containing - more elements than `vunit / element_size`. These use two or more registers to - represent the vector. - - `vscale x f32 x 8` cannot be defined without the attribute accepting - arbitrary specification of `N` or an argument to the attribute to specify the - number of registers used: - - ```rust - // alternative: user-provided arbitrary `N` - #[rustc_scalable_vector(8)] - struct svfloat32x2_t(f32); - - // alternative: add `tuple_of` to attribute - #[rustc_scalable_vector(tuple_of = "2")] // either `1` (default), `2`, `3` or `4` - struct svfloat32x2_t(f32); - ``` - -4. RISC-V RVV's scalable vectors are quite different from Arm's SVE, while +3. RISC-V RVV's scalable vectors are quite different from Arm's SVE, while sharing the same underlying infrastructure in LLVM. SVE's scalable vector types map directly onto LLVM scalable vector types, and @@ -1146,7 +1156,7 @@ The restriction that scalable vectors cannot be used in compound types could be relaxed at a later time either by extending rustc's codegen or leveraging newly added support in LLVM. -However, as C also has thus restriction and scalable vectors are nevertheless +However, as C also has this restriction and scalable vectors are nevertheless used in production code, it is unlikely there will be much demand for those restrictions to be relaxed in LLVM. @@ -1182,6 +1192,8 @@ support for scalable vectors into Portable SIMD. [lang-team#309]: https://github.com/rust-lang/lang-team/issues/309 [lang-team#317-notes]: https://hackmd.io/xydafCtMQ1aqUbm6wqmEmA?view [lang-team#317]: https://github.com/rust-lang/lang-team/issues/317 +[llvm-rfc-arrays]: https://discourse.llvm.org/t/rfc-enable-arrays-of-scalable-vector-types/72935 +[llvm-rfc-structs]: https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 [llvm#70563]: https://github.com/llvm/llvm-project/issues/70563 [portable-simd#339]: https://github.com/rust-lang/portable-simd/issues/339 [prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt From 9466d266087ffe734326c2854a69bf10cc039e2a Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 09:00:40 +0000 Subject: [PATCH 20/25] another possibility for removing restrictions --- text/3838-scalable-vectors.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 9a59ca6551c..8ff8ee0ea9c 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -1149,6 +1149,19 @@ restrictions on trait implementations and generic instantiation to be lifted: `target_feature`-annotated functions, which would enable fixed-length vectors to be passed as immediates, improving performance +- It may be possible to support scalable vector types without the target feature + being enabled by using an indirect ABI similarly to fixed length vectors. + + - This would enable these restrictions to be lifted and for scalable vector + types to be the same as fixed length vectors with respect to interactions + with the `target_feature` attribute. + + - As with fixed length vectors, it would still be desirable for them to + avoid needing to be passed indirectly between annotated functions, but + this could be addressed in a follow-up. + + - Experimentation is required to determine if this is feasible. + ## Compound types [compound-types]: #compound-types From c6d836de9fca7b4e739478141ffd3a1643fd1ec9 Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 09:07:29 +0000 Subject: [PATCH 21/25] clarify function pointers --- text/3838-scalable-vectors.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 8ff8ee0ea9c..9b5d496c549 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -300,6 +300,13 @@ Therefore, there is an additional restriction that these types cannot be used in the argument or return types of functions unless those functions are annotated with the relevant target feature. +It is permitted to create pointers to function that have scalable vector types +in their arguments or return types, even though function pointers themselves +cannot be annotated as having the target feature. When the function pointer was +created, the user must have made the unsafe promise that it was okay to call the +`#[target_feature]`-annotated function, so it is sound to permit function +pointers. + As scalable vectors will always be passed as immediates, they will therefore have the same ABI as in C, so should be considered FFI-safe. From 968ae3ed02e0bfa11db28378bbf8cf23e22d3928 Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 09:08:00 +0000 Subject: [PATCH 22/25] correct no action rationale --- text/3838-scalable-vectors.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 9b5d496c549..6ae1dce5848 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -392,10 +392,9 @@ initially support RVV tuples (see Without support for scalable vectors in the language and compiler, it is not possible to leverage hardware with scalable vectors from Rust. As extensions -with scalable vectors are available in architectures as either the only or -recommended way to do SIMD, lack of support in Rust would severely limit Rust's -suitability on these architectures compared to other systems programming -languages. +with scalable vectors are available in architectures as the recommended way to +do SIMD, lack of support in Rust would limit Rust's suitability on these +architectures compared to other systems programming languages. `rustc_scalable_vector` is preferred over a `repr(scalable)` attribute as there is existing dissatisfaction with fixed-length vectors being defined using the From 530378cf66ce504c386f61001c682dbf199bd4e6 Mon Sep 17 00:00:00 2001 From: David Wood Date: Wed, 6 Aug 2025 09:12:15 +0000 Subject: [PATCH 23/25] mention dyn incompatibility --- text/3838-scalable-vectors.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 6ae1dce5848..1807198e0a3 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -300,6 +300,8 @@ Therefore, there is an additional restriction that these types cannot be used in the argument or return types of functions unless those functions are annotated with the relevant target feature. +Any such functions would make a trait containing them dyn-incompatible. + It is permitted to create pointers to function that have scalable vector types in their arguments or return types, even though function pointers themselves cannot be annotated as having the target feature. When the function pointer was From 0fe50a869d7d33d48ff46e016609d8cb165f1727 Mon Sep 17 00:00:00 2001 From: David Wood Date: Thu, 7 Aug 2025 02:04:20 +0000 Subject: [PATCH 24/25] does not build on `repr(simd)` --- text/3838-scalable-vectors.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index 1807198e0a3..c122885c0ff 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -15,9 +15,7 @@ only in the standard library to introduce scalable vector types which can then be stabilised. Only the infrastructure to define these types are introduced in this RFC, not the types or intrinsics that use it. -This RFC builds on Rust's existing SIMD infrastructure, introduced in -[rfcs#1199: SIMD Infrastructure][rfcs#1199]. It depends on -[rfcs#3729: Hierarchy of Sized traits][rfcs#3729]. +This RFC depends on [rfcs#3729: Hierarchy of Sized traits][rfcs#3729]. SVE is used in examples throughout this RFC, but the proposed features should be sufficient to enable support for similar extensions in other architectures, such From 442beeae4054837c76635636b4a7f7da3b3c7802 Mon Sep 17 00:00:00 2001 From: David Wood Date: Tue, 12 Aug 2025 10:26:47 +0000 Subject: [PATCH 25/25] fix inconsistency with array support --- text/3838-scalable-vectors.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/text/3838-scalable-vectors.md b/text/3838-scalable-vectors.md index c122885c0ff..85d27c4c19e 100644 --- a/text/3838-scalable-vectors.md +++ b/text/3838-scalable-vectors.md @@ -267,8 +267,6 @@ codegen backend: element of the same type (but only if that struct is annotated with `#[rustc_scalable_vector]`) -- Cannot be used in arrays - - Cannot be the type of a static variable - Cannot be instantiated into generic functions (see