Skip to content

repr(scalable) #3838

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
386 changes: 386 additions & 0 deletions text/3838-repr-scalable.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,386 @@
- Feature Name: `repr_scalable`
- Start Date: 2025-07-07
- RFC PR: [rust-lang/rfcs#3838](https://github.com/rust-lang/rfcs/pull/3838)
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)

# Summary
[summary]: #summary

Extends Rust's existing SIMD infrastructure, `#[repr(simd)]`, with a
complementary scalable representation, `#[repr(scalable(N)]`, to support
scalable vector types, such as Arm's Scalable Vector Extension (SVE), or
RISC-V's Vector Extension (RVV).

Like the existing `repr(simd)` representation, `repr(scalable(N))` is internal
compiler infrastructure that will be used only in the standard library to
introduce scalable vector types which can then be stablised. Only the
infrastructure to define these types are introduced in this RFC, not the types
or intrinsics that use it.

This RFC builds on Rust's existing SIMD infrastructure, introduced in
[rfcs#1199: SIMD Infrastructure][rfcs#1199]. It depends on
[rfcs#3729: Hierarchy of Sized traits][rfcs#3729].

SVE is used in examples throughout this RFC, but the proposed features should be
sufficient to enable support for similar extensions in other architectures, such
as RISC-V's V Extension.

# Motivation
[motivation]: #motivation

SIMD types and instructions are a crucial element of high-performance Rust
applications and allow for operating on multiple values in a single instruction.
Many processors have SIMD registers of a known fixed length and provide
intrinsics which operate on these registers. For example, Arm's Neon extension
is well-supported by Rust and provides 128-bit registers and a wide range of
intrinsics.

Instead of releasing more extensions with ever increasing register bit widths,
AArch64 has introduced a Scalable Vector Extension (SVE). Similarly, RISC-V has
a Vector Extension (RVV). These extensions have vector registers whose width
depends on the CPU implementation and bit-width-agnostic intrinsics for
operating on these registers. By using scalable vectors, code won't need to be
re-written using new architecture extensions with larger registers, new types
and intrinsics, but instead will work on newer processors with different vector
register lengths and performance characteristics.

Scalable vectors have interesting and challenging implications for Rust,
introducing value types with sizes that can only be known at runtime, requiring
significant changes to the language's notion of sizedness - this support is
being proposed in the [rfcs#3729].

Hardware is generally available with SVE, and key Rust stakeholders want to be
able to use these architecture features from Rust. In a [recent discussion on
SVE, Amanieu, co-lead of the library team, said][quote_amanieu]:

> I've talked with several people in Google, Huawei and Microsoft, all of whom
> have expressed a rather urgent desire for the ability to use SVE intrinsics in
> Rust code, especially now that SVE hardware is generally available.

Without support in the compiler, leveraging the
[*Hierarchy of Sized traits*][rfcs#3729] proposal, it is not possible to
introduce intrinsics and types exposing the scalable vector support in hardware.

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

None of the infrastructure proposed in this RFC is intended to be used directly
by Rust users.

`repr(scalable)` as described later in
[*Reference-level explanation*][reference-level-explanation] is perma-unstable
and exists only enables scalable vector types to be defined in the standard
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and exists only enables scalable vector types to be defined in the standard
and exists only to enable scalable vector types to be defined in the standard

library. The specific vector types are intended to eventually be stabilised, but
none are proposed in this RFC.

## Using scalable vectors
[using-scalable-vectors]: #using-scalable-vectors

Scalable vector types correspond to vector registers in hardware with unknown
size at compile time. However, it will be a known and fixed size at runtime.
Additional properties could be known during compilation, depending on the
architecture, such as a minimum or maximum size or that the size must be a
multiple of some factor.

As previously described, users will not define their own scalable vector types
and instead use intrinsics from `std::arch`, and this RFC is not proposing any
such intrinsics, just the infrastructure. However, to illustrate how the types
and intrinsics that this infrastructure will enable can be used, consider the
following example that sums two input vectors:

```rust
fn sve_add(in_a: Vec<f32>, in_b: Vec<f32>, out_c: &mut Vec<f32>) {
let len = in_a.len();
Comment on lines +92 to +93
Copy link
Member

@RalfJung RalfJung Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are assuming here that len is a multiple of step? The scalar version would not typically make such an assumption, so this is worth calling out.

EDIT: Or does the svwhilelt_b32 part handle that? The comments are not clear enough for me to tell. The mask being "based on" index and len doesn't say how it is based on them...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let len = in_a.len();
assert_eq!(a.len(), b.len());
assert_eq!(a.len(), c.len());
let len = in_a.len();

unsafe {
// `svcntw` returns the actual number of elements that are in a 32-bit
// element vector
let step = svcntw() as usize;
for i in (0..len).step_by(step) {
let a = in_a.as_ptr().add(i);
let b = in_b.as_ptr().add(i);
let c = out_c as *mut f32;
let c = c.add(i);

// `svwhilelt_b32` generates a mask based on comparing the current
// index against the `len`
Comment on lines +104 to +105
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// `svwhilelt_b32` generates a mask based on comparing the current
// index against the `len`
// `svwhilelt_b32` deals with the tail of the iteration: it generates a mask that
// is enabled for the first `len` elements overall, but disables the last `len % step`
// elements in the last iteration.

Is this correct? I am guessing here.

let pred = svwhilelt_b32(i as _, len as _);

// `svld1_f32` loads a vector register with the data from address
// `a`, zeroing any elements in the vector that are masked out
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And crucially, those "zeroed" elements never even get loaded, so it is okay of they are out-of-bounds?

let sva = svld1_f32(pred, a);
let svb = svld1_f32(pred, b);

// `svadd_f32_m` adds `a` and `b`, any lanes that are masked out will
// take the keep value of `a`
let svc = svadd_f32_m(pred, sva, svb);

// `svst1_f32` will store the result without accessing any memory
// locations that are masked out
svst1_f32(svc, pred, c);
}
}
}
```

From a user's perspective, writing code for scalable vectors isn't too different
from when writing code with a fixed sized vector.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

Types annotated with the `#[repr(simd)]` attribute contains either an array
field or multiple fields to indicate the intended size of the SIMD vector that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as of rust-lang/rust#129403, #[repr(simd)] only supports an array field.

the type represents.

Similarly, a `scalable(N)` representation is introduced to define a scalable
vector type. `scalable(N)` accepts an integer to determine the minimum number of
elements the vector contains. For example:
Comment on lines +128 to +137
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use repr(C) because otherwise some hypothetical CStruct repr type would have to be parameterized by every single arrangement of types, in an ordered fashion.

But repr(simd) does not pair with either repr(packed) or repr(align(N)) coherently, and neither would repr(scalable).

I do not think this should be handled by a new repr... modeling it on a still-unstable repr(simd)... which, as we have discovered over time with existing reprs, has a million footguns for how they work and get compiled. A lang item seems more to-the-point. I intend to make this true for repr(simd) as well anyways.


```rust
#[repr(simd, scalable(4))]
pub struct svfloat32_t { _ty: [f32], }
Copy link
Member

@workingjubilee workingjubilee Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of the aspects of using these random repr attributes is it is misleading. This looks like it is unsized to begin with, but these are unconst Sized.

```

As with the existing `repr(simd)`, `_ty` is purely a type marker, used to get
the element type for the codegen backend.

`svfloat32_t` is a scalable vector with a minimum of four `f32` elements and
potentially more depending on the length of the vector register at runtime.

## Choosing `N`
[choosing-n]: #choosing-n

Many intrinsics using scalable vectors accept both a predicate vector argument
and data vector arguments. Predicate vectors determine whether a lane is on or
off for the operation performed by any given intrinsic. Predicate vectors may
use different registers of sizes to the vectors containing data.
`repr(scalable)` is used to define vectors containing both data and predicates.

As `repr(scalable(N))` is intended to be a permanently unstable attribute, any
value of `N` is accepted by the attribute and it is the responsibility of
whomever is defining the type to provide a valid value. A correct value for `N`
depends on the purpose of the specific scalable vector type and the
architecture.
Comment on lines +159 to +163
Copy link
Member

@workingjubilee workingjubilee Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of abdication of responsibility in practice, when the rules are simply not that hard to work out, just leaves it harder to maintain the compiler in the future as it amounts to saying "We don't have to write down any reasoning about validity, not even in documentation, and we will allow someone in the future to discover why our decisions were fragile and sometimes incorrect without protecting them against more easily-induced miscompilations". I realize rust-lang has accepted this kind of reasoning in the past, (edited by moderator) future maintainers once you can no longer ask the previous ones due to their inevitable burnout.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if you engaged on this RFC without describing decisions you disagree with as "sucks ass" or "pissing in the eyes of future maintainers". I'm open to discussing the details of this proposal and changing them, as with any proposal, but I'm much more open to that when concerns are phrased like this concern was in Ralf's comment rather than how you've chosen to phrase yours.


For example, with SVE, the scalable vector register length is a minimum of 128
bits, must be a multiple of 128 bits and a power of 2; and predicate registers
have one bit for each byte in the vector registers. So, for `svfloat32_t`
defined shown above, an `f32` is 32-bits and with `N=4`, the entire minimum
register length of 128 bits is used (4 x 32 = 128). An intrinsic that takes a
`svfloat32_t` may also want to accept as an argument a predicate vector with a
matching four elements (`N=4`), which would only use 4 bits of the predicate
register rather than the full 16 bits.

See
[*Manually-chosen or compiler-calculated element count*][manual-or-calculated-element-count]
for a discussion on why `N` is not calculated by the compiler.

## Properties of scalable vectors
[properties-of-scalable-vector-types]: #properties-of-scalable-vectors

Scalable vectors are necessarily non-`const Sized` (from [rfcs#3729]) as they
behave like value types but the exact size cannot be known at compilation time.

[rfcs#3729] allows these types to implement `Clone` (and consequently `Copy`) as
`Clone` only requires an implementation of `Sized`, irrespective of constness.

Scalable vector types have some further restrictions due to limitations of the
codegen backend:

- Cannot be stored in compound types (structs, enums, etc)

- Including coroutines, so these types cannot be held across an await
boundary in async functions

- `repr(transparent)` newtypes could be permitted with scalable vectors

- Cannot be used in arrays

- Cannot be the type of a static variable.

Some of these limitations may be able to be lifted in future depending on what
is supported by rustc's codegen backends.

## ABI
[abi]: #abi

Rust currently always passes SIMD vectors on the stack to avoid ABI mismatches
between functions annotated with `target_feature` - where the relevant vector
register is guaranteed to be present - and those without - where the relevant
vector register might not be present.

However, this approach will not work for scalable vector types as the relevant
target feature must to be present to use the instruction that can allocate the
correct size on the stack for the scalable vector.

Therefore, there is an additional restriction that these types cannot be used in
the argument or return types of functions unless those functions are annotated
with the relevant target feature.
Comment on lines +216 to +218
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this not mean that these types cannot be used in function pointers, as we have no knowledge of the target features of a function pointer?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the same reason it would also clash with dyn-compatible traits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is similar to the existing checks we have for __m256 only being allowed in extern "C" functions when AVX is available. They can be used in function ptrs just fine, because someone somewhere must have made the unsafe promise that it is okay to call a #[target_feature] function since we actually do have the target feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean for generic functions? I could write a fn foo<T>(x: Box<T>) that internally passes a T to some function by-value. Exactly during which stage of the compiler do we stop compilation if T ends up being a scalable vector type?

It seems to me that the only practical answer is "during monomorphization". Post-mono errors are rather undesirable though (and this would be easily observable by users on stable) so this would definitely need t-lang signoff. Please explicitly not this as a drawback.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I missed the sentence about automatically adding target features to generic functions... for reasons discussed here, that is unfortunately not practical, so I discarded it as an option in my comment above.


## Target features
[target-features]: #target-features

Similarly to the issues with the ABI of scalable vectors, without the relevant
target features, few operations can actually be performed on scalable vectors -
causing issues for the use of scalable vectors in generic code and with traits.
For example, implementations of traits like `Clone` would not be able to
actually perform a clone, and generic functions that are instantiated with
scalable vectors would during instruction selection in the codegen backend.

When a scalable vector is instantiated into a generic function during
monomorphisation, or a trait method is being implemented for a scalable vector,
then the relevant target feature will be added to the function.
Comment on lines +230 to +232
Copy link

@hanna-kruppe hanna-kruppe Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding target features to functions based on the types being mentioned in them was also proposed for a different reason in #3525 but that ran into all sorts of complications (I'm not sure off-hand which of these are relevant here).

Unlike that other RFC, what's proposed here isn't even visible in the function signature and not tied to some "if a value of this type exists, the target feature must be enabled, so it's safe" reasoning, which causes new problems. So far the unsafe enforcement and manual safety reasoning for target features assumes it's clear which target features a function has, long before monomorphization. For example, consider this program:

// crate A
fn foo<T: Sized>() -> usize {
    size_of::<T>()
}

// crate B
use crate_a::foo;
fn main() {
    println!("vector size: {}", foo::<svfloat32_t>());
}

If the instantiation of foo gets the target feature, it needs to be unsafe to call. How can the compiler ensure this? And how is it surfaced to the programmer so they can write the correct #[cfg] or is_aarch64_feature_detected!(...) condition?

Copy link
Member

@RalfJung RalfJung Jul 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding target features depending on the concrete instantiation is also fundamentally incompatible with how target features currently work in rustc. Of course the architecture can be changed, but it's a quite non-trivial change, and it has non-trivial ramifications: the MIR inliner crucially relies (for soundness) on knowing the target features of a function without knowing the concrete instance of the generics (inlining a function with more features into a function with fewer features is unsound), so functions that are generic in their target feature set risk serious perf regressions due the MIR inliner becoming less effective. Making every generic function potentially have target-feature thus seems to be like a non-starter both in terms of soundness (that's @hanna-kruppe's issue) and because it completely kills the MIR inliner for generic code (which is the code where we most care about the MIR inliner).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a scalable vector is instantiated into a generic function during
monomorphisation

It's also somewhat unclear what this means -- does instantiating T with fn() -> svfloat32_t also add the target feature? It pretty much has to, for soundness.

But then what about something like this

trait Tr { type Assoc; fn make() -> Self::Assoc }

fn foo<T: Tr>() {
  size_of_val(&T::make());
}

Now if I call foo with a type T such that T::Assoc is svfloat32_t, I also need the target feature:

impl Tr for i32 { type Assoc = svfloat32_t; ...}

foo::<i32>(); // oopsie, this must be unsafe

The target feature is needed any time svfloat32_t is in any way reachable via the given trait instance. This is not something the type system can even express, which also means I don't think it can possibly be sound.


For example, when instantiating `std::mem::size_of_val` with a scalable vector
during monomorphisation, the relevant target feature will be added to `size_of_val`
for codegen.

## Implementing `repr(scalable)`
[implementing-repr-scalable]: #implementing-reprscalable

Implementing `repr(scalable)` largely involves lowering scalable vectors to the
appropriate type in the codegen backend. LLVM has robust support for scalable
vectors and is the default backend, so this section will focus on implementation
in the LLVM codegen backend. Other codegen backends can implement support when
scalable vectors are supported by the backend.

Most of the complexity of SVE is handled by LLVM: lowering Rust's scalable
vectors to the correct type in LLVM and the `vscale` modifier that is applied to
LLVM's vector types.

LLVM's scalable vector type is of the form `<vscale x element_count x type>`.
`vscale` is the scaling factor determined by the hardware at runtime, it can be
any value providing it gives a legal vector register size for the architecture.

For example, a `<vscale x 4 x f32>` is a scalable vector with a minimum of four
`f32` elements and with SVE, `vscale` could then be any power of two which would
result in register sizes of 128, 256, 512, 1024 or 2048 and 4, 8, 16, 32, or 64
`f32` elements respectively.

The `N` in the `#[repr(scalable(N))]` determines the `element_count` used in the
LLVM type for a scalable vector.

While it is possible to change the vector length at runtime using a
[`prctl()`][prctl] call to the kernel, this would require that `vscale` change,
which is unsupported. `prctl` must only be used to set up the vector length for
child processes, not to change the vector length of the current process. As Rust
cannot prevent users from doing this, it will be documented as undefined
behaviour, consistent with C and C++.

# Drawbacks
[drawbacks]: #drawbacks

- `repr(scalable(N))` is inherently additional complexity to the language,
despite being largely hidden from users.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

Without support for scalable vectors in the language and compiler, it is not
possible to leverage hardware with scalable vectors from Rust. As extensions
with scalable vectors are available in architectures as either the only or
recommended way to do SIMD, lack of support in Rust would severely limit Rust's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assuming you're referring to RISC-V V by "only", you can use fixed-length vectors (e.g. std::simd::Simd<f32, 16>) just fine, LLVM just uses the minimum guaranteed size when deciding how many registers to use, so assuming the V extension (which implies Zvl128b giving a minimum length of 128 bits), a std::simd::Simd<f32, 16> would take 4 registers (though LLVM would probably just set LMUL to 4 and use 1 register).

suitability on these architectures compared to other systems programming
languages.

By aligning with the approach taken by C (discussed in the
[*Prior art*][prior-art] below), most of the documentation that already exists
for scalable vector intrinsics in C should still be applicable to Rust.

## Manually-chosen or compiler-calculated element count
[manual-or-calculated-element-count]: #manually-chosen-or-compiler-calculated-element-count

`repr(scalable(N))` expects `N` to be provided rather than calculating it. This
avoids needing to teach the compiler how to calculate the required `element`
count, which isn't always trivial.

Many of the intrinsics which accept scalable vectors as an argument also accept
a predicate vector. Predicate vectors decide which lanes are on or off for an
operation (i.e. which elements in the vector are operated on). Predicate vectors
can be in different and smaller registers than the data. For example,
`<vscale x 16 x i1>` could be the predicate vector for a `<vscale x 16 x u8>`
vector

For non-predicate scalable vectors, it will be typical that `N` will be
`$minimum_register_length / $type_size` (e.g. `4` for `f32` or `8` for `f16`
with a minimum 128-bit register length). In this circumstance, `N` could be
trivially calculated by the compiler.
Comment on lines +304 to +307

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this even this part doesn't really work for RVV, which has the added dimension of LMUL: there's not a single $minimum_register_length but several useful sizes of N for every element type. See the table at https://llvm.org/docs//RISCV/RISCVVectorExtension.html#mapping-to-llvm-ir-types


For predicate vectors, it is desirable to be able to to define types where `N`
matches the number of elements in the non-predicate vector, i.e. a
`<vscale x 4 x i1>` to match a `<vscale x 4 x f32>`, `<vscale x 8 x i1>` to
match `<vscale x 8 x u16>`, or `<vscale x 16 x i1>` to match
`<vscale x 16 x u8>`. In this circumstance, it might still be possible to give
rustc all of the relevant information such that it could compute `N`, but it
would add extra complexity.

This RFC takes the position that the additional complexity required to have the
compiler always be able to calculate `N` isn't justified given the permanently
unstable nature of the `repr(scalable(N))` attribute and the scalable vector
types defined in `std::arch` are likely to be few in number, automatically
generated and well-tested.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want there to be a std::simd way to use scalable vectors at some point, so we would need a way for the compiler to either generate the correct N or be able to correctly handle any N (within some architecture-independent bounds, e.g. N is always 1, 2, 4, 8, or 16) by translating to some other N that the current architecture can handle.


# Prior art
[prior-art]: #prior-art

There are not many languages with support for scalable vectors:

- SVE in C takes a similar approach as this proposal by using sizeless
incomplete types to represent scalable vectors. However, sizeless types are
not part of the C specification and Arm's C Language Extensions (ACLE) provide
[an edit to the C standard][acle_sizeless] which formally define "sizeless
types".
- [.NET 9 has experimental support for SVE][dotnet], but as a managed language,
the design and implementation considerations in .NET are quite different to
Rust.

[rfcs#3268] was a previous iteration of this RFC.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

There are currently no unresolved questions.

# Future possibilities
[future-possibilities]: #future-possibilities

There are a handful of future possibilities enabled by this RFC:

## General mechanism for target-feature-affected types
[general-mechanism-target-feature-types]: #general-mechanism-for-target-feature-affected-types

A more general mechanism for enforcing that SIMD types are only used in
`target_feature`-annotated functions would be useful, as this would enable SVE
types to have fewer distinct restrictions than other SIMD types, and would
enable SIMD vectors to be passed by-register, a performance improvement.

Such a mechanism would need be introduced gradually to existing SIMD types with
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Such a mechanism would need be introduced gradually to existing SIMD types with
Such a mechanism would need to be introduced gradually to existing SIMD types with

a forward compatibility lint. This will be addressed in a forthcoming RFC.

## Relaxed restrictions
[relaxed-restrictions]: #relaxed-restrictions

Some of the restrictions on these types (e.g. use in compound types) could be
relaxed at a later time either by extending rustc's codegen or leveraging newly
added support in LLVM.

However, as C also has restriction and scalable vectors are nevertheless used in
production code, it is unlikely there will be much demand for those restrictions
to be relaxed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

structs containing a scalable SIMD type may actually end up being in large demand due to project-portable-simd:
One idea IIRC we've talked about is that we would end up with something kinda like a SIMD version of ArrayVec where we'd have a struct that contains both an active length (like Vec::len, not to be confused with vscale which is more like Vec::capacity) and a scalable SIMD type, so we could do operations where the tail of the SIMD type is not valid but user code wouldn't need unsafe since those extra elements are ignored by using auto-generated masks or setvl.

this would need structs like so:

pub struct SimdArrayVec<T> {
    len: usize,
    value: ScalableSimd<T>,
}


## Portable SIMD
[portable-simd]: #portable-simd

Given that there are significant differences between scalable vectors and
fixed-length vectors, and that `std::simd` is unstable, it is worth
experimenting with architecture-specific support and implementation initially.
Later, there are a variety of approaches that could be taken to incorporate
support for scalable vectors into Portable SIMD.

[acle_sizeless]: https://arm-software.github.io/acle/main/acle.html#formal-definition-of-sizeless-types
[dotnet]: https://github.com/dotnet/runtime/issues/93095
[prctl]: https://www.kernel.org/doc/Documentation/arm64/sve.txt
[rfcs#1199]: https://rust-lang.github.io/rfcs/1199-simd-infrastructure.html
[rfcs#3268]: https://github.com/rust-lang/rfcs/pull/3268
[rfcs#3729]: https://github.com/rust-lang/rfcs/pull/3729
[quote_amanieu]: https://github.com/rust-lang/rust/pull/118917#issuecomment-2202256754