Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding

As spotted by @Mel-Chen in this review comment: https://github.com/llvm/llvm-project/pull/149981#discussion_r2224826250

Consider an EVL tail folded loop with a VF of 4 and a trip count of 5. With EVL tail folding, it's possible that this will take place with two iterations, one with EVL=3, and one with EVL=2.

A header mask will come in with the form `icmp ule wide-canonical-iv, backedge-tc`.

Most recipes will be converted to a VP intrinsic to use EVL in `optimizeMaskToEVL`. This should really be thought of as an optimisation, but consider a recipe that isn't handled yet or slips through, and so still uses the header mask.

The header mask is generated as `icmp ule wide-canonical-iv, backedge-tc`.

On the first iteration, the mask will look like:

`[0, 1, 2, 3] <= 4 = [T, T, T, T]` 

However for the recipes which were optimized to VP intrinsics, they will have an EVL of 3, so basically a mask of `[T, T, T, F]`. 

On the second iteration, the mask will look like:

`[4, 5, 6, 7] <= 4 = [T, F, F, F]`

But for the VP intrinsics, they will have an EVL of 2 so a mask of `[T, T, F, F]`.

We need to convert the header masks to something of the form `icmp ult step-vector, EVL`, otherwise we end up processing a different number of elements per iteration depending on whether or not it was converted to a VP intrinsic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding #150197

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unoptimized header masks mixed with VP intrinsics may have different lengths during EVL tail folding #150197

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions