-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[RISCV][TTI] Enable masked interleave vectorization #150074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -398,6 +398,10 @@ class RISCVTTIImpl final : public BasicTTIImplBase<RISCVTTIImpl> { | |
|
||
bool enableInterleavedAccessVectorization() const override { return true; } | ||
|
||
bool enableMaskedInterleavedAccessVectorization() const override { | ||
return ST->hasVInstructions(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This comment isn't opposing the use of hasVInstructions() here — in fact, I think it's a good thing. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this is a case where the existing code probably should be checking for vector instructions, but that the difference is essentially stylistic. The cost model results will penalize trying to vectorize without V enough that the feature being enabled won't really matter. The major question (long term) is what we want to do if P ever stabilizes, but that's definitely future work. |
||
} | ||
|
||
unsigned getMinTripCountTailFoldingThreshold() const override; | ||
|
||
enum RISCVRegisterClass { GPRRC, FPRRC, VRRC }; | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To make sure I'm understanding this right, we do support fixed-length deinterleave/interleave intrinsics, i.e.
Will get lowered to a vlseg. It's just that we don't currently match a masked.load/store with shufflevector [de]interleaves? Which is what the loop vectorizer emits for fixed-length vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For fixed-length VF, shuffles are used instead of interleave intrinsics.
However, I believe getMask in the InterleavedAccessPass should already handle shuffle mask:
Fixed-length should be supported.
Therefore, I think doing it as shown in #149981 should be good enough, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because we don't handle llvm.masked.{load,store} with shufflevectors, only load and vp.load:
llvm-project/llvm/lib/CodeGen/InterleavedAccessPass.cpp
Lines 689 to 695 in b7889a6
But yes, it seems a shame that we already have this fixed-length mask functionality in getMask. Hopefully it's not too difficult to extend InterleavedAccessPass to handle llvm.masked.{load,store} with shufflevectors. I think that would round out support for all the different ways of expressing interleaves.
Which as a side note, I think that would mean there's 12 possible different types of [de]interleaves: Either shufflevector or intrinsic based, for each of load, store, masked.load, masked.store, vp.load or vp.store. That's a lot!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. Then I think it does make sense to initially limit support to scalable only. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Luke is correct here. I am actively working on fully supporting the shuffle path with masked.load/store, but we're not there yet. The next change is #150241, and we've got at least one more needed after that before we could reasonable enable fixed vectors by default.