-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[RFC][LV] Add support for speculative loads in loops that may fault #151300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Regarding EVL propagation, I just noticed that header mask handling is fixed in #150202. This fix will need to be incorporated accordingly. |
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/include/llvm/CodeGen/SelectionDAGNodes.h llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h llvm/lib/IR/IntrinsicInst.cpp llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVISelLowering.h llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp llvm/lib/Transforms/Vectorize/VPlanValue.h llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp llvm/unittests/IR/VPIntrinsicTest.cpp View the diff from clang-format here.diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 557bd838d..32bb13644 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2282,8 +2282,8 @@ static VPValue *transformRecipestoEVLRecipes(VPlan &Plan) {
continue;
}
// TODO: Split optimizeMaskToEVL out and move into
- // VPlanTransforms::optimize. transformRecipestoEVLRecipes should be run in
- // tryToBuildVPlanWithVPRecipes beforehand.
+ // VPlanTransforms::optimize. transformRecipestoEVLRecipes should be run
+ // in tryToBuildVPlanWithVPRecipes beforehand.
VPRecipeBase *EVLRecipe =
optimizeMaskToEVL(Plan, CurRecipe, TypeInfo, *AllOneMask, *LastEVL);
if (!EVLRecipe)
|
if (!EnableSpeculativeLoads) { | ||
reportVectorizationFailure("Auto-vectorization of loops with speculative " | ||
"load is not enabled", | ||
"SpeculativeLoadsDisabled", ORE, L); | ||
return false; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check this first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. Thanks!
I think it would be good for reviewers if this patch was split up into several parts, probably roughly in this order:
|
Thanks for outlining the steps. That’s very helpful! I’ll follow this order for splitting up the patch. |
This patch enables vectorization of early-exit loops containing a single potentially faulting load, by supporting unit-stride first-faulting loads via the vp.load.ff intrinsic (as introduced in #128593 and cherry-picked here)
Key changes
Add VPWidenFFLoadRecipe, which defines two results: the loaded data and the updated vector length.
Lower VPWidenFFLoad to VPWidenFFLoadEVL in the EVL transform.
The data flow of the header mask does not reflect control flow order. To handle vector length changes introduced by WidenFFLoadEVL, control flow is used to adjust the EVL update logic. Recipes and basic blocks following WidenFFLoadEVL must now consume its returned vector length.
Note: While some changes might be better split into separate PRs for review clarity, presenting them together provides a complete overview and helps identify potential blockers.
And it depends on the following conditions being met: