Skip to content

[RFC][LV] Add support for speculative loads in loops that may fault #151300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

arcbbb
Copy link
Contributor

@arcbbb arcbbb commented Jul 30, 2025

This patch enables vectorization of early-exit loops containing a single potentially faulting load, by supporting unit-stride first-faulting loads via the vp.load.ff intrinsic (as introduced in #128593 and cherry-picked here)

Key changes

  • Relax legality check: Allow a single non-dereferenceable load in the loop header for early-exit loops.
  • Tail-folding support for early-exit loop
  • First-faulting load support
    Add VPWidenFFLoadRecipe, which defines two results: the loaded data and the updated vector length.
    Lower VPWidenFFLoad to VPWidenFFLoadEVL in the EVL transform.
  • EVL propagation
    The data flow of the header mask does not reflect control flow order. To handle vector length changes introduced by WidenFFLoadEVL, control flow is used to adjust the EVL update logic. Recipes and basic blocks following WidenFFLoadEVL must now consume its returned vector length.

Note: While some changes might be better split into separate PRs for review clarity, presenting them together provides a complete overview and helps identify potential blockers.

And it depends on the following conditions being met:

  • The speculative load must reside in the loop header (non-predicated), avoiding the need for EVL phis across multiple predecessors.
  • Tail-folding kind must be DataWithEVL, and the predicate kind must be PredicateOrDontVectorize.

@arcbbb
Copy link
Contributor Author

arcbbb commented Jul 30, 2025

Regarding EVL propagation, I just noticed that header mask handling is fixed in #150202. This fix will need to be incorporated accordingly.

Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff HEAD~1 HEAD --extensions h,cpp -- llvm/include/llvm/Analysis/TargetTransformInfo.h llvm/include/llvm/Analysis/TargetTransformInfoImpl.h llvm/include/llvm/CodeGen/SelectionDAG.h llvm/include/llvm/CodeGen/SelectionDAGNodes.h llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h llvm/lib/Analysis/TargetTransformInfo.cpp llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.h llvm/lib/IR/IntrinsicInst.cpp llvm/lib/Target/RISCV/RISCVISelLowering.cpp llvm/lib/Target/RISCV/RISCVISelLowering.h llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp llvm/lib/Transforms/Vectorize/LoopVectorize.cpp llvm/lib/Transforms/Vectorize/VPlan.h llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp llvm/lib/Transforms/Vectorize/VPlanValue.h llvm/lib/Transforms/Vectorize/VPlanVerifier.cpp llvm/unittests/IR/VPIntrinsicTest.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
index 557bd838d..32bb13644 100644
--- a/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
+++ b/llvm/lib/Transforms/Vectorize/VPlanTransforms.cpp
@@ -2282,8 +2282,8 @@ static VPValue *transformRecipestoEVLRecipes(VPlan &Plan) {
         continue;
       }
       // TODO: Split optimizeMaskToEVL out and move into
-      // VPlanTransforms::optimize. transformRecipestoEVLRecipes should be run in
-      // tryToBuildVPlanWithVPRecipes beforehand.
+      // VPlanTransforms::optimize. transformRecipestoEVLRecipes should be run
+      // in tryToBuildVPlanWithVPRecipes beforehand.
       VPRecipeBase *EVLRecipe =
           optimizeMaskToEVL(Plan, CurRecipe, TypeInfo, *AllOneMask, *LastEVL);
       if (!EVLRecipe)

Comment on lines 10002 to 10007
if (!EnableSpeculativeLoads) {
reportVectorizationFailure("Auto-vectorization of loops with speculative "
"load is not enabled",
"SpeculativeLoadsDisabled", ORE, L);
return false;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Thanks!

@david-arm
Copy link
Contributor

I think it would be good for reviewers if this patch was split up into several parts, probably roughly in this order:

  1. Codegen specific parts related to the RISCV intrinsic,
  2. A standalone loop vectorisation legality patch similar to [LV] Add initial legality checks for ee loops with stores #145663,
  3. A loop vectoriser patch that starts vectorising the loops you're interested in.

@arcbbb arcbbb changed the title [LV] Add support for speculative loads in loops that may fault [RFC][LV] Add support for speculative loads in loops that may fault Jul 31, 2025
@arcbbb
Copy link
Contributor Author

arcbbb commented Aug 1, 2025

I think it would be good for reviewers if this patch was split up into several parts, probably roughly in this order:

  1. Codegen specific parts related to the RISCV intrinsic,
  2. A standalone loop vectorisation legality patch similar to [LV] Add initial legality checks for ee loops with stores #145663,
  3. A loop vectoriser patch that starts vectorising the loops you're interested in.

Thanks for outlining the steps. That’s very helpful! I’ll follow this order for splitting up the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants