Skip to content
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
13d1fbb
[PGO] Add `llvm.loop.estimated_trip_count` metadata
jdenny-ornl Jul 15, 2025
db5920a
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Jul 21, 2025
47fbe85
Add PGOEstimateTripCounts in more cases
jdenny-ornl Jul 21, 2025
f8097fb
Add unused initialization
jdenny-ornl Jul 21, 2025
7b27203
Simplify some test changes
jdenny-ornl Jul 22, 2025
4c4669a
Extend verify pass to cover new metadata
jdenny-ornl Jul 24, 2025
0f40efd
Fix test for some builds
jdenny-ornl Jul 24, 2025
2791a1c
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Jul 24, 2025
6148922
Apply some small reviewer suggestions
jdenny-ornl Jul 24, 2025
3a49b43
Attempt to fix windows pre-commit CI
jdenny-ornl Jul 24, 2025
2f7daa8
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Jul 28, 2025
c627fc5
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Aug 7, 2025
f1fa8d9
Run update script on new test from last merge
jdenny-ornl Aug 7, 2025
38ace1e
Reapply 3a18fe33f0763cd9276c99c276448412100f6270
jdenny-ornl Aug 7, 2025
92ddaa0
Convert to function pass, avoid needless pass invalidation
jdenny-ornl Aug 8, 2025
a3e0d72
Fix layering violation
jdenny-ornl Aug 8, 2025
67f22cd
Apply clang-format
jdenny-ornl Aug 8, 2025
f0ff2e2
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Aug 9, 2025
e7eb1fe
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Aug 13, 2025
0973ab3
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Aug 18, 2025
680bdc2
Remove PGOEstimateTripCountsPass and no-value form of metadata
jdenny-ornl Aug 18, 2025
59cd184
Fix case where nested loops share latch
jdenny-ornl Aug 19, 2025
5d00250
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Aug 25, 2025
5719779
Remove redundant code
jdenny-ornl Aug 25, 2025
98cab7b
Clarify recent comments some
jdenny-ornl Aug 25, 2025
b3831b6
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Sep 1, 2025
b8aed9b
Merge branch 'main' into pgo-estimated-trip-count
jdenny-ornl Sep 9, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 48 additions & 0 deletions llvm/docs/LangRef.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7966,6 +7966,54 @@ The attributes in this metadata are added to all followup loops of the
loop distribution pass. See
:ref:`Transformation Metadata <transformation-metadata>` for details.

'``llvm.loop.estimated_trip_count``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This metadata records an estimated trip count for the loop. The first operand
is the string ``llvm.loop.estimated_trip_count``. The second operand is an
integer constant of type ``i32`` or smaller specifying the estimate. For
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the motivation for allowing "or smaller"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I set a maximum when I realized the current implementation cannot handle anything wider. For flexibility, I did not set a minimum, but I have no specific use case in mind. Do you prefer i32 only? I think that would be fine for me.

example:

.. code-block:: llvm

!0 = !{!"llvm.loop.estimated_trip_count", i32 8}

Purpose
"""""""

A loop's estimated trip count is an estimate of the average number of loop
iterations (specifically, the number of times the loop's header executes) each
time execution reaches the loop. It is usually only an estimate based on, for
example, profile data. The actual number of iterations might vary widely.

The estimated trip count serves as a parameter for various loop transformations
and typically helps estimate transformation cost. For example, it can help
determine how many iterations to peel or how aggressively to unroll.

Initialization and Maintenance
""""""""""""""""""""""""""""""

Passes should interact with estimated trip counts always via
``llvm::getLoopEstimatedTripCount`` and ``llvm::setLoopEstimatedTripCount``.

When the ``llvm.loop.estimated_trip_count`` metadata is not present on a loop,
``llvm::getLoopEstimatedTripCount`` estimates the loop's trip count from the
loop's ``branch_weights`` metadata under the assumption that the latter still
accurately encodes the program's original profile data. However, as passes
transform existing loops and create new loops, they must be free to update and
create ``branch_weights`` metadata in a way that maintains accurate block
frequencies. Trip counts estimated from this new ``branch_weights`` metadata
are not necessarily useful to the passes that consume estimated trip counts.

For this reason, when a pass transforms or creates loops, the pass should
separately estimate new trip counts based on the estimated trip counts that
``llvm::getLoopEstimatedTripCount`` returns at the start of the pass, and the
pass should record the new estimates by calling
``llvm::setLoopEstimatedTripCount``, which creates or updates
``llvm.loop.estimated_trip_count`` metadata. Once this metadata is present on a
loop, ``llvm::getLoopEstimatedTripCount`` returns its value instead of
estimating the trip count from the loop's ``branch_weights`` metadata.

'``llvm.licm.disable``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expand Down
4 changes: 2 additions & 2 deletions llvm/include/llvm/IR/Metadata.h
Original file line number Diff line number Diff line change
Expand Up @@ -919,8 +919,8 @@ class MDOperand {

// Check if MDOperand is of type MDString and equals `Str`.
bool equalsStr(StringRef Str) const {
return isa<MDString>(this->get()) &&
cast<MDString>(this->get())->getString() == Str;
return isa_and_nonnull<MDString>(get()) &&
cast<MDString>(get())->getString() == Str;
}

~MDOperand() { untrack(); }
Expand Down
4 changes: 4 additions & 0 deletions llvm/include/llvm/IR/ProfDataUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ struct MDProfLabels {
LLVM_ABI static const char *UnknownBranchWeightsMarker;
};

/// Profile-based loop metadata that should be accessed only by using
/// \c llvm::getLoopEstimatedTripCount and \c llvm::setLoopEstimatedTripCount.
LLVM_ABI extern const char *LLVMLoopEstimatedTripCount;

/// Checks if an Instruction has MD_prof Metadata
LLVM_ABI bool hasProfMD(const Instruction &I);

Expand Down
52 changes: 40 additions & 12 deletions llvm/include/llvm/Transforms/Utils/LoopUtils.h
Original file line number Diff line number Diff line change
Expand Up @@ -323,22 +323,50 @@ LLVM_ABI TransformationMode hasLICMVersioningTransformation(const Loop *L);
LLVM_ABI void addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
unsigned V = 0);

/// Returns a loop's estimated trip count based on branch weight metadata.
/// In addition if \p EstimatedLoopInvocationWeight is not null it is
/// initialized with weight of loop's latch leading to the exit.
/// Returns a valid positive trip count, saturated at UINT_MAX, or std::nullopt
/// when a meaningful estimate cannot be made.
/// Return either:
/// - The value of \c llvm.loop.estimated_trip_count from the loop metadata of
/// \p L, if that metadata is present.
/// - Else, a new estimate of the trip count from the latch branch weights of
/// \p L, if the estimation's implementation is able to handle the loop form
/// of \p L (e.g., \p L must have a latch block that controls the loop exit).
/// - Else, \c std::nullopt.
///
/// An estimated trip count is always a valid positive trip count, saturated at
/// \c UINT_MAX.
///
/// In addition, if \p EstimatedLoopInvocationWeight, then either:
/// - Set \c *EstimatedLoopInvocationWeight to the weight of the latch's branch
/// to the loop exit.
/// - Do not set it, and return \c std::nullopt, if the current implementation
/// cannot compute that weight (e.g., if \p L does not have a latch block that
/// controls the loop exit) or the weight is zero (because zero cannot be
/// used to compute new branch weights that reflect the estimated trip count).
///
/// TODO: Eventually, once all passes have migrated away from setting branch
/// weights to indicate estimated trip counts, this function will drop the
/// \p EstimatedLoopInvocationWeight parameter.
LLVM_ABI std::optional<unsigned>
getLoopEstimatedTripCount(Loop *L,
unsigned *EstimatedLoopInvocationWeight = nullptr);

/// Set a loop's branch weight metadata to reflect that loop has \p
/// EstimatedTripCount iterations and \p EstimatedLoopInvocationWeight exits
/// through latch. Returns true if metadata is successfully updated, false
/// otherwise. Note that loop must have a latch block which controls loop exit
/// in order to succeed.
LLVM_ABI bool setLoopEstimatedTripCount(Loop *L, unsigned EstimatedTripCount,
unsigned EstimatedLoopInvocationWeight);
/// Set \c llvm.loop.estimated_trip_count with the value \p EstimatedTripCount
/// in the loop metadata of \p L.
///
/// In addition, if \p EstimatedLoopInvocationWeight, set the branch weight
/// metadata of \p L to reflect that \p L has an estimated
/// \p EstimatedTripCount iterations and has \c *EstimatedLoopInvocationWeight
/// exit weight through the loop's latch.
///
/// Return false if \p EstimatedLoopInvocationWeight and if branch weight
/// metadata could not be successfully updated (e.g., if \p L does not have a
/// latch block that controls the loop exit). Otherwise, return true.
///
/// TODO: Eventually, once all passes have migrated away from setting branch
/// weights to indicate estimated trip counts, this function will drop the
/// \p EstimatedLoopInvocationWeight parameter.
LLVM_ABI bool setLoopEstimatedTripCount(
Loop *L, unsigned EstimatedTripCount,
std::optional<unsigned> EstimatedLoopInvocationWeight = std::nullopt);

/// Check inner loop (L) backedge count is known to be invariant on all
/// iterations of its outer loop. If the loop has no parent, this is trivially
Expand Down
1 change: 1 addition & 0 deletions llvm/lib/IR/ProfDataUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ const char *MDProfLabels::FunctionEntryCount = "function_entry_count";
const char *MDProfLabels::SyntheticFunctionEntryCount =
"synthetic_function_entry_count";
const char *MDProfLabels::UnknownBranchWeightsMarker = "unknown";
const char *LLVMLoopEstimatedTripCount = "llvm.loop.estimated_trip_count";

bool hasProfMD(const Instruction &I) {
return I.hasMetadata(LLVMContext::MD_prof);
Expand Down
12 changes: 12 additions & 0 deletions llvm/lib/IR/Verifier.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1074,6 +1074,18 @@ void Verifier::visitMDNode(const MDNode &MD, AreDebugLocsAllowed AllowLocs) {
}
}

// Check llvm.loop.estimated_trip_count.
if (MD.getNumOperands() > 0 &&
MD.getOperand(0).equalsStr(LLVMLoopEstimatedTripCount)) {
Check(MD.getNumOperands() == 2, "Expected two operands", &MD);
auto *Count = dyn_cast_or_null<ConstantAsMetadata>(MD.getOperand(1));
Check(Count && Count->getType()->isIntegerTy() &&
cast<IntegerType>(Count->getType())->getBitWidth() <= 32,
"Expected second operand to be an integer constant of type i32 or "
"smaller",
&MD);
}

// Check these last, so we diagnose problems in operands first.
Check(!MD.isTemporary(), "Expected no forward declarations!", &MD);
Check(MD.isResolved(), "All nodes should be resolved!", &MD);
Expand Down
112 changes: 83 additions & 29 deletions llvm/lib/Transforms/Utils/LoopUtils.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -804,26 +804,51 @@ static BranchInst *getExpectedExitLoopLatchBranch(Loop *L) {
return LatchBR;
}

/// Return the estimated trip count for any exiting branch which dominates
/// the loop latch.
static std::optional<unsigned> getEstimatedTripCount(BranchInst *ExitingBranch,
Loop *L,
uint64_t &OrigExitWeight) {
struct DbgLoop {
const Loop *L;
explicit DbgLoop(const Loop *L) : L(L) {}
};

#ifndef NDEBUG
static inline raw_ostream &operator<<(raw_ostream &OS, DbgLoop D) {
OS << "function ";
D.L->getHeader()->getParent()->printAsOperand(OS, /*PrintType=*/false);
return OS << " " << *D.L;
}
#endif // NDEBUG

static std::optional<unsigned> estimateLoopTripCount(Loop *L) {
// Currently we take the estimate exit count only from the loop latch,
// ignoring other exiting blocks. This can overestimate the trip count
// if we exit through another exit, but can never underestimate it.
// TODO: incorporate information from other exits
BranchInst *ExitingBranch = getExpectedExitLoopLatchBranch(L);
if (!ExitingBranch) {
LLVM_DEBUG(dbgs() << "estimateLoopTripCount: Failed to find exiting "
<< "latch branch of required form in " << DbgLoop(L)
<< "\n");
return std::nullopt;
}

// To estimate the number of times the loop body was executed, we want to
// know the number of times the backedge was taken, vs. the number of times
// we exited the loop.
uint64_t LoopWeight, ExitWeight;
if (!extractBranchWeights(*ExitingBranch, LoopWeight, ExitWeight))
if (!extractBranchWeights(*ExitingBranch, LoopWeight, ExitWeight)) {
LLVM_DEBUG(dbgs() << "estimateLoopTripCount: Failed to extract branch "
<< "weights for " << DbgLoop(L) << "\n");
return std::nullopt;
}

if (L->contains(ExitingBranch->getSuccessor(1)))
std::swap(LoopWeight, ExitWeight);

if (!ExitWeight)
if (!ExitWeight) {
// Don't have a way to return predicated infinite
LLVM_DEBUG(dbgs() << "estimateLoopTripCount: Failed because of zero exit "
<< "probability for " << DbgLoop(L) << "\n");
return std::nullopt;

OrigExitWeight = ExitWeight;
}

// Estimated exit count is a ratio of the loop weight by the weight of the
// edge exiting the loop, rounded to nearest.
Expand All @@ -834,33 +859,62 @@ static std::optional<unsigned> getEstimatedTripCount(BranchInst *ExitingBranch,
return std::numeric_limits<unsigned>::max();

// Estimated trip count is one plus estimated exit count.
return ExitCount + 1;
uint64_t TC = ExitCount + 1;
LLVM_DEBUG(dbgs() << "estimateLoopTripCount: Estimated trip count of " << TC
<< " for " << DbgLoop(L) << "\n");
return TC;
}

std::optional<unsigned>
llvm::getLoopEstimatedTripCount(Loop *L,
unsigned *EstimatedLoopInvocationWeight) {
// Currently we take the estimate exit count only from the loop latch,
// ignoring other exiting blocks. This can overestimate the trip count
// if we exit through another exit, but can never underestimate it.
// TODO: incorporate information from other exits
if (BranchInst *LatchBranch = getExpectedExitLoopLatchBranch(L)) {
uint64_t ExitWeight;
if (std::optional<uint64_t> EstTripCount =
getEstimatedTripCount(LatchBranch, L, ExitWeight)) {
if (EstimatedLoopInvocationWeight)
*EstimatedLoopInvocationWeight = ExitWeight;
return *EstTripCount;
// If requested, either compute *EstimatedLoopInvocationWeight or return
// nullopt if cannot.
//
// TODO: Eventually, once all passes have migrated away from setting branch
// weights to indicate estimated trip counts, this function will drop the
// EstimatedLoopInvocationWeight parameter.
if (EstimatedLoopInvocationWeight) {
if (BranchInst *ExitingBranch = getExpectedExitLoopLatchBranch(L)) {
uint64_t LoopWeight = 0, ExitWeight = 0; // Inits expected to be unused.
if (!extractBranchWeights(*ExitingBranch, LoopWeight, ExitWeight))
return std::nullopt;
if (L->contains(ExitingBranch->getSuccessor(1)))
std::swap(LoopWeight, ExitWeight);
if (!ExitWeight)
return std::nullopt;
*EstimatedLoopInvocationWeight = ExitWeight;
}
}
return std::nullopt;

// Return the estimated trip count from metadata unless the metadata is
// missing or has no value.
if (auto TC = getOptionalIntLoopAttribute(L, LLVMLoopEstimatedTripCount)) {
LLVM_DEBUG(dbgs() << "getLoopEstimatedTripCount: "
<< LLVMLoopEstimatedTripCount << " metadata has trip "
<< "count of " << *TC << " for " << DbgLoop(L) << "\n");
return TC;
}

// Estimate the trip count from latch branch weights.
return estimateLoopTripCount(L);
}

bool llvm::setLoopEstimatedTripCount(Loop *L, unsigned EstimatedTripCount,
unsigned EstimatedloopInvocationWeight) {
// At the moment, we currently support changing the estimate trip count of
// the latch branch only. We could extend this API to manipulate estimated
// trip counts for any exit.
bool llvm::setLoopEstimatedTripCount(
Loop *L, unsigned EstimatedTripCount,
std::optional<unsigned> EstimatedloopInvocationWeight) {
// Set the metadata.
addStringMetadataToLoop(L, LLVMLoopEstimatedTripCount, EstimatedTripCount);

// At the moment, we currently support changing the estimated trip count in
// the latch branch's branch weights only. We could extend this API to
// manipulate estimated trip counts for any exit.
//
// TODO: Eventually, once all passes have migrated away from setting branch
// weights to indicate estimated trip counts, we will not set branch weights
// here at all.
if (!EstimatedloopInvocationWeight)
return true;
BranchInst *LatchBranch = getExpectedExitLoopLatchBranch(L);
if (!LatchBranch)
return false;
Expand All @@ -869,8 +923,8 @@ bool llvm::setLoopEstimatedTripCount(Loop *L, unsigned EstimatedTripCount,
unsigned LatchExitWeight = 0;
unsigned BackedgeTakenWeight = 0;

if (EstimatedTripCount > 0) {
LatchExitWeight = EstimatedloopInvocationWeight;
if (EstimatedTripCount != 0) {
LatchExitWeight = *EstimatedloopInvocationWeight;
BackedgeTakenWeight = (EstimatedTripCount - 1) * LatchExitWeight;
}

Expand Down
Loading