Skip to content

Commit 13d1fbb

Browse files
committed
[PGO] Add llvm.loop.estimated_trip_count metadata
This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR#128785 review](#128785 (comment)), it does so via a `PGOEstimateTripCountsPass` pass, which creates the new metadata for the loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip count but later passes can transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's `branch_weights` metadata.
1 parent 4b52d22 commit 13d1fbb

File tree

18 files changed

+724
-137
lines changed

18 files changed

+724
-137
lines changed

llvm/docs/LangRef.rst

Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7933,6 +7933,67 @@ The attributes in this metadata is added to all followup loops of the
79337933
loop distribution pass. See
79347934
:ref:`Transformation Metadata <transformation-metadata>` for details.
79357935

7936+
'``llvm.loop.estimated_trip_count``' Metadata
7937+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
7938+
7939+
This metadata records an estimated trip count for the loop. The first operand
7940+
is the string ``llvm.loop.estimated_trip_count``. The second operand is an
7941+
integer specifying the count, which might be omitted for the reasons described
7942+
below. For example:
7943+
7944+
.. code-block:: llvm
7945+
7946+
!0 = !{!"llvm.loop.estimated_trip_count", i32 8}
7947+
!1 = !{!"llvm.loop.estimated_trip_count"}
7948+
7949+
Purpose
7950+
"""""""
7951+
7952+
A loop's estimated trip count is an estimate of the average number of loop
7953+
iterations (specifically, the number of times the loop's header executes) each
7954+
time execution reaches the loop. It is usually only an estimate based on, for
7955+
example, profile data. The actual number of iterations might vary widely.
7956+
7957+
The estimated trip count serves as a parameter for various loop transformations
7958+
and typically helps estimate transformation cost. For example, it can help
7959+
determine how many iterations to peel or how aggressively to unroll.
7960+
7961+
Initialization and Maintenance
7962+
""""""""""""""""""""""""""""""
7963+
7964+
The ``pgo-estimate-trip-counts`` pass typically runs immediately after profile
7965+
ingestion to add this metadata to all loops. It estimates each loop's trip
7966+
count from the loop's ``branch_weights`` metadata. This way of initially
7967+
estimating trip counts appears to be useful for the passes that consume them.
7968+
7969+
As passes transform existing loops and create new loops, they must be free to
7970+
update and create ``branch_weights`` metadata to maintain accurate block
7971+
frequencies. Trip counts estimated from this new ``branch_weights`` metadata
7972+
are not necessarily useful to the passes that consume them. In general, when
7973+
passes transform and create loops, they should separately estimate new trip
7974+
counts from previously estimated trip counts, and they should record them by
7975+
creating or updating this metadata. For this or any other work involving
7976+
estimated trip counts, passes should always call
7977+
``llvm::getLoopEstimatedTripCount`` and ``llvm::setLoopEstimatedTripCount``.
7978+
7979+
Missing Metadata and Values
7980+
"""""""""""""""""""""""""""
7981+
7982+
If the current implementation of ``pgo-estimate-trip-counts`` cannot estimate a
7983+
trip count from the loop's ``branch_weights`` metadata due to the loop's form or
7984+
due to missing profile data, it creates this metadata for the loop but omits the
7985+
value. This situation is currently common (e.g., the LLVM IR loop that Clang
7986+
emits for a simple C ``for`` loop). A later pass (e.g., ``loop-rotate``) might
7987+
modify the loop's form in a way that enables estimating its trip count even if
7988+
those modifications provably never impact the actual number of loop iterations.
7989+
That later pass should then add an appropriate value to the metadata.
7990+
7991+
However, not all such passes currently do so. Thus, if this metadata has no
7992+
value, ``llvm::getLoopEstimatedTripCount`` will disregard it and estimate the
7993+
trip count from the loop's ``branch_weights`` metadata. It does the same when
7994+
the metadata is missing altogether, perhaps because ``pgo-estimate-trip-counts``
7995+
was not specified in a minimal pass list to a tool like ``opt``.
7996+
79367997
'``llvm.licm.disable``' Metadata
79377998
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
79387999

llvm/include/llvm/Analysis/LoopInfo.h

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -637,9 +637,13 @@ LLVM_ABI std::optional<bool> getOptionalBoolLoopAttribute(const Loop *TheLoop,
637637
/// Returns true if Name is applied to TheLoop and enabled.
638638
LLVM_ABI bool getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name);
639639

640-
/// Find named metadata for a loop with an integer value.
641-
LLVM_ABI std::optional<int> getOptionalIntLoopAttribute(const Loop *TheLoop,
642-
StringRef Name);
640+
/// Find named metadata for a loop with an integer value. Return
641+
/// \c std::nullopt if the metadata has no value or is missing altogether. If
642+
/// \p Missing, set \c *Missing to indicate whether the metadata is missing
643+
/// altogether.
644+
LLVM_ABI std::optional<int>
645+
getOptionalIntLoopAttribute(const Loop *TheLoop, StringRef Name,
646+
bool *Missing = nullptr);
643647

644648
/// Find named metadata for a loop with an integer value. Return \p Default if
645649
/// not set.
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
//===- PGOEstimateTripCounts.h ----------------------------------*- C++ -*-===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#ifndef LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H
10+
#define LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H
11+
12+
#include "llvm/IR/PassManager.h"
13+
14+
namespace llvm {
15+
16+
struct PGOEstimateTripCountsPass
17+
: public PassInfoMixin<PGOEstimateTripCountsPass> {
18+
PGOEstimateTripCountsPass() {}
19+
PreservedAnalyses run(Module &M, ModuleAnalysisManager &AM);
20+
};
21+
22+
} // namespace llvm
23+
24+
#endif // LLVM_TRANSFORMS_INSTRUMENTATION_PGOESTIMATETRIPCOUNTS_H

llvm/include/llvm/Transforms/Utils/LoopUtils.h

Lines changed: 65 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -316,28 +316,73 @@ LLVM_ABI TransformationMode hasDistributeTransformation(const Loop *L);
316316
LLVM_ABI TransformationMode hasLICMVersioningTransformation(const Loop *L);
317317
/// @}
318318

319-
/// Set input string into loop metadata by keeping other values intact.
320-
/// If the string is already in loop metadata update value if it is
321-
/// different.
322-
LLVM_ABI void addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
323-
unsigned V = 0);
324-
325-
/// Returns a loop's estimated trip count based on branch weight metadata.
326-
/// In addition if \p EstimatedLoopInvocationWeight is not null it is
327-
/// initialized with weight of loop's latch leading to the exit.
328-
/// Returns a valid positive trip count, saturated at UINT_MAX, or std::nullopt
329-
/// when a meaningful estimate cannot be made.
319+
/// Set the string \p MDString into the loop metadata of \p TheLoop while
320+
/// keeping other loop metadata intact. Set \p *V as its value, or set it
321+
/// without a value if \p V is \c std::nullopt to indicate the value is unknown.
322+
/// If \p MDString is already in the loop metadata, update it if its value (or
323+
/// lack of value) is different. Return true if metadata was changed.
324+
LLVM_ABI bool addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
325+
std::optional<unsigned> V = 0);
326+
327+
/// Return either:
328+
/// - The value of \c llvm.loop.estimated_trip_count from the loop metadata of
329+
/// \p L, if that metadata is present and has a value.
330+
/// - Else, a new estimate of the trip count from the latch branch weights of
331+
/// \p L, if the estimation's implementation is able to handle the loop form
332+
/// of \p L (e.g., \p L must have a latch block that controls the loop exit).
333+
/// - Else, \c std::nullopt.
334+
///
335+
/// An estimated trip count is always a valid positive trip count, saturated at
336+
/// \c UINT_MAX.
337+
///
338+
/// Via \c LLVM_DEBUG, emit diagnostics that include "WARNING" when the metadata
339+
/// is in an unexpected state as that indicates some transformation has
340+
/// corrupted it. If \p DbgForInit, expect the metadata to be missing.
341+
/// Otherwise, expect the metadata to be present, and expect it to have no value
342+
/// only if the trip count is currently inestimable from the latch branch
343+
/// weights.
344+
///
345+
/// In addition, if \p EstimatedLoopInvocationWeight, then either:
346+
/// - Set \p *EstimatedLoopInvocationWeight to the weight of the latch's branch
347+
/// to the loop exit.
348+
/// - Do not set it and return \c std::nullopt if the current implementation
349+
/// cannot compute that weight (e.g., if \p L does not have a latch block that
350+
/// controls the loop exit) or the weight is zero (because zero cannot be
351+
/// used to compute new branch weights that reflect the estimated trip count).
352+
///
353+
/// TODO: Eventually, once all passes have migrated away from setting branch
354+
/// weights to indicate estimated trip counts, this function will drop the
355+
/// \p EstimatedLoopInvocationWeight parameter.
330356
LLVM_ABI std::optional<unsigned>
331357
getLoopEstimatedTripCount(Loop *L,
332-
unsigned *EstimatedLoopInvocationWeight = nullptr);
333-
334-
/// Set a loop's branch weight metadata to reflect that loop has \p
335-
/// EstimatedTripCount iterations and \p EstimatedLoopInvocationWeight exits
336-
/// through latch. Returns true if metadata is successfully updated, false
337-
/// otherwise. Note that loop must have a latch block which controls loop exit
338-
/// in order to succeed.
339-
LLVM_ABI bool setLoopEstimatedTripCount(Loop *L, unsigned EstimatedTripCount,
340-
unsigned EstimatedLoopInvocationWeight);
358+
unsigned *EstimatedLoopInvocationWeight = nullptr,
359+
bool DbgForInit = false);
360+
361+
/// Set \c llvm.loop.estimated_trip_count with the value \c *EstimatedTripCount
362+
/// in the loop metadata of \p L, or set it without a value if
363+
/// \c !EstimatedTripCount to indicate that \c getLoopEstimatedTripCount cannot
364+
/// estimate the trip count from latch branch weights. If
365+
/// \c !EstimatedTripCount but \c getLoopEstimatedTripCount can estimate the
366+
/// trip counts, future calls to \c getLoopEstimatedTripCount will diagnose the
367+
/// metadata as corrupt.
368+
///
369+
/// In addition, if \p EstimatedLoopInvocationWeight, set the branch weight
370+
/// metadata of \p L to reflect that \p L has an estimated
371+
/// \c *EstimatedTripCount iterations and has \c *EstimatedLoopInvocationWeight
372+
/// exit weight through the loop's latch.
373+
///
374+
/// Return false if \c llvm.loop.estimated_trip_count was already set according
375+
/// to \p EstimatedTripCount and so was not updated. Return false if
376+
/// \p EstimatedLoopInvocationWeight and if branch weight metadata could not be
377+
/// successfully updated (e.g., if \p L does not have a latch block that
378+
/// controls the loop exit). Otherwise, return true.
379+
///
380+
/// TODO: Eventually, once all passes have migrated away from setting branch
381+
/// weights to indicate estimated trip counts, this function will drop the
382+
/// \p EstimatedLoopInvocationWeight parameter.
383+
LLVM_ABI bool setLoopEstimatedTripCount(
384+
Loop *L, std::optional<unsigned> EstimatedTripCount,
385+
std::optional<unsigned> EstimatedLoopInvocationWeight = std::nullopt);
341386

342387
/// Check inner loop (L) backedge count is known to be invariant on all
343388
/// iterations of its outer loop. If the loop has no parent, this is trivially

llvm/lib/Analysis/LoopInfo.cpp

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1112,9 +1112,13 @@ bool llvm::getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name) {
11121112
}
11131113

11141114
std::optional<int> llvm::getOptionalIntLoopAttribute(const Loop *TheLoop,
1115-
StringRef Name) {
1116-
const MDOperand *AttrMD =
1117-
findStringMetadataForLoop(TheLoop, Name).value_or(nullptr);
1115+
StringRef Name,
1116+
bool *Missing) {
1117+
std::optional<const MDOperand *> AttrMDOpt =
1118+
findStringMetadataForLoop(TheLoop, Name);
1119+
if (Missing)
1120+
*Missing = !AttrMDOpt;
1121+
const MDOperand *AttrMD = AttrMDOpt.value_or(nullptr);
11181122
if (!AttrMD)
11191123
return std::nullopt;
11201124

llvm/lib/Passes/PassBuilder.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -248,6 +248,7 @@
248248
#include "llvm/Transforms/Instrumentation/NumericalStabilitySanitizer.h"
249249
#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
250250
#include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h"
251+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
251252
#include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
252253
#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
253254
#include "llvm/Transforms/Instrumentation/RealtimeSanitizer.h"

llvm/lib/Passes/PassBuilderPipelines.cpp

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,7 @@
8080
#include "llvm/Transforms/Instrumentation/MemProfUse.h"
8181
#include "llvm/Transforms/Instrumentation/PGOCtxProfFlattening.h"
8282
#include "llvm/Transforms/Instrumentation/PGOCtxProfLowering.h"
83+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
8384
#include "llvm/Transforms/Instrumentation/PGOForceFunctionAttrs.h"
8485
#include "llvm/Transforms/Instrumentation/PGOInstrumentation.h"
8586
#include "llvm/Transforms/Scalar/ADCE.h"
@@ -1268,8 +1269,13 @@ PassBuilder::buildModuleSimplificationPipeline(OptimizationLevel Level,
12681269
MPM.addPass(MemProfUsePass(PGOOpt->MemoryProfile, PGOOpt->FS));
12691270

12701271
if (PGOOpt && (PGOOpt->Action == PGOOptions::IRUse ||
1271-
PGOOpt->Action == PGOOptions::SampleUse))
1272+
PGOOpt->Action == PGOOptions::SampleUse)) {
12721273
MPM.addPass(PGOForceFunctionAttrsPass(PGOOpt->ColdOptType));
1274+
// TODO: Is this the right place for this pass? Should we enable it in any
1275+
// other case, such as when __builtin_expect_with_probability or
1276+
// __builtin_expect appears in the source code but profiles are not read?
1277+
MPM.addPass(PGOEstimateTripCountsPass());
1278+
}
12731279

12741280
MPM.addPass(AlwaysInlinerPass(/*InsertLifetimeIntrinsics=*/true));
12751281

@@ -2355,4 +2361,4 @@ AAManager PassBuilder::buildDefaultAAPipeline() {
23552361
bool PassBuilder::isInstrumentedPGOUse() const {
23562362
return (PGOOpt && PGOOpt->Action == PGOOptions::IRUse) ||
23572363
!UseCtxProfile.empty();
2358-
}
2364+
}

llvm/lib/Passes/PassRegistry.def

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -124,6 +124,7 @@ MODULE_PASS("openmp-opt", OpenMPOptPass())
124124
MODULE_PASS("openmp-opt-postlink",
125125
OpenMPOptPass(ThinOrFullLTOPhase::FullLTOPostLink))
126126
MODULE_PASS("partial-inliner", PartialInlinerPass())
127+
MODULE_PASS("pgo-estimate-trip-counts", PGOEstimateTripCountsPass())
127128
MODULE_PASS("pgo-icall-prom", PGOIndirectCallPromotion())
128129
MODULE_PASS("pgo-instr-gen", PGOInstrumentationGen())
129130
MODULE_PASS("pgo-instr-use", PGOInstrumentationUse())

llvm/lib/Transforms/Instrumentation/CMakeLists.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ add_llvm_component_library(LLVMInstrumentation
1616
LowerAllowCheckPass.cpp
1717
PGOCtxProfFlattening.cpp
1818
PGOCtxProfLowering.cpp
19+
PGOEstimateTripCounts.cpp
1920
PGOForceFunctionAttrs.cpp
2021
PGOInstrumentation.cpp
2122
PGOMemOPSizeOpt.cpp
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
#include "llvm/Transforms/Instrumentation/PGOEstimateTripCounts.h"
10+
#include "llvm/Analysis/LoopInfo.h"
11+
#include "llvm/IR/Module.h"
12+
#include "llvm/Transforms/Utils/LoopUtils.h"
13+
14+
using namespace llvm;
15+
16+
#define DEBUG_TYPE "pgo-estimate-trip-counts"
17+
18+
static bool runOnLoop(Loop *L) {
19+
bool MadeChange = false;
20+
std::optional<unsigned> TC = getLoopEstimatedTripCount(
21+
L, /*EstimatedLoopInvocationWeight=*/nullptr, /*DbgForInit=*/true);
22+
MadeChange |= setLoopEstimatedTripCount(L, TC);
23+
for (Loop *SL : *L)
24+
MadeChange |= runOnLoop(SL);
25+
return MadeChange;
26+
}
27+
28+
PreservedAnalyses PGOEstimateTripCountsPass::run(Module &M,
29+
ModuleAnalysisManager &AM) {
30+
FunctionAnalysisManager &FAM =
31+
AM.getResult<FunctionAnalysisManagerModuleProxy>(M).getManager();
32+
bool MadeChange = false;
33+
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": start\n");
34+
for (Function &F : M) {
35+
if (F.isDeclaration())
36+
continue;
37+
LoopInfo *LI = &FAM.getResult<LoopAnalysis>(F);
38+
if (!LI)
39+
continue;
40+
for (Loop *L : *LI)
41+
MadeChange |= runOnLoop(L);
42+
}
43+
LLVM_DEBUG(dbgs() << DEBUG_TYPE << ": end\n");
44+
return MadeChange ? PreservedAnalyses::none() : PreservedAnalyses::all();
45+
}

0 commit comments

Comments
 (0)