Skip to content

[PGO] Add llvm.loop.estimated_trip_count metadata #152775

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

jdenny-ornl
Copy link
Collaborator

This patch implements the llvm.loop.estimated_trip_count metadata discussed in [RFC] Fix Loop Transformations to Preserve Block Frequencies. As suggested in the RFC comments, it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's branch_weights metadata. As suggested in the PR #128785 review, it does so via a new PGOEstimateTripCountsPass pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form.

An important observation is that PGOEstimateTripCountsPass often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, llvm::getLoopEstimatedTripCount disregards it and tries again to estimate the trip count from the loop's current branch_weights metadata.

This patch implements the `llvm.loop.estimated_trip_count` metadata
discussed in [[RFC] Fix Loop Transformations to Preserve Block
Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785).
As [suggested in the RFC
comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4),
it adds the new metadata to all loops at the time of profile ingestion
and estimates each trip count from the loop's `branch_weights`
metadata.  As [suggested in the PR#128785
review](#128785 (comment)),
it does so via a `PGOEstimateTripCountsPass` pass, which creates the
new metadata for the loop but omits the value if it cannot estimate a
trip count due to the loop's form.

An important observation not previously discussed is that
`PGOEstimateTripCountsPass` *often* cannot estimate a loop's trip
count but later passes can transform the loop in a way that makes it
possible.  Currently, such passes do not necessarily update the
metadata, but eventually that should be fixed.  Until then, if the new
metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it
and tries again to estimate the trip count from the loop's
`branch_weights` metadata.
Somehow, on some of my builds, `llvm::` prefixes are dropped from some
symbol names in the printed past list.
For estimated trip count metdata.
That was applied after merging this PR last time.
@jdenny-ornl
Copy link
Collaborator Author

This resurrects PR #148758. I'll address previous issues shortly.

Copy link

github-actions bot commented Aug 8, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@jdenny-ornl
Copy link
Collaborator Author

I believe I have addressed most issues raised in PR #148758.

For now, this PR shows what the pass pipeline tests (e.g., llvm/test/Other/new-pm-defaults.ll) look like after recent changes. As @nikic pointed out, there are still changes to where DominatorTreeAnalysis and LoopAnalysis run. LastRunTrackingAnalysis too.

I have not addressed the general issue raised there about whether pgo-estimated-trip-counts is actually worthwhile and whether we should constrain it or disable it by default. What do people think now after seeing the above changes?

@nikic
Copy link
Contributor

nikic commented Aug 8, 2025

Please rebase out the merge commits so I can test this. Found a way to apply it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants