-
Notifications
You must be signed in to change notification settings - Fork 15.2k
IRLinker: avoid quadratic behavior #157045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IRLinker: avoid quadratic behavior #157045
Conversation
Created using spr 1.3.6
@llvm/pr-subscribers-lto Author: Vitaly Buka (vitalybuka) Changesis_contained does linear search trough On large binaries it can take up to 30% Full diff: https://github.com/llvm/llvm-project/pull/157045.diff 1 Files Affected:
diff --git a/llvm/lib/Linker/IRMover.cpp b/llvm/lib/Linker/IRMover.cpp
index d6c15de4c4cdc..8903f6e9775d8 100644
--- a/llvm/lib/Linker/IRMover.cpp
+++ b/llvm/lib/Linker/IRMover.cpp
@@ -8,6 +8,7 @@
#include "llvm/Linker/IRMover.h"
#include "LinkDiagnosticInfo.h"
+#include "llvm/ADT/DenseSet.h"
#include "llvm/ADT/SetVector.h"
#include "llvm/ADT/SmallString.h"
#include "llvm/IR/AutoUpgrade.h"
@@ -290,6 +291,9 @@ class IRLinker {
Module &DstM;
std::unique_ptr<Module> SrcM;
+ // Lookup table to optimize IRMover::linkNamedMDNodes().
+ DenseMap<StringRef, DenseSet<MDNode *>> NamedMDNodes;
+
/// See IRMover::move().
IRMover::LazyCallback AddLazyFor;
@@ -1132,12 +1136,20 @@ void IRLinker::linkNamedMDNodes() {
continue;
NamedMDNode *DestNMD = DstM.getOrInsertNamedMetadata(NMD.getName());
+
+ auto &Inserted = NamedMDNodes[DestNMD->getName()];
+ if (Inserted.size() < DestNMD->getNumOperands()) {
+ // Must be the first module, copy everything from DestNMD.
+ Inserted.insert(DestNMD->operands().begin(), DestNMD->operands().end());
+ }
+
// Add Src elements into Dest node.
for (const MDNode *Op : NMD.operands()) {
MDNode *MD = Mapper.mapMDNode(*Op);
- if (!is_contained(DestNMD->operands(), MD))
+ if (Inserted.insert(MD).second)
DestNMD->addOperand(MD);
}
+ assert(Inserted.size() == DestNMD->getNumOperands());
}
}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR optimizes the IRLinker to avoid quadratic behavior during ThinLTO linking by replacing a linear search with a hash-based lookup. The change addresses performance issues where is_contained
performs linear searches through previously inserted operands, causing up to 30% overhead on large binaries with CFI metadata.
Key changes:
- Introduces a
DenseMap<StringRef, DenseSet<MDNode *>>
lookup table to track inserted operands - Replaces
is_contained
linear search with O(1) hash-based duplicate detection - Adds initialization logic to populate the lookup table from existing destination operands
Created using spr 1.3.6
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/181/builds/27321 Here is the relevant piece of the build log for the reference
|
In #157045 I didn't realize that IRMover object is one for entire LTO, but IRLinker is created per module. We need one chache for combined module. I timed "IRLinker::linkNamedMDNodes" code on one of our internal binaries, not very large, but above average. Before #146020: 0.4859s After #146020: 624.4686s After #157045: 322.3493s After this patch: 0.5574s
We're hitting the assert when there are duplicated I'll go ahead and remove the assert for now and add a test. |
It was added in #157045, but will fail if there are duplicated nodes in DstM already, as illustrated by the test. See comments on the PR.
After #146020
is_contained
does linear search troughall previously inserted operands.
On large binaries it can take up to 30% for ThinLTO linking with CFI, which
uses long lists of
cfi.functions
metadata.