-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[LLD][COFF] Add support for ARM64X same-address thunks #151255
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-lld-coff Author: Jacek Caban (cjacek) ChangesFixes MSVC CRT thread-local constructors support on hybrid ARM64X targets.
MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it. The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the Patch is 28.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151255.diff 14 Files Affected:
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 01752cdc6a9da..b9231b5d9bec5 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -422,12 +422,6 @@ void SectionChunk::writeTo(uint8_t *buf) const {
applyRelocation(buf + rel.VirtualAddress, rel);
}
-
- // Write the offset to EC entry thunk preceding section contents. The low bit
- // is always set, so it's effectively an offset from the last byte of the
- // offset.
- if (Defined *entryThunk = getEntryThunk())
- write32le(buf - sizeof(uint32_t), entryThunk->getRVA() - rva + 1);
}
void SectionChunk::applyRelocation(uint8_t *off,
@@ -881,6 +875,18 @@ void RangeExtensionThunkARM64::writeTo(uint8_t *buf) const {
applyArm64Imm(buf + 4, target->getRVA() & 0xfff, 0);
}
+void SameAddressThunkARM64EC::setDynamicRelocs(COFFLinkerContext &ctx) const {
+ // Add ARM64X relocations replacing adrp/add instructions with a version using
+ // the hybrid target.
+ RangeExtensionThunkARM64 hybridView(ARM64EC, hybridTarget);
+ uint8_t buf[sizeof(arm64Thunk)];
+ hybridView.setRVA(rva);
+ hybridView.writeTo(buf);
+ ctx.dynamicRelocs->set(this, *reinterpret_cast<uint32_t *>(buf));
+ ctx.dynamicRelocs->set(Arm64XRelocVal(this, sizeof(uint32_t)),
+ *reinterpret_cast<uint32_t *>(buf + sizeof(uint32_t)));
+}
+
LocalImportChunk::LocalImportChunk(COFFLinkerContext &c, Defined *s)
: sym(s), ctx(c) {
setAlignment(ctx.config.wordsize);
@@ -1264,7 +1270,8 @@ void DynamicRelocsChunk::finalize() {
}
// Set the reloc value. The reloc entry must be allocated beforehand.
-void DynamicRelocsChunk::set(uint32_t rva, Arm64XRelocVal value) {
+void DynamicRelocsChunk::set(Arm64XRelocVal offset, Arm64XRelocVal value) {
+ uint32_t rva = offset.get();
auto entry =
llvm::find_if(arm64xRelocs, [rva](const Arm64XDynamicRelocEntry &e) {
return e.offset.get() == rva;
diff --git a/lld/COFF/Chunks.h b/lld/COFF/Chunks.h
index d03a64cc6b812..6d88f5ec73776 100644
--- a/lld/COFF/Chunks.h
+++ b/lld/COFF/Chunks.h
@@ -193,6 +193,8 @@ class NonSectionChunk : public Chunk {
// allowed ranges. Return the additional space required for the extension.
virtual uint32_t extendRanges() { return 0; };
+ virtual Defined *getEntryThunk() const { return nullptr; };
+
static bool classof(const Chunk *c) { return c->kind() >= OtherKind; }
protected:
@@ -633,7 +635,7 @@ class ImportThunkChunkARM64EC : public ImportThunkChunk {
bool verifyRanges() override;
uint32_t extendRanges() override;
- Defined *exitThunk;
+ Defined *exitThunk = nullptr;
Defined *sym = nullptr;
bool extended = false;
@@ -675,6 +677,26 @@ class RangeExtensionThunkARM64 : public NonSectionCodeChunk {
MachineTypes machine;
};
+// A chunk used to guarantee the same address for a function in both views of
+// a hybrid image. Similar to RangeExtensionThunkARM64 chunks, it calls the
+// target symbol using a BR instruction. It also contains an entry thunk for EC
+// compatibility and additional ARM64X relocations that swap targets between
+// views.
+class SameAddressThunkARM64EC : public RangeExtensionThunkARM64 {
+public:
+ explicit SameAddressThunkARM64EC(Defined *t, Defined *hybridTarget,
+ Defined *entryThunk)
+ : RangeExtensionThunkARM64(ARM64EC, t), hybridTarget(hybridTarget),
+ entryThunk(entryThunk) {}
+
+ Defined *getEntryThunk() const override { return entryThunk; }
+ void setDynamicRelocs(COFFLinkerContext &ctx) const;
+
+private:
+ Defined *hybridTarget;
+ Defined *entryThunk;
+};
+
// Windows-specific.
// See comments for DefinedLocalImport class.
class LocalImportChunk : public NonSectionChunk {
@@ -843,13 +865,13 @@ class Arm64XRelocVal {
public:
Arm64XRelocVal(uint64_t value = 0) : value(value) {}
Arm64XRelocVal(Defined *sym, int32_t offset = 0) : sym(sym), value(offset) {}
- Arm64XRelocVal(Chunk *chunk, int32_t offset = 0)
+ Arm64XRelocVal(const Chunk *chunk, int32_t offset = 0)
: chunk(chunk), value(offset) {}
uint64_t get() const;
private:
Defined *sym = nullptr;
- Chunk *chunk = nullptr;
+ const Chunk *chunk = nullptr;
uint64_t value;
};
@@ -884,7 +906,7 @@ class DynamicRelocsChunk : public NonSectionChunk {
arm64xRelocs.emplace_back(type, size, offset, value);
}
- void set(uint32_t rva, Arm64XRelocVal value);
+ void set(Arm64XRelocVal offset, Arm64XRelocVal value);
private:
std::vector<Arm64XDynamicRelocEntry> arm64xRelocs;
@@ -940,6 +962,8 @@ inline bool Chunk::isHotPatchable() const {
inline Defined *Chunk::getEntryThunk() const {
if (auto *c = dyn_cast<const SectionChunkEC>(this))
return c->entryThunk;
+ if (auto *c = dyn_cast<const NonSectionChunk>(this))
+ return c->getEntryThunk();
return nullptr;
}
diff --git a/lld/COFF/Config.h b/lld/COFF/Config.h
index 91b6e632fa7ed..a03bb57641670 100644
--- a/lld/COFF/Config.h
+++ b/lld/COFF/Config.h
@@ -223,6 +223,9 @@ struct Configuration {
StringRef manifestUIAccess = "'false'";
StringRef manifestFile;
+ // used for /arm64xsameaddress
+ std::vector<std::pair<Symbol *, Symbol *>> sameAddresses;
+
// used for /dwodir
StringRef dwoDir;
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 570b8f9d05906..7580b469e4f87 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -500,7 +500,9 @@ void LinkerDriver::parseDirectives(InputFile *file) {
file->symtab.parseAlternateName(arg->getValue());
break;
case OPT_arm64xsameaddress:
- if (!file->symtab.isEC())
+ if (file->symtab.isEC())
+ parseSameAddress(arg->getValue());
+ else
Warn(ctx) << arg->getSpelling()
<< " is not allowed in non-ARM64EC files (" << toString(file)
<< ")";
@@ -1318,13 +1320,9 @@ void LinkerDriver::convertResources() {
}
void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
- Defined *def;
if (!sym)
return;
- if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getDefinedWeakAlias();
- else
- def = dyn_cast<Defined>(sym);
+ Defined *def = sym->getDefined();
if (!def)
return;
@@ -1356,11 +1354,7 @@ void LinkerDriver::createECExportThunks() {
Symbol *sym = ctx.symtab.find(targetName);
if (!sym)
continue;
- Defined *targetSym;
- if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getDefinedWeakAlias();
- else
- targetSym = dyn_cast<Defined>(sym);
+ Defined *targetSym = sym->getDefined();
if (!targetSym)
continue;
@@ -2303,6 +2297,13 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
args.filtered(OPT_dependentloadflag, OPT_dependentloadflag_opt))
parseDependentLoadFlags(arg);
+ for (auto *arg : args.filtered(OPT_arm64xsameaddress)) {
+ if (ctx.hybridSymtab)
+ parseSameAddress(arg->getValue());
+ else
+ Warn(ctx) << arg->getSpelling() << " is allowed only on EC targets";
+ }
+
if (tar) {
llvm::TimeTraceScope timeScope("Reproducer: response file");
tar->append(
@@ -2676,12 +2677,46 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
createECExportThunks();
// Resolve remaining undefined symbols and warn about imported locals.
+ std::vector<Undefined *> aliases;
ctx.forEachSymtab(
- [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(); });
+ [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(aliases); });
if (errorCount())
return;
+ ctx.forEachActiveSymtab([](SymbolTable &symtab) {
+ symtab.initializeECThunks();
+ symtab.initializeLoadConfig();
+ });
+
+ // Identify unreferenced COMDAT sections.
+ if (config->doGC) {
+ if (config->mingw) {
+ // markLive doesn't traverse .eh_frame, but the personality function is
+ // only reached that way. The proper solution would be to parse and
+ // traverse the .eh_frame section, like the ELF linker does.
+ // For now, just manually try to retain the known possible personality
+ // functions. This doesn't bring in more object files, but only marks
+ // functions that already have been included to be retained.
+ ctx.forEachSymtab([&](SymbolTable &symtab) {
+ for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
+ "rust_eh_personality"}) {
+ Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
+ if (d && !d->isGCRoot) {
+ d->isGCRoot = true;
+ config->gcroot.push_back(d);
+ }
+ }
+ });
+ }
+
+ markLive(ctx);
+ }
+
+ ctx.symtab.initializeSameAddressThunks();
+ for (auto alias : aliases)
+ alias->resolveWeakAlias();
+
if (config->mingw) {
// Make sure the crtend.o object is the last object file. This object
// file can contain terminating section chunks that need to be placed
@@ -2773,35 +2808,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
if (auto *arg = args.getLastArg(OPT_print_symbol_order))
config->printSymbolOrder = arg->getValue();
- if (ctx.symtab.isEC())
- ctx.symtab.initializeECThunks();
- ctx.forEachActiveSymtab(
- [](SymbolTable &symtab) { symtab.initializeLoadConfig(); });
-
- // Identify unreferenced COMDAT sections.
- if (config->doGC) {
- if (config->mingw) {
- // markLive doesn't traverse .eh_frame, but the personality function is
- // only reached that way. The proper solution would be to parse and
- // traverse the .eh_frame section, like the ELF linker does.
- // For now, just manually try to retain the known possible personality
- // functions. This doesn't bring in more object files, but only marks
- // functions that already have been included to be retained.
- ctx.forEachSymtab([&](SymbolTable &symtab) {
- for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
- "rust_eh_personality"}) {
- Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
- if (d && !d->isGCRoot) {
- d->isGCRoot = true;
- config->gcroot.push_back(d);
- }
- }
- });
- }
-
- markLive(ctx);
- }
-
// Needs to happen after the last call to addFile().
convertResources();
diff --git a/lld/COFF/Driver.h b/lld/COFF/Driver.h
index 5a9bd5c6d9682..b500ac8bba569 100644
--- a/lld/COFF/Driver.h
+++ b/lld/COFF/Driver.h
@@ -214,6 +214,8 @@ class LinkerDriver {
void parsePDBPageSize(StringRef);
void parseSection(StringRef);
+ void parseSameAddress(StringRef);
+
// Parses a MS-DOS stub file
void parseDosStub(StringRef path);
diff --git a/lld/COFF/DriverUtils.cpp b/lld/COFF/DriverUtils.cpp
index d8b41c7f45400..dc4039f116f25 100644
--- a/lld/COFF/DriverUtils.cpp
+++ b/lld/COFF/DriverUtils.cpp
@@ -328,6 +328,22 @@ void LinkerDriver::parseSwaprun(StringRef arg) {
} while (!arg.empty());
}
+void LinkerDriver::parseSameAddress(StringRef arg) {
+ auto mangledName = getArm64ECMangledFunctionName(arg);
+ Symbol *sym = ctx.symtab.addUndefined(mangledName ? *mangledName : arg);
+
+ // MSVC appears to generate thunks even for non-hybrid ARM64EC images.
+ // As a side effect, the native symbol is pulled in. Since this is used
+ // in the CRT for thread-local constructors, it results in the image
+ // containing unnecessary native code. As these thunks don't appear to
+ // be useful, we limit this behavior to actual hybrid targets. This may
+ // change if compatibility becomes necessary.
+ if (ctx.config.machine != ARM64X)
+ return;
+ Symbol *nativeSym = ctx.hybridSymtab->addUndefined(arg);
+ ctx.config.sameAddresses.emplace_back(sym, nativeSym);
+}
+
// An RAII temporary file class that automatically removes a temporary file.
namespace {
class TemporaryFile {
diff --git a/lld/COFF/MarkLive.cpp b/lld/COFF/MarkLive.cpp
index f40810c6805aa..78f5030e8fc2b 100644
--- a/lld/COFF/MarkLive.cpp
+++ b/lld/COFF/MarkLive.cpp
@@ -49,7 +49,10 @@ void markLive(COFFLinkerContext &ctx) {
addSym(file->impchkThunk->exitThunk);
};
- addSym = [&](Symbol *b) {
+ addSym = [&](Symbol *s) {
+ Defined *b = s->getDefined();
+ if (!b)
+ return;
if (auto *sym = dyn_cast<DefinedRegular>(b)) {
enqueue(sym->getChunk());
} else if (auto *sym = dyn_cast<DefinedImportData>(b)) {
diff --git a/lld/COFF/Options.td b/lld/COFF/Options.td
index 0d66b49a4fdb8..2c393cc94b5e3 100644
--- a/lld/COFF/Options.td
+++ b/lld/COFF/Options.td
@@ -31,6 +31,9 @@ multiclass B_priv<string name> {
def align : P<"align", "Section alignment">;
def aligncomm : P<"aligncomm", "Set common symbol alignment">;
def alternatename : P<"alternatename", "Define weak alias">;
+def arm64xsameaddress
+ : P<"arm64xsameaddress", "Generate a thunk for the symbol with the same "
+ "address in both native and EC views on ARM64X.">;
def base : P<"base", "Base address of the program">;
def color_diagnostics: Flag<["--"], "color-diagnostics">,
HelpText<"Alias for --color-diagnostics=always">;
@@ -373,4 +376,3 @@ def tlbid : P_priv<"tlbid">;
def tlbout : P_priv<"tlbout">;
def verbose_all : P_priv<"verbose">;
def guardsym : P_priv<"guardsym">;
-def arm64xsameaddress : P_priv<"arm64xsameaddress">;
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 189e75dfc3ff5..de04cdff6483d 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -452,7 +452,7 @@ void SymbolTable::reportUnresolvable() {
reportProblemSymbols(undefs, /*localImports=*/nullptr, true);
}
-void SymbolTable::resolveRemainingUndefines() {
+void SymbolTable::resolveRemainingUndefines(std::vector<Undefined *> &aliases) {
llvm::TimeTraceScope timeScope("Resolve remaining undefined symbols");
SmallPtrSet<Symbol *, 8> undefs;
DenseMap<Symbol *, Symbol *> localImports;
@@ -468,8 +468,10 @@ void SymbolTable::resolveRemainingUndefines() {
StringRef name = undef->getName();
// A weak alias may have been resolved, so check for that.
- if (undef->resolveWeakAlias())
+ if (undef->getWeakAlias()) {
+ aliases.push_back(undef);
continue;
+ }
// If we can resolve a symbol by removing __imp_ prefix, do that.
// This odd rule is for compatibility with MSVC linker.
@@ -620,10 +622,10 @@ void SymbolTable::initializeECThunks() {
return;
for (auto it : entryThunks) {
- auto *to = dyn_cast<Defined>(it.second);
+ Defined *to = it.second->getDefined();
if (!to)
continue;
- auto *from = dyn_cast<DefinedRegular>(it.first);
+ auto *from = dyn_cast_or_null<DefinedRegular>(it.first->getDefined());
// We need to be able to add padding to the function and fill it with an
// offset to its entry thunks. To ensure that padding the function is
// feasible, functions are required to be COMDAT symbols with no offset.
@@ -642,7 +644,8 @@ void SymbolTable::initializeECThunks() {
Symbol *sym = exitThunks.lookup(file->thunkSym);
if (!sym)
sym = exitThunks.lookup(file->impECSym);
- file->impchkThunk->exitThunk = dyn_cast_or_null<Defined>(sym);
+ if (sym)
+ file->impchkThunk->exitThunk = sym->getDefined();
}
// On ARM64EC, the __imp_ symbol references the auxiliary IAT, while the
@@ -659,6 +662,35 @@ void SymbolTable::initializeECThunks() {
});
}
+void SymbolTable::initializeSameAddressThunks() {
+ for (auto iter : ctx.config.sameAddresses) {
+ auto sym = dyn_cast_or_null<DefinedRegular>(iter.first->getDefined());
+ if (!sym || !sym->isLive())
+ continue;
+ auto nativeSym =
+ dyn_cast_or_null<DefinedRegular>(iter.second->getDefined());
+ if (!nativeSym || !nativeSym->isLive())
+ continue;
+ Defined *entryThunk = sym->getChunk()->getEntryThunk();
+ if (!entryThunk)
+ continue;
+
+ // Replace symbols with symbols referencing the thunk. Store the original
+ // symbol as equivalent DefinedSynthetic instances for use in the thunk
+ // itself.
+ auto symClone = make<DefinedSynthetic>(sym->getName(), sym->getChunk(),
+ sym->getValue());
+ auto nativeSymClone = make<DefinedSynthetic>(
+ nativeSym->getName(), nativeSym->getChunk(), nativeSym->getValue());
+ SameAddressThunkARM64EC *thunk =
+ make<SameAddressThunkARM64EC>(nativeSymClone, symClone, entryThunk);
+ sameAddressThunks.push_back(thunk);
+
+ replaceSymbol<DefinedSynthetic>(sym, sym->getName(), thunk);
+ replaceSymbol<DefinedSynthetic>(nativeSym, nativeSym->getName(), thunk);
+ }
+}
+
Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
bool overrideLazy) {
auto [s, wasInserted] = insert(name, f);
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 7eb067640dc85..aadd366c7d39f 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -31,6 +31,7 @@ class DefinedAbsolute;
class DefinedRegular;
class ImportThunkChunk;
class LazyArchive;
+class SameAddressThunkARM64EC;
class SectionChunk;
class Symbol;
@@ -67,7 +68,7 @@ class SymbolTable {
// Try to resolve any undefined symbols and update the symbol table
// accordingly, then print an error message for any remaining undefined
// symbols and warn about imported local symbols.
- void resolveRemainingUndefines();
+ void resolveRemainingUndefines(std::vector<Undefined *> &aliases);
// Try to resolve undefined symbols with alternate names.
void resolveAlternateNames();
@@ -140,6 +141,7 @@ class SymbolTable {
void addEntryThunk(Symbol *from, Symbol *to);
void addExitThunk(Symbol *from, Symbol *to);
void initializeECThunks();
+ void initializeSameAddressThunks();
void reportDuplicate(Symbol *existing, InputFile *newFile,
SectionChunk *newSc = nullptr,
@@ -159,6 +161,8 @@ class SymbolTable {
// A list of EC EXP+ symbols.
std::vector<Symbol *> expSymbols;
+ std::vector<SameAddressThunkARM64EC *> sameAddressThunks;
+
// A list of DLL exports.
std::vector<Export> exports;
llvm::DenseSet<StringRef> directivesExports;
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index b571ce9ce432c..ba4f95d14bc64 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -91,6 +91,14 @@ bool Symbol::isLive() const {
return true;
}
+Defined *Symbol::getDefined() {
+ if (auto d = dyn_cast<Defined>(this))
+ return d;
+ if (auto u = dyn_cast<Undefined>(this))
+ return u->getDefinedWeakAlias();
+ return nullptr;
+}
+
void Symbol::replaceKeepingName(Symbol *other, size_t size) {
StringRef origName = getName();
memcpy(this, other, size);
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index fd3d8cec8c113..c86ded860876b 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -95,6 +95,10 @@ class Symbol {
symbolKind == LazyDLLSymbolKind;
}
+ // Get the Defined symbol associated with this symbol, either itself or its
+ // weak alias.
+ Defined *getDefined();
+
private:
void computeName();
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 076561807af47..4c7a52eda151b 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -314,6 +314,7 @@ class Writer {
uint32_t dataDirOffset64;
OutputSection *textSec;
+ OutputSection *wowthkSec;
OutputSection *hexpthkSec;
OutputSection *bssSec;
OutputSection *rdataSec;
@@ -1076,8 +1077,10 @@ void Writer::createSections() {
// Try to match the section order used by link.exe.
textSec = createSection(".text", code | r | x);
- if (isArm64EC(ctx.config.machine))
+ if (isArm64EC(ctx.config.machine)) {
+ wowthkSec = createSection(".wowthk", code | r | x);
hexpthkSec = createSection(".hexpthk", code | r | x);
+ }
bssSec = createSection(".bss", bss | r | w);
rdataSec = createSection(".rdata", data | r);
buildidSec = createSection(".buildid", data | r);
@@ -2310,6 +2313,18 @@ void Writer::createECChunks() {
ctx.symtab.findUnderscore("__arm64x_redirection_metadata");
replaceSymbol<DefinedSynthetic>(entryPointsSym, entryPointsSym->getName(),
entryPoints);
+
+ for (auto thunk : ctx.symtab.sameAddressThunks) {
+ wowthkSec->insertChunkAtStart(thunk);
+ if (ctx.config.machine == ARM64X) {
+ ...
[truncated]
|
@llvm/pr-subscribers-lld Author: Jacek Caban (cjacek) ChangesFixes MSVC CRT thread-local constructors support on hybrid ARM64X targets.
MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it. The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the Patch is 28.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151255.diff 14 Files Affected:
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 01752cdc6a9da..b9231b5d9bec5 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -422,12 +422,6 @@ void SectionChunk::writeTo(uint8_t *buf) const {
applyRelocation(buf + rel.VirtualAddress, rel);
}
-
- // Write the offset to EC entry thunk preceding section contents. The low bit
- // is always set, so it's effectively an offset from the last byte of the
- // offset.
- if (Defined *entryThunk = getEntryThunk())
- write32le(buf - sizeof(uint32_t), entryThunk->getRVA() - rva + 1);
}
void SectionChunk::applyRelocation(uint8_t *off,
@@ -881,6 +875,18 @@ void RangeExtensionThunkARM64::writeTo(uint8_t *buf) const {
applyArm64Imm(buf + 4, target->getRVA() & 0xfff, 0);
}
+void SameAddressThunkARM64EC::setDynamicRelocs(COFFLinkerContext &ctx) const {
+ // Add ARM64X relocations replacing adrp/add instructions with a version using
+ // the hybrid target.
+ RangeExtensionThunkARM64 hybridView(ARM64EC, hybridTarget);
+ uint8_t buf[sizeof(arm64Thunk)];
+ hybridView.setRVA(rva);
+ hybridView.writeTo(buf);
+ ctx.dynamicRelocs->set(this, *reinterpret_cast<uint32_t *>(buf));
+ ctx.dynamicRelocs->set(Arm64XRelocVal(this, sizeof(uint32_t)),
+ *reinterpret_cast<uint32_t *>(buf + sizeof(uint32_t)));
+}
+
LocalImportChunk::LocalImportChunk(COFFLinkerContext &c, Defined *s)
: sym(s), ctx(c) {
setAlignment(ctx.config.wordsize);
@@ -1264,7 +1270,8 @@ void DynamicRelocsChunk::finalize() {
}
// Set the reloc value. The reloc entry must be allocated beforehand.
-void DynamicRelocsChunk::set(uint32_t rva, Arm64XRelocVal value) {
+void DynamicRelocsChunk::set(Arm64XRelocVal offset, Arm64XRelocVal value) {
+ uint32_t rva = offset.get();
auto entry =
llvm::find_if(arm64xRelocs, [rva](const Arm64XDynamicRelocEntry &e) {
return e.offset.get() == rva;
diff --git a/lld/COFF/Chunks.h b/lld/COFF/Chunks.h
index d03a64cc6b812..6d88f5ec73776 100644
--- a/lld/COFF/Chunks.h
+++ b/lld/COFF/Chunks.h
@@ -193,6 +193,8 @@ class NonSectionChunk : public Chunk {
// allowed ranges. Return the additional space required for the extension.
virtual uint32_t extendRanges() { return 0; };
+ virtual Defined *getEntryThunk() const { return nullptr; };
+
static bool classof(const Chunk *c) { return c->kind() >= OtherKind; }
protected:
@@ -633,7 +635,7 @@ class ImportThunkChunkARM64EC : public ImportThunkChunk {
bool verifyRanges() override;
uint32_t extendRanges() override;
- Defined *exitThunk;
+ Defined *exitThunk = nullptr;
Defined *sym = nullptr;
bool extended = false;
@@ -675,6 +677,26 @@ class RangeExtensionThunkARM64 : public NonSectionCodeChunk {
MachineTypes machine;
};
+// A chunk used to guarantee the same address for a function in both views of
+// a hybrid image. Similar to RangeExtensionThunkARM64 chunks, it calls the
+// target symbol using a BR instruction. It also contains an entry thunk for EC
+// compatibility and additional ARM64X relocations that swap targets between
+// views.
+class SameAddressThunkARM64EC : public RangeExtensionThunkARM64 {
+public:
+ explicit SameAddressThunkARM64EC(Defined *t, Defined *hybridTarget,
+ Defined *entryThunk)
+ : RangeExtensionThunkARM64(ARM64EC, t), hybridTarget(hybridTarget),
+ entryThunk(entryThunk) {}
+
+ Defined *getEntryThunk() const override { return entryThunk; }
+ void setDynamicRelocs(COFFLinkerContext &ctx) const;
+
+private:
+ Defined *hybridTarget;
+ Defined *entryThunk;
+};
+
// Windows-specific.
// See comments for DefinedLocalImport class.
class LocalImportChunk : public NonSectionChunk {
@@ -843,13 +865,13 @@ class Arm64XRelocVal {
public:
Arm64XRelocVal(uint64_t value = 0) : value(value) {}
Arm64XRelocVal(Defined *sym, int32_t offset = 0) : sym(sym), value(offset) {}
- Arm64XRelocVal(Chunk *chunk, int32_t offset = 0)
+ Arm64XRelocVal(const Chunk *chunk, int32_t offset = 0)
: chunk(chunk), value(offset) {}
uint64_t get() const;
private:
Defined *sym = nullptr;
- Chunk *chunk = nullptr;
+ const Chunk *chunk = nullptr;
uint64_t value;
};
@@ -884,7 +906,7 @@ class DynamicRelocsChunk : public NonSectionChunk {
arm64xRelocs.emplace_back(type, size, offset, value);
}
- void set(uint32_t rva, Arm64XRelocVal value);
+ void set(Arm64XRelocVal offset, Arm64XRelocVal value);
private:
std::vector<Arm64XDynamicRelocEntry> arm64xRelocs;
@@ -940,6 +962,8 @@ inline bool Chunk::isHotPatchable() const {
inline Defined *Chunk::getEntryThunk() const {
if (auto *c = dyn_cast<const SectionChunkEC>(this))
return c->entryThunk;
+ if (auto *c = dyn_cast<const NonSectionChunk>(this))
+ return c->getEntryThunk();
return nullptr;
}
diff --git a/lld/COFF/Config.h b/lld/COFF/Config.h
index 91b6e632fa7ed..a03bb57641670 100644
--- a/lld/COFF/Config.h
+++ b/lld/COFF/Config.h
@@ -223,6 +223,9 @@ struct Configuration {
StringRef manifestUIAccess = "'false'";
StringRef manifestFile;
+ // used for /arm64xsameaddress
+ std::vector<std::pair<Symbol *, Symbol *>> sameAddresses;
+
// used for /dwodir
StringRef dwoDir;
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 570b8f9d05906..7580b469e4f87 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -500,7 +500,9 @@ void LinkerDriver::parseDirectives(InputFile *file) {
file->symtab.parseAlternateName(arg->getValue());
break;
case OPT_arm64xsameaddress:
- if (!file->symtab.isEC())
+ if (file->symtab.isEC())
+ parseSameAddress(arg->getValue());
+ else
Warn(ctx) << arg->getSpelling()
<< " is not allowed in non-ARM64EC files (" << toString(file)
<< ")";
@@ -1318,13 +1320,9 @@ void LinkerDriver::convertResources() {
}
void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
- Defined *def;
if (!sym)
return;
- if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getDefinedWeakAlias();
- else
- def = dyn_cast<Defined>(sym);
+ Defined *def = sym->getDefined();
if (!def)
return;
@@ -1356,11 +1354,7 @@ void LinkerDriver::createECExportThunks() {
Symbol *sym = ctx.symtab.find(targetName);
if (!sym)
continue;
- Defined *targetSym;
- if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getDefinedWeakAlias();
- else
- targetSym = dyn_cast<Defined>(sym);
+ Defined *targetSym = sym->getDefined();
if (!targetSym)
continue;
@@ -2303,6 +2297,13 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
args.filtered(OPT_dependentloadflag, OPT_dependentloadflag_opt))
parseDependentLoadFlags(arg);
+ for (auto *arg : args.filtered(OPT_arm64xsameaddress)) {
+ if (ctx.hybridSymtab)
+ parseSameAddress(arg->getValue());
+ else
+ Warn(ctx) << arg->getSpelling() << " is allowed only on EC targets";
+ }
+
if (tar) {
llvm::TimeTraceScope timeScope("Reproducer: response file");
tar->append(
@@ -2676,12 +2677,46 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
createECExportThunks();
// Resolve remaining undefined symbols and warn about imported locals.
+ std::vector<Undefined *> aliases;
ctx.forEachSymtab(
- [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(); });
+ [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(aliases); });
if (errorCount())
return;
+ ctx.forEachActiveSymtab([](SymbolTable &symtab) {
+ symtab.initializeECThunks();
+ symtab.initializeLoadConfig();
+ });
+
+ // Identify unreferenced COMDAT sections.
+ if (config->doGC) {
+ if (config->mingw) {
+ // markLive doesn't traverse .eh_frame, but the personality function is
+ // only reached that way. The proper solution would be to parse and
+ // traverse the .eh_frame section, like the ELF linker does.
+ // For now, just manually try to retain the known possible personality
+ // functions. This doesn't bring in more object files, but only marks
+ // functions that already have been included to be retained.
+ ctx.forEachSymtab([&](SymbolTable &symtab) {
+ for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
+ "rust_eh_personality"}) {
+ Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
+ if (d && !d->isGCRoot) {
+ d->isGCRoot = true;
+ config->gcroot.push_back(d);
+ }
+ }
+ });
+ }
+
+ markLive(ctx);
+ }
+
+ ctx.symtab.initializeSameAddressThunks();
+ for (auto alias : aliases)
+ alias->resolveWeakAlias();
+
if (config->mingw) {
// Make sure the crtend.o object is the last object file. This object
// file can contain terminating section chunks that need to be placed
@@ -2773,35 +2808,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
if (auto *arg = args.getLastArg(OPT_print_symbol_order))
config->printSymbolOrder = arg->getValue();
- if (ctx.symtab.isEC())
- ctx.symtab.initializeECThunks();
- ctx.forEachActiveSymtab(
- [](SymbolTable &symtab) { symtab.initializeLoadConfig(); });
-
- // Identify unreferenced COMDAT sections.
- if (config->doGC) {
- if (config->mingw) {
- // markLive doesn't traverse .eh_frame, but the personality function is
- // only reached that way. The proper solution would be to parse and
- // traverse the .eh_frame section, like the ELF linker does.
- // For now, just manually try to retain the known possible personality
- // functions. This doesn't bring in more object files, but only marks
- // functions that already have been included to be retained.
- ctx.forEachSymtab([&](SymbolTable &symtab) {
- for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
- "rust_eh_personality"}) {
- Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
- if (d && !d->isGCRoot) {
- d->isGCRoot = true;
- config->gcroot.push_back(d);
- }
- }
- });
- }
-
- markLive(ctx);
- }
-
// Needs to happen after the last call to addFile().
convertResources();
diff --git a/lld/COFF/Driver.h b/lld/COFF/Driver.h
index 5a9bd5c6d9682..b500ac8bba569 100644
--- a/lld/COFF/Driver.h
+++ b/lld/COFF/Driver.h
@@ -214,6 +214,8 @@ class LinkerDriver {
void parsePDBPageSize(StringRef);
void parseSection(StringRef);
+ void parseSameAddress(StringRef);
+
// Parses a MS-DOS stub file
void parseDosStub(StringRef path);
diff --git a/lld/COFF/DriverUtils.cpp b/lld/COFF/DriverUtils.cpp
index d8b41c7f45400..dc4039f116f25 100644
--- a/lld/COFF/DriverUtils.cpp
+++ b/lld/COFF/DriverUtils.cpp
@@ -328,6 +328,22 @@ void LinkerDriver::parseSwaprun(StringRef arg) {
} while (!arg.empty());
}
+void LinkerDriver::parseSameAddress(StringRef arg) {
+ auto mangledName = getArm64ECMangledFunctionName(arg);
+ Symbol *sym = ctx.symtab.addUndefined(mangledName ? *mangledName : arg);
+
+ // MSVC appears to generate thunks even for non-hybrid ARM64EC images.
+ // As a side effect, the native symbol is pulled in. Since this is used
+ // in the CRT for thread-local constructors, it results in the image
+ // containing unnecessary native code. As these thunks don't appear to
+ // be useful, we limit this behavior to actual hybrid targets. This may
+ // change if compatibility becomes necessary.
+ if (ctx.config.machine != ARM64X)
+ return;
+ Symbol *nativeSym = ctx.hybridSymtab->addUndefined(arg);
+ ctx.config.sameAddresses.emplace_back(sym, nativeSym);
+}
+
// An RAII temporary file class that automatically removes a temporary file.
namespace {
class TemporaryFile {
diff --git a/lld/COFF/MarkLive.cpp b/lld/COFF/MarkLive.cpp
index f40810c6805aa..78f5030e8fc2b 100644
--- a/lld/COFF/MarkLive.cpp
+++ b/lld/COFF/MarkLive.cpp
@@ -49,7 +49,10 @@ void markLive(COFFLinkerContext &ctx) {
addSym(file->impchkThunk->exitThunk);
};
- addSym = [&](Symbol *b) {
+ addSym = [&](Symbol *s) {
+ Defined *b = s->getDefined();
+ if (!b)
+ return;
if (auto *sym = dyn_cast<DefinedRegular>(b)) {
enqueue(sym->getChunk());
} else if (auto *sym = dyn_cast<DefinedImportData>(b)) {
diff --git a/lld/COFF/Options.td b/lld/COFF/Options.td
index 0d66b49a4fdb8..2c393cc94b5e3 100644
--- a/lld/COFF/Options.td
+++ b/lld/COFF/Options.td
@@ -31,6 +31,9 @@ multiclass B_priv<string name> {
def align : P<"align", "Section alignment">;
def aligncomm : P<"aligncomm", "Set common symbol alignment">;
def alternatename : P<"alternatename", "Define weak alias">;
+def arm64xsameaddress
+ : P<"arm64xsameaddress", "Generate a thunk for the symbol with the same "
+ "address in both native and EC views on ARM64X.">;
def base : P<"base", "Base address of the program">;
def color_diagnostics: Flag<["--"], "color-diagnostics">,
HelpText<"Alias for --color-diagnostics=always">;
@@ -373,4 +376,3 @@ def tlbid : P_priv<"tlbid">;
def tlbout : P_priv<"tlbout">;
def verbose_all : P_priv<"verbose">;
def guardsym : P_priv<"guardsym">;
-def arm64xsameaddress : P_priv<"arm64xsameaddress">;
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 189e75dfc3ff5..de04cdff6483d 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -452,7 +452,7 @@ void SymbolTable::reportUnresolvable() {
reportProblemSymbols(undefs, /*localImports=*/nullptr, true);
}
-void SymbolTable::resolveRemainingUndefines() {
+void SymbolTable::resolveRemainingUndefines(std::vector<Undefined *> &aliases) {
llvm::TimeTraceScope timeScope("Resolve remaining undefined symbols");
SmallPtrSet<Symbol *, 8> undefs;
DenseMap<Symbol *, Symbol *> localImports;
@@ -468,8 +468,10 @@ void SymbolTable::resolveRemainingUndefines() {
StringRef name = undef->getName();
// A weak alias may have been resolved, so check for that.
- if (undef->resolveWeakAlias())
+ if (undef->getWeakAlias()) {
+ aliases.push_back(undef);
continue;
+ }
// If we can resolve a symbol by removing __imp_ prefix, do that.
// This odd rule is for compatibility with MSVC linker.
@@ -620,10 +622,10 @@ void SymbolTable::initializeECThunks() {
return;
for (auto it : entryThunks) {
- auto *to = dyn_cast<Defined>(it.second);
+ Defined *to = it.second->getDefined();
if (!to)
continue;
- auto *from = dyn_cast<DefinedRegular>(it.first);
+ auto *from = dyn_cast_or_null<DefinedRegular>(it.first->getDefined());
// We need to be able to add padding to the function and fill it with an
// offset to its entry thunks. To ensure that padding the function is
// feasible, functions are required to be COMDAT symbols with no offset.
@@ -642,7 +644,8 @@ void SymbolTable::initializeECThunks() {
Symbol *sym = exitThunks.lookup(file->thunkSym);
if (!sym)
sym = exitThunks.lookup(file->impECSym);
- file->impchkThunk->exitThunk = dyn_cast_or_null<Defined>(sym);
+ if (sym)
+ file->impchkThunk->exitThunk = sym->getDefined();
}
// On ARM64EC, the __imp_ symbol references the auxiliary IAT, while the
@@ -659,6 +662,35 @@ void SymbolTable::initializeECThunks() {
});
}
+void SymbolTable::initializeSameAddressThunks() {
+ for (auto iter : ctx.config.sameAddresses) {
+ auto sym = dyn_cast_or_null<DefinedRegular>(iter.first->getDefined());
+ if (!sym || !sym->isLive())
+ continue;
+ auto nativeSym =
+ dyn_cast_or_null<DefinedRegular>(iter.second->getDefined());
+ if (!nativeSym || !nativeSym->isLive())
+ continue;
+ Defined *entryThunk = sym->getChunk()->getEntryThunk();
+ if (!entryThunk)
+ continue;
+
+ // Replace symbols with symbols referencing the thunk. Store the original
+ // symbol as equivalent DefinedSynthetic instances for use in the thunk
+ // itself.
+ auto symClone = make<DefinedSynthetic>(sym->getName(), sym->getChunk(),
+ sym->getValue());
+ auto nativeSymClone = make<DefinedSynthetic>(
+ nativeSym->getName(), nativeSym->getChunk(), nativeSym->getValue());
+ SameAddressThunkARM64EC *thunk =
+ make<SameAddressThunkARM64EC>(nativeSymClone, symClone, entryThunk);
+ sameAddressThunks.push_back(thunk);
+
+ replaceSymbol<DefinedSynthetic>(sym, sym->getName(), thunk);
+ replaceSymbol<DefinedSynthetic>(nativeSym, nativeSym->getName(), thunk);
+ }
+}
+
Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
bool overrideLazy) {
auto [s, wasInserted] = insert(name, f);
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 7eb067640dc85..aadd366c7d39f 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -31,6 +31,7 @@ class DefinedAbsolute;
class DefinedRegular;
class ImportThunkChunk;
class LazyArchive;
+class SameAddressThunkARM64EC;
class SectionChunk;
class Symbol;
@@ -67,7 +68,7 @@ class SymbolTable {
// Try to resolve any undefined symbols and update the symbol table
// accordingly, then print an error message for any remaining undefined
// symbols and warn about imported local symbols.
- void resolveRemainingUndefines();
+ void resolveRemainingUndefines(std::vector<Undefined *> &aliases);
// Try to resolve undefined symbols with alternate names.
void resolveAlternateNames();
@@ -140,6 +141,7 @@ class SymbolTable {
void addEntryThunk(Symbol *from, Symbol *to);
void addExitThunk(Symbol *from, Symbol *to);
void initializeECThunks();
+ void initializeSameAddressThunks();
void reportDuplicate(Symbol *existing, InputFile *newFile,
SectionChunk *newSc = nullptr,
@@ -159,6 +161,8 @@ class SymbolTable {
// A list of EC EXP+ symbols.
std::vector<Symbol *> expSymbols;
+ std::vector<SameAddressThunkARM64EC *> sameAddressThunks;
+
// A list of DLL exports.
std::vector<Export> exports;
llvm::DenseSet<StringRef> directivesExports;
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index b571ce9ce432c..ba4f95d14bc64 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -91,6 +91,14 @@ bool Symbol::isLive() const {
return true;
}
+Defined *Symbol::getDefined() {
+ if (auto d = dyn_cast<Defined>(this))
+ return d;
+ if (auto u = dyn_cast<Undefined>(this))
+ return u->getDefinedWeakAlias();
+ return nullptr;
+}
+
void Symbol::replaceKeepingName(Symbol *other, size_t size) {
StringRef origName = getName();
memcpy(this, other, size);
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index fd3d8cec8c113..c86ded860876b 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -95,6 +95,10 @@ class Symbol {
symbolKind == LazyDLLSymbolKind;
}
+ // Get the Defined symbol associated with this symbol, either itself or its
+ // weak alias.
+ Defined *getDefined();
+
private:
void computeName();
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 076561807af47..4c7a52eda151b 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -314,6 +314,7 @@ class Writer {
uint32_t dataDirOffset64;
OutputSection *textSec;
+ OutputSection *wowthkSec;
OutputSection *hexpthkSec;
OutputSection *bssSec;
OutputSection *rdataSec;
@@ -1076,8 +1077,10 @@ void Writer::createSections() {
// Try to match the section order used by link.exe.
textSec = createSection(".text", code | r | x);
- if (isArm64EC(ctx.config.machine))
+ if (isArm64EC(ctx.config.machine)) {
+ wowthkSec = createSection(".wowthk", code | r | x);
hexpthkSec = createSection(".hexpthk", code | r | x);
+ }
bssSec = createSection(".bss", bss | r | w);
rdataSec = createSection(".rdata", data | r);
buildidSec = createSection(".buildid", data | r);
@@ -2310,6 +2313,18 @@ void Writer::createECChunks() {
ctx.symtab.findUnderscore("__arm64x_redirection_metadata");
replaceSymbol<DefinedSynthetic>(entryPointsSym, entryPointsSym->getName(),
entryPoints);
+
+ for (auto thunk : ctx.symtab.sameAddressThunks) {
+ wowthkSec->insertChunkAtStart(thunk);
+ if (ctx.config.machine == ARM64X) {
+ ...
[truncated]
|
@llvm/pr-subscribers-platform-windows Author: Jacek Caban (cjacek) ChangesFixes MSVC CRT thread-local constructors support on hybrid ARM64X targets.
MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it. The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the Patch is 28.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/151255.diff 14 Files Affected:
diff --git a/lld/COFF/Chunks.cpp b/lld/COFF/Chunks.cpp
index 01752cdc6a9da..b9231b5d9bec5 100644
--- a/lld/COFF/Chunks.cpp
+++ b/lld/COFF/Chunks.cpp
@@ -422,12 +422,6 @@ void SectionChunk::writeTo(uint8_t *buf) const {
applyRelocation(buf + rel.VirtualAddress, rel);
}
-
- // Write the offset to EC entry thunk preceding section contents. The low bit
- // is always set, so it's effectively an offset from the last byte of the
- // offset.
- if (Defined *entryThunk = getEntryThunk())
- write32le(buf - sizeof(uint32_t), entryThunk->getRVA() - rva + 1);
}
void SectionChunk::applyRelocation(uint8_t *off,
@@ -881,6 +875,18 @@ void RangeExtensionThunkARM64::writeTo(uint8_t *buf) const {
applyArm64Imm(buf + 4, target->getRVA() & 0xfff, 0);
}
+void SameAddressThunkARM64EC::setDynamicRelocs(COFFLinkerContext &ctx) const {
+ // Add ARM64X relocations replacing adrp/add instructions with a version using
+ // the hybrid target.
+ RangeExtensionThunkARM64 hybridView(ARM64EC, hybridTarget);
+ uint8_t buf[sizeof(arm64Thunk)];
+ hybridView.setRVA(rva);
+ hybridView.writeTo(buf);
+ ctx.dynamicRelocs->set(this, *reinterpret_cast<uint32_t *>(buf));
+ ctx.dynamicRelocs->set(Arm64XRelocVal(this, sizeof(uint32_t)),
+ *reinterpret_cast<uint32_t *>(buf + sizeof(uint32_t)));
+}
+
LocalImportChunk::LocalImportChunk(COFFLinkerContext &c, Defined *s)
: sym(s), ctx(c) {
setAlignment(ctx.config.wordsize);
@@ -1264,7 +1270,8 @@ void DynamicRelocsChunk::finalize() {
}
// Set the reloc value. The reloc entry must be allocated beforehand.
-void DynamicRelocsChunk::set(uint32_t rva, Arm64XRelocVal value) {
+void DynamicRelocsChunk::set(Arm64XRelocVal offset, Arm64XRelocVal value) {
+ uint32_t rva = offset.get();
auto entry =
llvm::find_if(arm64xRelocs, [rva](const Arm64XDynamicRelocEntry &e) {
return e.offset.get() == rva;
diff --git a/lld/COFF/Chunks.h b/lld/COFF/Chunks.h
index d03a64cc6b812..6d88f5ec73776 100644
--- a/lld/COFF/Chunks.h
+++ b/lld/COFF/Chunks.h
@@ -193,6 +193,8 @@ class NonSectionChunk : public Chunk {
// allowed ranges. Return the additional space required for the extension.
virtual uint32_t extendRanges() { return 0; };
+ virtual Defined *getEntryThunk() const { return nullptr; };
+
static bool classof(const Chunk *c) { return c->kind() >= OtherKind; }
protected:
@@ -633,7 +635,7 @@ class ImportThunkChunkARM64EC : public ImportThunkChunk {
bool verifyRanges() override;
uint32_t extendRanges() override;
- Defined *exitThunk;
+ Defined *exitThunk = nullptr;
Defined *sym = nullptr;
bool extended = false;
@@ -675,6 +677,26 @@ class RangeExtensionThunkARM64 : public NonSectionCodeChunk {
MachineTypes machine;
};
+// A chunk used to guarantee the same address for a function in both views of
+// a hybrid image. Similar to RangeExtensionThunkARM64 chunks, it calls the
+// target symbol using a BR instruction. It also contains an entry thunk for EC
+// compatibility and additional ARM64X relocations that swap targets between
+// views.
+class SameAddressThunkARM64EC : public RangeExtensionThunkARM64 {
+public:
+ explicit SameAddressThunkARM64EC(Defined *t, Defined *hybridTarget,
+ Defined *entryThunk)
+ : RangeExtensionThunkARM64(ARM64EC, t), hybridTarget(hybridTarget),
+ entryThunk(entryThunk) {}
+
+ Defined *getEntryThunk() const override { return entryThunk; }
+ void setDynamicRelocs(COFFLinkerContext &ctx) const;
+
+private:
+ Defined *hybridTarget;
+ Defined *entryThunk;
+};
+
// Windows-specific.
// See comments for DefinedLocalImport class.
class LocalImportChunk : public NonSectionChunk {
@@ -843,13 +865,13 @@ class Arm64XRelocVal {
public:
Arm64XRelocVal(uint64_t value = 0) : value(value) {}
Arm64XRelocVal(Defined *sym, int32_t offset = 0) : sym(sym), value(offset) {}
- Arm64XRelocVal(Chunk *chunk, int32_t offset = 0)
+ Arm64XRelocVal(const Chunk *chunk, int32_t offset = 0)
: chunk(chunk), value(offset) {}
uint64_t get() const;
private:
Defined *sym = nullptr;
- Chunk *chunk = nullptr;
+ const Chunk *chunk = nullptr;
uint64_t value;
};
@@ -884,7 +906,7 @@ class DynamicRelocsChunk : public NonSectionChunk {
arm64xRelocs.emplace_back(type, size, offset, value);
}
- void set(uint32_t rva, Arm64XRelocVal value);
+ void set(Arm64XRelocVal offset, Arm64XRelocVal value);
private:
std::vector<Arm64XDynamicRelocEntry> arm64xRelocs;
@@ -940,6 +962,8 @@ inline bool Chunk::isHotPatchable() const {
inline Defined *Chunk::getEntryThunk() const {
if (auto *c = dyn_cast<const SectionChunkEC>(this))
return c->entryThunk;
+ if (auto *c = dyn_cast<const NonSectionChunk>(this))
+ return c->getEntryThunk();
return nullptr;
}
diff --git a/lld/COFF/Config.h b/lld/COFF/Config.h
index 91b6e632fa7ed..a03bb57641670 100644
--- a/lld/COFF/Config.h
+++ b/lld/COFF/Config.h
@@ -223,6 +223,9 @@ struct Configuration {
StringRef manifestUIAccess = "'false'";
StringRef manifestFile;
+ // used for /arm64xsameaddress
+ std::vector<std::pair<Symbol *, Symbol *>> sameAddresses;
+
// used for /dwodir
StringRef dwoDir;
diff --git a/lld/COFF/Driver.cpp b/lld/COFF/Driver.cpp
index 570b8f9d05906..7580b469e4f87 100644
--- a/lld/COFF/Driver.cpp
+++ b/lld/COFF/Driver.cpp
@@ -500,7 +500,9 @@ void LinkerDriver::parseDirectives(InputFile *file) {
file->symtab.parseAlternateName(arg->getValue());
break;
case OPT_arm64xsameaddress:
- if (!file->symtab.isEC())
+ if (file->symtab.isEC())
+ parseSameAddress(arg->getValue());
+ else
Warn(ctx) << arg->getSpelling()
<< " is not allowed in non-ARM64EC files (" << toString(file)
<< ")";
@@ -1318,13 +1320,9 @@ void LinkerDriver::convertResources() {
}
void LinkerDriver::maybeCreateECExportThunk(StringRef name, Symbol *&sym) {
- Defined *def;
if (!sym)
return;
- if (auto undef = dyn_cast<Undefined>(sym))
- def = undef->getDefinedWeakAlias();
- else
- def = dyn_cast<Defined>(sym);
+ Defined *def = sym->getDefined();
if (!def)
return;
@@ -1356,11 +1354,7 @@ void LinkerDriver::createECExportThunks() {
Symbol *sym = ctx.symtab.find(targetName);
if (!sym)
continue;
- Defined *targetSym;
- if (auto undef = dyn_cast<Undefined>(sym))
- targetSym = undef->getDefinedWeakAlias();
- else
- targetSym = dyn_cast<Defined>(sym);
+ Defined *targetSym = sym->getDefined();
if (!targetSym)
continue;
@@ -2303,6 +2297,13 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
args.filtered(OPT_dependentloadflag, OPT_dependentloadflag_opt))
parseDependentLoadFlags(arg);
+ for (auto *arg : args.filtered(OPT_arm64xsameaddress)) {
+ if (ctx.hybridSymtab)
+ parseSameAddress(arg->getValue());
+ else
+ Warn(ctx) << arg->getSpelling() << " is allowed only on EC targets";
+ }
+
if (tar) {
llvm::TimeTraceScope timeScope("Reproducer: response file");
tar->append(
@@ -2676,12 +2677,46 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
createECExportThunks();
// Resolve remaining undefined symbols and warn about imported locals.
+ std::vector<Undefined *> aliases;
ctx.forEachSymtab(
- [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(); });
+ [&](SymbolTable &symtab) { symtab.resolveRemainingUndefines(aliases); });
if (errorCount())
return;
+ ctx.forEachActiveSymtab([](SymbolTable &symtab) {
+ symtab.initializeECThunks();
+ symtab.initializeLoadConfig();
+ });
+
+ // Identify unreferenced COMDAT sections.
+ if (config->doGC) {
+ if (config->mingw) {
+ // markLive doesn't traverse .eh_frame, but the personality function is
+ // only reached that way. The proper solution would be to parse and
+ // traverse the .eh_frame section, like the ELF linker does.
+ // For now, just manually try to retain the known possible personality
+ // functions. This doesn't bring in more object files, but only marks
+ // functions that already have been included to be retained.
+ ctx.forEachSymtab([&](SymbolTable &symtab) {
+ for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
+ "rust_eh_personality"}) {
+ Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
+ if (d && !d->isGCRoot) {
+ d->isGCRoot = true;
+ config->gcroot.push_back(d);
+ }
+ }
+ });
+ }
+
+ markLive(ctx);
+ }
+
+ ctx.symtab.initializeSameAddressThunks();
+ for (auto alias : aliases)
+ alias->resolveWeakAlias();
+
if (config->mingw) {
// Make sure the crtend.o object is the last object file. This object
// file can contain terminating section chunks that need to be placed
@@ -2773,35 +2808,6 @@ void LinkerDriver::linkerMain(ArrayRef<const char *> argsArr) {
if (auto *arg = args.getLastArg(OPT_print_symbol_order))
config->printSymbolOrder = arg->getValue();
- if (ctx.symtab.isEC())
- ctx.symtab.initializeECThunks();
- ctx.forEachActiveSymtab(
- [](SymbolTable &symtab) { symtab.initializeLoadConfig(); });
-
- // Identify unreferenced COMDAT sections.
- if (config->doGC) {
- if (config->mingw) {
- // markLive doesn't traverse .eh_frame, but the personality function is
- // only reached that way. The proper solution would be to parse and
- // traverse the .eh_frame section, like the ELF linker does.
- // For now, just manually try to retain the known possible personality
- // functions. This doesn't bring in more object files, but only marks
- // functions that already have been included to be retained.
- ctx.forEachSymtab([&](SymbolTable &symtab) {
- for (const char *n : {"__gxx_personality_v0", "__gcc_personality_v0",
- "rust_eh_personality"}) {
- Defined *d = dyn_cast_or_null<Defined>(symtab.findUnderscore(n));
- if (d && !d->isGCRoot) {
- d->isGCRoot = true;
- config->gcroot.push_back(d);
- }
- }
- });
- }
-
- markLive(ctx);
- }
-
// Needs to happen after the last call to addFile().
convertResources();
diff --git a/lld/COFF/Driver.h b/lld/COFF/Driver.h
index 5a9bd5c6d9682..b500ac8bba569 100644
--- a/lld/COFF/Driver.h
+++ b/lld/COFF/Driver.h
@@ -214,6 +214,8 @@ class LinkerDriver {
void parsePDBPageSize(StringRef);
void parseSection(StringRef);
+ void parseSameAddress(StringRef);
+
// Parses a MS-DOS stub file
void parseDosStub(StringRef path);
diff --git a/lld/COFF/DriverUtils.cpp b/lld/COFF/DriverUtils.cpp
index d8b41c7f45400..dc4039f116f25 100644
--- a/lld/COFF/DriverUtils.cpp
+++ b/lld/COFF/DriverUtils.cpp
@@ -328,6 +328,22 @@ void LinkerDriver::parseSwaprun(StringRef arg) {
} while (!arg.empty());
}
+void LinkerDriver::parseSameAddress(StringRef arg) {
+ auto mangledName = getArm64ECMangledFunctionName(arg);
+ Symbol *sym = ctx.symtab.addUndefined(mangledName ? *mangledName : arg);
+
+ // MSVC appears to generate thunks even for non-hybrid ARM64EC images.
+ // As a side effect, the native symbol is pulled in. Since this is used
+ // in the CRT for thread-local constructors, it results in the image
+ // containing unnecessary native code. As these thunks don't appear to
+ // be useful, we limit this behavior to actual hybrid targets. This may
+ // change if compatibility becomes necessary.
+ if (ctx.config.machine != ARM64X)
+ return;
+ Symbol *nativeSym = ctx.hybridSymtab->addUndefined(arg);
+ ctx.config.sameAddresses.emplace_back(sym, nativeSym);
+}
+
// An RAII temporary file class that automatically removes a temporary file.
namespace {
class TemporaryFile {
diff --git a/lld/COFF/MarkLive.cpp b/lld/COFF/MarkLive.cpp
index f40810c6805aa..78f5030e8fc2b 100644
--- a/lld/COFF/MarkLive.cpp
+++ b/lld/COFF/MarkLive.cpp
@@ -49,7 +49,10 @@ void markLive(COFFLinkerContext &ctx) {
addSym(file->impchkThunk->exitThunk);
};
- addSym = [&](Symbol *b) {
+ addSym = [&](Symbol *s) {
+ Defined *b = s->getDefined();
+ if (!b)
+ return;
if (auto *sym = dyn_cast<DefinedRegular>(b)) {
enqueue(sym->getChunk());
} else if (auto *sym = dyn_cast<DefinedImportData>(b)) {
diff --git a/lld/COFF/Options.td b/lld/COFF/Options.td
index 0d66b49a4fdb8..2c393cc94b5e3 100644
--- a/lld/COFF/Options.td
+++ b/lld/COFF/Options.td
@@ -31,6 +31,9 @@ multiclass B_priv<string name> {
def align : P<"align", "Section alignment">;
def aligncomm : P<"aligncomm", "Set common symbol alignment">;
def alternatename : P<"alternatename", "Define weak alias">;
+def arm64xsameaddress
+ : P<"arm64xsameaddress", "Generate a thunk for the symbol with the same "
+ "address in both native and EC views on ARM64X.">;
def base : P<"base", "Base address of the program">;
def color_diagnostics: Flag<["--"], "color-diagnostics">,
HelpText<"Alias for --color-diagnostics=always">;
@@ -373,4 +376,3 @@ def tlbid : P_priv<"tlbid">;
def tlbout : P_priv<"tlbout">;
def verbose_all : P_priv<"verbose">;
def guardsym : P_priv<"guardsym">;
-def arm64xsameaddress : P_priv<"arm64xsameaddress">;
diff --git a/lld/COFF/SymbolTable.cpp b/lld/COFF/SymbolTable.cpp
index 189e75dfc3ff5..de04cdff6483d 100644
--- a/lld/COFF/SymbolTable.cpp
+++ b/lld/COFF/SymbolTable.cpp
@@ -452,7 +452,7 @@ void SymbolTable::reportUnresolvable() {
reportProblemSymbols(undefs, /*localImports=*/nullptr, true);
}
-void SymbolTable::resolveRemainingUndefines() {
+void SymbolTable::resolveRemainingUndefines(std::vector<Undefined *> &aliases) {
llvm::TimeTraceScope timeScope("Resolve remaining undefined symbols");
SmallPtrSet<Symbol *, 8> undefs;
DenseMap<Symbol *, Symbol *> localImports;
@@ -468,8 +468,10 @@ void SymbolTable::resolveRemainingUndefines() {
StringRef name = undef->getName();
// A weak alias may have been resolved, so check for that.
- if (undef->resolveWeakAlias())
+ if (undef->getWeakAlias()) {
+ aliases.push_back(undef);
continue;
+ }
// If we can resolve a symbol by removing __imp_ prefix, do that.
// This odd rule is for compatibility with MSVC linker.
@@ -620,10 +622,10 @@ void SymbolTable::initializeECThunks() {
return;
for (auto it : entryThunks) {
- auto *to = dyn_cast<Defined>(it.second);
+ Defined *to = it.second->getDefined();
if (!to)
continue;
- auto *from = dyn_cast<DefinedRegular>(it.first);
+ auto *from = dyn_cast_or_null<DefinedRegular>(it.first->getDefined());
// We need to be able to add padding to the function and fill it with an
// offset to its entry thunks. To ensure that padding the function is
// feasible, functions are required to be COMDAT symbols with no offset.
@@ -642,7 +644,8 @@ void SymbolTable::initializeECThunks() {
Symbol *sym = exitThunks.lookup(file->thunkSym);
if (!sym)
sym = exitThunks.lookup(file->impECSym);
- file->impchkThunk->exitThunk = dyn_cast_or_null<Defined>(sym);
+ if (sym)
+ file->impchkThunk->exitThunk = sym->getDefined();
}
// On ARM64EC, the __imp_ symbol references the auxiliary IAT, while the
@@ -659,6 +662,35 @@ void SymbolTable::initializeECThunks() {
});
}
+void SymbolTable::initializeSameAddressThunks() {
+ for (auto iter : ctx.config.sameAddresses) {
+ auto sym = dyn_cast_or_null<DefinedRegular>(iter.first->getDefined());
+ if (!sym || !sym->isLive())
+ continue;
+ auto nativeSym =
+ dyn_cast_or_null<DefinedRegular>(iter.second->getDefined());
+ if (!nativeSym || !nativeSym->isLive())
+ continue;
+ Defined *entryThunk = sym->getChunk()->getEntryThunk();
+ if (!entryThunk)
+ continue;
+
+ // Replace symbols with symbols referencing the thunk. Store the original
+ // symbol as equivalent DefinedSynthetic instances for use in the thunk
+ // itself.
+ auto symClone = make<DefinedSynthetic>(sym->getName(), sym->getChunk(),
+ sym->getValue());
+ auto nativeSymClone = make<DefinedSynthetic>(
+ nativeSym->getName(), nativeSym->getChunk(), nativeSym->getValue());
+ SameAddressThunkARM64EC *thunk =
+ make<SameAddressThunkARM64EC>(nativeSymClone, symClone, entryThunk);
+ sameAddressThunks.push_back(thunk);
+
+ replaceSymbol<DefinedSynthetic>(sym, sym->getName(), thunk);
+ replaceSymbol<DefinedSynthetic>(nativeSym, nativeSym->getName(), thunk);
+ }
+}
+
Symbol *SymbolTable::addUndefined(StringRef name, InputFile *f,
bool overrideLazy) {
auto [s, wasInserted] = insert(name, f);
diff --git a/lld/COFF/SymbolTable.h b/lld/COFF/SymbolTable.h
index 7eb067640dc85..aadd366c7d39f 100644
--- a/lld/COFF/SymbolTable.h
+++ b/lld/COFF/SymbolTable.h
@@ -31,6 +31,7 @@ class DefinedAbsolute;
class DefinedRegular;
class ImportThunkChunk;
class LazyArchive;
+class SameAddressThunkARM64EC;
class SectionChunk;
class Symbol;
@@ -67,7 +68,7 @@ class SymbolTable {
// Try to resolve any undefined symbols and update the symbol table
// accordingly, then print an error message for any remaining undefined
// symbols and warn about imported local symbols.
- void resolveRemainingUndefines();
+ void resolveRemainingUndefines(std::vector<Undefined *> &aliases);
// Try to resolve undefined symbols with alternate names.
void resolveAlternateNames();
@@ -140,6 +141,7 @@ class SymbolTable {
void addEntryThunk(Symbol *from, Symbol *to);
void addExitThunk(Symbol *from, Symbol *to);
void initializeECThunks();
+ void initializeSameAddressThunks();
void reportDuplicate(Symbol *existing, InputFile *newFile,
SectionChunk *newSc = nullptr,
@@ -159,6 +161,8 @@ class SymbolTable {
// A list of EC EXP+ symbols.
std::vector<Symbol *> expSymbols;
+ std::vector<SameAddressThunkARM64EC *> sameAddressThunks;
+
// A list of DLL exports.
std::vector<Export> exports;
llvm::DenseSet<StringRef> directivesExports;
diff --git a/lld/COFF/Symbols.cpp b/lld/COFF/Symbols.cpp
index b571ce9ce432c..ba4f95d14bc64 100644
--- a/lld/COFF/Symbols.cpp
+++ b/lld/COFF/Symbols.cpp
@@ -91,6 +91,14 @@ bool Symbol::isLive() const {
return true;
}
+Defined *Symbol::getDefined() {
+ if (auto d = dyn_cast<Defined>(this))
+ return d;
+ if (auto u = dyn_cast<Undefined>(this))
+ return u->getDefinedWeakAlias();
+ return nullptr;
+}
+
void Symbol::replaceKeepingName(Symbol *other, size_t size) {
StringRef origName = getName();
memcpy(this, other, size);
diff --git a/lld/COFF/Symbols.h b/lld/COFF/Symbols.h
index fd3d8cec8c113..c86ded860876b 100644
--- a/lld/COFF/Symbols.h
+++ b/lld/COFF/Symbols.h
@@ -95,6 +95,10 @@ class Symbol {
symbolKind == LazyDLLSymbolKind;
}
+ // Get the Defined symbol associated with this symbol, either itself or its
+ // weak alias.
+ Defined *getDefined();
+
private:
void computeName();
diff --git a/lld/COFF/Writer.cpp b/lld/COFF/Writer.cpp
index 076561807af47..4c7a52eda151b 100644
--- a/lld/COFF/Writer.cpp
+++ b/lld/COFF/Writer.cpp
@@ -314,6 +314,7 @@ class Writer {
uint32_t dataDirOffset64;
OutputSection *textSec;
+ OutputSection *wowthkSec;
OutputSection *hexpthkSec;
OutputSection *bssSec;
OutputSection *rdataSec;
@@ -1076,8 +1077,10 @@ void Writer::createSections() {
// Try to match the section order used by link.exe.
textSec = createSection(".text", code | r | x);
- if (isArm64EC(ctx.config.machine))
+ if (isArm64EC(ctx.config.machine)) {
+ wowthkSec = createSection(".wowthk", code | r | x);
hexpthkSec = createSection(".hexpthk", code | r | x);
+ }
bssSec = createSection(".bss", bss | r | w);
rdataSec = createSection(".rdata", data | r);
buildidSec = createSection(".buildid", data | r);
@@ -2310,6 +2313,18 @@ void Writer::createECChunks() {
ctx.symtab.findUnderscore("__arm64x_redirection_metadata");
replaceSymbol<DefinedSynthetic>(entryPointsSym, entryPointsSym->getName(),
entryPoints);
+
+ for (auto thunk : ctx.symtab.sameAddressThunks) {
+ wowthkSec->insertChunkAtStart(thunk);
+ if (ctx.config.machine == ARM64X) {
+ ...
[truncated]
|
// containing unnecessary native code. As these thunks don't appear to | ||
// be useful, we limit this behavior to actual hybrid targets. This may | ||
// change if compatibility becomes necessary. | ||
if (ctx.config.machine != ARM64X) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This option can be handled on the command line, too, right? In that case, the machine probably isn't known yet (unless explicitly specified)? Then again, arm64x mode isn't ever really implied, it is always specified explicitly with -machine:arm64x
, so this is probably fine. (Plus, this is mostly passed in practice through directives in object files in practice anyway, I guess.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can be handled via the command line, but as you said, arm64x needs to be specified explicitly (as does arm64ec; see #116281), so the check should be safe. And yes, I would expect it to be passed through directives anyway (I don't expect it to be widely used in application code, but its use in the CRT itself makes it relevant).
|
||
ctx.symtab.initializeSameAddressThunks(); | ||
for (auto alias : aliases) | ||
alias->resolveWeakAlias(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if resolveWeakAlias()
fails? Previously that would cause resolveRemainingUndefines
to go on to do other things, but those are entirely skipped, if the symbo had a weak alias now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolveRemainingUndefines
checks whether a symbol is resolvable before storing it in the vector. We know that these symbols will not become unresolvable before reaching this code. Replacing symbols for same-address handling cannot make a resolvable alias unresolvable, nor can it make an unresolvable alias resolvable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, so you mean that as long as undef->getWeakAlias()
returns non-null, we can be sure that the later alias->resolveWeakAlias()
also will succeed? Then this should be fine.
lld/COFF/Writer.cpp
Outdated
@@ -2310,6 +2313,18 @@ void Writer::createECChunks() { | |||
ctx.symtab.findUnderscore("__arm64x_redirection_metadata"); | |||
replaceSymbol<DefinedSynthetic>(entryPointsSym, entryPointsSym->getName(), | |||
entryPoints); | |||
|
|||
for (auto thunk : ctx.symtab.sameAddressThunks) { | |||
wowthkSec->insertChunkAtStart(thunk); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that for each thunk, we insert it at the start, i.e. adding them in reverse order? That seems both a bit odd and inefficient. Wouldn't wowthkSec
be empty here and we could just append to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wowthkSec
is also used by thunks provided by object files, so they would be inserted after those. I did it this way because it more closely matches MSVC's behavior (in terms of placement, not necessarily order). I moved the insertion into createSection
, where wowthkSec
is still empty. We also add some other chunks there.
|
||
if (errorCount()) | ||
return; | ||
|
||
ctx.forEachActiveSymtab([](SymbolTable &symtab) { | ||
symtab.initializeECThunks(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, initializeECThunks()
only was called on one out of two symtabs - now it's called on both. Presumably just for code simplicity here, or is it a functional change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initializeECThunks()
already checks the machine type, so the previous check was redundant.
Thanks for the review! I’ve pushed a new version that addresses the comments. I also removed one more arm64x check, since those chunks are currently only generated in that case. It was a leftover from an earlier version, where I generated the chunks for arm64ec as well (like MSVC). That approach ended up being worse than a no-op, since it pulled in additional, unnecessary code. I’ve removed it for now, but I’m happy to revisit if we find a case where it’s needed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I don't think I have further concerns here then, and I believe all of my comments were addressed.
Fixes MSVC CRT thread-local constructors support on hybrid ARM64X targets. -arm64xsameaddress is an undocumented option that ensures the specified function has the same address in both native and EC views of hybrid images. To achieve this, the linker emits additional thunks and replaces the symbols of those functions with the thunk symbol (the same thunk is used in both views). The thunk code jumps to the native function (similar to range extension thunks), but additional ARM64X relocations are emitted to replace the target with the EC function in the EC view. MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it. The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the markLive call was moved earlier, and the final weak alias resolution was postponed until afterward. This requires more code to be aware of weak aliases, which previously could assume they were already resolved.
Hi @cjacek, I have been seeing some random crashes in a test on one of my build bots which after bisecting it back, I think this change seems to be causing it. The crash seems to mainly affect Windows (although I have seen it twice I think on my linux bot) with some recent examples from the Windows bot here:
The affected test seems to be
Any idea what might be happening here? |
Thanks for the bisect and the report! I can’t reproduce it, but I believe I see what’s going wrong. We likely store the alias as |
I could reproduce it reliably with an explicit assert and test extended to make the problem more likely. I created #152000 with a fix. |
…152000) Fixes crashes reported in #151255. The alias may have already been stored for later resolution, which can lead to treating a resolved alias as if it were still undefined. Instead, use the alias target directly for the import. Also extended the test to make reproducing the problem more likely, and added an assert that catches the issue.
Fixes MSVC CRT thread-local constructors support on hybrid ARM64X targets.
-arm64xsameaddress
is an undocumented option that ensures the specified function has the same address in both native and EC views of hybrid images. To achieve this, the linker emits additional thunks and replaces the symbolsof those functions with the thunk symbol (the same thunk is used in both views). The thunk code jumps to the native function (similar to range extension thunks), but additional ARM64X relocations are emitted to replace the target with the EC function in the EC view.
MSVC appears to generate thunks even for non-hybrid ARM64EC images. As a side effect, the native symbol is pulled in. Since this is used in the CRT for thread-local constructors, it results in the image containing unnecessary native code. Because these thunks do not appear to be useful in that context, we limit this behavior to actual hybrid targets. This may change if compatibility requires it.
The tricky part is that thunks should be skipped if the symbol is not live in either view, and symbol replacement must be reflected in weak aliases. This requires thunk generation to happen before resolving weak aliases but after the GC pass. To enable this, the
markLive
call was moved earlier, and the final weak alias resolution was postponed until afterward. This requires more code to be aware of weak aliases, which previously could assume they were already resolved.