-
Notifications
You must be signed in to change notification settings - Fork 14.7k
ELF,SystemZ: Don't sort relocations for TLS GD/LD optimization; support CREL #149640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
ELF,SystemZ: Don't sort relocations for TLS GD/LD optimization; support CREL #149640
Conversation
Created using spr 1.3.5-bogner
@llvm/pr-subscribers-lld @llvm/pr-subscribers-lld-elf Author: Fangrui Song (MaskRay) ChangesThe General dynamic and Local Dynamic TLS models call However, sorting relocations does not play well with CREL. Since GNU ld Full diff: https://github.com/llvm/llvm-project/pull/149640.diff 5 Files Affected:
diff --git a/lld/ELF/Arch/SystemZ.cpp b/lld/ELF/Arch/SystemZ.cpp
index a9125806c0952..6771ff01dae58 100644
--- a/lld/ELF/Arch/SystemZ.cpp
+++ b/lld/ELF/Arch/SystemZ.cpp
@@ -23,7 +23,6 @@ namespace {
class SystemZ : public TargetInfo {
public:
SystemZ(Ctx &);
- int getTlsGdRelaxSkip(RelType type) const override;
RelExpr getRelExpr(RelType type, const Symbol &s,
const uint8_t *loc) const override;
RelType getDynRel(RelType type) const override;
@@ -277,29 +276,6 @@ RelExpr SystemZ::adjustTlsExpr(RelType type, RelExpr expr) const {
return expr;
}
-int SystemZ::getTlsGdRelaxSkip(RelType type) const {
- // A __tls_get_offset call instruction is marked with 2 relocations:
- //
- // R_390_TLS_GDCALL / R_390_TLS_LDCALL: marker relocation
- // R_390_PLT32DBL: __tls_get_offset
- //
- // After the relaxation we no longer call __tls_get_offset and should skip
- // both relocations to not create a false dependence on __tls_get_offset
- // being defined.
- //
- // Note that this mechanism only works correctly if the R_390_TLS_[GL]DCALL
- // is seen immediately *before* the R_390_PLT32DBL. Unfortunately, current
- // compilers on the platform will typically generate the inverse sequence.
- // To fix this, we sort relocations by offset in RelocationScanner::scan;
- // this ensures the correct sequence as the R_390_TLS_[GL]DCALL applies to
- // the first byte of the brasl instruction, while the R_390_PLT32DBL applies
- // to its third byte (the relative displacement).
-
- if (type == R_390_TLS_GDCALL || type == R_390_TLS_LDCALL)
- return 2;
- return 1;
-}
-
void SystemZ::relaxTlsGdToIe(uint8_t *loc, const Relocation &rel,
uint64_t val) const {
// The general-dynamic code sequence for a global `x`:
@@ -320,6 +296,9 @@ void SystemZ::relaxTlsGdToIe(uint8_t *loc, const Relocation &rel,
// Relaxing to initial-exec entails:
// 1) Replacing the call by a load from the GOT.
// 2) Replacing the relocation on the constant LC0 by R_390_TLS_GOTIE64.
+ //
+ // While we no longer call __tls_get_offset after optimization, we still
+ // generate an unused PLT entry. This simple behavior matches GNU ld.
switch (rel.type) {
case R_390_TLS_GDCALL:
diff --git a/lld/ELF/Relocations.cpp b/lld/ELF/Relocations.cpp
index 4333b032c9d4e..845ce199b62f2 100644
--- a/lld/ELF/Relocations.cpp
+++ b/lld/ELF/Relocations.cpp
@@ -1657,7 +1657,7 @@ void RelocationScanner::scan(Relocs<RelTy> rels) {
// On SystemZ, all sections need to be sorted by r_offset, to allow TLS
// relaxation to be handled correctly - see SystemZ::getTlsGdRelaxSkip.
SmallVector<RelTy, 0> storage;
- if (isa<EhInputSection>(sec) || ctx.arg.emachine == EM_S390)
+ if (isa<EhInputSection>(sec))
rels = sortRels(rels, storage);
if constexpr (RelTy::IsCrel) {
diff --git a/lld/test/ELF/systemz-plt.s b/lld/test/ELF/systemz-plt.s
index 717343ce4c4d5..1207f0704db8e 100644
--- a/lld/test/ELF/systemz-plt.s
+++ b/lld/test/ELF/systemz-plt.s
@@ -3,7 +3,7 @@
# RUN: llvm-mc -filetype=obj -triple=s390x-unknown-linux %t1.s -o %t1.o
# RUN: ld.lld -shared %t1.o -soname=t1.so -o %t1.so
-# RUN: llvm-mc -filetype=obj -triple=s390x-unknown-linux %s -o %t.o
+# RUN: llvm-mc -filetype=obj -triple=s390x-unknown-linux --crel %s -o %t.o
# RUN: ld.lld %t.o %t1.so -z separate-code -o %t
# RUN: llvm-readelf -S -s -r -x .got.plt %t | FileCheck %s
# RUN: llvm-objdump -d %t | FileCheck --check-prefixes=DIS %s
diff --git a/lld/test/ELF/systemz-tls-gd.s b/lld/test/ELF/systemz-tls-gd.s
index 742797e2d62e4..0042127eea153 100644
--- a/lld/test/ELF/systemz-tls-gd.s
+++ b/lld/test/ELF/systemz-tls-gd.s
@@ -1,6 +1,7 @@
# REQUIRES: systemz
# RUN: llvm-mc -filetype=obj -triple=s390x-unknown-linux %s -o %t.o
-# RUN: echo '.tbss; .globl b, c; b: .zero 4; c:' | llvm-mc -filetype=obj -triple=s390x-unknown-linux - -o %t1.o
+# RUN: echo '.globl __tls_get_offset; __tls_get_offset:; .tbss; .globl b, c; b: .zero 4; c:' | \
+# RUN: llvm-mc -filetype=obj -triple=s390x-unknown-linux - -o %t1.o
# RUN: ld.lld -shared -soname=t1.so %t1.o -o %t1.so
# RUN: ld.lld -shared %t.o %t1.o -o %t.so
@@ -19,12 +20,12 @@
# RUN: llvm-objdump --section .data.rel.ro --full-contents %t.ie | FileCheck --check-prefix=IE-DATA %s
# GD-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 6 entries:
-# GD-REL: 0000000000002570 0000000200000036 R_390_TLS_DTPMOD 0000000000000008 a + 0
-# GD-REL-NEXT: 0000000000002578 0000000200000037 R_390_TLS_DTPOFF 0000000000000008 a + 0
-# GD-REL-NEXT: 0000000000002580 0000000300000036 R_390_TLS_DTPMOD 000000000000000c b + 0
-# GD-REL-NEXT: 0000000000002588 0000000300000037 R_390_TLS_DTPOFF 000000000000000c b + 0
-# GD-REL-NEXT: 0000000000002590 0000000400000036 R_390_TLS_DTPMOD 0000000000000010 c + 0
-# GD-REL-NEXT: 0000000000002598 0000000400000037 R_390_TLS_DTPOFF 0000000000000010 c + 0
+# GD-REL: 0000000000002570 {{.*}} R_390_TLS_DTPMOD 0000000000000008 a + 0
+# GD-REL-NEXT: 0000000000002578 {{.*}} R_390_TLS_DTPOFF 0000000000000008 a + 0
+# GD-REL-NEXT: 0000000000002580 {{.*}} R_390_TLS_DTPMOD 000000000000000c b + 0
+# GD-REL-NEXT: 0000000000002588 {{.*}} R_390_TLS_DTPOFF 000000000000000c b + 0
+# GD-REL-NEXT: 0000000000002590 {{.*}} R_390_TLS_DTPMOD 0000000000000010 c + 0
+# GD-REL-NEXT: 0000000000002598 {{.*}} R_390_TLS_DTPOFF 0000000000000010 c + 0
## _GLOBAL_OFFSET_TABLE is at 0x2558
# GD: larl %r12, 0x2558
@@ -80,33 +81,36 @@
# IE-REL: Relocation section '.rela.dyn' at offset {{.*}} contains 2 entries:
-# IE-REL: 0000000001002430 0000000200000038 R_390_TLS_TPOFF 0000000000000000 b + 0
-# IE-REL-NEXT: 0000000001002438 0000000300000038 R_390_TLS_TPOFF 0000000000000000 c + 0
+# IE-REL: 0000000001002500 {{.*}} R_390_TLS_TPOFF 0000000000000000 b + 0
+# IE-REL-NEXT: 0000000001002508 {{.*}} R_390_TLS_TPOFF 0000000000000000 c + 0
+## Benign false dependency on __tls_get_offset
+# IE-REL: Relocation section '.rela.plt' at offset {{.*}} contains 1
+# IE-REL: R_390_JMP_SLOT 0000000000000000 __tls_get_offset
-## _GLOBAL_OFFSET_TABLE is at 0x1002418
-# IE: larl %r12, 0x1002418
+## _GLOBAL_OFFSET_TABLE
+# IE: larl %r12, 0x10024e8
-## TP offset of a is at 0x1002340
-# IE-NEXT: lgrl %r2, 0x1002340
+## TP offset of a
+# IE-NEXT: lgrl %r2, 0x10023d0
# IE-NEXT: jgnop
# IE-NEXT: lgf %r2, 0(%r2,%r7)
-## GOT offset of the TP offset for b is at 0x1002348
-# IE-NEXT: lgrl %r2, 0x1002348
+## GOT offset of the TP offset for b
+# IE-NEXT: lgrl %r2, 0x10023d8
# IE-NEXT: lg %r2, 0(%r2,%r12)
# IE-NEXT: lgf %r2, 0(%r2,%r7)
-## GOT offset of the TP offset for c is at 0x1002350
-# IE-NEXT: lgrl %r2, 0x1002350
+## GOT offset of the TP offset for c
+# IE-NEXT: lgrl %r2, 0x10023e0
# IE-NEXT: lg %r2, 0(%r2,%r12)
# IE-NEXT: lgf %r2, 0(%r2,%r7)
## TP offsets (a) / GOT offset of TP offsets (b, c)
# a: -4
-# b: 0x1002430 / 0x18
-# c: 0x1002438 / 0x20
-# IE-DATA: 1002340 ffffffff fffffffc 00000000 00000018
-# IE-DATA-NEXT: 1002350 00000000 00000020
+# b: 0x10023d0 / 0x18
+# c: 0x10023e0 / 0x20
+# IE-DATA: 10023d0 ffffffff fffffffc 00000000 00000018
+# IE-DATA-NEXT: 10023e0 00000000 00000020
ear %r7,%a0
diff --git a/lld/test/ELF/systemz-tls-ld.s b/lld/test/ELF/systemz-tls-ld.s
index ef104b82644ce..fc96f3a558a1f 100644
--- a/lld/test/ELF/systemz-tls-ld.s
+++ b/lld/test/ELF/systemz-tls-ld.s
@@ -91,6 +91,9 @@ lgf %r1,0(%r1,%r2)
lgrl %r1, .LC3
lgf %r1,0(%r1,%r2)
+.globl __tls_get_offset
+__tls_get_offset:
+
.section .data.rel.ro,"aw"
.align 8
.LC0:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'd be OK with this. It would still be nicer if this false dependency could be avoided. Not sure what CREL has to do with relocation sorting: is the problem indeed the swap of these two TLS-related relocs, or is the problem that generic sorting also introduces other changes? If the latter, maybe we could just swap those TLS relocs instead of a generic sorting path? Or simply allow a more generic relocation-skipping mechanism that allows skipping the reloc before a TLS reloc as well as the reloc after the TLS reloc?
lld/ELF/Relocations.cpp
Outdated
@@ -1657,7 +1657,7 @@ void RelocationScanner::scan(Relocs<RelTy> rels) { | |||
// On SystemZ, all sections need to be sorted by r_offset, to allow TLS | |||
// relaxation to be handled correctly - see SystemZ::getTlsGdRelaxSkip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should remove that comment then as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed!
CREL does not support random access. To allow the GDCALL/PLT32, we would need to decode them to RELA and sort them first... |
Ah, I see. But it should be possible to "peek" at the next relocation, right? So we might add something to the main |
Created using spr 1.3.5-bogner
…rt CREL The General dynamic and Local Dynamic TLS models call `__tls_get_offset`, which is eliminated after optimization. To avoid an unneeded PLT entry and its .dynsym entry, llvm#75643 sorted the relocations. However, sorting relocations does not play well with CREL. Since GNU ld has a false dependency on `__tls_get_offset` as well, let's remove the relocation sorting. Fix llvm#149511 Pull Request: llvm#149640
In I plan not to do anything with with GDCALL/PLT32 skipping because it is a minor thing and GNU ld doesn't skip PLT32DBL... |
The General dynamic and Local Dynamic TLS models call
__tls_get_offset
, which is eliminated after optimization. To avoid anunneeded PLT entry and its .dynsym entry, #75643 sorted the relocations.
However, sorting relocations does not play well with CREL. Since GNU ld
has a false dependency on
__tls_get_offset
as well, let's remove therelocation sorting. Fix #149511