Skip to content

Conversation

trenouf
Copy link
Collaborator

@trenouf trenouf commented Jul 10, 2025

We don't want that padding in a module that only contains data, not code.

Also fix MCSection::hasInstructions() so it works with the asm streamer too.

We don't want that padding in a module that only contains data, not
code.
@llvmbot
Copy link
Member

llvmbot commented Jul 10, 2025

@llvm/pr-subscribers-mc

@llvm/pr-subscribers-backend-amdgpu

Author: Tim Renouf (trenouf)

Changes

We don't want that padding in a module that only contains data, not code.


Full diff: https://github.com/llvm/llvm-project/pull/147980.diff

1 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp (+3-1)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
index 749b9efc81378..3765c1d2b106c 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp
@@ -486,10 +486,12 @@ bool AMDGPUAsmPrinter::doFinalization(Module &M) {
   // Pad with s_code_end to help tools and guard against instruction prefetch
   // causing stale data in caches. Arguably this should be done by the linker,
   // which is why this isn't done for Mesa.
+  // Don't do it if there are no functions.
   const MCSubtargetInfo &STI = *getGlobalSTI();
   if ((AMDGPU::isGFX10Plus(STI) || AMDGPU::isGFX90A(STI)) &&
       (STI.getTargetTriple().getOS() == Triple::AMDHSA ||
-       STI.getTargetTriple().getOS() == Triple::AMDPAL)) {
+       STI.getTargetTriple().getOS() == Triple::AMDPAL) &&
+      !M.empty()) {
     OutStreamer->switchSection(getObjFileLowering().getTextSection());
     getTargetStreamer()->EmitCodeEnd(STI);
   }

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testcase?

(STI.getTargetTriple().getOS() == Triple::AMDHSA ||
STI.getTargetTriple().getOS() == Triple::AMDPAL)) {
STI.getTargetTriple().getOS() == Triple::AMDPAL) &&
!M.empty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable but I wonder if there's a better way to test "did we emit any code?". E.g. what if the module contains some function declarations but no definitions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Plus this may misdetect available_externally functions which will be dropped

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed to using MCSection::hasInstructions(), which I had to fix to work with the asm streamer.

Also added test. The positive case (when it does add padding) already has a test.

@llvmbot llvmbot added the llvm:mc Machine (object) code label Jul 11, 2025
@trenouf trenouf changed the title AMDGPU: Don't pad .text with s_code_end if it would otherwise be empty MC,AMDGPU: Don't pad .text with s_code_end if it would otherwise be empty Jul 11, 2025
@trenouf
Copy link
Collaborator Author

trenouf commented Jul 22, 2025

@jayfoad @arsenm Is this ok now?

@@ -0,0 +1,9 @@
; Test that there is no s_code_end padding if .text is othertwise empty.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
; Test that there is no s_code_end padding if .text is othertwise empty.
; Test that there is no s_code_end padding if .text is otherwise empty.

@trenouf trenouf merged commit e99c565 into llvm:main Aug 6, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU llvm:mc Machine (object) code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants