[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 #155431

Muzammiluddin-Syed-ECE · 2025-08-26T14:56:55Z

The current chip guard fails to prevent scaling_extf/truncf patterns from being applied on gfx1100 which does not have scaling support.

llvmbot · 2025-08-26T14:57:28Z

@llvm/pr-subscribers-mlir-gpu
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-backend-amdgpu

Author: Muzammil (Muzammiluddin-Syed-ECE)

Changes

The current chip guard fails to prevent scaling_extf/truncf patterns from being applied on gfx1100 which does not have scaling support.

Full diff: https://github.com/llvm/llvm-project/pull/155431.diff

1 Files Affected:

(modified) mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp (+1-1)

diff --git a/mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp b/mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp
index 8230591123661..c0c063b4d4923 100644
--- a/mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp
+++ b/mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp
@@ -702,7 +702,7 @@ void mlir::arith::populateArithToAMDGPUConversionPatterns(
   if (allowPackedF16Rtz)
     patterns.add<TruncfToFloat16RewritePattern>(patterns.getContext(), benefit);
 
-  if (chipset >= kGfx950) {
+  if (chipset == kGfx950) {
     patterns.add<ScalingExtFRewritePattern>(patterns.getContext(), benefit);
     patterns.add<ScalingTruncFRewritePattern>(patterns.getContext(), benefit);
   }

Signed-off-by: Muzammiluddin Syed <[email protected]>

mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp

github-actions · 2025-08-26T16:01:08Z

✅ With the latest revision this PR passed the C/C++ code formatter.

Signed-off-by: Muzammiluddin Syed <[email protected]>

arsenm

Missing test

Signed-off-by: Muzammiluddin Syed <[email protected]>

Muzammiluddin-Syed-ECE · 2025-09-03T19:00:21Z

@arsenm I've added a minimal test that checks to see that the extf/truncf aren't expanded on gfx1100 for scalar types.

arsenm · 2025-09-04T03:23:34Z

mlir/test/Conversion/ArithToAMDGPU/scaling-extf.mlir

@@ -241,6 +242,9 @@ func.func @conversion_broadcast(%in: vector<4xf8E5M2>, %scale: f8E8M0FNU) -> vec

 // -----

+// CHECK-GFX1100-LABEL: @conversion_scalar
+// CHECK-GFX1100: arith.scaling_extf


I don't understand the operations involved here. Why does this need target knowledge, to emit something that isn't a target operation? Why isn't there target legalization?

There does not exist support for scaled mxfp4 types in this target (gfx1100) so there is no viable legalization IIUC (cc @krzysz00 )

The main reason this change is necessary is to avoid running the scaled extf/truncf rewrite on unsupported chips when calling populateArithToAMDGPUConversionPatterns (See here).

However, this check does not need to exist in ArithToAMDGPU.cpp. If preferred, I can create a separate function populateConversionPatterns in the same vein as how it's done in Math/Transforms where we use pass in a list of ops which we are interested in expanding, moving the check to the caller rather than having it here.

/// Adds patterns to expand math operations into other more fundamental /// operations. For example, hyperbolic functions are expanded into expressions /// using `exp`. If `opMnemonics` is empty then all available patterns will be /// added, otherwise only the patterns corresponding to ops in `opMnemonics` /// will be added to the set. void populateExpansionPatterns(RewritePatternSet &patterns, ArrayRef<StringRef> opMnemonics = {});

So, these rewrite patterns are specifically for "lower arith operations to target intrinsics if they exist". If the target intrinsic doesn't exist, these patterns shouldn't run, and later patterns (the "generic" expansion over in ExpandArithOps) will run instead.

So, the main point of the test here is to ensure that we don't emit intrinsics calls that can't be fulfilled, or rewrite things into a more complex form that only makes sense if you're targetting intrinsics (the amdgpu.* operations are intrinsic wrappers that proved somewhat higher-level APIs)

I think these target checks are fine right where they are, especially since this lowering already has to do a decent number of target checks (for example, which FP8 formats get lowered to intrinsic calls)

arith.scaling_extf isn't a target-specific operation, but, if it has an implementation on a given chipset, this pass lowers to that target-specific implementation. Hence the test checking that, on gfx1100, this pass is a noop (because other passes introduce a less efficient lowering)

(If there were an easy, low-dependency way to query the CPU features - that is, if we substantially refactored LLVM - these target checks could be checks on the same flags LLVM uses to test for the ultimate presence of the intrinsics. But that isn't, so chipset checks it is.)

krzysz00 · 2025-09-08T16:26:43Z

I'm open to landing through the changes requested flag, since the tests have been added

@arsenm ping if you'd still like to hold this up

llvmbot added backend:AMDGPU mlir:gpu mlir labels Aug 26, 2025

Muzammiluddin-Syed-ECE force-pushed the muzasyed/fp4ChipGuard branch from 841cbb3 to 5464787 Compare August 26, 2025 14:57

Muzammiluddin-Syed-ECE mentioned this pull request Aug 26, 2025

Adding MFXP Gemm e2e test iree-org/iree#21678

Open

limit scaling truncf/extf support to gfx950

5464787

Signed-off-by: Muzammiluddin Syed <[email protected]>

Groverkss reviewed Aug 26, 2025

View reviewed changes

mlir/lib/Conversion/ArithToAMDGPU/ArithToAMDGPU.cpp Outdated Show resolved Hide resolved

Groverkss approved these changes Aug 26, 2025

View reviewed changes

hoist chipset check to variable

2f2a400

Signed-off-by: Muzammiluddin Syed <[email protected]>

Muzammiluddin-Syed-ECE force-pushed the muzasyed/fp4ChipGuard branch from dd20850 to 2f2a400 Compare August 26, 2025 16:01

Muzammiluddin-Syed-ECE requested a review from Groverkss August 26, 2025 16:06

arsenm requested changes Aug 26, 2025

View reviewed changes

adding tests

7e309b6

Signed-off-by: Muzammiluddin Syed <[email protected]>

Muzammiluddin-Syed-ECE requested a review from arsenm August 27, 2025 15:26

krzysz00 approved these changes Aug 28, 2025

View reviewed changes

arsenm reviewed Sep 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 #155431

[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 #155431

Uh oh!

Muzammiluddin-Syed-ECE commented Aug 26, 2025

Uh oh!

llvmbot commented Aug 26, 2025 •

edited

Loading

Uh oh!

Uh oh!

github-actions bot commented Aug 26, 2025 •

edited

Loading

Uh oh!

arsenm left a comment

Uh oh!

Muzammiluddin-Syed-ECE commented Sep 3, 2025 •

edited

Loading

Uh oh!

arsenm Sep 4, 2025

Uh oh!

Muzammiluddin-Syed-ECE Sep 4, 2025

Uh oh!

krzysz00 Sep 4, 2025

Uh oh!

krzysz00 Sep 4, 2025

Uh oh!

krzysz00 Sep 4, 2025

Uh oh!

Muzammiluddin-Syed-ECE Sep 8, 2025

Uh oh!

krzysz00 commented Sep 8, 2025

Uh oh!

Uh oh!

[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 #155431

Are you sure you want to change the base?

[mlir][ArithToAMDGPU] limit scaling truncf/extf support to gfx950 #155431

Uh oh!

Conversation

Muzammiluddin-Syed-ECE commented Aug 26, 2025

Uh oh!

llvmbot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

github-actions bot commented Aug 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm left a comment

Choose a reason for hiding this comment

Uh oh!

Muzammiluddin-Syed-ECE commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

arsenm Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Muzammiluddin-Syed-ECE Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

krzysz00 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

krzysz00 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

krzysz00 Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Muzammiluddin-Syed-ECE Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

krzysz00 commented Sep 8, 2025

Uh oh!

Uh oh!

llvmbot commented Aug 26, 2025 •

edited

Loading

github-actions bot commented Aug 26, 2025 •

edited

Loading

Muzammiluddin-Syed-ECE commented Sep 3, 2025 •

edited

Loading