Fix CI after LLVM upgrade #1897

nikic · 2025-08-07T12:48:03Z

This fixes a number of issues after the LLVM upgrade:

Drop no longer needed feature gates (unrelated, just needed to work with current nightly)
Use intrinsics for some operations on s390x, as these are no longer recognized after Stop using uadd.with.overflow rust#145144 (fixed upstream by [SystemZ] Allow forming overflow op for i128 llvm/llvm-project#153557).
Add missing avx512vl features to tests using assert_eq_m128h. This avoid a selection failure (github.com/[X86] Selection failure with llvm.x86.avx512fp16.mask.mul.sh.round llvm/llvm-project#153570), but is also necessary for correctness.
Added some more avx512vl features to tests to work around another selection failure ([x86] LLVM ERROR: Cannot select: t28: v8i32 = X86ISD::CVTTP2UI t27 llvm/llvm-project#154492). These should get removed once the issue is fixed.
Adjusted immediate for vrndscalepd tests to ensure they actually use scaled rounding.

rustbot · 2025-08-07T12:48:07Z

rustbot has assigned @folkertdev.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

nikic · 2025-08-07T12:58:00Z

Failures like this on s390x:

---- core_arch::s390x::vector::assert_vec_addec_u128_vacccq stdout ----
disassembly for stdarch_test_shim_vec_addec_u128_vacccq: 
	 0: stmg %r14,%r15,112(%r15)
	 1: aghi %r15,-216
	 2: vst %v26,160(%r15),3
	 3: vlgvg %r0,%v28,1
	 4: lghi %r5,1
	 5: ngr %r5,%r0
	 6: la %r2,192(%r15)
	 7: la %r3,176(%r15)
	 8: la %r4,160(%r15)
	 9: vst %v24,176(%r15),3
	10: brasl %r14,458b0 <_ZN4core3num22_$LT$impl$u20$u128$GT$12carrying_add17hf49cfddc246426f1E>
	11: vzero %v24
	12: vleb %v24,208(%r15),15
	13: lmg %r14,%r15,328(%r15)
	14: br %r14
	15: nopr %r7
	16: nopr %r7
	17: nopr %r7

thread 'core_arch::s390x::vector::assert_vec_addec_u128_vacccq' panicked at crates/stdarch-test/src/lib.rs:204:9:
failed to find instruction `vacccq` in the disassembly
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

The carrying_add here is no longer inlined:

stdarch/crates/core_arch/src/s390x/vector.rs

Lines 4719 to 4734 in 97bf36d

    
           /// Vector Add With Carry Compute Carry unsigned 128-bits 
        
           #[inline] 
        
           #[target_feature(enable = "vector")] 
        
           #[unstable(feature = "stdarch_s390x", issue = "135681")] 
        
           #[cfg_attr(test, assert_instr(vacccq))] 
        
           pub unsafe fn vec_addec_u128( 
        
               a: vector_unsigned_char, 
        
               b: vector_unsigned_char, 
        
               c: vector_unsigned_char, 
        
           ) -> vector_unsigned_char { 
        
               let a: u128 = transmute(a); 
        
               let b: u128 = transmute(b); 
        
               let c: u128 = transmute(c); 
        
               let (_d, carry) = a.carrying_add(b, c & 1 != 0); 
        
               transmute(carry as u128) 
        
           }

First guess would be that this is a regression from llvm/llvm-project#132976. But I'd expect that to allow more inlining, not less...

nikic · 2025-08-07T13:07:17Z

First guess would be that this is a regression from llvm/llvm-project#132976. But I'd expect that to allow more inlining, not less...

No, the description in that PR is just incorrect. The implementation that matters here is the BasicTTIImpl one introduced in llvm/llvm-project@e26af09, which implemented a subset check. This one then incorrectly made it a strict equality.

nikic · 2025-08-07T13:33:43Z

Posted a revert at llvm/llvm-project#152494.

folkertdev · 2025-08-07T13:37:42Z

cc @uweigand

Maybe some of the test coverage we have here should be added as LLVM tests too?

uweigand · 2025-08-08T13:30:17Z

No, the description in that PR is just incorrect. The implementation that matters here is the BasicTTIImpl one introduced in llvm/llvm-project@e26af09, which implemented a subset check. This one then incorrectly made it a strict equality.

Thanks for catching this!

Maybe some of the test coverage we have here should be added as LLVM tests too?

As part of @nikic 's PR there is now already a test that inlining does happen in this case. Not sure if anything else is necessary.

nikic · 2025-08-14T08:04:34Z

So that's fixed ... but we still get a failure on s390x:

---- core_arch::s390x::vector::assert_vec_addc_u128_vaccq stdout ----
disassembly for stdarch_test_shim_vec_addc_u128_vaccq: 
	 0: vno %v0,%v24,%v24
	 1: veclg %v0,%v26
	 2: jlh 42b06 <stdarch_test_shim_vec_addc_u128_vaccq+0x16>
	 3: vchlgs %v0,%v26,%v0
	 4: ipm %r0
	 5: xilf %r0,268435456
	 6: afi %r0,-268435456
	 7: vlvgp %v0,%r0,%r0
	 8: larl %r1,113628 <anon.0a249e466206e8b1f15729d2ca130c65.271.llvm.7352218685919627983>
	 9: vl %v1,0(%r1),3
	10: vrepib %v2,31
	11: vsrlb %v0,%v0,%v2
	12: vsrl %v0,%v0,%v2
	13: vn %v24,%v0,%v1
	14: br %r14
	15: nopr %r7
	16: nopr %r7
	17: nopr %r7
	18: nopr %r7
	19: nopr %r7
	20: nopr %r7
	21: nopr %r7

From:

stdarch/crates/core_arch/src/s390x/vector.rs

Lines 4691 to 4698 in 97bf36d

    
           pub unsafe fn vec_addc_u128( 
        
               a: vector_unsigned_char, 
        
               b: vector_unsigned_char, 
        
           ) -> vector_unsigned_char { 
        
               let a: u128 = transmute(a); 
        
               let b: u128 = transmute(b); 
        
               transmute(a.overflowing_add(b).1 as u128) 
        
           }

nikic · 2025-08-14T08:09:44Z

I think this one might be due to rust-lang/rust#145144 rather than the LLVM upgrade: https://rust.godbolt.org/z/W9fTGhq1G

nikic · 2025-08-14T08:36:03Z

The problem is the SystemZ shouldFormOverflowOp() hook: https://github.com/llvm/llvm-project/blob/04aebbfbe2b40ee947a00c50a6e1ab62405a6c80/llvm/lib/Target/SystemZ/SystemZISelLowering.h#L522-L527 It only allows forming uaddo for i32 and i64, but not i128.

The SystemZ backend also has a DAGCombine to recognize (a + b) < a independent of that, but not for InstCombine's canonical a > ~b pattern.

nikic · 2025-08-14T09:16:42Z

I've put up llvm/llvm-project#153557 to fix this in the SystemZ backend. In the meantime, I have switched stdarch to use the intrinsics instead.

But now we're hitting a failure on i686-unknown_linux-gnu:

 rustc-LLVM ERROR: Cannot select: 0x7f4ae5be2000: v8f16 = X86ISD::SELECTS 0x7f4ae5bd3070, 0x7f4ae5bd3540, 0x7f4ae5bd3700, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
  0x7f4ae5bd3070: v1i1 = BUILD_VECTOR Constant:i8<0>, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
  0x7f4ae5bd3540: v8f16 = X86ISD::FMULS_RND 0x7f4ae5bd3af0, 0x7f4ae5bc43f0, TargetConstant:i32<0>, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
    0x7f4ae5bd3af0: v8f16,ch = X86ISD::VZEXT_LOAD<(load (s16) from constant-pool)> 0x7f4ae5a46520, 0x7f4ae5be2310, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
      0x7f4ae5be2310: i32 = X86ISD::Wrapper TargetConstantPool:i32<half 0xH3C00> 0
    0x7f4ae5bc43f0: v8f16,ch = X86ISD::VZEXT_LOAD<(load (s16) from constant-pool)> 0x7f4ae5a46520, 0x7f4ae5bd3460, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
      0x7f4ae5bd3460: i32 = X86ISD::Wrapper TargetConstantPool:i32<half 0xH4000> 0
  0x7f4ae5bd3700: v8f16 = BUILD_VECTOR ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, ConstantFP:f16<APFloat(0)>, crates/core_arch/src/x86/avx512fp16.rs:2207:9 @[ crates/core_arch/src/x86/avx512fp16.rs:2230:5 @[ crates/core_arch/src/x86/avx512fp16.rs:18269:13 ] ]
In function: _ZN9core_arch9core_arch3x8610avx512fp165tests26test_mm_maskz_mul_round_sh26test_mm_maskz_mul_round_sh17hf59eff26b6019e79E.llvm.4621278697345228905
error: could not compile `core_arch` (lib test)

Edit: Reduced: https://rust.godbolt.org/z/5nT35vfM9

nikic · 2025-08-14T12:13:38Z

Upstream issue: llvm/llvm-project#153570

tgross35 · 2025-08-14T21:26:55Z

But now we're hitting a failure on i686-unknown_linux-gnu:

Something comical about hitting AVX512 issues on an architecture from 1995. Not that it shouldn't work, of course

sayantn · 2025-08-19T21:11:16Z

Do we really need to wait for the LLVM fix btw? Seems like we are stepping in (almost) UB territory, because _mm_cmp_ph requires avx512vl, and the caller of assert_eq_m128h doesn't have that #[target_feature]. I believe this is being triggered only now due to llvm/llvm-project#137450. For now, just adding avx512vl to the caller test_* function seems to do the job

nikic · 2025-08-20T08:36:44Z

@sayantn That's a good point. I've updated the relevant tests to add the avx512vl feature.

This unlocks a new selection failure:

rustc-LLVM ERROR: Cannot select: 0x7fe8a7ecbe70: v8i32 = X86ISD::CVTTP2UI 0x7fe8a7ed42a0, crates/core_arch/src/x86/avx512f.rs:15933:19 @[ crates/core_arch/src/x86/avx512f.rs:49644:17 ]
  0x7fe8a7ed42a0: v8f32,ch = load<(load (s256) from constant-pool, align 64)> 0x7fe8a464d120, 0x7fe8a7ed41c0, undef:i64, crates/core_arch/src/x86/avx512f.rs:15933:19 @[ crates/core_arch/src/x86/avx512f.rs:49644:17 ]
    0x7fe8a7ed41c0: i64 = X86ISD::WrapperRIP TargetConstantPool:i64<<16 x float> <float 0.000000e+00, float -1.500000e+00, float 2.000000e+00, float -3.500000e+00, float 4.000000e+00, float -5.500000e+00, float 6.000000e+00, float -7.500000e+00, float 8.000000e+00, float 9.500000e+00, float 1.000000e+01, float 1.150000e+01, float 1.200000e+01, float 1.350000e+01, float 1.400000e+01, float 1.550000e+01>> 0
In function: _ZN9core_arch9core_arch3x867avx512f5tests29test_mm512_maskz_cvttps_epu3229test_mm512_maskz_cvttps_epu3217h7b313469c08d09d3E.llvm.15918738188031402023

nikic · 2025-08-20T08:49:35Z

Reduced:

; RUN: llc -mattr=+avx512f < %s
define <16 x i32> @test() {
  %res = call <16 x i32> @llvm.x86.avx512.mask.cvttps2udq.512(<16 x float> zeroinitializer, <16 x i32> zeroinitializer, i16 255, i32 4)
  ret <16 x i32> %res
}

It works with +avx512f,+avx512vl.

Upstream issue: llvm/llvm-project#154492

The immediate here encodes both the rounding mode (in the low bits) and the scale (in the high bits). Make sure the scale is non-zero.

nikic · 2025-08-20T09:33:39Z

Okay, the CI should pass now.

folkertdev

@sayantn can you confirm that the avx512 changes look good?

folkertdev · 2025-08-20T09:35:41Z

crates/core_arch/src/s390x/vector.rs

+    #[link_name = "llvm.s390.vaccq"] fn vaccq(a: u128, b: u128) -> u128;
+    #[link_name = "llvm.s390.vacccq"] fn vacccq(a: u128, b: u128, c: u128) -> u128;
+


a fix was merged for this s390x issue. Is that just not on the version of LLVM that rustc uses yet?

The fix is currently in LLVM 22 only. I wasn't planning to backport it, as it's just an optimization improvement.

Right, it kind of looks like a regression from our perspective given that it worked before, but I can see how from LLVM's perspective this is sort of OK.

rustbot assigned folkertdev Aug 7, 2025

sayantn mentioned this pull request Aug 7, 2025

Implement arithmetic operation traits for x86 SIMD types #1898

Open

folkertdev mentioned this pull request Aug 11, 2025

Add testing for Arm64EC Windows #1899

Merged

nikic closed this Aug 14, 2025

nikic reopened this Aug 14, 2025

nikic force-pushed the dummy branch 3 times, most recently from 78d9ceb to 37ecc87 Compare August 14, 2025 09:12

nikic force-pushed the dummy branch from 99f91be to ef4b7ef Compare August 20, 2025 08:32

nikic added 5 commits August 20, 2025 11:23

Drop no longer needed feature gates

9810ae9

Use intrinsics for some s390x operations

a435aca

Add missing avx512vl target features

f9a7115

Work around selection failure without avx512vl

f54d816

Adjust immediate for vrndscalepd tests

ebb9606

The immediate here encodes both the rounding mode (in the low bits) and the scale (in the high bits). Make sure the scale is non-zero.

nikic force-pushed the dummy branch from e9fc4aa to ebb9606 Compare August 20, 2025 09:27

nikic changed the title ~~Run CI after LLVM upgrade~~ Fix CI after LLVM upgrade Aug 20, 2025

folkertdev reviewed Aug 20, 2025

View reviewed changes

sayantn approved these changes Aug 20, 2025

View reviewed changes

folkertdev added this pull request to the merge queue Aug 20, 2025

Merged via the queue into rust-lang:master with commit 69d1978 Aug 20, 2025
62 checks passed

		#[link_name = "llvm.s390.vaccq"] fn vaccq(a: u128, b: u128) -> u128;
		#[link_name = "llvm.s390.vacccq"] fn vacccq(a: u128, b: u128, c: u128) -> u128;

Fix CI after LLVM upgrade #1897

Fix CI after LLVM upgrade #1897

Uh oh!

Conversation

nikic commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rustbot commented Aug 7, 2025

Uh oh!

nikic commented Aug 7, 2025

Uh oh!

nikic commented Aug 7, 2025

Uh oh!

nikic commented Aug 7, 2025

Uh oh!

folkertdev commented Aug 7, 2025

Uh oh!

uweigand commented Aug 8, 2025

Uh oh!

nikic commented Aug 14, 2025

Uh oh!

nikic commented Aug 14, 2025

Uh oh!

nikic commented Aug 14, 2025

Uh oh!

nikic commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Aug 14, 2025

Uh oh!

tgross35 commented Aug 14, 2025

Uh oh!

sayantn commented Aug 19, 2025

Uh oh!

nikic commented Aug 20, 2025

Uh oh!

nikic commented Aug 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nikic commented Aug 20, 2025

Uh oh!

folkertdev left a comment

Choose a reason for hiding this comment

Uh oh!

folkertdev Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

nikic Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

folkertdev Aug 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

nikic commented Aug 7, 2025 •

edited

Loading

nikic commented Aug 14, 2025 •

edited

Loading

nikic commented Aug 20, 2025 •

edited

Loading