Skip to content

Commit 3e0d4aa

Browse files
pkwasnie-intelpszymich
authored andcommitted
run early MemCpy Optimization pass
When input SPIR-V does memcpy with src pointing to zeroinitialized memory, legacy SPIR-V translator did early optimization by replacing memcpy with memset. Khronos SPIR-V translator emits memcpy intact. This impacts AlignmentAnalysis pass, which produces different dst alignment for memcpy and memset, impacting following optimizations. Run LLVM's MemCpy Optimization pass before AlignmentAnalysis to replace memcpy with memset, producing result closer to legacy translator. (cherry picked from commit 8bc3ee1)
1 parent 463a950 commit 3e0d4aa

File tree

2 files changed

+61
-0
lines changed

2 files changed

+61
-0
lines changed

IGC/AdaptorOCL/UnifyIROCL.cpp

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,7 @@ static void CommonOCLBasedPasses(
331331
#ifdef IGC_SCALAR_USE_KHRONOS_SPIRV_TRANSLATOR
332332
mpmSPIR.add(new PreprocessSPVIR());
333333
mpmSPIR.add(new PromoteBools());
334+
mpmSPIR.add(llvm::createMemCpyOptPass());
334335
#endif // IGC_SCALAR_USE_KHRONOS_SPIRV_TRANSLATOR
335336
mpmSPIR.add(new TypesLegalizationPass());
336337
mpmSPIR.add(new TargetLibraryInfoWrapperPass());
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
; Input spv:
2+
; 1. Defines struct struct.complex = { double, double }
3+
; 2. Defines local array variable [5 x [6 x struct.complex ]]
4+
; 3. Defines global array variable [16 x i8] = { 0 }
5+
; 4. Kernel sets first struct in array to zero by doing memcpy of global i8 array.
6+
;
7+
; Legacy translator does early optimization by replacing memcpy to memset.
8+
; Khronos translator emits memcpy as-it, which is later optimized to memset by compiler.
9+
; Test if in both cases AlignmentAnalysis pass assigns correct allignment to pointer.
10+
11+
; REQUIRES: regkeys, spirv-as, pvc-supported
12+
; RUN: spirv-as --target-env spv1.0 -o %t.spv %s
13+
; RUN: ocloc compile -spirv_input -file %t.spv -device pvc -options " -igc_opts 'PrintToConsole=1 PrintAfter=AlignmentAnalysisPass'" 2>&1 | FileCheck %s
14+
15+
OpCapability Addresses
16+
OpCapability Kernel
17+
OpMemoryModel Physical32 OpenCL
18+
OpEntryPoint Kernel %1 "test"
19+
OpName %cpx "struct.complex" ; 0x00000884
20+
%void = OpTypeVoid
21+
%double = OpTypeFloat 64
22+
%uchar = OpTypeInt 8 0
23+
%uint = OpTypeInt 32 0
24+
%ulong = OpTypeInt 64 0
25+
%uint_0 = OpConstant %uint 0
26+
%ulong_0 = OpConstant %ulong 0
27+
%ulong_5 = OpConstant %ulong 5
28+
%ulong_6 = OpConstant %ulong 6
29+
%ulong_16 = OpConstant %ulong 16
30+
%ptr_uchar = OpTypePointer UniformConstant %uchar
31+
%arr_uchar = OpTypeArray %uchar %ulong_16
32+
%ptr_arr_uchar = OpTypePointer UniformConstant %arr_uchar
33+
%cpx = OpTypeStruct %double %double
34+
%arr_cpx = OpTypeArray %cpx %ulong_6
35+
%arr_arr_cpx = OpTypeArray %arr_cpx %ulong_5
36+
%ptr_arr_arr_cpx = OpTypePointer Function %arr_arr_cpx
37+
%ptr_cpx = OpTypePointer Function %cpx
38+
%ptr_double = OpTypePointer Function %double
39+
%ptr_char = OpTypePointer Function %uchar
40+
OpDecorate %349 Constant
41+
%347 = OpConstantNull %arr_uchar
42+
%349 = OpVariable %ptr_arr_uchar UniformConstant %347
43+
%7 = OpTypeFunction %void
44+
%1 = OpFunction %void None %7
45+
%10 = OpLabel
46+
%317 = OpVariable %ptr_arr_arr_cpx Function
47+
%334 = OpInBoundsPtrAccessChain %ptr_cpx %317 %ulong_0 %ulong_0 %ulong_0
48+
%339 = OpInBoundsPtrAccessChain %ptr_double %334 %ulong_0 %uint_0
49+
%345 = OpBitcast %ptr_char %339
50+
%351 = OpBitcast %ptr_uchar %349
51+
OpCopyMemorySized %345 %351 %ulong_16 Aligned 0
52+
OpReturn
53+
OpFunctionEnd
54+
55+
; CHECK-LABEL: @test(
56+
; CHECK-NOT: call void @llvm.memcpy
57+
; CHECK: [[ALLOCA:%.*]] = alloca %struct.complex
58+
; CHECK: [[PTR:%.*]] = bitcast %struct.complex* [[ALLOCA]] to i8*
59+
; CHECK: call void @llvm.memset.p0i8.i64(i8* align 8 [[PTR]], i8 0, i64 16, i1 false)
60+
; CHECK: ret void

0 commit comments

Comments
 (0)