-
Notifications
You must be signed in to change notification settings - Fork 354
Closed
Description
Issue Description
We are making unnecessary copies of the given function parameters.
This is making the compile time and runtime very slow when targeting CUDA.
Reproducer Code
The following shader can be used as a repro shader;
struct BigStruct {.....
void member() { ... }
}
ParameterBlock<BigStruct> pb;
void test()
{
pb.member();
}
Expected Behavior
Slang generates the following CUDA code,
struct BigStruct {...}
void BigStruct_member(BigStruct* this_0) { ... }
void test(BigStruct* pb) {
BigStruct tmp = *pb; // Unnecessary copy
BigStruct_test(&tmp);
}
Actual Behavior
Note that the copy from pb
to tmp
is unnecessary and it should directly use pb
as following,
void test(BigStruct* pb) {
BigStruct_test(pb);
}
Additional context
The solution for this problem can be resolved at IR level by removing unnecessary Load and Store.