If a function argument is byval and RV is located in default or alloca address space
an optimization of creating addrspacecast instead of memcpy is performed. That is
not correct for OpenCL, where that can lead to a situation of address space casting
from private * to global *. See an example below:
typedef struct { int x; } MyStruct; void foo(MyStruct val) {} kernel void KernelOneMember(__global MyStruct* x) { foo (*x); }
for this code clang generated following IR:
...
%0 = load %struct.MyStruct addrspace(1)*, %struct.MyStruct addrspace(1)**
%x.addr, align 4
%1 = addrspacecast %struct.MyStruct addrspace(1)* %0 to %struct.MyStruct*
...
So the optimization was disallowed for OpenCL if RV is located in an address space
different than that of the argument (0).
This statement is incorrect. Private is only default AS in CL versions before 2.0.
I feel this can be expressed simpler. I guess the observation here is that if addr space mismatches it has to generate a copy because even if it would be a generic we can't get what the original specific addr space was? So it's safe to generate a copy.