[OpenCL][CodeGen] Fix replacing memcpy with addrspacecast


[OpenCL][CodeGen] Fix replacing memcpy with addrspacecast

If a function argument is byval and RV is located in default or alloca address space
an optimization of creating addrspacecast instead of memcpy is performed. That is
not correct for OpenCL, where that can lead to a situation of address space casting
from private * to global *. See an example below:

typedef struct {
  int x;
} MyStruct;

void foo(MyStruct val) {}

kernel void KernelOneMember(__global MyStruct* x) {
  foo (*x);

for this code clang generated following IR:
%0 = load %struct.MyStruct addrspace(1)*, %struct.MyStruct addrspace(1)**
%x.addr, align 4
%1 = addrspacecast %struct.MyStruct addrspace(1)* %0 to %struct.MyStruct*

So the optimization was disallowed for OpenCL if RV is located in an address space
different than that of the argument (0).

Reviewers: yaxunl, Anastasia

Reviewed By: Anastasia

Subscribers: cfe-commits, asavonic

Differential Revision: https://reviews.llvm.org/D54947