Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.
It only affects amdgcn target.
Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.
Differential D45223
[CUDA] Set LLVM calling convention for CUDA kernel yaxunl on Apr 3 2018, 12:32 PM. Authored by
Details Some targets need special LLVM calling convention for CUDA kernel. It only affects amdgcn target. Patch by Greg Rodgers.
Diff Detail
Event TimelineComment Actions I think the appropriate place to do this is in IsStandardConversion, immediately after the call to ResolveAddressOfOverloadedFunction. You might want to add a general utility for getting the type-of-reference of a function decl. Comment Actions We may need to resolve overloaded functions with dropped calling conventions, e.g. __global__ void EmptyKernel(float) {} __global__ void EmptyKernel(double) {} struct Dummy { /// Type definition of the EmptyKernel kernel entry point typedef void (*EmptyKernelPtr)(float); EmptyKernelPtr Empty() { return EmptyKernel; } }; In this case we have to drop the calling convention during the resolution. Since the calling convention is invisible in the AST, why don't we just do not represent it in AST? Going back to the original implementation in CodeGen: if ((getTriple().getArch() == llvm::Triple::amdgcn) && D->hasAttr<CUDAGlobalAttr>()) Fn->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL); It is much simpler and straightforward. Can we just reconsider implement this in CodeGen instead of Sema? Comment Actions Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road. |