This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Set LLVM calling convention for CUDA kernel
ClosedPublic

Authored by yaxunl on Apr 3 2018, 12:32 PM.

Details

Summary

Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.

It only affects amdgcn target.

Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.

Diff Detail

Repository
rL LLVM

Event Timeline

yaxunl created this revision.Apr 3 2018, 12:32 PM

I think the appropriate place to do this is in IsStandardConversion, immediately after the call to ResolveAddressOfOverloadedFunction. You might want to add a general utility for getting the type-of-reference of a function decl.

yaxunl added a comment.EditedApr 18 2018, 8:31 AM

I think the appropriate place to do this is in IsStandardConversion, immediately after the call to ResolveAddressOfOverloadedFunction. You might want to add a general utility for getting the type-of-reference of a function decl.

We may need to resolve overloaded functions with dropped calling conventions, e.g.

__global__ void EmptyKernel(float) {}

__global__ void EmptyKernel(double) {}

struct Dummy {
  /// Type definition of the EmptyKernel kernel entry point
  typedef void (*EmptyKernelPtr)(float);
  EmptyKernelPtr Empty() { return EmptyKernel; } 
};

In this case we have to drop the calling convention during the resolution.

Since the calling convention is invisible in the AST, why don't we just do not represent it in AST?

Going back to the original implementation in CodeGen:

if ((getTriple().getArch() == llvm::Triple::amdgcn) &&
    D->hasAttr<CUDAGlobalAttr>())
  Fn->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL);

It is much simpler and straightforward.

Can we just reconsider implement this in CodeGen instead of Sema?

Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road.

Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road.

Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road.

Never mind. I will update the diff for CodeGen approach.

yaxunl updated this revision to Diff 143001.Apr 18 2018, 2:30 PM
yaxunl retitled this revision from [CUDA] Fix overloading resolution failure due to kernel calling convention to [CUDA] Set LLVM calling convention for CUDA kernel.
yaxunl edited the summary of this revision. (Show Details)

Use CodeGen approach.

tra accepted this revision.Apr 18 2018, 2:48 PM

AFAICT this is the replacement for D44747. LGTM.

This revision is now accepted and ready to land.Apr 18 2018, 2:48 PM
In D45223#1071452, @tra wrote:

AFAICT this is the replacement for D44747. LGTM.

Yes. Thanks.

This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.