This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Set LLVM calling convention for CUDA kernel
ClosedPublic

Authored by yaxunl on Apr 3 2018, 12:32 PM.

Download Raw Diff

Details

Reviewers

rjmccall
tra

Commits

rG4306f2086fe8: [CUDA] Set LLVM calling convention for CUDA kernel
rL330447: [CUDA] Set LLVM calling convention for CUDA kernel
rC330447: [CUDA] Set LLVM calling convention for CUDA kernel

Summary

Some targets need special LLVM calling convention for CUDA kernel.
This patch does that through a TargetCodeGenInfo hook.

It only affects amdgcn target.

Patch by Greg Rodgers.
Revised and lit tests added by Yaxun Liu.

Diff Detail

Event Timeline

yaxunl created this revision.Apr 3 2018, 12:32 PM

I think the appropriate place to do this is in IsStandardConversion, immediately after the call to ResolveAddressOfOverloadedFunction. You might want to add a general utility for getting the type-of-reference of a function decl.

In D45223#1056187, @rjmccall wrote:

I think the appropriate place to do this is in IsStandardConversion, immediately after the call to ResolveAddressOfOverloadedFunction. You might want to add a general utility for getting the type-of-reference of a function decl.

We may need to resolve overloaded functions with dropped calling conventions, e.g.

__global__ void EmptyKernel(float) {}

__global__ void EmptyKernel(double) {}

struct Dummy {
  /// Type definition of the EmptyKernel kernel entry point
  typedef void (*EmptyKernelPtr)(float);
  EmptyKernelPtr Empty() { return EmptyKernel; } 
};

In this case we have to drop the calling convention during the resolution.

Since the calling convention is invisible in the AST, why don't we just do not represent it in AST?

Going back to the original implementation in CodeGen:

if ((getTriple().getArch() == llvm::Triple::amdgcn) &&
    D->hasAttr<CUDAGlobalAttr>())
  Fn->setCallingConv(llvm::CallingConv::AMDGPU_KERNEL);

It is much simpler and straightforward.

Can we just reconsider implement this in CodeGen instead of Sema?

Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road.

In D45223#1071358, @rjmccall wrote:

Yes, I'm sorry, I think you're right. I had misunderstood the language problem when I suggested going down this road.

Never mind. I will update the diff for CodeGen approach.

Use CodeGen approach.

AFAICT this is the replacement for D44747. LGTM.

This revision is now accepted and ready to land.Apr 18 2018, 2:48 PM

In D45223#1071452, @tra wrote:

AFAICT this is the replacement for D44747. LGTM.

Yes. Thanks.

Closed by commit rC330447: [CUDA] Set LLVM calling convention for CUDA kernel (authored by yaxunl). · Explain WhyApr 20 2018, 10:06 AM

Closed by commit rL330447: [CUDA] Set LLVM calling convention for CUDA kernel (authored by yaxunl). · Explain Why

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptApr 20 2018, 10:06 AM

Revision Contents

Path

Size

lib/

Sema/

SemaOverload.cpp

12 lines

test/

CodeGenCUDA/

kernel-amdgcn.cu