This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Externalize kernels with internal linkage
ClosedPublic

Authored by yaxunl on Apr 21 2022, 10:46 AM.

Details

Summary

This patch is a continuation of https://reviews.llvm.org/D123353.

Not only kernels in anonymous namespace, but also template
kernels with template arguments in anonymous namespace
need to be externalized.

To be more generic, this patch checks the linkage of a kernel
assuming the kernel does not have __global__ attribute. If
the linkage is internal then clang will externalize it.

This patch also fixes the postfix for externalized symbol
since nvptx does not allow '.' in symbol name.

Diff Detail

Event Timeline

yaxunl created this revision.Apr 21 2022, 10:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 10:46 AM
Herald added a subscriber: mattd. · View Herald Transcript
yaxunl requested review of this revision.Apr 21 2022, 10:46 AM
yaxunl edited the summary of this revision. (Show Details)Apr 21 2022, 10:49 AM
mkuron added a subscriber: mkuron.Apr 21 2022, 11:07 AM
tra added inline comments.Apr 21 2022, 11:52 AM
clang/lib/AST/ASTContext.cpp
11322

Nit: I'd phrase it as a positive assertion ConsiderCudaGlobalAttr and default it to true.

DontDoX always trips me and gets me to question it -- "what *are* we doing then? what else is there besides X?".
With a DoX things are usually simpler and limited to X -- we're either doing X or not.

12300

Perhaps we don't need to change the public AST API and plumb IgnoreCUDAGlobalAttr through.
We cold create CUDA-aware static version of GetGVALinkageForCudaKernel instead, which would call adjustGVALinkageForExternalDefinitionKind(..., adjustGVALinkageForAttributes(IgnoreCUDAGlobalAttr=true)).

yaxunl marked an inline comment as done.Apr 21 2022, 12:17 PM
yaxunl added inline comments.
clang/lib/AST/ASTContext.cpp
11322

will do

12300

We could have a static function but it would be GetGVALinkageForCUDAKernelWithoutGlobalAttr since we need to know the linkage of the kernel assuming it has no __global__ attribute.

If you think it is OK I can make the change.

yaxunl updated this revision to Diff 424280.Apr 21 2022, 1:33 PM
yaxunl marked an inline comment as done.

use static function

tra added inline comments.Apr 21 2022, 1:56 PM
clang/lib/AST/ASTContext.cpp
12300

No point making public what's of no use to anybody other than this particular instance.

To think of it, we don't even need a function and could just do

if (D->hasAttr<CUDAGlobalAttr>() ) {
  bool OriginalKernelLinkage = adjustGVALinkageForExternalDefinitionKind(..., adjustGVALinkageForAttributes(IgnoreCUDAGlobalAttr=true));
  return OriginalKernelLinkage == GVA_Internal;
}
return (IsStaticVar &&....)
yaxunl added inline comments.Apr 21 2022, 2:02 PM
clang/lib/AST/ASTContext.cpp
12300

One disadvantage of this approach is that it duplicates the code in GetGVALinkageForFunction. In the future, if GetGVALinkageForFunction changes, the same change needs to be applied to the duplicated code, which is error-prone.

tra added inline comments.Apr 21 2022, 3:41 PM
clang/lib/AST/ASTContext.cpp
12300

Good point. Looking at the code closer, it appears that what we're interested in is whether the kernel was internal and *became* externally visible due to it being a kernel.

Right now we're checking if the function would normally be GVA_Internal (shouldn't we have considered GVA_DiscardableODR, too, BTW?)
This is a somewhat indirect way of figuring out what we really need.

The code that determines what we want is essentially this code in adjustGVALinkageForAttributes that we're trying to enable/disable with ConsiderCudaGlobalAttr.

It can be easily extracted into a static function, which could then be used from both adjustGVALinkageForAttributes, (which would no longer need ConsiderCudaGlobalAttr) and from here.

bool isInternalKernel(ASTContext *Context, Decl *D) {
  L=basicGVALinkageForFunction(Context, D);
  return (D->hasAttr<CUDAGlobalAttr>() &&
          (L == GVA_DiscardableODR || L == GVA_Internal));
}

This would both avoid logic duplication and would better match our intent.

Does it make sense? Or did I miss something else?

yaxunl added inline comments.Apr 21 2022, 7:42 PM
clang/lib/AST/ASTContext.cpp
12300

GVA_DiscardableODR usually maps to linkonce_odr linkage in LLVM IR. It follows the ODR, therefore we should not make them unique.

If we use isInternalKernel in adjustGVALinkageForAttributes, there will be two calls of basicGVALinkageForFunction when GetGVALinkageForFunction is called, which seems inefficient. I think we can keep GetGVALinkageForFunction as it was, and use basicGVALinkageForFunction directly in mayExternalize.

yaxunl updated this revision to Diff 424364.Apr 21 2022, 8:35 PM

use basicGVALinkageForFunction

tra accepted this revision.Apr 22 2022, 10:56 AM

LGTM overal, with few test nits.

clang/lib/AST/ASTContext.cpp
12300

SGTM.

clang/test/CodeGenCUDA/device-var-linkage.cu
1–2

This is odd -- the tests use -x hip and -triple nvptx.

I think we need to change them into HIP+amdgpu and CUDA +nvptx variants ans we now have language-dependent behavior here and are interested in the language/triple combinations that we do use in practice.

clang/test/CodeGenCUDA/kernel-in-anon-ns.cu
3

We should have CUDA test variants here, too.

clang/test/CodeGenCUDA/managed-var.cu
0–1

Tests above do not have REQUIRED. Is it needed here?

This revision is now accepted and ready to land.Apr 22 2022, 10:56 AM
yaxunl marked 7 inline comments as done.Apr 22 2022, 11:37 AM
yaxunl added inline comments.
clang/test/CodeGenCUDA/device-var-linkage.cu
1–2

will change them to amdgcn and add CUDA variant when committing.

clang/test/CodeGenCUDA/kernel-in-anon-ns.cu
3

will add CUDA test when committing.

clang/test/CodeGenCUDA/managed-var.cu
0–1

No. I will remove it when committing.

This revision was landed with ongoing or failed builds.Apr 22 2022, 2:07 PM
This revision was automatically updated to reflect the committed changes.
yaxunl marked 3 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptApr 22 2022, 2:07 PM