This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Add hip input kind and codegen for kernel launching
ClosedPublic

Authored by yaxunl on Mar 28 2018, 9:25 AM.

Details

Summary

HIP is a language similar to CUDA (https://github.com/ROCm-Developer-Tools/HIP/blob/master/docs/markdown/hip_kernel_language.md ).
The language syntax is very similar, which allows a hip program to be compiled as a CUDA program by Clang. The main difference
is the host API. HIP has a set of vendor neutral host API which can be implemented on different platforms. Currently there is open source
implementation of HIP runtime on amdgpu target (https://github.com/ROCm-Developer-Tools/HIP).

This patch adds support of input kind and language standard hip.

When hip file is compiled, both LangOpts.CUDA and LangOpts.HIP is turned on. This allows compilation of hip program as CUDA
in most cases and only special handling of hip program is needed LangOpts.HIP is checked.

This patch also adds support of kernel launching of HIP program using HIP host API.

When -x hip is not specified, there is no behaviour change for CUDA.

Patch by Greg Rodgers.
Revised and lit test added by Yaxun Liu.

Diff Detail

Event Timeline

yaxunl created this revision.Mar 28 2018, 9:25 AM
tra added a reviewer: tra.Mar 28 2018, 9:32 AM

The changes appear to cover only some of the functionality needed to enable HIP support. Do you have more patches in queue? Having complete picture would help to make sense of the overall plan.
I did ask for it in D42800, but I don't think I've got the answers. It would help a lot if you or @gregrodgers could write a doc somewhere outlining overall plan for HIP support in clang, what are the main issues that need to be dealt with, and at least a general idea on how to handle them.

As far as "add -x hip, and tweak runtime glue codegen" goes, the change looks OK, but it's not very useful all by itself. It leaves a lot of other issues unsolved and no clear plan on whether/when/how you are planning to deal with them.

As things stand right now, with this patch clang will still attempt to include CUDA headers, which, among other things will provide threadIdx/blockIdx and other CUDA-specific features.
Perhaps it would make sense to disable pre-inclusion of CUDA headers and, probably, disable use of CUDA's libdevice bitcode library if we're compiling with -x hip (i.e. -nocudainc -nocudalib).
If you do depend on CUDA headers, then, I suspect, you may need to adjust some wrapper headers we use for CUDA and that change should probably come before this one.

test/CodeGenCUDA/device-stub.cu
2–9

Please wrap the long lines.

In D44984#1050526, @tra wrote:

The changes appear to cover only some of the functionality needed to enable HIP support. Do you have more patches in queue? Having complete picture would help to make sense of the overall plan.
I did ask for it in D42800, but I don't think I've got the answers. It would help a lot if you or @gregrodgers could write a doc somewhere outlining overall plan for HIP support in clang, what are the main issues that need to be dealt with, and at least a general idea on how to handle them.

As far as "add -x hip, and tweak runtime glue codegen" goes, the change looks OK, but it's not very useful all by itself. It leaves a lot of other issues unsolved and no clear plan on whether/when/how you are planning to deal with them.

As things stand right now, with this patch clang will still attempt to include CUDA headers, which, among other things will provide threadIdx/blockIdx and other CUDA-specific features.
Perhaps it would make sense to disable pre-inclusion of CUDA headers and, probably, disable use of CUDA's libdevice bitcode library if we're compiling with -x hip (i.e. -nocudainc -nocudalib).
If you do depend on CUDA headers, then, I suspect, you may need to adjust some wrapper headers we use for CUDA and that change should probably come before this one.

Hi Artem, I am responsible for upstreaming Greg's work and addressing reviewers' comments.

Yes we already have a basic working implementation of HIP compiler due to Greg's work. I will either update D42800 or create a new review about the toolchain changes for compiling and linking HIP programs. Essentially HIP has its own header files and device libraries which are taken care of by the toolchain patch.

Since the header file and library seem not to affect this patch, is it OK to defer their changes to be part of the toolchain patch?

Thanks.

tra added a comment.Mar 28 2018, 11:22 AM

Yes we already have a basic working implementation of HIP compiler due to Greg's work.

That is great, but it's not necessarily true that all these changes will make it into clang/llvm as is. LLVM/Clang is a community effort and it helps a lot to get the changes in when the community understands what is it you're planning to do. I personally am very glad to see AMD moving towards making clang a viable compiler for AMD GPUs, but there's only so much I'll be able to do to help you with reviews if all I have is either piecemeal patches with little idea how they all fit together or one humongous patch I would have no time to dive in and really understand. Considering that compilation for GPU is a fairly niche market my bet is that your progress will be bottlenecked by the code reviews. Whatever you can do to make reviewers jobs easier by giving more context will help a lot with upstreaming the patches.

I will either update D42800 or create a new review about the toolchain changes for compiling and linking HIP programs. Essentially HIP has its own header files and device libraries which are taken care of by the toolchain patch.

Fair enough. I'll wait for the rest of the patches. If you have multiple pending patches, it helps if you could arrange them as dependent patches in phabricator. It makes it easier to see the big picture.

Since the header file and library seem not to affect this patch, is it OK to defer their changes to be part of the toolchain patch?

I'm not sure I understand. Could you elaborate?

You should send an RFC to cfe-dev about adding this new language mode. I understand that it's very similar to an existing language mode that we already support, and that's definitely we'll consider, but we shouldn't just agree to add new language modes in patch review.

You should send an RFC to cfe-dev about adding this new language mode. I understand that it's very similar to an existing language mode that we already support, and that's definitely we'll consider, but we shouldn't just agree to add new language modes in patch review.

RFC sent http://lists.llvm.org/pipermail/cfe-dev/2018-March/057426.html

Thanks.

yaxunl marked an inline comment as done.Mar 29 2018, 12:00 PM

Since the header file and library seem not to affect this patch, is it OK to defer their changes to be part of the toolchain patch?

I'm not sure I understand. Could you elaborate?

clang -cc1 does not include __clang_cuda_runtime_wrapper.h by default when clang -cc1 is called directly to compile CUDA programs. CUDA toolchain adds -include __clang_cuda_runtime_wrapper.h when compiling CUDA program as kernel code. Therefore if clang -cc1 is used to compile HIP program in lit test, there is no need to use -fnocudainc.

This patch mainly changes kernel launching API function names. The implement and testing of this change does not depend on the CUDA/HIP header files. A minimum header like test/CodeGenCUDA/Input/cuda.h is sufficient for testing this patch.

Basically this patch is only concerns about -cc1 and therefore is independent of the toolchain changes about header and library files.

mkuron added a subscriber: mkuron.Mar 31 2018, 2:38 AM
yaxunl added a comment.Apr 9 2018, 2:00 PM

ping. Any further changes need to be done for this patch? Thanks.

rjmccall added inline comments.Apr 13 2018, 12:06 AM
lib/CodeGen/CGCUDANV.cpp
98

Can you take these as StringRefs or Twines?

104

I think "addUnderscoredPrefixToName" would be better.

134

Please move this comment down into the else clause (and terminate it with a semicolon) and add your own declaration comment in your clause.

lib/Frontend/CompilerInvocation.cpp
2109

Why is this done here? We infer the language mode from the input kind somewhere else.

yaxunl marked 4 inline comments as done.Apr 13 2018, 8:35 AM
yaxunl added inline comments.
lib/Frontend/CompilerInvocation.cpp
2109

It is usually done through CompilerInvocation::setLangDefaults. However, HIP does not have its own input kind nor is it defined as a language standard. Therefore it cannot use CompilerInvocation::setLangDefaults to set Opts.HIP.

yaxunl updated this revision to Diff 142422.Apr 13 2018, 8:42 AM
yaxunl marked an inline comment as done.

Revised by John's comments.

rjmccall added inline comments.Apr 14 2018, 2:59 AM
lib/Frontend/CompilerInvocation.cpp
2109

What are the values of -x if not input kinds or language standards?

yaxunl marked 2 inline comments as done.Apr 17 2018, 12:46 PM
yaxunl added inline comments.
lib/Frontend/CompilerInvocation.cpp
2109

I will add hip as input kind and language standard since it really is both.

yaxunl updated this revision to Diff 142818.Apr 17 2018, 12:50 PM
yaxunl marked an inline comment as done.
yaxunl retitled this revision from [HIP] Add hip file type and codegen for kernel launching to [HIP] Add hip input kind and codegen for kernel launching.
yaxunl edited the summary of this revision. (Show Details)

Add hip as input kind and language standard.

tra added inline comments.Apr 17 2018, 3:29 PM
lib/CodeGen/CGCUDANV.cpp
51–52

const CodeGenModule &CGM

lib/Frontend/InitPreprocessor.cpp
466–469

Is __CUDA__ supposed to be set during HIP compilation? My guess is that __HIP__ and __CUDA__ should be mutually exclusive.
You do set LangOpts.CUDA during HIP compilation, so this should be changed to if (CUDA && ! HIP)

lib/Sema/SemaDecl.cpp
9054–9055

This would be somewhat easier to read:

if (II && II->isStr(getLangOpts().HIP ? "hipConfigureCall" : "cudaConfigureCall") && ...
test/CodeGenCUDA/device-stub.cu
3–10

The changes in this file do not seem to have anything related to the code changes in this patch.
Did you intend to add some HIP tests here?

yaxunl marked 4 inline comments as done.Apr 18 2018, 11:59 AM
yaxunl added inline comments.
lib/Frontend/InitPreprocessor.cpp
466–469

HIP documentation does not require __CUDA__ to be defined. Will make changes as you suggested.

yaxunl updated this revision to Diff 142978.Apr 18 2018, 12:00 PM
yaxunl marked an inline comment as done.

Revised by Artem's comments.

Otherwise LGTM.

lib/CodeGen/CGCUDANV.cpp
51–52

Why doesn't the CGNVCUDARuntime just hold on to a reference to the CGM? That's what we do with all the other separated singletons (like the CGCXXABI), and it would let you avoid some of the redundant fields like Context and TheModule.

tra added inline comments.Apr 24 2018, 10:50 AM
lib/CodeGen/CGCUDANV.cpp
51–52

Actually, CGCUDARuntime already has CGM field, so the CGM argument can be just dropped.

yaxunl updated this revision to Diff 143792.Apr 24 2018, 11:55 AM

Remove CodeGenModule argument from addPrefix* functions.

tra added inline comments.Apr 24 2018, 12:04 PM
test/CodeGenCUDA/device-stub.cu
3–10

Do you need these changes?

yaxunl marked 2 inline comments as done.Apr 24 2018, 12:36 PM
yaxunl added inline comments.
test/CodeGenCUDA/device-stub.cu
3–10

Sorry, some changes about HIP were lost during revision. I will get back those changes.

yaxunl updated this revision to Diff 143800.Apr 24 2018, 1:19 PM
yaxunl marked 5 inline comments as done.

Add back HIP related changes to the tests.

tra accepted this revision.Apr 24 2018, 1:37 PM
This revision is now accepted and ready to land.Apr 24 2018, 1:37 PM
rjmccall accepted this revision.Apr 24 2018, 4:26 PM

Thank you.

This revision was automatically updated to reflect the committed changes.