This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 1
ClosedPublic

Authored by saiislam on May 11 2020, 5:13 PM.

Details

Summary

Allow AMDGCN as a GPU offloading target for OpenMP during compiler
invocation and allow setting CUDAMode for it.

Originally authored by Greg Rodgers (@gregrodgers).

Diff Detail

Event Timeline

saiislam created this revision.May 11 2020, 5:13 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
saiislam updated this revision to Diff 263314.May 11 2020, 5:15 PM

Fixed lint errors.

arsenm added a subscriber: arsenm.May 11 2020, 6:00 PM

Test?

clang/lib/AST/Decl.cpp
3224

This is also identical to the cuda handling above, can you merge them

jdoerfert added inline comments.May 11 2020, 6:43 PM
clang/lib/AST/Decl.cpp
3224

Yes, wasn't there an idea to have isGPU()?

clang/lib/Frontend/CompilerInvocation.cpp
3162

Can we please not call it CUDA mode for non-CUDA targets? Or do you expect the compilation to "identify" as CUDA compilation?

llvm/include/llvm/ADT/Triple.h
700

What's the difference between an AMDGPU and AMDGCN?

sameerds added inline comments.May 11 2020, 6:45 PM
clang/lib/AST/Decl.cpp
3224

It's not identical, because CUDA is filtering out host code and its check is target independent.

But then, Saiyed, it seems that the new patch disallows library builtins on all languages that reach this point, both on host and device code. Is that the intention? Does this point in the flow preclude any side-effects outside of "OpenMP on AMDGCN"?

llvm/include/llvm/ADT/Triple.h
696

Why not just isAMDGPU()? I myself don't have an opinion either way, but still curious.

sameerds added inline comments.May 11 2020, 7:21 PM
clang/lib/Frontend/CompilerInvocation.cpp
3162

I suspect it's just a lot of water under the bridge. All over Clang, HIP has traditionally co-opted CUDA code without introducing new identifiers, like "-fcuda-is-device". It won't be easy to redo that now, and definitely looks out of scope for this change. A typical HIP compilation usually does identify itself as a HIP compilation like setting the isHIP flag. This allows the frontend to distinguish between CUDA and HIP when it matters.

yaxunl added inline comments.May 11 2020, 9:30 PM
llvm/include/llvm/ADT/Triple.h
700

AMDGPU inlude r600 and amdgcn. r600 are old AMD GPUs. They do not support flat address space therefore cannot support CUDA or HIP.

sameerds added inline comments.May 11 2020, 10:03 PM
llvm/include/llvm/ADT/Triple.h
700

As a separate topic, does that mean that Clang should reject r600 at an earlier entry point itself?

sameerds added inline comments.May 11 2020, 10:27 PM
clang/lib/AST/Decl.cpp
3227

This needs a test.

clang/lib/Frontend/CompilerInvocation.cpp
3102

Needs a test.

3137

This is probably too fundamental to need a test on its own.

saiislam marked 5 inline comments as done.May 14 2020, 4:55 AM
saiislam added inline comments.
clang/lib/AST/Decl.cpp
3224

@sameerds this function returns a value indicating whether input function corresponds to a builtin function (returns BuiltinID), or not (returns 0) i.e. all conditions returning 0 are the exceptions where a valid BuiltinID can't be returned. The new condition only applies to non-OpenCL (line 3213) builtin functions which are not printf and malloc, and have been targeted for amdgcn.

3224

@jdoerfert Can you please elaborate on isGPU() idea? I am not aware about it.
Guessing by the name, I have added isGPU() as a wrapper over isNVPTX() and isAMDGCN() in the next revision.

3225

this condition is not required because it has already been checked in line #3200 earlier.

clang/lib/Frontend/CompilerInvocation.cpp
3162

@jdoerfert OpenMP support document of clang defines two data sharing modes for cuda devices: CUDA and Generic mode. NVPTX and AMDGCN both are cuda devices. Doesn't it make sense to not refactor CUDA mode as of now?

saiislam updated this revision to Diff 263972.May 14 2020, 4:57 AM
saiislam marked an inline comment as done.

Added test cases. Added a wrapper isGPU() for isNVPTX()/isAMDGCN().

saiislam marked 7 inline comments as done and 2 inline comments as done.May 14 2020, 5:00 AM
yaxunl added inline comments.May 14 2020, 5:30 AM
llvm/include/llvm/ADT/Triple.h
700

I think HIP checks triple and rejects triple other than amdgcn

arsenm added inline comments.May 14 2020, 9:31 AM
llvm/include/llvm/ADT/Triple.h
699

This is skipping out r600, and probably should leave this in clang?

saiislam updated this revision to Diff 264192.May 15 2020, 2:35 AM

Moved isGPU() from llvm's Triple.h to clang's TargetInfo. Renamed it to isOpenMPGPU()
to represent target's compatibility with OpenMP offloading and reduce its scope.

sameerds added inline comments.May 15 2020, 9:15 AM
clang/include/clang/Basic/TargetInfo.h
1261 ↗(On Diff #264192)

How is "OpenMP-compatible GPU" defined? I think it's too early to start designing predicates about whether a target is a GPU and whether it supports OpenMP.

clang/lib/AST/Decl.cpp
3224

This seems awkward to me. Why mix it up with only CUDA and HIP? The earlier factoring is better, where CUDA/HIP took care of their own business, and the catch-all case of AMDGCN was a separate clause by itself. It doesn't matter that the builtins being checked for AMDGCN on OpenMP are currently identical to CUDA/HIP. When this situation later changes (I am sure OpenMP will support more builtins), we will have to split it out again anyway.

clang/lib/Frontend/CompilerInvocation.cpp
3100

I am not particularly in favour of introducing a variable just because it looks smaller than a call at each appropriate location. If you really want it this way, at least make it a const.

3102–3103

Looking at the comment before this line, the correct predicate would "target supports exceptions with OpenMP". Is it always true that every GPU that supports OpenMP will not support exception handling? I would recommend just checking individual targets for now instead of inventing predicates.

3156

Is there any reason to believe that every future GPU added to this predicate will also want the CUDA mode set? I would recommend using individual targets for now instead of inventing a new predicate.

3161

Same doubt about this use of an artificial predicate as commented earlier.

sameerds added inline comments.May 16 2020, 7:51 PM
clang/lib/AST/Decl.cpp
3224

This clause seems to assume that AMDGCN automatically means OpenMP whenever it is not HIP. That does not sound right. Is there a Context.getLangOpts().OpenMP flag? If not, why not? There also seems to be an Opts.OpenMPIsDevice ... perhaps that could be used here to write a separate OpenMP clause?

saiislam updated this revision to Diff 264529.May 17 2020, 9:55 PM

Removed isOpenMPGPU() to avoid defining OpenMP compatibility of an architecture.
Reverting back to explicitly checking NVPTX and AMDGCN architectures. Also, split
handling of NVPTX's and AMDGCN's handling of getBuiltinID. For AMDGCN it now uses
OpenMPIsDevice LangOpt and returns 0 for every device library function, except for
printf and malloc.

sameerds added inline comments.May 17 2020, 10:35 PM
clang/lib/AST/Decl.cpp
3227

This still needs a test. One that shows that specific builtins are allowed and others are not.

clang/test/Driver/openmp-offload-gpu.c
257 ↗(On Diff #264529)

Is there a functional reason to move these lines? Changes to existing files should be minimized to show only functional changes. Any NFC rearrangement should be a separate patch.

sameerds added inline comments.May 17 2020, 10:51 PM
clang/lib/AST/Decl.cpp
3224

This should say "OpenMP does not have ..."?

sameerds added inline comments.May 18 2020, 8:27 AM
clang/lib/AST/Decl.cpp
3226

Why is the check for AMDGCN required here? Doesn't the language define the set of supported builtins in a language-independent way? At least that seems to be true for OpenCL and CUDA above.

saiislam updated this revision to Diff 264766.May 18 2020, 6:52 PM

Fixed typo. Added test case to check for device side functions.

saiislam updated this revision to Diff 265716.May 22 2020, 5:51 AM

Added test case to show treatment of specific functions as builtins or functions on the device

saiislam marked 19 inline comments as done.May 22 2020, 6:00 AM
saiislam added inline comments.
clang/lib/AST/Decl.cpp
3226

AMDGCN check is required because this block deals with the openmp target when it is amdgcn. A future or existing target device might decide to deal with library calls in its own way.

3227

Added test case to show difference in treatment of printf() and other predefined library functions, on the device.

saiislam updated this revision to Diff 266171.May 26 2020, 5:04 AM
saiislam marked 2 inline comments as done.

Shifted test cases in openmp-offload-gpu.c for better visual segmentation.
Updated device function test case to be more accurate.

I'm generally fine with this, don't wait for my approval.

clang/test/OpenMP/amdgcn_device_function_call.cpp
27 ↗(On Diff #266171)

Not for this patch:
FWIW, we will need to make math functions work inside target regions. The way aomp does it is afaik different from the way we do it. We can however adopt our way for this target though. Feel free to ping me on this later.

saiislam marked an inline comment as done.May 26 2020, 10:34 AM
saiislam added inline comments.
clang/test/OpenMP/amdgcn_device_function_call.cpp
27 ↗(On Diff #266171)

Yes, sure. I will ping you as we move towards it. Thanks.

This revision is now accepted and ready to land.May 27 2020, 12:24 AM
This revision was automatically updated to reflect the committed changes.