Page MenuHomePhabricator

[libomptarget] Enable AMDGPU devicertl
ClosedPublic

Authored by JonChesterfield on Apr 23 2021, 5:18 PM.

Details

Summary

[libomptarget] Enable AMDGPU devicertl

The amdgpu devicertl is written in freestanding openmp and compiles to a
bitcode library (per listed gfx arch) with no unresolved symbols. It requires
a recent clang, preferably the one from the same monorepo checkout.

This is D98658, with printf explicitly stubbed out, after patching clang to no
longer require an llvm with the amdgpu target enabled.

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Apr 23 2021, 5:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 23 2021, 5:18 PM
tianshilei1992 accepted this revision.Apr 23 2021, 5:23 PM

LG. The only thing missing here is, if the compiler doesn't support the target (no matter NVPTX or AMDGCN), we need to disable them. So we need a check in CMake. I'll do it after I'm done with my current business.

This revision is now accepted and ready to land.Apr 23 2021, 5:23 PM
JonChesterfield added a comment.EditedApr 23 2021, 6:15 PM

There are some interesting configuration choices.

In particular, we can build the nvptx or amdgcn device bitcode with a (recent/same-commit) clang, even if the llvm doesn't have either of those targets enabled. If that same clang is later used to build an application, it (hopefully!) errors about the missing target at that point.

We currently use the existence of cuda to control whether to run tests on nvptx. That's not strictly the right test, a machine may have cuda installed but no nvptx hardware.

I'm trying to get a simplified cmake (based on llvm-config) to run, will hold off on committing this until I can tell if that's going to work out.

edit: didn't work. llvm-config not found during a runtimes build for some reason. Will land this and iterate in tree.

  • detect aux triple, following nvptx
This revision was landed with ongoing or failed builds.Apr 23 2021, 6:25 PM
This revision was automatically updated to reflect the committed changes.

I think we are really in dire need of the mechanism to test if the compiler can be used to compile the source code w/o making any assumption. Corresponding directory should ONLY be included or exit ahead of time if the compiler is not qualified. Now I found multiple issues:

  • libomptarget depends on LLVM components, but when I try to build OpenMP standalone with GCC, it causes failure of CMake configuration.
  • The AMDGCN device runtime compilation fails because my Clang doesn't support AMDGCN backend. This failure can block the whole compilation of OpenMP.

I'll make a patch to make corresponding detection before including any directory in libomptarget. By default everything will be disabled.

  • The AMDGCN device runtime compilation fails because my Clang doesn't support AMDGCN backend. This failure can block the whole compilation of OpenMP.

For what it's worth, that's a bug in clang that has been fixed. I guess it was introduced before the last release and fixed after, so there's a window of vulnerability there

tianshilei1992 added a comment.EditedMay 16 2021, 1:44 PM
  • The AMDGCN device runtime compilation fails because my Clang doesn't support AMDGCN backend. This failure can block the whole compilation of OpenMP.

For what it's worth, that's a bug in clang that has been fixed. I guess it was introduced before the last release and fixed after, so there's a window of vulnerability there

I'm using the latest trunk, and I can still observe:

[31/174] Generating sync.gfx900.bc
FAILED: libomptarget/deviceRTLs/amdgcn/sync.gfx900.bc
cd /nvm/0/shiltian/build/openmp/debug/libomptarget/deviceRTLs/amdgcn && /home/shiltian/.local/llvm-12.0.0/bin/clang -xc++ -c -std=c++14 -ffreestanding -target amdgcn-amd-amdhsa -emit-llvm -Xclang -aux-triple -Xclang x86_64-unknown-linux-gnu -fopenmp -fopenmp-cuda-mode -Xclang -fopenmp-is-device -D__AMDGCN__ -Xclang -target-cpu -Xclang gfx900 -fvisibility=default -Wno-unused-value -nogpulib -O2 -I/home/shiltian/Documents/vscode/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src -I/home/shiltian/Documents/vscode/llvm-project/openmp/libomptarget/deviceRTLs/common/include -I/home/shiltian/Documents/vscode/llvm-project/openmp/libomptarget/deviceRTLs /home/shiltian/Documents/vscode/llvm-project/openmp/libomptarget/deviceRTLs/common/src/sync.cu -o sync.gfx900.bc
clang (LLVM option parsing): Unknown command line argument '--amdhsa-code-object-version=3'.  Try: 'clang (LLVM option parsing) --help'
clang (LLVM option parsing): Did you mean '--pgo-memop-max-version=3'?

Actually, interestingly, although I already set CMAKE_C_COMPILER and CMAKE_CXX_COMPILER in CMake configuration, it still uses the clang in my $PATH.

That's annoying. D101095 fixed that at the time, and hasn't been reverted, though I don't have a CI system to catch regressions.